Advance to content

Galois
Technology
Client Services
Company
Blog

Entries in the “Open Source” category

Tech talk: abcBridge: Functional interfaces for AIGs and SAT solving

August 19th, 2010 by Iavor S. Diatchki

Galois is pleased to host the following tech talk. These talks are open to the interested public. Please join us!

title:
abcBridge: Functional interfaces for AIGs and SAT solving (slides, video)
presenter:
Edward Z. Yang
time:

10:30am, Tuesday, 24 August 2010

location:

Galois Inc.
421 SW 6th Ave. Suite 300, Portland, OR, USA
(3rd floor of the Commonwealth building)

abstract:

SAT solvers are perhaps the most under-utilized high-tech tools that the modern software engineer has at their fingertips. An industrial strength SAT solver can solve most human generated NP-complete problems in time for lunch, and there are many, many practical problem domains which involve NP-complete problems. However, a major roadblock to using a SAT solver in your every day routine is translating your problem into SAT, and then running it on a highly optimized SAT solver, which is probably implemented in C or C++ and not your usual favorite programming language.

This talk is about the use, design and implementation of abcBridge, a set of Haskell bindings for ABC, a system for sequential synthesis and verification produced by the Berkeley Logic Synthesis and Verification Group. ABC looks at SAT solving from the following perspective: given two circuits of logic gates (ANDs and NOTs), are they equivalent? ABC is imperative C code: abcBridge provides a pure and type-safe interface for building and manipulating and-inverter graphs. We hope to release abcBridge soon as open source.

bio:
Edward Z. Yang is an undergraduate studying computer science at MIT. He has been interning at Galois over the summer and enjoying every second of it. His interests include blogging, functional programming and practical applications of computer science research. You can read his blog at http://blog.ezyang.com/

Posted in Formal Methods, Functional Programming, Haskell, Open Source, Tech Talks | Comments (0)

Galois Tech Talk Video: Large-Scale Static Analysis at Mozilla

July 9th, 2010 by Iavor S. Diatchki

We are pleased to announce the availability of a new Galois tech talk video: “Large-Scale Static Analysis at Mozilla”, presented by Taras Glek.
Slides and more details about the talk are available on the announcement page.

Large-Scale Static Analysis at Mozilla from Galois Video on Vimeo.

For more videos, please visit http://vimeo.com/channels/galois.

Posted in Community, Open Source, Tech Talks, Technology, Video | Comments (0)

Tech Talk: Large-Scale Static Analysis at Mozilla

June 3rd, 2010 by Iavor S. Diatchki

Galois is pleased to host the following tech talk. These talks are open to the interested public. Please join us!

title:
Large-Scale Static Analysis at Mozilla (slides, video)
presenter:
Taras Glek
time:
10:30 am, 8 June 2010, Tuesday
location:
Galois Inc.
421 SW 6th Ave. Suite 300, Portland, OR, USA
(3rd floor of the Commonwealth building)
abstract:
A competitive browser market requires fast-paced improvements to the codebase. Such improvements may require significant refactoring of large parts of the codebase. Mozilla Firefox is one of the largest open source C++ projects. Unfortunately C++ is a complex language: method overloading, virtual functions, template instantiation, pointer arithmetic, etc reduce developer productivity. Mozilla developed C++ static analysis and refactoring tools to increase developer leverage in C++. Static analysis is done via Dehydra/Treehydra GCC plugins and refactoring is accomplished by extending the Elsa C++ parser. This talk will discuss why Mozilla needs static analysis, why there are so few tools for C++, and specific projects that we’ve embarked on.
bio:
Taras Glek is a software engineer at Mozilla Corporation. He works on static analysis and startup performance. Taras blogs about it at http://blog.mozilla.com/tglek/.

Posted in Open Source, Tech Talks, Technology | Comments (4)

Tech Talk: An Introduction to Communicating Haskell Processes

March 10th, 2010 by Iavor S. Diatchki

Please note the unusual time-slot for this talk!

Details:

  • Title: An Introduction to Communicating Haskell Processes
  • Presenter: Neil Brown
  • Date: Monday, 15 March 2010
  • Time: 10:30am
  • Location: Galois Inc.
    421 SW 6th Ave. Suite 300, Portland, OR, USA
    (3rd floor of the Commonwealth building)

Abstract: Haskell is an excellent language for combining the power of functional programming with imperative constructs. This characteristic led to the development of the Communicating Haskell Processes (CHP) libraries, which support imperative synchronous message-passing in Haskell. The core ‘chp’ library provides basic message-passing, concurrency and choice, as well as integrated support for tracing. The ‘chp-plus’ library provides higher-level features such as process composition operators and behaviour combinators. This talk provides an introduction to the two libraries and the programming style they engender — as well as a brief look at the formal semantics underlying the libraries.

Bio: Neil Brown is a software researcher from the University of Kent in the UK. After graduating he worked for several years as a machine learning researcher in industry at QinetiQ, before returning to university to undertake his PhD. He started out writing a Haskell-based compiler for synchronous message-passing languages, and ended up programming some synchronous message-passing libraries for Haskell itself. As well as these CHP libraries, he also developed the Progression benchmark-graphing library for Haskell. More detail on both projects can be found on his blog.

Posted in Functional Programming, Haskell, Open Source, Tech Talks | Comments (1)

Tech Talk: Modern Benchmarking in Haskell

February 19th, 2010 by Iavor S. Diatchki

Details:

  • Title: Modern Benchmarking in Haskell (slides)
  • Presenter: Don Stewart
  • Date: Tuesday, 23 February 2010
  • Time: 10:30am
  • Location: Galois Inc.
    421 SW 6th Ave. Suite 300, Portland, OR, USA
    (3rd floor of the Commonwealth building)

Abstract: Thanks to work by Bryan O’Sullivan, there has been a renaissance in performance benchmarking tools for Haskell, built upon Criterion.

“Compared to most other benchmarking frameworks (for any programming language, not just Haskell), criterion focuses on being easy to use, informative, and robust.”

Criterion uses statistically robust mechanisms for sampling and computing sound microbenchmark results, and is more stable in the presence of noise on the system than naive timings.

Criterion has in turn spawned some extensions:

  • Progrssion: compare different criterion graphs
  • NoSlow: a new array benchmark suite based on Criterion

In this talk I will present these tools, how to use them, and how to make your performance benchmarks in Haskell, or languages Haskell can talk to, more reliable.

Bio: Don is an Australian open source hacker, and engineer at Galois, Inc, in Portland, Oregon, where he works on creating trustworthiness and assurance in critical systems with an emphasis on language design and compiler techniques. Don is co-author of the book, Real World Haskell, published by O’Reilly, and the XMonad window manager.


Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities. The talks are open and free. An RSVP is not required, but feel free to contact the organizer with questions and comments.

Posted in Haskell, Open Source, Tech Talks | Comments (1)

GHC Nominated for Programming Language Award

January 6th, 2010 by john

ACM SIGPLAN recently announced a new award:

The SIGPLAN Programming Languages Software Award is awarded to an institution or individual(s) to recognize the development a software system that has had a significant impact on programming language research, implementations, and tools. The impact may be reflected in the wide-spread adoption of the system or its underlying concepts by the wider programming language community either in research projects, in the open-source community, or commercially. The award includes a prize of $2,500.

I think that GHC (Glasgow Haskell Compiler) and the two Simons (Peyton Jones and Marlow) are prime candidates for this. So, being careful to stay within the 500 word limit, I submitted a nomination statement for GHC as follows:

For the last decade, Haskell has been at the center of the most exciting innovations within programming languages, with work on software transactional memory, generalized algebraic data types, rank-N polymorphism, monads, multi-parameter type classes, embedded domain-specific languages, property-based testing, data parallelism, thread-profiling, and so on, generating hundreds of research papers from many diverse research groups.

GHC, the Glasgow Haskell Compiler, is the vehicle that made this research possible.

It is hard to explore radical ideas on real systems, yet the GHC team created a flexible platform that allows other researchers to explore the implications of their ideas and to test whether they really work in the large. From the first beta release in 1991, GHC emphasized collaboration and open “bazaar” style development, as opposed to the “cathedral” development of most of its contemporaries. GHC was open source even before Linux made open source cool. GHC has continued in the same vein, now listing over 60 contributors to the codebase.

In those early days, efficient compilation of a higher-order, allocation-rich, lazy functional language seemed to be a pipe dream. Yet GHC has risen to be a top-flight performer in the online language performance shootout (shootout.alioth.debian.org), comparable with Java Server-6, and approaching native C in performance overall. This is a tribute to the incredible amount of profound optimization built into the compiler, with techniques like cross-module code migration, unboxed data types, and automated removal of intermediate data structures, all done through correctness-preserving transformations that exploit the algebraic simplicity of Haskell terms, even in the presence of monadic effects.

The impact GHC has had on programming language research would be sufficient to merit an award by itself, but GHC is having a corresponding influence in industry. By showing the feasibility of purely functional, statically-typed programming in the large, GHC Haskell has also had clear influence on many of the newest generation of languages, such as C#, F#, Java Generics, LINQ, Perl 6, Python, and Visual Basic 9.0. As Soma Somasegar, Microsoft Developer Division Chief, said in 2007, “One of the important themes in programming languages over recent years has been a move to embrace ideas from functional programming, [which] are helping us address some of the biggest challenges facing the industry today, from the impedance mismatch between data and objects to the challenges of the multi-core and parallel computing space.”

GHC now supports a burgeoning professional Haskell world. The O’Reilly book Real World Haskell, targeted to professional programmers and oriented to GHC, was published in 2008. It went on to win the Jolt Award for best technical book of the year. In 2009 there were 3500+ Haskell package updates, with more than 100,000 package downloads in November alone. GHC is now used across the financial sector in institutions like Credit Suisse and Standard Chartered Bank, and for high assurance software in companies like Amgen, Eaton, and Galois. Some of these companies came together in 2009 to create the Industrial Haskell Group, whose purpose is to ensure the health and longevity of GHC.

499 words. Whew! There is so much that could be said, but let’s hope this is enough. I think the case is very strong, and both Simon’s deserve honor and accolade for their work. Thank you both so much!

Posted in Functional Programming, Haskell, Open Source | Comments (4)

Engineering Large Projects in Haskell: A Decade of FP at Galois

April 27th, 2009 by donstewart

Galois has been building systems in Haskell for the past decade. This talk describes some of what we’ve learned about in-the-large, commercial Haskell programming in that time. (Download slides :: .pdf).

  • When and where we use Haskell
  • Correctness, productivity, scalabilty, maintainability
  • What language features we like: types, purity, types, abstractions, types, concurrency, types!
  • The Haskell toolchain: FFI, HPC, Cabal, compiler, libraries, build systems, etc.
  • Being a commercial entity in a largely open source community

This talk was presented Monday 20th April at λondon HUG.

Posted in Community, Functional Programming, Haskell, Open Source | Comments (7)

One Million Haskell Downloads…

March 23rd, 2009 by admin

Galois engineers write a lot of Haskell (in fact, our tech catalogue is built pretty much entirely on it). We find we’re able to build systems faster, with fewer errors, and in turn are able to apply techniques to increase assurance, helping us deliver value to our clients. We’ve successfully engineered large systems in the language for nearly a decade.

We also use and write a lot of open source Haskell code.

Since 2004 we’ve been investing in improving packaging and distribution infrastructure for Haskell code, and since 2007 Galois has been hosting hackage.haskell.org: the central online database of open source Haskell libraries and applications. These packages are built via Cabal (dreamed up by Galois’ own Isaac Potoczny-Jones), and distributed via cabal-install. Hackage now hosts more than 1100 released libraries and tools, and has been growing rapidly (and, incidentally, Galois employees have released or been significant contributors to just shy of 10% of all Hackage projects).

We’ve wondered for a while now just how busy Hackage was becoming, and in turn, what other interesting information about Haskell were buried in the Hackage logs. This post answers those questions for the first time.

We’ll see:

  • Total, and growing, Haskell source downloads
  • The most popular Haskell projects hosted on Hackage
  • The most popular development categories
  • The most popular methods for distributing Haskell source

And speculate a little on where Hackage is heading.

Background

We’ve known for a while that uploads to Hackage were growing. You might have seen this graph elsewhere (it’s derivable from the RSS logs of package uploads):

There’s a pretty clear trend upwards. Average daily Hackage releases have increased 4-fold since Hackage was launched, and it’s now averaging 10 packages a day released. The question is: was anyone using this code?

Measuring Downloads

To measure downloads, we processed the apache logs for Hackage going back to its launch (incidentally, the log processing script – in Haskell of course – uses the Haskell zlib, bytestring, filepath, containers, bytestring-csv libraries). This generates a two dimensional map of downloads per project, per month. (You can find links to the raw data at the end of the article).

We can now play with the data to see some interesting trends.

Some cautionary notes interpreting this data: we only process Hackage package downloads (i.e. “GET binary-0.5..tar.gz” requests). We are only able to measure source downloads – that is, someone downloads a package they will build from source, with GHC. We cannot measure downloads from open source mirrors (such as those provided by the major unix distributions), nor can we measure users of binary packages (such as on Debian, Ubuntu or Fedora), nor can we measure downloads of packages not hosted on Hackage (such as gtk2hs, pugs, darcs or ghc). So this doesn’t represent all Haskell downloads, only downloads of source packages.

We have complete data for Hackage, from the initial alpha tests in July 2006, through to launch in January 2007, until March 2009.

Total Downloads

To begin with, here’s the cumulative downloads from Hackage, over time:

As you can see, we’re just shy of 900 thousand package downloads, and from January 2008 to December 2008 – the first complete year of live operation – there were 500 thousand downloads, with a further 150 thousand downloads in the first 2 months of 2009.

To visualize the growth trend, here is the same cumulative download line on a logarithmic scale:

In the alpha-testing period from July 2006 to July 2007, downloads grew exponentially (4 orders of magnitude in 4 quarters) as existing developers started to use the system. Since the middle of 2007, the rate of growth has slowed, increasing by an order of magnitude in a 15 month period.

Hackage is on target to reach its 1 millionth download next month. We’ll have a party.

Downloads by Month

Next up is the downloads per month, over time. I dropped a bezier curve on top to give a sense of the trend. As we only have partial data for March 2009, I’m excluding that. This graph essentially confirms the same trend as we see in the cumulative graphs. Interestingly, download spikes roughly correspond with upload spikes in our first graph (presumably as people scramble to get the new code).

Hackage is currently seeing 100 package downloads an hour, and that value has been doubling every 4 months for the past year and a half.

Package Popularity

Besides overall downloads, there’s a wealth of per-package information. In the following graphs we extract total downloads for each package (ignoring version numbers).

The popularity graph displays a classic “long tail“, where the download frequency of any package is inversely proportional to its rank in the frequency table.

This suggests that a good interface to a large Hackage database should behave something like the “long tail” sites like Amazon – where we rely on recommendations and other interlinking to ensure visibility of projects in the tail.

The frequency of downloads is even more visible on a double log scale, where the popularity pretty much matches Zipf’s law:

The download frequency doesn’t quite match the classic distribution at the top and tail of the curve. There are two reasons for this (and maybe other factors at play). Firstly, the tail of the curve drops off, as the bottom 10% of popular packages tend to be the newest packages, and so are unde-represented in the “market”.

The other interesting part of the popularity curve is at the top. The top 10 to 50 packages are, to varying degrees, less popular than we might predict. Why is this?

The Distribution Effect

Remember that Hackage is a source-only repository. So it is of primary interest to developers. As a Haskell package becomes more popular, it tends to get picked up by Linux and BSD distributions such as Debian or Ubuntu (and also distributed in binary form for Mac and Windows), removing the need to download the source. Popular packages are doomed to become less popular in source form if the distributions are doing their work!

This is particularly apparent for libraries distributed with GHC (the “extra libs”). Libraries such as containers, arrays, bytestring, parsec and network rarely need to be downloaded in source form, as they’re bundled with GHC forming a platform base. They should thus be under-repesented in source downloads.

You can see this “distribution effect” in this overlay of xmonad and its Debian package installs where source installs decline dramatically as soon as the binary packaging takes off.

Popular packages are doomed to be distributed through other channels.

Most popular packages

And here – for the first time – are the per-package popularity statistics for Hackage. First, the top 25 packages sorted by their cumulative total downloads. Executable applications are marked in blue.

1 xmonad 35428
2 HTTP 26203
3 zlib 24431
4 Cabal 23691
5 X11 21563
6 binary 15752
7 utf8-string 12633
8 mtl 12517
9 cabal-install 12274
10 regex-posix 11351
11 X11-extras 10509
12 xmonad-contrib 9794
13 haddock 9209
14 parsec 8468
15 bytestring 7473
16 regex-base 7438
17 HaXml 6307
18 network 6285
19 xmobar 6272
20 yi 6268
21 hscolour 6264
22 QuickCheck 5697
23 hslogger 5434
24 regex-compat 5266
25 ghc-paths 4653

And the next 75 Haskell packages, in order:

filepath, X11-xft, alex, happy, vty, cgi, terminfo, unix, GLUT, chunks, fingertree, OpenGL, time, pureMD5, regex-tdfa, xhtml, bzlib, Crypto, syb-with-class, hxt, tagsoup, HDBC, MissingH, SDL, haskell-src-exts, plugins, Stream, frag, curl, pcre-light, unix-compat, uniplate, wxcore, hinstaller, stm, html, Diff, polyparse, leksah, HUnit, hmp3, haskell-src, RJson, fastcgi, pandoc, arrows, YamlReference, parsedate, HGL, GLFW, process, extensible-exceptions, zip-archive, iconv, HDBC-sqlite3, TypeCompose, cpphs, hmatrix, HPDF, HAppS-Server, haskell98, hspread, HAppS-Util, rosezipper, gd, dlist, array, yi-gtk, haskeline, HStringTemplate, HAppS-Data, fgl, haskelldb, xml, cabal-rpm

Congratulations to authors of these packages – you made the top 10% most popular releases.

Most Popular Downloads in February

The previous table looked at the cumulative most popular projects. But that doesn’t necessarily reflect what is popular at the moment to download in source form. This table compares January downloads against February downloads, for the top 25 packages in Feburary:

Package Downloads Rank Change
HTTP 2926
zlib 2345
Cabal 2148
cabal-install 1490
utf8-string 1352
xmonad 1280
binary 1174
regex-posix 901 +8
parsec 842 +2
X11 834 -1
xmonad-contrib 754 +1
hscolour 739 -2
terminfo 713 +1
haddock 669 -6
ghc-paths 630 +2
HaXml 600 +12
extensible-exceptions 596 +16
QuickCheck 584 +9
regex-base 558 -4
time 529 -1
darcs 501 1
leksah 500 +18
regex-tdfa 496 +24
hslogger 441 -2

Most popular applications

The 25 most popular Haskell applications hosted on Hackage, to download in source form are:

xmonad, cabal-install, haddock, xmobar, yi, hscolour, alex, happy, frag, leksah, hmp3, pandoc, cpphs, cabal-rpm, darcs, c2hs, hoogle, lambdabot, cabal2arch, hpodder, monadius, lhs2tex, mkcabal, pugs, ghc-core

Note pugs and darcs have been primarily distributed separately to Hackage, until recently.

Most popular libraries by category

We can also determine the most popular libraries and tools in each semantic category on Hackage:

Task Library Downloads
Client-side HTTP HTTP 26203
Database HDBC 3098
XML HaXml 6307
Control mtl 12517
Parsing parsec 8468
Binary Parsing binary 15752
Logging hscolour 6264
Testing QuickCheck 5697
Regex regex-base + regex-posix 7438
Lexing alex 4360
Codec zlib 24431
Unicode IO utf8-string 12633
Sockets network 6285
Build System Cabal 23691
Documentation haddock 9209
Syntax hscolour 6264
3D Graphics GLUT + OpenGL 7345
2D Graphics SDL * 3016
Hashing pureMD5 3460
HTML xhtml 3391
Cryptography Crypto 3243
Generics syb-with-class 3230
IDE leksah 2408
JSON RJson 2222
Markup pandoc 2210
Numerics hmatrix 1844
Web Framework HAppS-Server * 1759
Graphs fgl 1658
Parallelism parallel 1370
Charting chart 1300
Code generation llvm 970
RSS feed 726
Wiki gitit 759

Note that SDL is represented here as gtk2hs (which provides many cairo-based 2D graphics functions) isn’t distributed on Hackage. Also, HAppS-Server has been superceded by happstack.

Honourable mentions

These packages didn’t quite take first place, but still have significant user support. I include happstack, although it is only a month old, as it replaces the previously popular HAppS-Server.

Parsing happy, polyparse 6685
Regex Regex-tdfa, regexpr, pcre-light 8791
Codec bzlib 3275
XML hxt, xml 4871
HTML tagsoup 3141
Client-side HTTP curl 2678
Generics uniplate 2675
2D Graphics wxcore 2601
Testing HUnit 2325
Database haskelldb, hsql 3164
Network Network-bytestring 1210
Control monadLib 1081
Web Framework happstack 549

Future

Finally, we can speculate on what would happen if the current download growth rate continued for a couple more years (projecting forward 18 months, using the trend of the last 18). We’d reach a cumulative total of 10 million source downloads around the end of 2010 (continuing the order of magnitude growth of the last 18 months).

Of course, a lot is unknown in this scenario. If everyone starts installing all their code via cabal-install, downloads will rocket, as does increasing reuse, by using more libraries.

However, if the top Haskell applications and libraries become an order of magnitude more popular, the distros will take them up, slowing growth. Growth will also slow if we run out of resources in some form or another: no more easy libraries to bind to, for example, just as we ran out of existing things to cabalize in 2007.

Get the data yourself

You can play with the data set yourself here:

  • Monthly downloads per package (CSV, HTML)
  • Packages by download frequency (CSV, HTML)

You can also get the full data in a sqlite database, courtesy of mmorrow on #haskell.

Posted in Community, Cryptography, Functional Programming, Galois News, Haskell, Open Source | Comments (8)

Trustworthy Voting Systems

March 2nd, 2009 by Aaron Tomb

Accurate and reliable elections are a critical component of an effective democracy. However, completely secure and trustworthy voting procedures are difficult to design, and no perfect solutions are known. Ideally, a trustworthy voting system should guarantee both verifiability (the ability to prove that the counted vote matches the submitted ballots) and privacy (the inability to link the contents of a vote with the voter who cast it).

These guarantees may now be achievable. Many researchers have proposed voting protocols that achieve verifiability and privacy in theory, and a few do so under assumptions that are satisfied by current election practices. Most of the protocols involve posting an encrypted version of the contents of every ballot in some public place (likely a web site), and depend on the properties of cryptographic operations to achieve privacy while allowing anyone to verify the final tally.

Now that practical, secure voting protocols exist, the time has come to bring them into use. One existing solution that comes close to achieving these goals while retaining compatibility with current voting practices is the Scantegrity II system. It has the advantage that it can operate under current US election conditions, without requiring any modification to existing optical ballot scanners, and with very little change to the individual voting process.

However, the software used in this system is only a prototype, with a number of shortcomings. Voter privacy depends on ability of a computer system to keep a key database completely secret, and accurate vote counting depends on the correct implementation of complex cryptographic algorithms. The software is tens of thousands of lines of code, and as with any other software of that size, many bugs certainly exist.

We believe that the importance of trustworthy election results and the past lack of success in creating reliable solutions warrants a new approach to the design of voting systems. In particular, we advocate a class of techniques known as formal methods that allow us to make precise mathematical assertions about how software should behave, and determine whether it satisfies those assertions. Government agencies within the Department of Defense make use of formal methods to ensure the reliability of important computer systems, and the draft update to the development standards used by the Federal Aviation Administration, DO178C, includes provisions for the use of formal methods. Voting systems deserve similar care.

One Approach

Formal Methods

Software designers employ a number of techniques to combat the difficulty of producing correct software. At the simple extreme, running the software directly through a few scenarios that match expected operating conditions and observing the results can help eliminate egregious mistakes, but does little to prevent subtle mistakes or malicious attacks. Most software, including that in existing voting machines, undergoes little more than this preliminary sanity checking.

The goal of formal methods is instead to prove that a piece of software necessarily behaves properly under all possible circumstances. The first step toward this goal is to make a connection between the concrete software and some well-defined mathematical model. It can be possible to extract a model directly from the concrete software. Alternatively, software architects can build the model first, and derive the concrete software from the mathematical abstraction.

In either case, the connection between the model and the concrete software should be such that a proving a property of the model guarantees that the property holds of the software.

A Formal Model of the Voting Process

We advocate developing a formal model of the complete pipeline of voting events, including ballot creation, vote marking, ballot recording or scanning, ballot aggregation and counting, and final certification. If described appropriately, in a fully-executable language, portions of this formal model can be used directly in a software implementation. This method of software development also makes feasible a wide variety of techniques to gain confidence in the correctness of the software, potentially including a complete formal proof of correctness for key components. It has the additional benefit of allowing us to make confident statements about the security and reliability of the non-computerized stages of the process.

The Voluntary Voting System Guidelines recommendation released in 2007 by the National Institute of Standards and Technology makes significant steps toward a formal definition of voting, going as far as including UML process diagrams and mathematical descriptions of a few important properties. We advocate taking this work a step farther, and describing as much of the process as possible in a precise, formal, and machine-readable language.

High-Assurance Cryptography

Voting protocols that provide simultaneous guarantees of integrity and privacy typically depend heavily on the properties of cryptographic algorithms. While, in theory, these algorithms transform data in ways that achieve the properties necessary for trustworthy voting, it is absolutely critical that the algorithms be implemented properly.

Recent work has made significant progress toward high-assurance cryptography. Researchers at Stanford have done work to formally verify that Java code implementing a cryptographic algorithm matches a specification written in a functional language. Galois’ own programming language, Cryptol, is also a functional language, designed specifically for implementation of cryptographic algorithms at a high level of abstraction, and also allows the high-level specification to be proven equivalent to code generated in C or VHDL. The use of a functional language in both approaches allows developers to describe cryptographic algorithms in a form that closely resembles their mathematical specification, making it significantly easier to gain confidence in their correctness.

The existence of these approaches demonstrates the feasibility of high-assurance implementations of cryptographic algorithms. These specific technologies may or may not be the most appropriate solutions for a standard, highly trustworthy voting system, but they show that such a system can be built.

Guaranteed Privacy

Secure voting protocols often depend on the secrecy of certain pieces of information to protect voter privacy. To prevent vote buying and voter intimidation, it must be infeasible to prove that a specific voter cast a particular vote.

Technology exists to provide this guarantee, as well. Work on cross domain solutions has resulted in a number of highly trustworthy systems. As an example, Galois’ Trusted Services Engine (TSE) includes a software component that enforces strong separation between different information domains. A formal proof of correctness, designed for evaluation at EAL6, guarantees that private information can never be accessed by unauthorized parties. General Dynamics also has a whole suite of products aimed at this market.

In the context of voting systems, a cross domain component could be used to protect the database used to generate ballots. This database could be inaccessible to any human user, while still allowing auditors, and even the general public, to view the data necessary to tally votes, and verify that any particular vote was recorded correctly.

Though these existing systems, as they stand, may not be  entirely appropriate for use in a voting system (none of them are available to the general public, for instance) they do show that it is feasible to gain a high degree of assurance in the separation of secret and public data.

Openness

Because of the importance of voting to an effective democracy, and the temptations to subvert the process, we believe that transparency is essential. Therefore, we advocate that all software source code, design documents, and assurance artifacts involved in a voting system be made freely available to the public (either in the public domain or under an OSI-approved open source license). In particular, any interested party should be able to obtain the source code and use it to build executable software identical to that stored in the voting systems in use.

Conclusion

We argue that a key component of a secure voting system is trustworthy software that is developed using the best known assurance techniques and completely open to public inspection. In addition, the essence of the voting process should remain software independent, in that a manual recount should always be possible. Higher software reliability should make the necessity of a hand recount less common, but software should never be designed in such a way that a hand recount is impossible, or even more difficult than it would be in a completely manual election.

Posted in Cryptography, Formal Methods, Open Source, Technology | Comments (6)

Tech Talk: Android G1: Experiences with an open mobile platform

January 19th, 2009 by donstewart

The second Galois Tech Talk, for 2009, was Isaac Potoczny-Jones (aka SyntaxNinja) on developing for the Android G1 phone. You can read the slides..

The Android G1 is a TMobile phone whose operating system, Android is based on Linux and was developed by Google. It’s a very open smart-phone platform that rivals the iPhone.

While I’m no expert in Android or mobile platform development, I will discuss my experiences in Android development and demonstrate the toolchain used to develop software for the Android. I’ll outline the basic features of the platform, with a focus on the factors that make its openness so powerful:

* the inter-process communications mechanism whereby applications can advertise the services they offer and other applications can take advantage of those services,

* The open-source Java, Eclipse, and Linux-based toolchain,

* the OpenIntents project.

This will be an informal demonstration and discussion.

A group of us in collaboration between the Android Password Safe project and the Openintents project have implemented a cryptography service and a keystore service which other Android applications can use to keep data and passwords safe, in a way that’s convenient for the end user.

Our system allows a single password, and periodic single sign-on so that all applications can encrypt, decrypt, and store keys using the same master password that the user enters once.

  • Date: Tuesday, January 20, 2008
  • Time: 10:30am – 11:30am
  • Location: Galois, Inc.
    421 SW 6th Ave. Suite 300
    (3rd floor of the Commonwealth Building)
    Portland, OR 97204

Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities. The talks are open and free.

Posted in Community, Open Source, Tech Talks, Technology | Comments (0)

©2000–2010 Galois, Inc. All rights reserved. Terms of use. Contact Us