Specifications Don't Exist

This article is a companion piece to What Works (and Doesn't) Selling Formal Methods. It began life as part of a talk I gave in late 2024.

Some things are difficult, and some things are merely expensive. 

Imagine this: aliens are coming to kill us all. Humanity sends its best negotiators, and the aliens announce that they will call off their horrible assault only if we release a formally verified version of GCC, every damn line of code… in ONE YEAR. 

After some understandable confusion, the heads of the G7 summon Xavier Leroy, hand him an enormous check, and tell him to get on with it. Pretty soon, warehouses of mathematicians around the world are proving Rocq theorems day and night. The coordination challenges are enormous, but they get it done! The aliens go home, Xavier wins a Fields medal, EGOT, and all five Nobel prizes,1 and everyone else goes back to worrying about AI. 

The point I’m making is that we basically know how to build a formally verified compiler—we already did it. The same is true for formally verified cryptographic libraries, parsers, and microkernels. The reason that GCC hasn’t been formally verified is that to do so would be absurdly expensive. It wouldn’t be beneficial enough to justify the cost. But if aliens do show up, I bet we could do it for less than the cost of MCU Phase 3 ($2.2–2.4 billion). 

Okay different story: aliens are coming to kill us all and they demand a formally verified web browser. Xavier sticks his hand up (they brought him this time) and asks, er… what is it exactly they want? Like, what formally is a web browser, anyway? This question confuses and infuriates the aliens! Surely we have a formal specification for Chrome, didn’t we build it? And, well, that’s it for humanity. 

The problem is that we don’t have a formal specification for Chrome, or a word processor, or even something simpler like the PDF document format. Outside a few domains, specifications don’t exist—I don’t just mean that no-one wrote them yet, but rather we have reason to believe that no precise and coherent specification can be written. If the second type of alien shows up, we’ll have to quickly invent a lot of brand-new computer science, and I don’t know if even MCU money could get it done. 

Formal Specifications Should be Formal and Specific   

Why do we want formal specifications in the first place (aliens aside)? 

A formal specification says what we want in a precise enough way to serve as a summary or explanation for a system. Rather than examining the system, we can examine the specification instead. This generally doesn’t dictate every single thing about the system (it’s a map, not the territory). For example, the Rust borrow checker guarantees a simple specification, “this program will not generate memory errors.” If you give me a safe Rust program, it might do a lot of things, but it won’t do that

We need a formal specification if we want to formally verify anything. Otherwise, what are we verifying? Actually doing the verification might mean a warehouse of math grads writing Rocq theorems, or it might mean running the Rust borrow checker. But without a specification, we can’t get started. 

Formal methods are old, and people have tried specifying a lot of things. But as a rule of thumb, the most useful formal specifications often have the following characteristics: 

  • Mathematically clean: we can write the specification in some relatively simple collection of mathematical notions. 
  • Easy to reason about: we can use the specification as a way of understanding the system, either by human intuition or formal analysis.  
  • Encapsulated: there’s a clear boundary between the ‘inside’ of the system and the ‘outside’, and the specification describes what happens at this boundary. 
  • Broadly agreed upon: the designers and users of the system agree it should in fact match the specification—the map is a good description of the territory. 
  • Stable under change: when the system evolves, the specification doesn’t change too much—the map abstracts away from small differences. 

Some systems seem to fit these requirements very naturally—for example, compilers, cryptographic libraries, parsers, and microkernels. We might say these systems are naturally formalizable.2 If that list looks familiar, you’re right: these are exactly the success stories for formal verification!

I have a hypothesis: once you have a good-quality formal specification, verifying a system is not difficult, merely expensive. And there’s a lot of unsqueezed juice here for formal verification fans. Compilers, microkernels, and the rest are security-critical components, but almost no such system we use today is formally verified. Call the G7, it’s time, let's verify GCC! *end article, close laptop* 

Wait, no, there’s more. Some people look at CompCert and SeL4 and ask, “why can’t we formally verify my system?” Well, I think naturally specifiable systems are a tiny and very unrepresentative niche. Out in the wild, most systems are very un-specifiable, and that’s a big problem if we want to formally verify a lot more of the world. 

Actually, Too Many Specifications Exist   

We might not have a formal specification, but developers write informal specifications all the time. A system might have many types of specifications which carry important information: 

  • Prose documentation - internal-facing design documents, external-facing user manuals and guides, and also white papers, RFCs, and similar. 
  • Slide decks - astoundingly common, especially in the US Government. 
  • The system itself - especially for a legacy system, the system may ‘do what it does’ and any change is by definition against specification. 
  • User stories - for a user-facing application like a web browser, the set of user stories might be the real top-level specification, but it's typically very hard to express these formally.
  • Unit and integration tests - often the closest thing to a formal specification, but restricted to a single input or scenario. 
  • Many many other things, including reference implementations, inline comments, tacit / practical know-how in the team, customer preferences, regulatory requirements, scribbled notes on coffee-shop napkins… 

In fact, there are too many partial specifications, with incomplete coverage, none of which match each other. When I’m scoping projects for Galois clients, I’ve often had this kind of conversation: 

              Me: “Do you have a specification for [your giant system]” 

              Client: “Yes, here’s a two-slide powerpoint deck”

                   ~ and/or ~

              Client: “Yes, here’s a 7000-page requirements document in semi-structured prose” 

All of which leads to this kind of conversation, when trying to formally verify the system in question:

              Me: “We found the system does [something] but your specification implies [something else].”  

              Client: “Oh, huh, that doesn’t matter”

                    ~ or ~

              Client: “Yeah, we changed that 6 months ago” 

After a person has this conversation about 40,000 times, I find they consider retraining as a circus clown or just throwing themselves in the river. 

You can also try sitting down with clients and writing formal specifications: 

              Me: “Do you want the specification to allow [some behavior]?”  

              Client: “Er… I don’t know, I haven’t thought about that situation”

              Me: “If you allow it, you’ll also have to permit [some other behavior]. Does that matter?”  

              Client: “I don’t know about that situation either… how long is this going to take?”

I’ve come to think writing formal specifications is just a very difficult task. It requires a top-down view of the system that designers and engineers typically don’t have or, more importantly, need. In contrast, informal specifications can be ambiguous, partial, flexible. Informal specifications are intended as communication mechanisms between humans, and as a result they can be ‘wrong but useful’, and elide aspects of the system that are not of interest. This is a strength, but it also results in systems that can’t be easily formalized. 

I think systems are typically both designed from the top and grown incrementally. Most systems have some degree of top-down structure, but few systems have a mathematically coherent specification that covers every behavior. The effect is that most systems obey some formal specification for some core functionality, but if outside this core, we rapidly enter muddy territory where it is unclear what the system should do, or whether the designer should even care.3

Le PDF n’a pas eu lieu4  

Let’s dig into a particular example, the Portable Document Format, PDF. In many ways, this should be an easy case for specification. After all, a PDF isn’t some esoteric thing - you can download one and open it on any one of a billion devices. PDF also has a well-maintained standard developed by the PDF Association, a huge body of examples, and many many implementations, ranging from hobbyist projects to security-critical systems. 

But even so, PDF does not have a consensus about what implementations should do, nor does it have a specification that matches the behavior of existing implementations, nor a clear definition of a bad or insecure document. As a result, it seems impossible today to write a precise and coherent formal specification for PDF. 

In a large PDF dataset, we will have some examples that are known-bad—ill-formed in a way designed to elicit some vulnerability—and some that are known-good, in that they conform tightly to the standard. But we also have a large number (the majority in fact) where all that we can do is shrug. They might be good or bad, we just don’t know. 

We called this diagram "the potato of doom"

This seems like an absurd situation, what’s going on? One reason for this is that PDF readers have an incentive to parse as many documents as possible. If you receive a PDF and it doesn’t open, you’re likely to blame the PDF reader, and not the program that created the document. To avoid this, every PDF reader tries to fix up mistakes in non-standard documents, and all of them do it slightly differently. The same non-standard document might get interpreted two different ways by different PDF readers. 

I feel like Jacques Derrida writing this, but the result is that PDF does not exist, at least in the sense of a discrete category that can be formalized. Instead, there’s a fuzzy boundary where something might or might not be considered a PDF. That wouldn’t be so bad, except that a large proportion of actual PDFs exist in this liminal region! 

On the DARPA SafeDocs project, Galois spent several years working on formalizing PDF. We worked closely with the PDF Association using a new language we built for the purpose, DaeDaLus. I’m proud of the work we did, but at the end of the project, our formal PDF specification was non-descriptive (it differs from real PDF readers), non-normative (it doesn’t characterize what we think should be correct behavior), and it’s unclear how to get to a more rigorous and accepted specification.

You can read about Galois’s work on PDF here, and you can read about DaeDaLus, the language we built to help describe document formats, here

Conclusion: Do What I Mean    

xkcd 568. license: CC BY-NC 2.5

Okay here’s another story: the current era of formal verification has been dominated by the cost of proof. Specification has taken second place—we can’t even verify systems with simple specs, so why worry about everything else? Now, thanks to advances in modern AI, we may soon live in a strange world where proofs are cheap and abundant. If that happens, I think we will quickly verify every compiler and microkernel, then find that we’re stuck. Even Claude can’t tell us what to want. 

Formal verification today is very useful, but for most systems it’s very difficult to write the kinds of complete specifications that verification needs. However, other kinds of specifications are popular: for example, a test case is a kind of limited, partial specification. The key is that a test case is immediately useful, and doesn’t impose undue costs on the development team. We need to find ways to specify systems that have these virtues, and avoid the trap of imposing a complete and coherent view that fundamentally does not exist. 

It’s a formal verification cliché that writing the specification tends to uncover most of the bugs in a system. To me, this suggests an analogy between specification and programming—both are tools for expressing what we want. In one way, this is a pessimistic thought: no tool can remove the burden of clarifying our ideas. But also, it gives me some hope. Programming is very difficult, but through careful tool design, we’ve made it available to hundreds of millions of people. With luck and skill, perhaps we can do the same for specifications. 

We’re just getting started. There’s a lot to do. And we need to keep our eye out for those aliens. 

1  Not the fake one for Economics 

2  I don’t want to undersell the work it took to invent the specification technologies we have now. The systems that are naturally specifiable are only so with the specification tools of today, which represent half a century of concerted work 

3  Ah-ha, you say, what we should do is insist that engineers build every system starting with a rigorous formal specification. In most cases, I’m skeptical this would pass the cost/benefit test

4  Avec toutes mes excuses à Jean Baudrillard