What Are Formal Methods?

What are formal methods?

Formal Methods are a set of mathematically rigorous techniques used to specify, write, analyze, and verify software systems. They go beyond traditional testing by using logic-based reasoning to prove that a system behaves correctly under all possible conditions – no matter the inputs or states.

Think of formal methods as a mathematical safety net for critical systems.

At the heart of formal methods is a shift in how systems are represented. In traditional software development, system behavior is often described using diagrams or natural language – methods that can be imprecise or outdated. Formal methods replace this ambiguity with formal logic and rigorous proofs, ensuring that your system’s behavior and design are both correct and verifiable.

How do formal methods work?

Formal methods work much like mathematical proofs. To prove the Pythagorean theorem, we don’t measure every right triangle in existence – we use mathematical logic to show that the relationship holds for all right triangles, no matter their size. Similarly, formal methods use mathematical reasoning to prove that a software system will behave correctly under every possible scenario, not just the ones we test.

In Traditional Software Development, we rely on manual or automated testing to check whether a system meets its requirements. But just as we can’t prove the Pythagorean theorem by testing a few triangles, we can’t verify the correctness of a complex system by checking a handful of inputs. Testing only covers a limited subset of behaviors, leaving room for errors to escape detection.

By contrast, with Formal Methods we translate the description of a system’s intended behavior into a precise mathematical problem, allowing us to reason about every possible input and outcome using formal proofs. By shifting from testing specific cases like a software developer to proving general correctness like a mathematician, formal methods bridge the gap between real-world systems and mathematical certainty – delivering far stronger assurance of safety, security, and reliability.

Side-by-side comparison table of a traditional software development approach to formal methods and mathematical theorems.

examples of formal methods

Examples of formal methods in software engineering include:
‍

Using automated theorem proving to prove the functional correctness of hash table implementations
Creating formal mathematical models to verify implementations of side-channel resistant microchips
Using static analysis to ensure memory safety of a C program
Using SMT solvers to verify the correctness of a statistical classifier
Applying formal verification to neural network-based collision avoidance systems that ensure commercial aircraft maintain a safe distance
Verifying security of blockchain protocols and safety of smart contracts
Using Rust, which internally uses formal methods to prove memory safety for performance-critical code

Let’s say you are developing a nuclear reactor and a component in the system is supposed to monitor reactor temperature and automatically shut the system down if readings fall outside a safe range. If this function fails, it would result in catastrophic damage.

In traditional development, engineers would rely on manual or automated testing to check whether the temperature monitoring software behaves as expected. However, testing can only cover a finite number of cases / inputs. Testing might check a few values around the safe range, but it can’t account for every possible input the system might receive in deployment.

Formal methods offer a more robust alternative. By mathematically modeling the system and its expected behaviors, we can prove that the system will respond correctly to any possible temperature input – even edge cases engineers didn’t think to test.

For example, the method shown below is a snippet of software that checks some input temperature and reacts if it is out of bounds.
‍

Traditional Development Approach Example:‍

bool is_valid_temperature_range(int temperature) {
	return (temperature < 2200);
}

A traditional testing approach would throw values at this function within the range, outside of the range, and at the boundary of the range. In comparison, an example formal methods approach would be to annotate this function with a logical claim that specifies how it should behave – as shown below.

Formal Methods Approach Example:

/* 
    behavior temperatureInRange:
        assumes (temperature < 2200);
        ensures \result == true;

    behavior temperatureOutOfRange:
        assumes (2200 <= temperature);
        ensures \result == false;
*/

bool is_valid_temperature_range(int temperature) {
    return (temperature < 2200);
}

Formal methods tooling then creates a math model out of the contract (lines 2-7 of Listing 2) and the entire software program. It then tries to prove that:
‍

For every value inside of this range (line 3), the method returns true (line 4)
For every value outside this range (line 6), the method returns false (line 7)

If a tool can prove this contract, it provides assurance that the method is correct for all temperature values.

principles of formal methods

Applying these principles effectively across a system or engineering process can be intricate, and involves tradeoffs. To realize their full value, formal methods must be applied with strategic intent – targeting the components that matter most, using tools and techniques that fit the development context, and balancing cost and assurance. These principles are not intended as requirements that must be met, but rather as guidance that can help guide the adoption of formal methods across various stages of system development:

principle 1: Apply with precision

Applied Formal Methods (AFM) techniques and tools exist on a spectrum, as illustrated in the figure below.
‍

A colored spectrum bar showing various formal methods, moving left to right from easy to challenging. From left to right, the list includes: Controlled Natural Language Specifications, Code-Level Contracts, Property-Based Testing, Modeling & Analysis, and Theorem Provers.

On one end of the spectrum are simple applied concepts that are understandable by and available to engineers. They are akin to dipping one’s toes into the AFM world. One example of this is inserting a few code-level contracts into the software. One can perform a little bit of AFM to a part of their system and get some value out of it.
‍
On the other end lies pure formal methods in all their complex glory. You will find that researchers usually live here. They provide precise mathematical claims backed by rigorous proofs using tools such as theorem provers. While this is a worthy goal, it is simply not tenable to apply across an entire system: (1) it is not always scalable, (2) the tools are not at industrial usability standards, (3) engineers in industry are often not mathematicians, and (4) not all systems can be proven to be correct with today’s technology due to complexities of interactions with the environment.

Not only does which formal method you choose make a difference, so does where you choose to apply them. Applying formal methods to an entire system is rarely necessary, or even practical. Instead, it’s important to choose the right tool or technique and apply it to the specific part of a system where it will make the biggest impact.

For example, a commercial aircraft is a mixed-criticality complex system. Within the aircraft as a whole, the entertainment system is low criticality (if it fails, it will be inconvenient), the toilet flusher control is medium criticality (if it fails, it will be a more significant problem that may result in reputational or financial costs), and the flight controls are high criticality (if they fail, the plane crashes and everyone dies). In this example, the Airline may choose to formally verify the flight control system, but choose a lower assurance verification technique for the entertainment and toilet flusher system.

Graphic of an airplane showing low, medium, and high criticality components within a system, and pointing out that formal methods should be applied to just high criticality components

Think of your system as a network of mixed-criticality components. Choose where formal assurance is most needed, then apply just enough formal methods to reduce risk without slowing progress. Use the following framework to guide your strategy:

1. Identify critical components, whether for safety, security, or business impact
‍
2. Select the right level of formal method based on the risk profile. For low-to-medium criticality, easier but slightly lower assurance techniques may suffice. For highly-critical applications, consider full formal verification.
‍
3. Start small and focused. Choose one area to apply formal methods. Train a small group of engineers and give them a clearly defined, attainable goal.

4. Work incrementally. Many formal methods live on a spectrum, from lightweight annotation to full theorem proving. Even minimal use can deliver meaningful results.

There are often two axes of technical approaches when you're trying to increase the assurance of a system: (a) find as many bugs as possible, and then eliminate those bugs, or (b) prove the absence of all bugs (within a well specified class). Techniques that fall into category (a) are generally lightweight approaches that can be extremely useful in raising the assurance of a system (e.g., symbolic execution, fuzzing). Techniques that fall into category (b) tend to be more heavyweight approaches that can be applied to components of very high criticality.

principle 2: specification before implementation

Crafting clear, unambiguous specifications is very useful in formal methods. During system development, engineers typically begin by writing a specification: a description of the system’s design, features, requirements, and intended behavior which serves as the blueprint of the system. Yet these specifications vary widely – from formal documents to napkin sketches – and are rarely precise, consistent, or agreed on by all users of a system. As a result, the implemented system may not match the specification. This can lead to cost or schedule overruns due to misunderstandings of what should be implemented. It also can lead to subtle security vulnerabilities, safety concerns, or performance issues.

Rewriting or translating a specification using a controlled natural language or formal specification language removes ambiguity, enables verification of the specification itself (e.g., checking consistency and completeness), and lays the groundwork for mathematically verifying that an implementation conforms to its intended design. Even if no further formal methods are used, refining your specification process alone can produce enormous benefits: Reduced defects, clearer communication, and fewer integration surprises.

What if you do not have any specifications? If you have existing subsystems and legacy code and no specifications, there are some lightweight formal methods that can help check existing implementations, e.g., equivalence testing, binary analysis, some static analyses. However, for many tools and formal approaches, some form of understandable behavior specification is critical, even if that is developed after the fact, e.g., for legacy systems.

principle 3: validation and verification

Testing and formal proofs are complimentary approaches, not alternatives.
‍
Testing helps validate that a system behaves as expected under a finite set of test inputs. Verification goes further: it proves, mathematically, that the implementation always satisfies its specification under all possible inputs and execution paths.
‍
Some formal verification techniques can be woven into existing development pipelines. Static analyzers and contract checkers, for example, can validate software correctness without requiring wholesale changes to engineering practices. Tools such as Frama-C for C, PMD for Java, Logika for Scala, and SPARK Ada for Ada can help teams get started quickly with manageable effort.

principle 4: abstraction and modeling

Formal methods depend on well-chosen abstractions. The right model captures essential system behaviors while omitting irrelevant details – allowing meaningful analysis early and often in the development lifecycle.
‍

Domain modeling identifies important system-environment relationships
System architecture modeling serves as a blueprint that defines a system’s design principles, including its structure, components, interfaces, and behavior. It is a refinement of the domain model to a level at which many analyses and tests can be performed very early in the design process, including architectural modeling, behavioral analysis, timing and schedulability of threads, safety analysis, code-generation, and property-based testing.
Hardware modeling can detect logic and timing bugs that might otherwise go unnoticed until late in development.

When combined with strong tool support (e.g., AADL, Simulink, or Mathworks tools), these models enable scalable assurance across system layers.

principle 5: refinement

Systems evolve from abstract ideas into detailed implementations through a series of refinements. At each stage, formal methods can ensure that each step preserves correctness.
‍
Take an example of Railways: a broad rule like “only one train on a track at a time” can be refined into concrete designs involving interlocks, signals, and movement logic – each verified to preserve the core safety requirement.
‍
Refinement frameworks make it possible to mathematically link high-level requirements to low-level implementations – reducing the risk of gaps.

Benefits of formal methods

These aren’t just theoretical techniques and tools – Formal Methods deliver practical, high-impact value – including cost reduction, risk mitigation, and better system outcomes across domains.

In High Assurance Software Development, Formal Methods help engineering teams deeply understand, verify, and improve the systems they build:
‍

Saving time and money by replacing arduous manual testing with mathematical reasoning
Achieving near-perfect confidence in critical properties of your system
Ensuring that system designs and implementations stay perfectly synchronized as the system evolves
Going beyond “whack-a-mole” security measures, achieving security by default

In Cryptography, Formal Methods offer mathematical guarantees in a field where correctness is non-negotiable:
‍

Proving the absence of entire classes of vulnerabilities (e.g., side-channel attacks, logic flaws)
Demonstrating compliance with security standards and specifications
Ensuring implementation correctness
Preserving trust in systems where a single flaw could undermine security at a global scale.

In Model Based Systems Engineering, Formal Methods provide engineers with repeatable and thorough tools to analyze system models at varying levels of detail:
‍

Link high-level requirements to implementations, supporting traceability and easing compliance
Enabling experimentation, testing, and quick iteration in a virtual environment for better quality and results
Reducing total lifecycle cost, timeline, and risk via correct-by-construction design techniques
Verifying and validating that cyber-physical systems and software that runs them are correct and secure

In Artificial Intelligence and Machine Learning (AI/ML), a deep expertise in formal methods gives a clear-eyed understanding of the limits of formal verification and high-assurance techniques for complex, neural network-based architectures. That understanding shapes a more practical, impactful approach:
‍

Using AI techniques to automate and accelerate formal verification and model-based systems engineering processes
Developing novel algorithmic approaches to make AI systems that are more interpretable, auditable, and aligned with human intent than the current AI SoTA
Advising on the amenability of particular AI architectures to formal analysis

Frequently asked questions

Do I have to mathematically verify the whole system?
‍
No, not at all. In fact, it would be untenable with large-scale systems. When applying formal methods, one should consider the scope of where to apply it and what tools are best suited for the application. For example, in an aircraft, the flight control system is a safety-critical software component that must meet very stringent standards such as DO-178C. This component may be a good target for formal verification approaches.

How do I know where in the system formal methods could be applied?

‍To start, find components that have high safety or security criticality, are challenging to understand, or are software-intensive. AFM approaches are very good at reasoning about complex logic. Likewise, there are many available tools for software-specific reasoning; it is easy to incrementally apply a little bit of formal methods into software analysis.

How do I know where in system development formal methods could be applied?

‍This question very much depends on your system development process, the teams that work at each level of design, and their willingness to participate in formal verification activities. There are varying tools, disciplines, and techniques that can be applied at different levels, so finding the right team who is willing to try will be your best bet.

Do I have to change my existing engineering techniques to support the inclusionof formal methods?

You can wrap formal methods around your existing techniques to increase assurance, and you do not have to rip out existing processes. If you use formal methods, there are likely things in your process that you won’t need to do any longer. There are also likely things that can be improved to enable formal methods. Since formal method techniques live on a spectrum, it is easy to identify places within your existing approaches where a little bit of formal methods could be applied. Even doing this in small places will result in value added.

‍Do you have to be highly trained in order to learn and apply AFM?

‍No. Over the last decades, much research has focused on automating mathematical approaches with the goal of making these tools more usable by any engineer. However, there still exists some tools and techniques that are very difficult to use and understand, so be careful when you choose what approach you want to incorporate into your workflow. For example, theorem provers like Coq or Isabelle are challenging to learn and use, but code-level contract tooling such as Frama-C are much easier for a programmer to learn and use.

Can you do a little bit of AFM and still get value out of it?

‍Yes, we recommend a simple approach such as static analysis for software analysis or architectural modeling and analysis for early stage development. These techniques are relatively easy to learn and use while having great benefit during system development.

Can testing and AFM supplement each other?

‍Very much so. In fact, there are some software verification tools that aid testing by automatically generating test cases for you. Like we described earlier, formal verification and validation address two different problems, both of which are valuable in system engineering.

references

[1] Alloy Analyzer. Alloy tools. Website, Accessed Feb 2025. https://alloytools.org/.

[2] Dines Bjørner and Cliff B. Jones. "The Vienna Development Method: The Meta-Language," volume 61 of LNCS. Springer, May 1978.

[3] Paul Black, Mark Badger, Barbara Guttman, and Elizabeth Fong. "Dramatically reducing software vulnerabilities: report to the white house office of science and technology policy." Technical report, National Institute of Standards and Technology, 2016.

[4] B. Bonvoisin. "25 years of formal methods at RATP." In International Railway Safety Council (IRSC), 2017. Accessed: March 31, 2025.

[5] Bernard Carré and Jonathan Garnsworthy. "Spark—an annotated ada subset for safety-critical programming." In Proceedings of the Conference on TRI-ADA’90, pages 392–402, 1990.

[6] ClearSy. "Atelier b: A toolset for the b method, 2021." Accessed: 2025-03-15.

[7] Dessault. Catia embedded code generator. Website, Accessed Feb 2025. https://docs.nomagic.com/display/MD2022x/CATIA+Embedded+Code+Generator.

[8] Jeremy Doerr, Jeremy Doerr, Conrad Bock, and Raphael Barbau. "Verifying executability of SysML behavior models using Alloy analyzer." US Department of Commerce, National Institute of Standards and Technology, 2021.

[9] Jonathan Draper and Helen Treharne. "The refinement of embedded software with the b-method." In Proceedings of the BCS-FACS Northern Formal Methods Workshop. BCS Learning & Development, 1996.

[10] Peter H. Feiler, Jorgen Hansson, Dionisio de Niz, and Lutz Wrage. "System Architecture Virtual Integration: An Industrial Case Study." Technical Report. CMU/SEI-2009-TR-017 ESC-TR-2009-017, Software Engineering Institute, 2009.

[11] Galois, Inc. "The BESSPIN Lando System Specification Sublanguage." GitHub repository, May 2019. Available at: https://github.com/GaloisInc/BESSPIN-Lando.

[12] Andy Greenberg. "Inside a five-year battle between chinese hackers and sophos firewall defenders." Wired, 2025. Accessed: March 31, 2025.

[13] John Hatcliff, Jason Belt, Robby, and Todd Carpenter. "HAMR: an AADL multi-platform code generation toolset." In Tiziana Margaria and Bernhard Steffen, editors, Leveraging Applications of Formal Methods, Verification and Validation - 10th International Symposium on Leveraging Applications of Formal Methods, ISoLA 2021, Rhodes, Greece, October 17-29, 2021, Proceedings, volume 13036 of Lecture Notes in Computer Science, pages 274–295. Springer, 2021.

[14] John Hatcliff, Jason Belt, Robby, Jacob Legg, Danielle Stewart, and Todd Carpenter. "Automated property-based testing from aadl component contracts." In International Conference on Formal Methods for Industrial Critical Systems, pages 131–150. Springer, 2023.

[15] John Hatcliff et al. "Slang: The sireum programming language." In International Symposium on Leveraging Applications of Formal Methods, pages 253–273. Springer, 2021.

[16] John Hatcliff, Danielle Stewart, Jason Belt, and August Schwerdfeger. "An aadl contract language supporting integrated model-and code-level verification." ACM SIGAda Ada Letters, 42(2):45–54, 2023.

[17] SAE International. Architecture Analysis & Design Language (AADL) Rev. C, January 2017. Published at https://www.sae.org/standards/content/as5506c/.

[18] Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski. "Frama-c: A software analysis perspective." Formal aspects of computing, 27(3):573–609, 2015.

[19] Daniel Kroening and Michael Tautschnig. "Cbmc - c bounded model checker." In Proceedings of the 20th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS ’14, pages 389–391. Springer, 2014.

[20] Tobias Kuhn. "A survey and classification of controlled natural languages." Computational linguistics, 40(1):121–170, 2014.

[21] Waterloo Intelligent Systems Engineering (WISE) Lab. Clafer Model Wiki. Website, 2023. http://t3-necsis.cs.uwaterloo.ca:8091/#learning-clafer.

[22] Gary T. Leavens and Peter Müller. "Tutorial on jml, the java modeling language." In Proceedings of the 31st International Conference on Software Engineering, ICSE ’06, pages 704–705. ACM, 2006.

[23] George D. Lemp and Steven Jenkins. "Ambiguities in natural language requirements." Technical report, NASA, 2013. Accessed: March 31, 2025.

[24] CAMET Library. Fastar compositional schedulability. Website, Accessed Feb 2025. https://tools.galois.com/camet/overview/camet-tools.

[25] CAMET Library. Sliced behavioral analysis. Website, Accessed Feb 2025. https://tools.galois.com/camet/overview/camet-tools.

[26] David Lin, Ted Hong, Yanjing Li, S Eswaran, Sharad Kumar, Farzan Fallah, Nagib Hakim, Donald S Gardner, and Subhasish Mitra. "Effective post-silicon validation of system-on-chips using quick error detection." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 33(10):1573–1590, 2014.

[27] Logika. Sireum logika: automated and interactive verification framework for slang. Website, 2025. https://logika.sireum.org/.

[28] MathWorks. Implement behaviors for architecture model simulation. Accessed: April 3, 2025.

[29] Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, and Denys Poshyvanyk. "On the utility of marrying gin and pmd for improving stack overflow code snippets." In Proceedings of the 44th International Conference on Software Engineering: Companion Proceedings, ICSE ’22, pages 189–190. ACM, 2022.

[30] NASA. "Simulink code generation: Tutorial for generating c code from simulink models using simulink coder." Technical report, NASA Technical Reports Server, 2012.

[31] No Magic, a Dassault Systèmes company. Cameo Systems Modeler. No Magic, Inc., Allen, Texas. Accessed: April 2, 2025.

[32] Object Management Group. OMG Systems Modeling Language (OMG SysML), 2019.

[33] S. Owre, J. M. Rushby, and N. Shankar. Pvs: A prototype verification system. https://pvs.csl.sri.com, 1992. Accessed: 2025-03-15.

[34] Nicolás Pérez, Santiago Matalonga, Xabier Larrucea, and Rory V. O’Connor. "The persistence of ambiguities in software requirements specification documents." Journal of Systems and Software, 165:110567, 2020.

[35] PMD. Pmd: An extensible cross-language static code analyzer. Tool website, 2024. https://pmd.github.io/.

[36] Eshan Singh, David Lin, Clark Barrett, and Subhasish Mitra. "Logic bug detection and localization using symbolic quick error detection." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018.

[37] Danielle Stewart, Jing (Janet) Liu, Darren Cofer, Mats Heimdahl, Michael W. Whalen, and Michael Peterson. "AADL-based safety analysis using formal methods applied to aircraft digital systems." Reliability Engineering & System Safety, 213:107649, 2021.

[38] Masoud Taram, Soheil Khodayari, Nael Abu-Ghazaleh, Ronny Krashinsky, and Kevin Skadron. "When a patch is not enough - hardfails: Software-exploitable hardware bugs." arXiv, abs/1812.00197, 2018. Accessed: March 31, 2025.

[39] Atelier B Team. "Formal methods in railway signaling: The B method at RATP." In RSSRail 2017 - Reliability, Safety, and Security of Railway Systems, 2017. Accessed: March 31, 2025.

[40] The MathWorks, Inc. Simulink. The MathWorks, Inc., Natick, Massachusetts. Accessed: April 2, 2025.

[41] Alvaro Veizaga, Mauricio Alferez, Damiano Torre, Mehrdad Sabetzadeh, and Lionel Briand. "On systematically building a controlled natural language for functional requirements." Empirical Software Engineering, 26(4):79, 2021.

[42] Donalt T. Ward and Steven B. Helton. "Estimating Return on Investment for SAVI (a Model-Based Virtual Integration Process." In SAE International Journal of Aerospace, November 2011.

[43] Clifford Wolf. Yosys open synthesis suite. 2016.

[44] Wen Xu, Man Yue, and Kangjie Lu. "Analyzing the security gaps in patched android systems." arXiv, abs/1905.09352, 2019. Accessed: March 31, 2025.

[45] Dockins, R., Foltzer, A., Hendrix, J., Huffman, B., McNamee, D., & Tomb, A. (2016). "Constructing semantic models of programs with the Software Analysis Workbench." In Verified Software: Theories, Tools, and Experiments (VSTTE) (Vol. 9971, pp. 134–151). Springer. https://doi.org/10.1007/978-3-319-48869-1_9

a foundation for high-assurance solutions

formal methods