Generative AI for Rigorous Digital Engineering

‍Rigorous Digital Engineering (RDE) is arguably the most powerfully transformative methodology in engineering today. So why isn’t everyone using it?

Think of RDE as virtual cartography for complex systems. By meticulously defining the function and relationships of each component, as well as requirements, digital engineers create detailed maps (or "models") of a system at different levels of abstraction. These models aren’t just diagrams; they’re reusable, analyzable artifacts that enable rapid design exploration, cybersecurity optimization, and even automated code generation. What’s more, RDE uses formal methods to mathematically prove that a system behaves as specified.

The result? An extremely high level of assurance, a high degree of traceability between components and different levels of abstraction, and significantly faster, safer, cheaper critical system development. This is a methodology that can literally shave months or years (and millions of dollars) off system development timelines, while simultaneously dramatically increasing system reliability.

The catch? RDE often has an prohibitively high cost in terms of the level of expertise required. Scaling RDE to its full potential across the DOD and industry would require an army of specialized computer science PhDs that simply doesn’t exist.

With the rise of increasingly capable generative AI (GAI), the natural question emerges: “Can we use GAI to automate RDE?”

Enter: GAI4RDE

With the Generative Artificial Intelligence for Rigorous Digital Engineering (GAI4RDE) Project, Galois is exploring this core question and experimenting with using LLMs to automate RDE workflows and applications. A key challenge in making this dream a reality is the need for LLMs to learn a wide variety of domain specific languages (DSLs) and tools to accomplish RDE workflows, combined with their limited (albeit still substantial) ability to master an extremely large and complex body of knowledge.

“An LLM’s context is finite, and you don’t want to overwhelm it,” explained Galois Research Engineer Alexander Grushin. “It’s probably not going to work very well to take a huge body of documents that covers every one of these RDE tools and techniques, give it to the LLM in one shot and just tell it: ‘Here’s what I want to build. Now, build it for me.’ The knowledge has to be broken down. We need a less monolithic approach and more something where we have specialized components that automate the different parts of RDE. Ultimately, this led us to look at a multi-agent system approach.”

Put differently: a single AI agent trying to tackle an RDE project all by itself will be overwhelmed, but a team of agents working together has a better chance at collectively handling the multi-stage, complex task.

Agents and Multi-Agent Systems

While there is no universally accepted definition of what makes an “agent,” Grushin favors the one he first encountered as an undergrad at the University of Delaware: “A computer system that is situated in some environment, and that is capable of autonomous action in this environment in order to meet its design objectives” (Weiss, 1999).

“A key aspect is autonomy,” said Grushin. “An agent isn’t told: ‘In this situation, always do this,’ or ‘in that situation, always do that.’ Rather, it picks actions to satisfy higher level design goals.”

In multi-agent systems, these autonomous entities interact with one another – often in surprising ways – to perform more complex tasks that would have been infeasible for individual agents. Back in grad school, long before the current GAI boom, Grushin focused his research on “swarm intelligence,” in which agents, inspired by social insects like ants or wasps, had limited capabilities individually, but collectively displayed complex, emergent behavior, allowing them to solve non-trivial problems.

“I built simulations where individual blocks moved around a virtual space, following simple reactive rules,” Grushin explained. “Alone, these blocks couldn’t do much. Together, they built elaborate structures without any centralized control. The key challenge was in figuring out how to design local rules that led to global order.”

After grad school, Grushin worked on agent-based simulations for NASA, modeling U.S. air traffic. In this system simulation, agents represented flights, airports, and controllers – each with its own individual logic, but together capturing real-world dynamics like cascading flight delays across the network.

These projects yielded fascinating results, but because AI was simply not as advanced as it is today, there was a hard limit to how much intelligence each individual agent could display and what they could accomplish, even as a multi-agent system. Now, LLM-powered agents are capable of behavior that was previously out of reach, and the potential applications for a well-designed, multi-agent, swarm-intelligence framework are many.

A Multi-Agent System for RDE

When Galois first began experimenting with using GAI for RDE in late 2024, the available emerging agent frameworks were too unstable and changing too frequently to be effective for the kind of work the Galois team wanted to do. Instead, they decided to build their own framework for the GAI4RDE project, which they dubbed “RDE Wingman.”

The RDE Wingman framework is fairly generic, though with a significant emphasis on RDE, software engineering, and formal methods. Galois Principal Scientist Joe Kiniry, Grushin, and the rest of the Galois team began by training RDE Wingman on the RDE methodology.

“Over the years, I’ve written a bunch of papers about RDE, and even an RDE course that’s about 2000 pages long,” said Kiniry. “We basically fed all that information to the model to train it – plus a bunch of example projects.”

By using LLMs to summarize this information, and then designing effective prompts based on the summaries, the team set up RDE Wingman to understand not only the big picture goals of Rigorous Digital Engineering, but how to do RDE: how to break down a problem into solvable steps and the system into coherent components with traceable connections, how to generate specifications and code, and how to maintain and evolve those specifications, models, code, and assurance artifacts. Because each step requires specialized knowledge, the team designed RDE Wingman as a multi-agent framework.

“It’s a collection of agents that work together,” said Kiniry. “At its core is an RDE agent that understands Rigorous Digital Engineering, and then it farms out specialized tasks to other subagents.”

The requirements agent, for example, knows how to write and understand system requirements, while the design agent has focused knowledge on system design, and so on. Zooming in farther, some of these assistant agents also have their own assistants, such as a shell agent that understands and writes command-line operations.

The whole system is thus set up in a hierarchical structure, with the LLM-powered core RDE agent autonomously delegating tasks to increasingly specialized layers of agents and sub-agents. This prevents the LLM from becoming overwhelmed while simultaneously allowing the system as a whole to accomplish the complex, multi-stage task that is Rigorous Digital Engineering.

RDE Wingman in Action

The team has already successfully used RDE Wingman on a number of experimental applications, including using it to automatically draw a system architecture diagram, translate English language system requirements into a test bench, and more. If you give RDE Wingman a system specification, you can ask it to generate code, create a more or less detailed version of that same specification, or translate a specification from one language to another. You can also go the other way: give RDE Wingman a codebase, and it can ask it to generate a specification.

“Because RDE is built in, the agent has an understanding of abstraction layers and how they must relate to each other,” said Kiniry. “It understands how to check that a specification and implementation are in alignment, either because they should be equivalent, or because one should refine the other. It will also independently problem-solve to accomplish the goal you set for it – for example, searching online for a needed tool, and downloading it to your computer so that it can do the job.”

Best of all, all this capability is handled through simply telling RDE Wingman what you want it to do and setting it loose. The tool currently has two different options for usability. While more advanced developers can use a command line tool, more general or mid-level users have the option of a prompt-like interface that can be integrated into the developer experience such that the user runs their RDE project by simply having a conversation with RDE Wingman.

“The front end includes a local LLM that has a whole personality and script behind it to walk you through the RDE process,” explained Kiniry. “Think of it as a GAI4RDE wizard.”

Through a chat-like back-and-forth now familiar to anybody who has been playing with ChatGPT or similar LLMs, RDE Wingman can gauge a user’s knowledge and goals, recommend a plan of action, and then put that plan into action on the backend.

Learning and Iteration

The tool’s inherent malleability also makes it uniquely adapted to a rapidly changing technological environment. As AI agents continue to improve over time, RDE Wingman can integrate the latest developments. As new tools or DSLs are introduced to the RDE ecosystem, RDE Wingman can simply learn them, adding to its toolbelt. This reactive dynamism also makes RDE Wingman well-suited for integration into CI/CD/CV pipelines: once a system model has been built, the tool can be used to ensure it stays up-to-date as changes are made.

“Let’s say you update your AADL model,” said Kiniry. “You could just tell RDE Wingman: ‘Hey, I updated the model. Update the code for me.’ That’s literally the only prompt you need, and it will do it automatically for you. Or let’s say you have a codebase that already has one set of requirements. You can tell RDE Wingman: ‘Here’s two new requirements. Fix it.’ And it will go do it.”

This isn’t theory – it’s a use case that Galois Research Engineer Benoit Razet has successfully proven in a GAI4RDE experiment. In the experiment, when tasked with updating a system and generating code based on new requirements, RDE Wingman had to understand the new requirements, understand how they fit in with existing requirements, figure out if they were consistent, turn the requirements into features, and write test code to demonstrate that those features work.

RDE Wingman was able to do all of that successfully, but when it ran the test the tool realized that there was an inconsistency in the code. In a fascinating turn, the tool caught its own mistake, fixed it, and then propagated the change back up to the requirements – iterating until it accomplished its original task.

Results and Looking Ahead

While RDE Wingman’s results are already remarkable, it’s still AI, and subject to both the advantages and limitations that come with the technological territory.

“Sometimes it gets things wrong. Sometimes it literally lies to you,” said Kiniry. “It’s doing really cool stuff, and saving a ton of time, but you still have to have a human expert review the results.”

Precision makes a difference, too. For example, a direction like: “Create a pictorial representation of this pile of C code” is underspecified – there are many different ways to accomplish that task. Faced with that sort of request, RDE Wingman will think through its many options and then either present options and ask for clarity, or simply pick an option itself and move forward. More precise instructions such as “Use Taphos to analyze this C++ code and give me a SysMLv1 model” or “I like agile programming and Python, but have types” yield more precise results.

While the results are, as yet, inconsistent and imperfect, RDE Wingman is still automatically generating multi-level models and code with extraordinary capability and speed. In other words, it’s not a reliable plug and play tool, but in the hands of an expert user like Kiniry it is already capable of dramatically speeding up an RDE workflow.

“In a single day, using RDE Wingman could save me two weeks worth of work,” said Kiniry. “It’s making RDE 10 to 100X faster.”

The RDE Wingman also acts as a co-pilot for engineers who want to use and learn formal methods and RDE, but avoid taking weeks of professional development time to take the RDE course and conduct RDE “skunkworks” or self-educational projects. While using the RDE Wingman, engineers learn how to do applied formal methods and model-based engineering by example, working hand in hand with an AI agent.

Even more significantly, engineers aren’t limited to working with just a single RDE Wingman. As many RDE Wingmen can be run simultaneously as desired, so long as the LLM resources are available. The RDE Wingmen can work as a kind of RDE Squadron or Swarm, what we call the H.O.A.R.D.E.—the Hierarchical Orchestration of Autonomous Rigorous Digital Entities. Imagine if each engineer on your team had ten AI collaborators who knew formal methods and could work 24/7, helping them achieve their engineering goals. That’s where the RDE Wingman is going next.

The implications are significant. After all, the current version of RDE Wingman is just a prototype – the result of talented engineers experimenting on a relatively small budget and timeline. Fully realized, RDE Wingman could streamline the development and modification of virtually every imaginable cyber-physical, software, or hardware system, both new and legacy, from aircraft to computer chips to hospital IOT systems and beyond.

Generative AI for Rigorous Digital Engineering

October 8, 2025

Enter: GAI4RDE

Agents and Multi-Agent Systems

A Multi-Agent System for RDE

RDE Wingman in Action

Learning and Iteration

Results and Looking Ahead

RELATED ARTICLES

Blog Post

Claude Can (Sometimes) Prove It

Blog Post

Privacy vs Power: Can LLMs Succeed in Security-Critical Environments?

Blog Post

Escaping Isla Nublar: Coming around to LLMs for Formal Methods