Generative AI (gAI) holds immense promise for the defense sector, enabling rapid data aggregation, synthesis, and analysis at scale, as well as enhanced decision support, predictive insights, and operational efficiency. Yet, we understand far too little about how these new technologies actually work. All too often, large language models (LLMs) are treated like magic boxes – prompts go in, results come out – with little comprehension of what is happening inside the model itself.
Despite its potential benefits, gAI exhibits inherent unpredictability and carries unknown risks, including potential data exposure, reverse engineering of inputs, reconstruction of sensitive information, inherent biases, and untrustworthy or incorrect outputs. There is an urgent need to demystify gAI, developing a deeper understanding of its inner workings to leverage its full potential and mitigate risk.
Galois is developing a mathematical evaluation framework for analyzing and characterizing exactly what is happening inside gAI models. By quantifying and measuring the latent geometry inside these technologies, we can understand what a model has learned and set guardrails to achieve predictable, consistent results. In addition, we are building a future-forward toolkit that can be sustainably used to evaluate and certify the safety of gAI tools – ensuring that tomorrow’s technologies are designed to resist information leakage, manipulation, and jailbreak.
Our approach treats generative technologies as tools, not magic boxes. By embracing a rigorous analytic approach, Galois is not only safeguarding advancements but setting the stage for efficient, secure, and reliable AI deployment and advancement, steering technological progress in a trajectory that’s beneficial, well-understood, and above all, safe for high-stakes environments.