Constraint Engineering: The Discipline AI Systems Actually Need

When Microsoft began enterprise rollout of Microsoft 365 Copilot in early 2024, the constraint architecture was, by any reasonable engineering standard, sound. Copilot was designed to honor the existing Microsoft 365 permissions model. Documents a user could not access through SharePoint or OneDrive, Copilot could not surface. Sensitivity labels propagated. Audit logging captured every prompt and every retrieval. The system worked exactly as specified.

Within days, Fortune 500 IT departments were getting calls from senior leadership asking why employees were able to ask Copilot for “the highest-paid people in the company” and get answers. Salary spreadsheets surfaced. M&A working documents surfaced. Board materials surfaced. HR investigation files surfaced. None of it was a permissions failure in the technical sense. The files were accessible to the user who asked because, somewhere across a decade of accumulated SharePoint configurations, someone had checked a box that granted access to “Everyone” or “All employees” or had created a sharing link that was supposed to expire and never did. The constraint operated. The constraint was wrong.

By June 2025, Gartner found that forty percent of organizations had delayed Copilot rollout by three months or more because of oversharing concerns. A separate analysis of more than five hundred and fifty million records concluded that the average organization had eight hundred thousand files at risk from broken permissions, inherited sharing, and incorrect classification. Sixteen percent of business-critical data was overshared across the average tenant. The US House of Representatives banned congressional staff from using Copilot entirely. Microsoft, for its part, began shipping SharePoint Advanced Management free with every Copilot license and published a “secure and governed data foundation” deployment blueprint that did not exist when the product launched. The remediation effort itself is the evidence that the original deployment guidance was inadequate to what the system would actually do at runtime.

What Microsoft customers learned, the hard way, is that the constraint architecture was sound but the constraint specification was wrong. The permissions model had been populated by humans, over years, in ways that were never designed to function as runtime constraints for a probabilistic assistant that could discover and recombine information across the entire data estate in a single prompt. Before Copilot, finding an overshared salary spreadsheet required knowing it existed and where to look. After Copilot, asking the question in natural language was sufficient. The same data, the same permissions, the same access controls. The runtime behavior was unrecognizable.

This is the structural failure mode that conventional security thinking does not anticipate. The controls are in place. The controls operate. The controls are wrong, because they were specified for a deterministic discovery model and the system using them is not deterministic. The gap between specification and runtime behavior is the entire subject of this piece.

The Shape of the Failure

Cases like this are now common enough to have a shape. Air Canada was held liable in 2024 when its customer service chatbot promised a bereavement discount the airline did not offer; the airline’s defense, that the chatbot was a separate legal entity, was rejected by the tribunal. Nippon Life Insurance is currently suing OpenAI for ten million dollars over a ChatGPT instance that, despite explicit terms-of-service prohibitions and system-level instructions, drafted forty-four legal filings to reopen a settled case. A Chevy dealership’s chatbot agreed to sell a Tahoe for one dollar. A McDonald’s drive-through AI took an order for two hundred and sixty chicken nuggets. The stakes vary. The structure does not. In each case, the deploying organization believed it had controls. The controls were words in a configuration file, sentences in a system prompt, or a permissions model that had been adequate before the system became capable of acting on it at scale. Something broke between what the policy said and what the system actually did.

The work of designing the layer that closes that gap is constraint engineering. It is the missing engineering discipline of the AI practice, and most organizations deploying AI today are doing it badly or not doing it at all.

Why Deterministic Thinking Fails on Probabilistic Systems

Begin with the structural difference between deterministic and probabilistic systems. In traditional software, the constraint is the code. A function does what it implements; it cannot do what it does not implement. The boundary between permitted and forbidden behavior is drawn by the program itself, in the act of writing it. There is no separate enforcement layer because there does not need to be one. The system cannot stray.

AI systems do not work this way. A large language model can produce almost any output its training distribution permits, and an agent built on top of one can take almost any action the tools available to it can perform. The boundary between permitted and forbidden is no longer drawn by the code. It has to be designed and instrumented separately, as a runtime property of the system, in a layer that did not exist in the deterministic era and does not yet have a settled name.

The reason is partly vocabulary. The term most commonly used for this work is guardrails, which has come to mean everything from a system prompt that asks the model to be polite, to a regular expression that filters profanity from outputs, to a full policy enforcement point that intercepts every tool call an agent attempts. The vocabulary collapse hides a structural distinction that matters: a guardrail in the prompt is a request, not a constraint. A regex filter is a constraint, but only over a narrow class of outputs. A policy enforcement point intercepting tool calls is a constraint of a different kind entirely. Treating the three as the same word, on the same page in the same documentation, is how organizations end up believing they have controls when what they have is hope.

Defining the Discipline

Constraint engineering, used precisely, names the engineering practice of specifying what an AI system is allowed to do, designing the mechanisms that enforce those specifications at runtime, and producing the evidence that the enforcement is operating. It sits one level below governance and one level above the model. Governance answers what should be true. The model answers what is generated. Constraint engineering answers what is allowed to actually happen.

The discipline has three layers, and each of them is its own engineering problem.

Layer One: Specification

What constraints does this system need? The question sounds simple and is not. The Copilot case shows why: the specification can be technically present and substantively wrong, calibrated to a discovery model the system has just rendered obsolete. Constraints fall into categories that require different vocabulary to express: behavioral (the agent should not promise refunds above a threshold), structural (the agent’s output must conform to this schema), procedural (the agent must obtain human approval before completing this kind of action), economic (the agent must not exceed this budget per execution), semantic (the agent’s response must be grounded in the retrieved documents and not contradict them), and access-scoped (the agent’s available context must be calibrated to the runtime, not inherited from a prior model of how that context was discovered). Recent academic work has begun to formalize the specification problem. The Semantic Integrity Constraints paper from researchers working on AI-augmented data systems (arXiv 2503.00600, 2026) proposes a declarative abstraction for specifying correctness conditions over LLM outputs, generalizing the long-established concept of database integrity constraints to semantic settings. The contribution is not a new technique so much as a new vocabulary, which is exactly what the discipline currently lacks. Most teams specify constraints implicitly, in scattered prompts and configuration files, in a way that no one could enumerate if asked. Worse, many teams inherit specifications that were never designed to constrain an AI system at all, the way Microsoft customers inherited SharePoint permission models that worked fine until they didn’t. The first act of constraint engineering is making the specification explicit, version-controlled, reviewable by people who do not speak prompt-language, and audited against the actual runtime behavior of the system that will use it.

Layer Two: Enforcement

Where does the constraint actually run? In LLM-centric thinking, enforcement happens by asking the model to follow the rule. This works inconsistently and fails silently, which is the worst possible failure mode. Mature constraint engineering separates the specification from the model entirely. The constraint runs in a policy enforcement point, a layer of code that intercepts the model’s outputs or actions, evaluates them against the specification, and blocks, modifies, or escalates as required. The Guardrails as Infrastructure: Policy-First Control for Tool-Orchestrated Workflows paper (arXiv 2603.18059, 2026) develops this architecture for agent systems, defining a model-agnostic permission layer that operates regardless of which model or caller produces the action. The same pattern appears in production frameworks from Weights & Biases, Reco, and others, which have converged on the same architectural insight: the enforcement layer cannot depend on the model behaving correctly. The discipline is older than the AI use case. It is what every mature distributed system already does for authorization, rate limiting, and resource quotas. Constraint engineering for AI is the same pattern applied to a system whose internal logic is no longer transparent.

Layer Three: Evidence

Did the constraint operate? This is where most current implementations fail entirely. A guardrail that blocked an output without logging what it blocked, why, and against which version of the policy, is functionally indistinguishable from no guardrail at all. The audit cannot find it. The incident review cannot reconstruct it. The next iteration of the policy cannot improve on it. Constraint engineering treats every enforcement decision as an event that must be captured, attributed to a specific policy version, and made queryable. This is the bridge to the governance pillar: the evidence the auditor is going to ask for, the evidence the regulator is going to require, the evidence the next operator is going to need when something goes wrong.

Three layers: specification, enforcement, evidence. Constraint engineering is the practice of building all three, deliberately, as a property of the system and not as documentation about it.
Where Build, Secure, and Govern Meet in Code.

The reason this matters now, and not in two years, is that constraint engineering is the layer where the build, secure, and govern pillars of the AI practice actually meet in code. Governance specifies the constraints. Engineering builds the enforcement. Security adversarially probes whether the enforcement holds. The audit reads the evidence. When constraint engineering is done well, the four functions are working on the same artifacts in different roles. When it is done badly, each function is working on a different copy of the truth, and the gaps between the copies are exactly where the failures emerge.

Most organizations have not yet hired for this discipline because the discipline does not yet have a name on a job description. The work is being done, when it is being done, by AI engineers improvising it on top of LangChain or by security engineers retrofitting it on top of policy engines that were not designed for probabilistic systems. The improvisation is producing inconsistent results because there is no shared vocabulary, no shared architecture, and no shared standard for what good looks like.

A Lineage to Borrow From

The aviation industry solved the equivalent problem by separating control and safety into distinct disciplines with distinct training, distinct certifications, and distinct review processes. The pilot operates the aircraft. The flight envelope is defined and enforced by systems the pilot does not control. The Safety Guardrails in the Sky paper (arXiv 2603.27912, 2026) demonstrates this architecture in flight test on the VISTA F-16, with safety constraints that minimally modify operator inputs only when those inputs would violate flight envelope boundaries. The same architectural pattern is what enterprise AI systems need, and the disciplines that produce it are the disciplines that constraint engineering is going to have to claim.
The Work Ahead

The work ahead is to define constraint engineering as a first-class engineering practice, with the specifications, the architectural patterns, the testing methodologies, the observability standards, and the review processes that other mature disciplines have. It will not look exactly like security engineering, though it will share tools. It will not look exactly like reliability engineering, though it will share patterns. It is its own discipline, and it is the one most directly responsible for whether the AI systems being deployed today still work three years from now without having caused harm.

Microsoft customers had constraints. They had a permissions model the company had refined over twenty years. The constraints were calibrated for a different kind of system, and the moment the system changed, eight hundred thousand files per organization became reachable through a sentence typed into a chat box. The systems that survive will be the ones whose builders treated the constraint as a first-class engineering artifact, including the discipline of asking what the constraint actually constrains when the runtime changes. The systems that fail will be the ones whose builders treated it as a configuration setting on a model, or a permissions model from the deterministic era, or any other inheritance from an architecture that no longer applies.

Specification, enforcement, evidence. The work begins before the next deployment, not after.

References

Semantic Integrity Constraints: Declarative Guardrails for AI-Augmented Data Processing Systems. (2026).

Guardrails as Infrastructure: Policy-First Control for Tool-Orchestrated Workflows. (2026). arXiv:2603.18059.

Safety Guardrails in the Sky: Realizing Control Barrier Functions on the VISTA F-16 Jet. (2026). arXiv:2603.27912.

Gartner, Microsoft 365 and Copilot Survey (June 2025). 40% of surveyed organizations delayed Copilot rollout by three months or more because of oversharing concerns; 64% reported that information governance and security risks required considerable time and resources to address during deployment.

Concentric AI, The State of Data Risk (2025). Analysis of more than 550 million records found an average of approximately 800,000 files at risk per organization from oversharing, broken permissions, and incorrect classification; 16% of business-critical data was found to be overshared.

Microsoft Learn, Secure & Governed Data Foundation for Microsoft 365 Copilot: Foundational Deployment Guidance. The deployment blueprint Microsoft published in response to widespread oversharing concerns following Copilot’s enterprise rollout.

Moffatt v. Air Canada, 2024 BCCRT 149 (Civil Resolution Tribunal of British Columbia, February 14, 2024).

Nippon Life Insurance Company of America v. OpenAI Foundation et al., Case No. 1:26-cv-02448 (N.D. Ill., filed March 4, 2026).