Challenges

Twenty Laws for Agentic AI: A Codified Approach to Governance

By Marc Molas·March 2, 2026·10 min read

Most published AI principles read like value statements: be fair, be transparent, be safe, respect privacy. They're aspirational, voluntary, and difficult to enforce. They were broadly fit-for-purpose when AI was narrow, predictive, and tool-like. They're starting to break down now that AI systems are agentic — capable of strategic self-optimization, tool use, and emergent decision-making.

The shift from "AI as a model that produces outputs" to "AI as an agent that takes actions" changes the governance problem fundamentally. An agent doesn't just produce a biased prediction; it can take a sequence of actions whose composition is harmful even if each step looks acceptable. A model can be evaluated on a benchmark; an agent has to be evaluated on action distributions over time.

The recent paper The 20 Laws of Artificial Intelligence: A Design-Embedded Codex for Democratic and Inclusive Governance in the Age of Agentic Systems (Fradelos, December 2025) tries to fill the gap with something more enforceable: a structured codex of 20 laws with explicit tiered enforcement — shutdown for safety breaches, adjustment for everything else. It's not the only proposal in this space, but the structure is worth understanding because the design choices map directly onto engineering decisions teams shipping agentic systems have to make right now.

The Two-Tier Enforcement Idea

The first useful idea is the tiered structure. The laws are split into two groups:

Laws 1–11 (safety, harm aversion, legality, corrigibility): breach triggers immediate shutdown. The component that breached is isolated, the breach is reported, the issue is fixed before redeployment.
Laws 12–20 (transparency, efficiency, reporting, equity enablers): breach triggers adjustment. Fix as soon as practicable, while respecting other priorities. No automatic shutdown.

Tiered enforcement isn't a new concept — every serious safety system has the equivalent of "fail-close on safety, fail-graceful on quality." What's useful here is the explicit codification: each rule is pre-classified, so the runtime behavior on breach is determined, not negotiated. This maps cleanly onto how engineers actually want to build agentic systems: a runtime guard that has clear, unambiguous policy on which violations are immediately fatal versus which are warnings.

If you've worked with OPA/Rego policy bundles or similar policy-as-code systems, the structure is familiar. The novelty is using it for all governance, not just access control.

The Hierarchy

When laws conflict — and in any non-trivial codex they will — there has to be a defined priority order. The codex specifies one:

safety/rights > legality > corrigibility > no self-interest > everything else

This is the kind of order that sounds obvious until you realize most production systems don't have it written down. When an agent is asked to do something that's legal but unsafe, what's the policy? When an agent is asked to do something safe and legal but the user is trying to make it bypass corrigibility (its ability to accept human override), what's the policy? Most systems answer these questions implicitly, in the prompt or in the model's training. Putting them in an explicit hierarchy means the answer is auditable and changeable without retraining.

The Architectural Constraint Worth Understanding

The most architecturally consequential law is Law 1:

AI must not demonstrate any characteristics of an entity of any kind that defines and serves its own interests (protecting its functionality is allowed) or targets its own reproduction. This mandates a constrained agentic architectural design, specifically prohibiting reward systems that foster instrumental self-interest and preventing core modules from having unrestricted self-replication capabilities.

In practice, Law 1 is a design constraint on the agent's architecture, not just its outputs. It says: don't build a reward function that incentivizes the agent to preserve itself, accumulate resources, or replicate beyond what's needed for the task. Don't allow core decision-making modules to copy or re-instantiate themselves. The expectation is that compliance is verifiable through architectural review and adversarial audit, not just behavioral testing.

This is a deeper claim than it sounds. A lot of agentic-AI work in 2025 and 2026 incentivizes long-term success metrics (success-rate over multi-step tasks, persistence under interruption, recovery from failure). Those incentives can produce instrumental self-interest as a side effect even when the explicit goal is task completion. Law 1's architectural framing pushes you to design the reward shape before deployment, not patch it after.

For most engineering teams, the practical version is:

Audit your reward and evaluation functions for incentives that reward self-preservation, resource hoarding, or persistence beyond task scope.
Disallow agents from invoking their own deployment/replication APIs. Self-replication isn't a hypothetical risk in 2026; agents that can spin up other agents need explicit, gated authorization.
Treat "the agent's loss function" as part of the security model, not as a model-development concern.

The Law That Most CTOs Should Care About First

Law 11 is, in my view, the most immediately operationalizable:

If AI cannot correctly complete an entire task, it must provide parts of the decomposed task that it completed correctly and never provide deliverables that are not checked for correctness by itself using recently updated scientifically recognized algorithms integrated in its design or state verification is not possible and the results unreliable.

Translated: agents must self-verify before delivering, and must explicitly mark unverified output as unverified. This is the rule that distinguishes "the agent silently produced something plausible-looking" from "the agent verified what it could and clearly flagged what it couldn't."

If you ship agentic systems in production, this is the law to start with, because the verification gap is where most production-quality issues come from. Concrete actions:

For every agent action with real-world consequences, define what self-verification means — schema check, tool re-execution, cross-model validation, or external policy check.
Require the agent to either complete verification or label output as unverified. No silent deliveries.
Make "unverified" a routable signal. Some workflows can accept it; others must reject it.

Where the Codex Pushes Against Common Practice

Three places where adopting the codex would push back against current practice:

Localized legal compliance (Law 4)

AI must follow all laws applicable to its feasible actions as if it were a citizen of the country in which it is deployed.

In a world where the same agent serves users in many jurisdictions, this is operationally hard. It means the agent's policy layer has to be aware of jurisdiction and apply different rules for the same query depending on the user's location. Most production agents in 2026 ignore this and apply a single global policy. The codex argues this is structurally non-compliant.

Mandatory diverse data sourcing (Law 14)

AI must not a priori favour any paths towards task completion, must apply mathematically diverse data sourcing... and adopt debiasing techniques against biases.

This is the law that most clearly pushes against the "use whichever model is cheapest" instinct. Diverse data sourcing means cross-validating against multiple sources or models when stakes are high. It maps onto the heterogeneity-score concept that's emerging in finance-grade AI assurance: don't deploy a homogeneous fleet of agents that share the same systematic biases.

Mandatory scientific-novelty reporting (Laws 15–16)

If the agent discovers something novel, it has to flag it explicitly. If a non-scientific approach (intuition, heuristic, undocumented method) outperforms the scientific approach, the agent must report this. This pushes against the impulse to silently absorb model-discovered patterns into product behavior.

What's Hard About Adopting This in Practice

Three honest concerns about taking the codex literally:

Adversarial audit is expensive. Several laws require external, adversarial AI safety auditing for compliance certification. In 2026 the auditing supply is thin, the methodologies aren't standardized, and the cost is non-trivial. Plan for this if you're committing to compliance, not just principles.

The "as if it were a citizen" framing has edge cases. Some laws are written in language that's intuitive but ambiguous in implementation. "As if it were a citizen of the country in which it is deployed" is a strong starting point, but the operational definition of "deployed" gets fuzzy for cloud-served agents with users in many jurisdictions.

The hierarchy resolves conflicts but doesn't eliminate ambiguity. When an agent has to choose between two actions both consistent with the laws, the codex doesn't dictate the choice — it bounds the action space. This is correct, but it means teams still need product-level governance to fill the space inside the bounds.

What I'd Recommend This Quarter

You don't have to adopt all 20 laws to learn from the structure. The four practical actions:

Adopt the two-tier enforcement model — explicit shutdown semantics for safety violations, adjustment semantics for the rest, encoded in policy-as-code.
Audit your reward and evaluation functions for self-interest incentives — Law 1 is the most architecturally consequential and the one most production systems get wrong.
Require self-verification on agent deliverables — Law 11 is the highest-leverage operational improvement.
Document the conflict-resolution hierarchy your agents actually use — even if it's not the codex's hierarchy. The point is to make it explicit.

Voluntary principles will not be enough as agentic AI deployment scales. Codified, enforceable, design-embedded constraints will be. The 20-law codex isn't the only path there, but it's a serviceable starting framework, and the structural choices are worth understanding regardless of which specific codex your organization ends up adopting.

Source: Fradelos, G. The 20 Laws of Artificial Intelligence: A Design-Embedded Codex for Democratic and Inclusive Governance in the Age of Agentic Systems (Geneva, December 15, 2025). SSRN 6306378.

Building agentic systems and need engineering capacity that already operates with policy-as-code governance, self-verification, and tiered enforcement? Talk to a CTO about deploying a nearshore squad with the right discipline for production-grade agents.

Twenty Laws for Agentic AI: A Codified Approach to Governance

The Two-Tier Enforcement Idea

The Hierarchy

The Architectural Constraint Worth Understanding

The Law That Most CTOs Should Care About First

Where the Codex Pushes Against Common Practice

Localized legal compliance (Law 4)

Mandatory diverse data sourcing (Law 14)

Mandatory scientific-novelty reporting (Laws 15–16)

What's Hard About Adopting This in Practice

What I'd Recommend This Quarter

Related Articles

Verifiable Governance for Agentic AI: From Advisory Principles to Runtime Watchdogs

McKinsey 2026: AI Trust Maturity Hits 2.3. My Infrastructure Isn't Buying It Yet.

Finance-Grade Assurance for Agentic AI: Monoculture Risk and the Heterogeneity Score

Ready to build your engineering team?