Challenges

What Counts as AI? An Engineering-Useful Definition for 2026

By Marc Molas·February 23, 2026·8 min read

The standard textbook definition of AI — "systems that perform tasks normally requiring human intelligence" — was already showing its age a decade ago. In 2026 it's actively unhelpful. Almost any modern software performs tasks that "normally require human intelligence." A spreadsheet's pivot table normally requires human intelligence. A SQL query normally requires human intelligence. Calling them AI is a category error; pretending they're not is also a category error.

Engineers need a definition that drives decisions, not one that drives press releases. The proposal Georgios Fradelos lays out in Updated Definition of Artificial Intelligence (December 2025) is the cleanest engineering-useful framing I've seen in a while. It's worth working through, because adopting it changes how you scope projects, evaluate vendors, and budget infrastructure.

The Definition

Strip the references and it reads as follows:

Any piece of software that analyzes and/or generates data (even for its own training) and can routinely produce, with only exceptional human intervention:

Better results for analysis and/or synthesis tasks than the average human professional today, within the same professional culture, independent of energy cost; or

Comparable results for analysis and/or synthesis tasks, achieved faster and/or with measurably lower energy cost; or

Both.

Three things make this definition useful for engineering decisions.

It's outcome-bound, not architecture-bound. It doesn't care whether the system uses neural networks, symbolic AI, search, or hand-written rules. If the deployed system routinely outperforms a professional baseline (in quality, speed, or energy), it counts. If it doesn't, it doesn't, regardless of what's under the hood. This kills the "is X technique really AI?" debate and replaces it with a measurable test.

It's energy-aware. Energy cost is a first-class dimension of the definition, not an externality. A system that produces "comparable" results to a human professional but uses 100× the energy to do it is not an upgrade — it's a downgrade with extra steps. This forces honesty about whether AI deployment is actually a net improvement or a fashionable cost increase.

It's bounded by professional culture. The reference baseline is "the average human professional today, within the same professional culture." Not the median human. Not a 1990s baseline. Not an aspirational target. This anchors the comparison in something measurable and current, and it forces re-evaluation as the baseline moves.

Why This Helps a CTO

Three concrete decisions get easier under this framing.

Vendor evaluation

When a vendor says their product is "AI-powered," the question becomes specific: against which professional baseline does it outperform, by how much, and at what energy cost? If the vendor can't answer, you don't have an AI claim — you have a marketing claim.

In practice this conversation surfaces a lot. A non-trivial fraction of "AI" tools in 2026 deliver comparable quality to a junior professional at higher infrastructure cost than a junior professional. That's not an AI deployment; that's an AI experiment. Both are legitimate, but they should be budgeted differently.

Project scoping

The definition gives you a shipping criterion. The system is shippable when it routinely produces (a) better, (b) faster, or (c) cheaper-per-result outputs than the human baseline it's replacing or augmenting. Not "the demo is impressive." Not "leadership likes the screenshots." Routine, measurable, baseline-anchored.

This is the threshold most GenAI pilots quietly fail to hit. The honest answer is: we never measured the baseline, so we can't say whether we cleared it. Setting the baseline first, and re-measuring it as the model improves, removes that ambiguity.

Energy and cost accounting

Most AI cost discussions today focus on token costs. That's necessary but not sufficient. The full energy and infrastructure cost — GPU time, cooling, data egress, query orchestration, eval infrastructure — is what matters for the energy comparison. A definition that puts energy on equal footing with output quality forces this accounting to actually happen.

Practical version: for any AI initiative, you should be able to answer "how much energy does the AI system consume per useful output, and how does that compare to the human baseline performing the same task." If you can't, you're not yet ready to ship to production — you're ready to instrument.

The "Exceptional Human Intervention" Clause

There's a clause buried in the definition that does a lot of work: with only exceptional human intervention. The clarifying note specifies that "exceptional" means reorienting the software in mathematical situations that break its approximations.

In engineering terms: a constant stream of prompt engineering, RAG pipeline tuning, output filtering, and human-in-the-loop correction is not exceptional intervention. It's routine intervention. A system that requires routine human intervention to produce useful output is, by this definition, AI software under development, not deployed AI.

This matters because most "AI deployments" I see in 2026 are still in the routine-intervention regime. They produce outputs that need human curation to be reliable. That's a legitimate stage of development, but calling such systems "deployed AI" misrepresents the operational cost. Adopting this definition forces honesty about which systems have actually graduated.

What This Definition Pushes You Away From

A few things get harder to claim once you adopt this framing:

"AI for AI's sake." If the system doesn't beat the human baseline on quality, speed, or energy, the deployment isn't producing value. That doesn't mean kill it — it might be a stepping stone toward a system that will. But it should be honestly labeled as R&D, not as production AI.

"AI as a feature checkbox." Putting an LLM-powered button in the product because competitors have one is fine as marketing. It is not, by this definition, an AI deployment, because it doesn't have a measured outperformance against a baseline. Don't budget it like one.

"Quantum-leaping" architecture claims. The definition is architecture-agnostic. A finely tuned classical algorithm that beats a neural baseline at lower energy cost is, by this definition, more AI than the neural baseline it replaced. This is a useful corrective against the assumption that bigger and more complex is always more AI.

What This Definition Lets You Defend

It also makes some unfashionable positions defensible.

A 200-line SQL query that consistently outperforms a junior analyst on a specific class of reports, runs in seconds, and costs cents to execute, is by this definition AI. It analyzes data, produces results better than the average professional baseline, faster, and at lower energy cost than human-baseline alternatives. The fact that it's not a neural network is irrelevant.

This isn't a rhetorical flourish. It's a practical posture. The SQL query is doing the job. The expensive LLM-based alternative might be doing it worse and slower at higher cost. Adopting the definition lets you ship the SQL query and be honest that you've shipped AI — without the apologetic framing that "real" AI requires neural architectures.

What I'd Suggest Doing With This

Three concrete actions for the quarter:

For every system you currently call "AI," write down the human baseline. Quality, speed, and energy. If you can't, you don't yet know whether the system clears the bar.
Rebudget marketing-AI vs. production-AI separately. If a system doesn't routinely beat a baseline, it's R&D. R&D belongs in an R&D budget with R&D expectations.
Add energy/cost-per-useful-output to your AI dashboards. If your AI roadmap has a budget but no per-output cost telemetry, you're flying blind on the part of the definition that actually constrains long-term viability.

The textbook definitions of AI exist for academic continuity. The engineering definition exists to make better deployment decisions. Worth adopting one of each.

Source: Fradelos, G. Updated Definition of Artificial Intelligence (Geneva, December 14, 2025). SSRN 6292000.

If your AI roadmap is full of pilots and short on shipped systems that actually beat their baselines, talk to a CTO about deploying engineering capacity focused on measurable AI outcomes, not demos.

What Counts as AI? An Engineering-Useful Definition for 2026

The Definition

Why This Helps a CTO

Vendor evaluation

Project scoping

Energy and cost accounting

The "Exceptional Human Intervention" Clause

What This Definition Pushes You Away From

What This Definition Lets You Defend

What I'd Suggest Doing With This

Related Articles

AI Failure Modes Are Now a Top-of-Stack Concern: An Engineering Defense Playbook

AI Agents in 2026: MCP, Memory Limits, and the Interoperability Wall

Foundation Model Economics: How to Ship AI Without Owning a Frontier Lab

Ready to build your engineering team?