Standards Brief

ISO/IEC 42001 in practice: an AI management system that does more than pass an audit

ISO/IEC 42001 gives AI governance a real management-system backbone. But a certificate proves you have a system — not that the system stops a bad action when it happens. Here is how to make the standard operational rather than ornamental.

A management system, not a one-time assessment

ISO/IEC 42001, published in 2023, is the first international management-system standard for artificial intelligence. It defines an "AI management system" (AIMS): the set of policies, roles, processes, and controls an organization uses to govern AI on an ongoing basis. It is certifiable — though a clarification matters: ISO itself does not certify organizations. Independent, accredited certification bodies do, the same way they certify an ISO/IEC 27001 information-security management system. (A companion standard, ISO/IEC 42006:2025, sets the requirements for the bodies that audit and certify an AIMS against 42001.)

That word — management system — is the point, and it is easy to skim past. A management-system standard is not a checklist you complete once and file away. It follows the harmonized management-system structure often associated with Annex SL — shared across modern ISO standards — and it runs on a Plan-Do-Check-Act (PDCA) cycle of continual improvement. You establish context and leadership commitment, set objectives, assess risk and impact, define operations and controls, then monitor, review, and improve — on a loop, indefinitely. The standard carries a set of Annex A controls, and it sits alongside companion guidance that fills in the detail — ISO/IEC 23894:2023 for AI risk management and ISO/IEC 42005:2025 for AI system impact assessment.

This is a meaningful step up from a point-in-time audit or a self-attestation. A one-time assessment tells you the state of things on the day someone looked. A management system institutionalizes the looking: it names who is accountable, requires that risks be reassessed as systems and uses change, and forces gaps surfaced in the "Check" phase to be closed in the "Act" phase. For AI specifically — where models drift, data shifts, and the behavior of a system in production rarely matches the behavior described in the design document — that ongoing discipline matters more than it does for a static IT control.

It is worth being precise about how 42001 relates to the other framework most teams have on their desk. The NIST AI Risk Management Framework is a voluntary risk framework. It is influential, well-constructed, and not certifiable. ISO/IEC 42001 is a certifiable management-system standard. The two are complementary, not competing: NIST AI RMF gives you a vocabulary and structure for reasoning about AI risk; 42001 gives you a certifiable program for running governance as an ongoing operation. Many organizations will sensibly use both. Neither, on its own, reaches down to the moment an AI system actually takes an action.

The honest failure mode: a binder that passes

Here is the risk no certificate will warn you about. A management system can be built well enough to pass an audit while the AI it governs still acts without real controls.

This is not a hypothetical born of cynicism, and it is not a knock on auditors. Certification evaluates whether you have a system: documented policies, assigned roles, a risk-assessment procedure, and evidence that you reviewed and improved. A good auditor can and will review sampled evidence that operational and technical controls are defined, implemented, and working. Certification can evaluate whether runtime controls are defined, implemented, sampled, and evidenced. What it does not provide is exhaustive assurance that every future AI action will be stopped at the moment of violation.

The gap is between the management layer and the runtime. Your policy may say "high-risk decisions require human review." Your binder may document that policy, name its owner, and show meeting minutes where it was reviewed. The audit passes. And then, in production, an agent issues a refund, sends a communication, modifies a record, or approves a transaction — and nothing in the actual path of that action checked the policy. The control lived in a document. The action lived in a system. They never met.

When that happens, the certificate is not wrong. It accurately reports that you have a management system. It simply does not — and was never designed to — prove that the system governs behavior at the point of action. That is the difference between a system that is ornamental and one that is operational.

What a certificate does — and doesn't — tell a buyer

A few practical cautions follow directly from how certification works.

Scope is everything. A 42001 certificate applies to a defined AIMS scope — the systems, processes, and business units the organization chose to include. It does not automatically cover every AI system the company runs, or every unit. "We're 42001 certified" and "the AI system you're about to rely on is in scope" are different claims; ask which one is true.

Verify the certificate, not just the logo. The AI-certification market is young. A certificate is only as good as the body that issued it, so confirm the certification body's accreditation and read the certificate's stated scope before you treat it as assurance.

Certification is not a legal safe harbor. Under the EU AI Act, high-risk AI carries its own obligations — risk management, a quality management system, logging, technical documentation, and conformity processes. A 42001 program can produce evidence that helps meet some of these, but a certificate is not, by itself, a defense or a substitute for the Act's specific requirements. Treat 42001 as supporting evidence, not as compliance.

Making PDCA fire at the point of action

The fix is not to abandon the standard. The standard is worth having; the structure is sound. The fix is to wire its loop into the place where AI actually acts, so that the controls 42001 asks you to define are controls that fire, and the evidence its "Check" phase consumes is real rather than reconstructed.

Concretely, that means two things.

First, the controls a 42001 program defines should be enforced at runtime, not just described in policy. When a policy says an action requires review, the system should be unable to take that action without the review — allow, block, or escalate, decided at the point of action against bounded authority, not after the fact. A control that cannot stop the thing it governs is a description, not a control.

Second, "Check" should be fed by evidence captured as decisions happen. The strongest input to a monitoring and review process is not a sample of logs assembled before the audit; it is a continuous stream of signed records showing what was allowed, what was blocked, what was escalated, and why. That turns the PDCA loop from a periodic paperwork exercise into a feedback system: real decision records reveal real gaps, and "Act" closes them against evidence instead of opinion.

This is where KAiM Helm fits, and we will be candid about what is built versus sequenced. KAiM Helm is the enforcement and evidence layer: deterministic control at the point of action — allow, block, or escalate — operating within bounded authority and producing signed records of every decision. In 42001 terms, that gives an AIMS something real to monitor and improve. The "Do" of your management system gains an enforcement surface; the "Check" gains trustworthy evidence; the "Act" gains specifics to act on. KAiM Helm does not certify you, and it does not replace the program work the standard requires — context, leadership, risk assessment, objectives. It makes the program's controls consequential. KAiM is at the design-partner pilot and controlled-demonstration stage; we are not claiming production deployments or audit outcomes we have not earned, and the integration between an enforcement layer and a formal AIMS is work we describe honestly as sequenced where it is not yet built.

Our position is straightforward. Certifications are worth pursuing. A 42001 program brings discipline that most AI efforts badly need. But the goal is controls that fire, not a certificate on the wall. 42001 plus enforced controls beats 42001 alone — and the difference shows up precisely on the day something tries to go wrong.

In practice, the useful question is not "Are we certified?" but "Which AI systems are in scope, which controls are enforced in production, and what evidence proves those controls worked when decisions were made?"

Key takeaways

ISO/IEC 42001 is a certifiable management-system standard for AI — an AIMS built on the harmonized management-system structure (often associated with Annex SL) and a PDCA continual-improvement cycle. It institutionalizes roles, policy, risk and impact assessment, operations, monitoring, and improvement as an ongoing program, not a one-time check. (ISO doesn't certify; accredited bodies do — under ISO/IEC 42006.)
It is complementary to, not interchangeable with, NIST AI RMF. NIST AI RMF is a voluntary, non-certifiable risk framework; 42001 is a certifiable management system. Use both.
Certification proves you have a system; it does not provide exhaustive assurance that every runtime action is stopped at the moment of violation. Auditors can review sampled control evidence — but that gap, between documented controls and enforced ones, is where to focus.
A certificate has a scope and is not a legal safe harbor. Confirm which AI systems are in scope and the certification body's accreditation; under the EU AI Act, 42001 can supply supporting evidence but does not by itself satisfy high-risk obligations.
Make 42001 operational by wiring its loop to the point of action: enforce controls where AI acts, and feed "Check" with signed decision records captured as decisions happen, so "Act" closes real gaps.
KAiM Helm is the enforcement and evidence layer — deterministic allow/block/escalate within bounded authority, with signed records — that gives an AIMS something real to monitor and improve. Built versus sequenced is stated plainly; KAiM is at the design-partner and controlled-demonstration stage.

If your 42001 program is well-documented but you cannot say with confidence what would stop an AI system from taking a prohibited action today, that is the gap worth measuring. Our Control Gap Assessment maps your defined controls against what actually fires at the point of action — a short, honest read on where your management system ends and your runtime begins.