What "AI proposes, deterministic evaluators enforce" actually means
KAiM's core principle, stated plainly and made rigorous — plus an eight-question FAQ for the people who have to sign off on it. The short version: the AI suggests; a deterministic gate decides; every decision is logged, signed, and reproducible.
You have heard the slogan. It is on our home page: AI proposes, deterministic evaluators enforce. It sounds reassuring. It is also the kind of sentence that should make a serious risk officer suspicious, because slogans are cheap and accountability is not.
So here is the principle taken apart and checked. What we mean. What we claim. And — just as important — what we do not claim.
The part that is not deterministic, and the part that is
Start with an honest admission. The AI is not deterministic. It is a model. Ask it the same question twice and you may get two different answers. It samples. It has a temperature. It drifts as it is retrained. This is not a flaw we are hiding; it is the nature of the technology, and any vendor who tells you their model is "deterministic" is selling you something.
The AI's job in our world is narrow: it proposes a consequential action. Send this wire. Approve this loan. Email this customer. Close this account.
That proposal then hits KAiM Helm.
KAiM Helm is the deterministic part. It is a control engine — code, not a model. When the AI proposes an action, Helm evaluates that proposal on six axes:
- Authority — is this actor permitted to do this?
- Policy — does it satisfy internal rules?
- Evidence — is the supporting data present and valid?
- Harm — what is the blast radius if this is wrong?
- Regulation — does it comply with the rules that bind this business?
- Escalation — does a human need to look at this first?
Helm returns one of three verdicts: ALLOW, BLOCK, or ESCALATE. And it leaves behind a signed, reproducible record of why.
"Deterministic" means exactly this: the same proposed action, evaluated against the same rule set, produces the same verdict every time. No sampling. No temperature. No drift. If Helm blocks a wire today, it will block the identical wire tomorrow, and the log will show the same rule firing for the same reason. You can re-run it. You can diff it. You can hand it to an examiner.
What we claim — and what we do not
This is where most "responsible AI" marketing falls apart, so let us be precise.
We are not claiming the AI is correct. It proposes bad actions sometimes. That is assumed.
We are not claiming the codified rules are complete. No rule set anticipates everything. Yours will have gaps, and so will the first version we build with you.
We are not claiming the human on the escalation path will always judge well. Human judgment is, by definition, not deterministic either.
Here is the narrower claim, and it is the one that survives scrutiny:
The gate between the AI and your customer behaves the same way every time, and every ALLOW, BLOCK, and ESCALATE is logged, signed, and reproducible.
That is a smaller promise than "our AI is safe." It is also a promise we can actually keep, and that you can actually verify. The determinism lives in one place — the evaluator and its verdict logic — and that place is code: inspectable, version-controlled, testable. The non-deterministic parts are quarantined on either side of it. The model's creativity stays upstream. Human discretion stays downstream, on escalation. In the middle sits something that does not improvise.
Why this maps to things you already trust
None of this is exotic. We are assembling familiar, battle-tested ideas at a new point in the workflow.
Policy-as-code at the point of action. Your engineers already express infrastructure rules as code that is evaluated automatically. KAiM Helm does the same thing for consequential business actions — the rule is code, the check runs at the moment of action, not in a quarterly review.
An independent challenge function. Model-risk practice has a name for this: effective challenge — an independent function with the standing to say no. Helm is that function, made continuous and automatic. The system proposing the action is not the system that approves it. That separation is the entire point.
An auditable decision record. Every verdict is a record: the proposed action, the rules evaluated, the verdict, the signature, the timestamp. When someone asks "who approved this, and on what basis?" — the question that follows every incident — there is an answer that does not depend on memory or goodwill.
The accountability question underneath all of this is simple. When the AI is wrong, who answers for it? With an opaque model, the honest answer is "no one can say." With Helm in the path, the answer is on the record before the action ever ships.
Buyer FAQ
1. What exactly is "deterministic" here? The evaluator and its verdict logic. Same proposed action plus same rule set equals same verdict, every time, with no randomness in the decision path. It is not the AI that is deterministic — it is the gate.
2. If the AI proposes and the evaluator decides, what stops the AI from routing around it? Architecture, not trust. Helm sits in the execution path for the consequential action — the AI cannot send the wire or approve the loan itself; it can only propose, and the proposal must clear Helm to take effect. An action that bypasses the gate is not an allowed action; it is a broken integration, and that is a deployment requirement we treat as non-negotiable.
3. Which rules are codified? The ones that govern the six axes for your specific actions — authority limits, internal policy, evidence requirements, harm thresholds, the regulations that bind you, and your escalation triggers. We do not pretend to codify everything. We codify what is consequential and what is knowable, and we are explicit about the boundary.
4. Who writes and maintains the rules? You own them; we build them with you. They are not a black box we hand over. They are version-controlled, reviewed, and changed deliberately — every change is itself a recorded event. Your compliance and risk people should be able to read a rule and recognize their own policy in it.
5. How are ambiguous cases handled? They default to ESCALATE, not ALLOW. When the evidence is thin or the rules do not clearly cover the situation, the safe verdict is to route to a human. Uncertainty resolves toward caution, by design.
6. How are conflicting rules resolved? With codified precedence. When two rules disagree, the resolution order is itself written down and version-controlled — so the outcome is predictable and explainable, not a coin flip. If a stricter regulatory rule conflicts with a looser internal one, precedence is defined in advance, not argued after the fact.
7. Can the evaluator itself be audited? Yes — that is the point of building it as deterministic code. The verdict logic is inspectable. The decision records are reproducible. You can re-run a past decision against the rule set that was live at the time and confirm the same verdict. An examiner can do the same.
8. What about false positives and negatives — do you publish rates? Honestly: not yet. KAiM is at the design-partner and controlled-demonstration stage. Publishing error rates without real production volume across many customers would be a fabricated number, and we will not manufacture statistics to win a meeting. What we can show you today is the mechanism, the decision records it produces, and how it behaves on your own scenarios. Measured error rates come with deployment and time, and we will report them when they are real.
Key takeaways
- The AI proposes; it is a model and is not deterministic. KAiM Helm enforces; it is code and is deterministic.
- "Deterministic" means same action plus same rules equals same verdict — ALLOW, BLOCK, or ESCALATE — every time.
- We do not claim the AI is right or the rules are complete. We claim the gate is consistent and every decision is logged, signed, and reproducible.
- It is policy-as-code at the point of action, an independent challenge function, and an auditable decision record — assembled where AI meets your customer.
- Ambiguity defaults to escalate; rule conflicts resolve by codified precedence; the evaluator itself can be audited.
- We are at pilot stage. No customer counts, no published error rates — yet.
Where to start
If your AI can already take consequential actions but you cannot say, on the record, who approved each one and on what basis — that is a control gap. It is also exactly what we measure.
Ask us for a Control Gap Assessment. We will map where your AI proposes consequential actions, where a deterministic gate belongs, and what a signed decision record would look like for your highest-stakes workflow. No fabricated numbers. Just your actual exposure, made legible.
Audit-first AI governance