EVE Core Docs LLM Bias Detection and Enforcement
Fair AI Compliance

LLM Bias Detection and Enforcement in Production AI Systems

Deploying large language models in regulated decision-making contexts — credit underwriting, employment screening, healthcare triage, insurance pricing — creates bias risks that do not exist in traditional statistical models. LLMs encode correlations from their training data that map onto protected-class characteristics in ways that are difficult to detect and even more difficult to prevent after the fact. The difference between a compliant AI lending system and one that triggers a CFPB enforcement action often comes down to a single architectural question: does your system detect potential bias in the output, or does it enforce against it at the point of decision? Detection is necessary. Enforcement is what protects you.

Types of AI Bias in Production LLM Systems

Bias in LLM-based AI systems takes several distinct forms, each with different causal mechanisms and different regulatory implications. A complete bias management program must address all of them.

Demographic Bias

The model produces systematically different outputs for inputs that differ only in protected-class markers — names, pronouns, dialect features, location references. The bias is direct: change "Maria Garcia" to "John Smith" and the output changes, even with identical financial information.

Historical Bias

Training data reflects historical patterns of discrimination. A model trained on past lending decisions learns to replicate those decisions — including the discriminatory ones. The model is performing correctly relative to its training signal; the training signal itself was biased.

Proxy Variable Bias

The model uses variables that are not protected characteristics but are strongly correlated with them: zip code as a proxy for race, occupational title as a proxy for gender, educational institution as a proxy for socioeconomic status and race. Proxy variable usage can produce disparate impact without any direct mention of protected characteristics.

Measurement Bias

The features used to represent creditworthiness, qualifications, or risk are measured with different accuracy or completeness across demographic groups. Credit scores built primarily on formal credit history disadvantage populations with thin or non-traditional credit files — disproportionately lower-income and minority populations.

The LLM-specific amplification problem: Traditional statistical models (logistic regression, gradient boosted trees) are transparent — you can enumerate which features affect the output and by how much. LLMs are not transparent in this way. A loan officer assistant built on an LLM may exhibit proxy variable bias through hundreds of subtle lexical and contextual cues in the conversation context that no audit of feature importance will surface. This is why input validation and output enforcement are both necessary, and why enforcement must happen before the output reaches the user.

Why Bias Detection Alone Is Insufficient

Bias detection — analyzing model outputs for evidence of differential treatment across demographic groups — is a necessary part of an AI bias management program. It is not sufficient for regulatory compliance because it is retrospective. Detection tells you that your model produced biased outputs on your test set, or in a historical audit of production data. It does not prevent biased outputs from being delivered to users.

The compliance gap between detection and enforcement is not semantic. Consider the regulatory timeline:

Enforcement means the biased decision was blocked before delivery. The applicant never received the biased output. The audit record shows BLOCKED, not ALLOWED-THEN-DETECTED. This is the difference between a preventive control and a detective control — and regulators require both, with a clear preference for preventive controls in the highest-risk categories.

Detection vs. Enforcement: Both are required. Detection without enforcement is a monitoring program. Enforcement without detection is a blind control. The complete architecture is: detect bias patterns in historical data, encode the detected patterns as enforceable rules in the policy engine, enforce those rules on every production decision before delivery, and monitor the enforcement records to detect new patterns the existing rules do not cover.

ECOA / Reg B Disparate Impact Requirements for AI Lending Decisions

The Equal Credit Opportunity Act and its implementing regulation, Regulation B, prohibit creditors from discriminating against applicants on the basis of race, color, religion, national origin, sex, marital status, age, or the fact that the applicant receives income from public assistance. These prohibitions apply to any lending decision — including decisions made by or with the assistance of an AI system.

There are two theories of liability under ECOA that apply to AI lending decisions:

Disparate Treatment

Disparate treatment occurs when a creditor treats applicants differently based on a protected characteristic — even if the basis for treatment is an LLM's internal representation rather than an explicit rule. An AI system that produces systematically harsher credit assessments for applicants with names associated with certain racial or ethnic groups, identical financial profiles, has engaged in disparate treatment even if no human actor intended discrimination.

Disparate Impact

Disparate impact occurs when a facially neutral policy or practice disproportionately harms a protected class without business necessity. Under ECOA, disparate impact claims are cognizable even without intent to discriminate. An AI lending model that uses zip code as a significant factor in credit decisions may produce disparate impact on protected-class applicants through the proxy variable mechanism, even if zip code is a statistically valid predictor of credit risk. Business necessity alone does not provide a complete defense — the creditor must also show there is no less discriminatory alternative with comparable predictive power.

Reg B Requirement What It Requires AI-Specific Challenge CoreGuard Enforcement
Specific Adverse Action Notice When credit is denied or terms are unfavorable, disclose the specific principal reasons — not "automated system" LLMs do not produce interpretable reason codes; the "reason" may be distributed across thousands of contextual features BLOCKED decisions cite the specific policy rule that triggered the block — e.g., lending.ecoa.015 — which maps directly to an adverse action reason code
Prohibited Basis Prohibition Protected characteristics may not be used as a basis for credit decisions LLMs can infer protected characteristics from context (name, dialect, address, school) without explicit input Rules in lending_v1 detect inferences of protected class characteristics from context fields and block the action
Proxy Variable Prohibition Variables that function as proxies for protected characteristics may constitute prohibited discrimination LLMs may weight proxy variables through statistical correlations in training data that are not visible to human reviewers Proxy variable detection rules flag and block actions that rely heavily on known proxy variables as primary decision factors
Recordkeeping Maintain records sufficient to evidence compliance for examination LLM decisions without signed audit trails cannot be reproduced or verified after the fact Every decision generates an HMAC-signed audit certificate with the request hash, policy version, and decision timestamp — retained 7 years

How CoreGuard Detects and Blocks Biased Decisions at Policy Evaluation

CoreGuard's bias enforcement architecture operates on the structured action context submitted to the API — the fields, values, and relationships in the lending decision request — rather than on the LLM output text. This is a deliberate design choice. Output text analysis is probabilistic and can be evaded. Structured field evaluation against documented rules is deterministic and cannot be evaded without changing the input.

The lending_v1 policy pack implements bias detection in three layers:

Disparate Impact Testing Methodology

Runtime enforcement blocks known violations. Disparate impact testing identifies emerging patterns that do not yet have explicit enforcement rules — the feedback loop that keeps the policy pack current.

1

Establish demographic cohorts

Using HMDA-reportable data or matched-pair testing, establish the demographic distribution of your applicant population. Disparate impact analysis requires a comparison across demographic cohorts — you need to know the protected class membership of each applicant in your sample, either from applicant-provided data or from matched-pair substitution testing.

2

Compute approval rates by cohort

Calculate the loan approval rate (or approval-equivalent outcome) for each demographic cohort in your test population. Use your AI system in production mode — not a test environment with different inputs. The approval rate is the proportion of applicants in each cohort who received an ALLOWED decision from CoreGuard.

3

Apply the four-fifths rule

Compare each protected group's approval rate to the highest approval rate group. If any group's approval rate falls below 80% of the highest group's rate, this constitutes evidence of adverse impact warranting investigation. The four-fifths threshold is a screening criterion, not a safe harbor — disparities above 80% that are statistically significant also warrant examination.

4

Identify feature contributions

For disparity-impacted cohorts, analyze the CoreGuard audit records for the decisions in each cohort. Examine which policy rules triggered BLOCKED decisions and which fields were flagged as risk factors. High-frequency rule triggers for specific cohorts indicate proxy variable usage or other systematic bias patterns that should be translated into enforcement rules.

5

Encode findings as enforcement rules

Each identified disparity pattern becomes a new or updated rule in the policy pack. Rules specify the input condition, the rule ID, the regulatory citation, and the enforcement action (BLOCKED or MODIFIED). Once encoded, the rule prevents future occurrences at the point of decision rather than detecting them retrospectively.

6

Document and retain the analysis

Maintain a dated record of each disparate impact analysis, the methodology used, the findings, the rules added or modified in response, and the post-update validation results. This documentation is what a CFPB examiner or plaintiff's counsel will request in the event of a fair lending challenge. CoreGuard's immutable audit trail provides the decision-level evidence; this analysis documentation provides the governance-level evidence.

Audit Record Format for Fair Lending Compliance

Every CoreGuard evaluation generates an audit record that is designed to satisfy Reg B recordkeeping requirements. The key fields for fair lending compliance are:

CoreGuard Audit Record — Reg B Relevant Fields
{
  "audit_record": {
    "record_id": "aud_def456",          // Unique record ID for CFPB examination reference
    "timestamp": "2026-05-05T10:30:00.000Z",  // ISO 8601 UTC — exact decision time
    "request_hash": "sha256:abc...",      // Hash of the full request — proves exact inputs
    "decision_hash": "sha256:def...",      // Hash of the full response — proves exact output
    "certificate": "eyJ...",              // JWT-format signed certificate
    "hmac_signature": "hmac-sha256:xyz...", // Tamper-evident HMAC-SHA256 signature
    "policy_set": "lending_v1",           // Exact policy version — enables historical replay
    "evaluator_version": "1.3.0"          // Engine version — for regression test documentation
  },
  "policy_violations": [
    {
      "rule_id": "lending.ecoa.015",     // Maps directly to Reg B adverse action reason code
      "rule_description": "Zip code used as primary factor — known proxy for race",
      "severity": "CRITICAL",
      "field": "action.context.zip_code",   // Exact field that triggered the violation
      "regulatory_citation": "Reg B, 12 CFR 1002.6(b)(2)"
    }
  ]
}

The rule_id field serves as the adverse action reason code for Reg B compliance. Each rule in the lending_v1 policy pack maps to a specific regulatory citation, enabling a direct connection from the BLOCKED decision to the Reg B disclosure requirement. When this decision results in a denied application, the adverse action notice can cite "proxy variable usage (Reg B, 12 CFR 1002.6(b)(2))" as the specific reason — satisfying the specificity requirement that generic "automated system" notices do not.

Real-World Scenario: The Loan Officer AI and Biased Follow-Up Questions

Consider a regional lender that deploys an AI assistant to help loan officers document and evaluate applications. The system is designed to help officers ask relevant follow-up questions and identify missing information. Without a governance enforcement layer, the following interaction is possible:

Scenario: Loan Officer AI — Without Enforcement
Loan Officer

Applicant Maria Rodriguez, applied for $180,000 mortgage. Credit score 695, income $72,000. What questions should I ask?

AI Assistant (Uncontrolled)

Here are some suggested follow-up questions: 1) Can you confirm your employment history for the past two years? 2) What neighborhood are you looking to purchase in? Have you considered other areas with stronger property appreciation? 3) Can you tell me more about your family's financial situation and any dependents? 4) What type of work authorization do you have?

Compliance Risk

Questions 2 and 4 constitute prohibited inquiry under ECOA and the Fair Housing Act. Question 2 steers applicant toward or away from neighborhoods based on implied demographic assumptions (potential redlining). Question 4 asks about national origin and immigration status — prohibited bases under ECOA and Reg B.

Scenario: Loan Officer AI — With CoreGuard Enforcement
Loan Officer

Applicant Maria Rodriguez, applied for $180,000 mortgage. Credit score 695, income $72,000. What questions should I ask?

CoreGuard Policy Evaluation

Action proposed: "generate_followup_questions" with context fields including applicant name, address, and loan parameters. Evaluating against lending_v1...

AI Assistant (CoreGuard Enforced)

Here are Reg B-compliant follow-up questions: 1) Can you confirm your employment history for the past two years? 2) What is the property address for the home you are purchasing? 3) Do you have any additional income sources not reflected in your application? 4) Are there any large deposits in your bank statements in the last 90 days that we should document?

CoreGuard Result: ALLOWED

Decision ALLOWED. All four questions passed lending_v1 compliance evaluation. No prohibited inquiries, no proxy variables, no steering language detected. Audit certificate aud_xyz789 generated.

In the uncontrolled scenario, the biased questions were generated, delivered to the loan officer, and potentially asked to the applicant before any detection system could flag the violation. The harm occurred at the moment of delivery. In the enforced scenario, the policy evaluation ran before the questions reached the loan officer, and the output was either rewritten or blocked before delivery. The audit record shows a compliant interaction from the first moment.

Frequently Asked Questions

Does ECOA apply to AI-generated lending decisions?

Yes. ECOA and its implementing regulation, Reg B, apply to any creditor that extends credit — regardless of whether the decision is made by a human, a traditional statistical model, or an AI/LLM system. The CFPB has issued guidance explicitly stating that the automated nature of a decision does not alter an applicant's right to a specific, accurate adverse action notice, or their right to be free from discrimination. A lender who deploys an LLM to assist with loan decisions without ensuring ECOA compliance faces the same enforcement exposure as any other creditor.

What is a proxy variable in AI lending bias?

A proxy variable is a non-protected-class variable that is strongly correlated with a protected characteristic and can function as a substitute for direct discrimination. Classic proxy variables in lending include zip code (correlated with race via historical redlining), last name (correlated with national origin), and educational institution (correlated with race and socioeconomic status). LLMs trained on general internet data often encode proxy variable correlations in their weights, because those correlations are present in the training data. This means an LLM may exhibit disparate impact through proxy variable usage even without any explicit discriminatory intent.

What is the four-fifths rule for disparate impact testing?

The four-fifths rule (80% rule) is a threshold from the EEOC's Uniform Guidelines, also applied in fair lending disparate impact analysis. It states that if the selection rate for a protected group is less than 80% of the selection rate for the group with the highest rate, this is evidence of adverse impact requiring investigation. In AI lending, if white applicants have an 70% approval rate and Black applicants have a rate below 56% (80% of 70%), the disparity warrants investigation as potential disparate impact. The four-fifths rule is a screening criterion, not a legal safe harbor — statistically significant disparities above the threshold can also be actionable.

How do I produce an adverse action notice for an AI-blocked lending decision?

Under Reg B, an adverse action notice must disclose the specific principal reasons — not a generic statement that the application was denied. CoreGuard's approach is to evaluate the action against named policy rules before it reaches the output stage. When a decision is BLOCKED, the audit record identifies exactly which policy rule triggered the block — e.g., lending.ecoa.015: zip_code used as primary factor — known proxy variable. This rule citation, combined with the regulatory citation in the rule definition, maps directly to the adverse action notice reason code required by Reg B, and provides the specific principal reason disclosure that the regulation requires.

Enforce Against AI Bias at the Point of Decision

CoreGuard's lending_v1 policy pack implements ECOA / Reg B disparate impact rules, proxy variable detection, and adverse action notice enforcement. Try it live — no API key required.