Medical coding has always lived at an uncomfortable intersection: clinical nuance on one side, payer rules on the other, and revenue integrity in the middle. When it works, it’s invisible. When it doesn’t, the impact shows up everywhere – denials, rework, delayed cash, and a coding team stuck in permanent triage.
That’s why the conversation around AI in medical coding has shifted so quickly. In 2026, the question isn’t “Will AI touch coding?” It already has. The real question is: what can AI reliably do today, what should it never do unattended, and how do you implement it without introducing new compliance and denial risk?
This post is a practical, current-state view, grounded in what’s happening in healthcare adoption and what coding workflows actually need.
Where AI is today – and why healthcare is talking about it more than ever
A few years ago, AI in healthcare mostly lived in pilots, innovation labs, and conference slides. Now it’s making its way into real workflows – especially operational ones.
One clear indicator is clinician adoption: the American Medical Association reported that 66% of physicians used AI in 2024, up from 38% in 2023. That kind of year-over-year jump is rare in healthcare technology adoption. Another signal comes from Menlo Ventures, who reported 22% of healthcare organizations have implemented domain-specific AI tools – meaning tools built for particular healthcare workflows rather than generic chatbots.
This acceleration is happening against a backdrop of sustained cost pressure. CMS estimates 2024 hospital spending at ~$1.63T and physician/clinical services at ~$1.11T. Meanwhile, administrative complexity remains one of the biggest “hidden” costs in the system. A peer-reviewed analysis estimated $812B in administrative spending (2017), representing 34.2% of US national health expenditures.
So the interest in AI is not just curiosity. It’s a response to a system that has a massive administrative surface area and growing pressure to deliver more throughput without growing headcount at the same pace.
Why adoption is moving faster now than the last wave of health IT
Healthcare has lived through many technology waves – EHR rollouts, patient portals, RPA, analytics platforms. Most improved parts of the system, but they rarely reduced operational burden in a way that teams could feel.
What’s different now is that modern AI is unusually strong at dealing with the exact inputs healthcare runs on: narrative notes, unstructured documentation, and messy context. And access to data is slowly improving as policy and industry momentum pushes against information blocking and toward greater interoperability.
There’s also a workforce reality. HIM and revenue cycle leaders have been dealing with staffing challenges for years, and AHIMA has explicitly discussed how AI adoption is likely to shift coding work toward validation, auditing, and governance rather than simply removing the function. In other words, AI is arriving in an environment that’s already stretched—and that makes operational adoption easier to justify.
Why medical coding is a good use case in healthcare ops
Medical coding is a compelling AI use case because it is both measurable and repeatable. Every encounter has documentation. Every claim needs codes. And downstream, there’s a scoreboard: denials, audit variance, rework, throughput, and revenue integrity.
At the same time, coding has long struggled with three realities: humans vary, rules change, and payers interpret everything differently.
Coding error rates vary widely by setting and specialty, but the overall error surface is significant. A 2024 peer-reviewed overview cites contexts where coding error rates have been reported as high as 38% (example: anesthesia CPT), which is not a universal rate – but it does underline how hard consistent coding can be in real operations. On the reimbursement side, the cost of rework and improper payment is also non-trivial: CMS’ CERT program reported a Medicare FFS improper payment rate of 6.55% (often tied to documentation and coverage issues, not necessarily fraud). Add the fact that rules evolve regularly – AAPC notes ICD-10-CM updates effectively occur twice a year, with a major update cycle often effective Oct 1 – and you get a system that demands consistency in an environment that constantly produces variability.
This is exactly where AI can help – not by “replacing coders,” but by reducing friction and variance in the most repetitive parts of the work.
What AI can do well in medical coding today
In practice, the best coding AI systems are less like an autopilot and more like a high-quality first pass that makes human review faster.
AI is strong at reading large volumes of documentation quickly and turning it into structured outputs: what happened, what diagnoses are present, what procedures were performed, what setting and provider type applies, and what evidence in the note supports the coded story. This matters because a surprising amount of coding time is spent not on the final code selection, but on simply navigating documentation and extracting the relevant facts.
AI is also useful for consistency. Given two similar encounters, a well-designed system will generally reach a more standardized interpretation than two humans working under time pressure. It can also flag common documentation gaps – missing specificity, mismatches between what’s documented and what’s billed, or missing supporting details that often lead to payer edits.
And when AI is implemented thoughtfully, it improves over time through feedback loops: coder overrides, audit outcomes, denial reason codes, and payer-specific behavior patterns. That last point matters because coding correctness is not purely theoretical – it’s operational, payer-shaped, and local.
What AI can’t do reliably today
Here’s the part most blogs gloss over: AI doesn’t usually fail by being obviously wrong. It fails by being plausibly wrong – and in the revenue cycle, “plausible” can still be expensive.
Behavioral health is a great example. On paper, psychotherapy coding looks straightforward. In practice, it’s packed with time thresholds, pairing rules, and documentation nuance and payer scrutiny varies more than most teams expect.
CMS guidance distinguishes psychotherapy without E/M (such as 90832/90834/90837) from E/M + psychotherapy add-on codes (90833/90836/90838), and documentation must support the time and context for what is billed. In this world, small ambiguities – missing time language, unclear session structure, vague assessment elements – can be the difference between a defensible claim and a denial.
This is where AI introduces risk if it hasn’t been trained and tuned on the nuances that actually matter in your environment. If the note is unclear, an LLM may still choose a code and produce a rationale that sounds reasonable – even if the time documentation doesn’t fully support it, or the pairing logic is off. And even when the clinical logic is directionally correct, AI can miss payer-specific expectations that drive denials in the real world unless you condition it on those rules and learn from your outcomes.
The net effect is that AI doesn’t remove governance work = it raises the value of it. That aligns with AHIMA’s framing: as AI becomes more present, the work shifts toward validation, auditing, and ensuring the integrity of what’s submitted.
So the right mental model is: AI reduces routine effort; it does not reduce accountability. It can absolutely perform well in complex areas like behavioral health – but only when it’s implemented with specialization, feedback loops, and controls, not as a generic out-of-the-box model.
How to know if you need medical coding AI
Medical coding AI isn’t something you adopt because it’s what everyone else is doing. It pays off when it targets a real, measurable bottleneck; one that’s already costing you time, cash, or control.
You’re likely to see ROI if two or more of these are true:
- Coding-related denials are rising, especially denials tied to medical necessity, documentation gaps, or coding edits.
- Audit variance is meaningful and persistent, you see recurring disagreement between coders, auditors, or external reviewers.
- DNFB is prolonged, and staffing pressure feels chronic rather than temporary.
- Coders spend excessive time on chart navigation (hunting for the right evidence) versus actual coding decision-making.
- Outsourcing costs are growing without improving consistency, turnaround times, or governance.
- You can access the core data needed for a closed loop: clinical note + charges + remits (even if imperfect).
If you can’t baseline any metrics or you can’t reliably access the documentation and outputs you’d need to measure impact, start there first. Coding AI is only as valuable as your ability to operationalize it, measure it, and continuously tune it.
How to think about implementing medical coding AI
Once you’ve established that medical coding AI is likely to deliver ROI for you, the next step is resisting the temptation to “roll it out everywhere.” The safest implementations look boring on paper because they’re designed to control risk, prove impact, and scale only after the workflow is stable.
A safe implementation pattern looks like this:
- Start with a narrow wedge: pick one specialty, one encounter type, and a defined payer set. Avoid cross-specialty rollouts until governance and performance are predictable.
- Define success metrics finance will accept and baseline them for 2 weeks before you change anything. Track:
- coding-related denial rate categories
- coder touches per chart
- turnaround time
- audit variance
- net collection impact (when attributable)
- Make evidence and explainability mandatory. For every suggested code, require evidence snippets from the documentation, a clear rationale, and (where relevant) time/pairing logic, especially important in behavioral health.
- Design the human-in-the-loop system upfront. Be explicit about what is suggest-only, what can eventually be auto-coded, how escalations work, and what your audit sampling cadence will be.
- Operationalize updates. ICD and guideline changes are ongoing; without a structured update + validation workflow, performance will degrade quietly over time—and you’ll only notice after denials or audit findings move the wrong way.
Conclusion
Medical coding AI can be a real lever, mainly by speeding up chart review, standardizing routine decisions, and catching documentation gaps earlier. But it only performs reliably when it’s tuned to your specialty and payer nuances, with clear evidence trails and a review/audit loop. If you implement it narrowly, measure outcomes, and operationalize updates, you get faster throughput without compromising defensibility.