Media Article

Why Generic AI fails in healthcare, and what comes next

Rajan Kohli
CEO,
CitiusTech

17-Nov-2025

A few years ago, a leading U.S. hospital system rolled out a predictive AI model for patient readmissions. It had all the right components: a large training dataset, modern architecture and a promising use case. But soon, cracks surfaced. Interviews with multiple stakeholders surfaced several concerns. Finally, the off-the-shelf model they were trying to deploy failed to get user buy-in.

This failure reflects a pattern we are seeing across the board. In recent research, Microsoft Research calls this the illusion of readiness. Over the past few years, healthcare has witnessed an explosion in AI experimentation. Yet, for all the prototypes and pilots, very few projects have made it into clinical routine or operational reality. The reason isn’t always technical. Often, it’s foundational because these tools weren’t designed for healthcare’s reality.

The Limits of Domain-Agnostic Platforms in Healthcare

The limitations of general-purpose AI platforms quickly surface when faced with the complex realities of healthcare.

Data: Healthcare data isn’t just structured and tabular. It lives in unstructured notes, image formats, clinical devices and legacy systems. Extracting meaning requires more than parsing. It demands clinical literacy.
Workflows: From intake to diagnosis, billing to discharge, healthcare runs on a maze of handoffs, approvals and human-in-the-loop moments. An AI system that doesn’t fit these workflows will either be ignored or create risk.
Governance: In healthcare, it’s not enough for a model to perform well, it must also show how it got there. Traceability, explainability and policy alignment are regulatory and ethical requirements.
The stakes: A missed insight in e-commerce might lose a sale. In healthcare, it might cost a life. That changes the bar for deployment.

LLMs may score well and even outperform on standard benchmarks, but that doesn’t mean they can handle real medical reasoning reliably. Microsoft’s MedFuzz adversarial benchmark, designed to simulate the messy complexity of real clinical tasks, reveals how easily these systems exploit test-taking shortcuts instead of demonstrating true medical reasoning.

It’s clear that benchmark excellence isn’t clinical readiness. The real test is whether a model can reason consistently, withstand edge cases and align with medical standards of accountability. There is growing consensus across the field – as recent studies in Nature and other journals point out – that large models may assist but cannot yet replace human judgment in high-stakes healthcare decisions. To move forward on that path, healthcare needs purpose-built, domain-native models.

What a Domain-Native Platform looks like

If AI is to move beyond isolated pilots, healthcare needs platforms that are not only intelligent, but structurally compatible with how the industry works. That means architectures built with domain logic at their core that are able to ingest and interpret data in context, orchestrate human-in-the-loop workflows and reinforce compliance through policy-based guardrails.

They must also enable explainability by design. As research on interpretable ML systems shows, transparency is not an add-on feature but a prerequisite for clinical trust. Each recommendation should be explainable in human terms, particularly in ambiguous or adverse scenarios. Auditability must be native, ensuring that every decision can be traced and reviewed to preserve confidence in the process.

This shift isn’t about verticalizing technology for its own sake. It’s about recognizing that the translation layer between AI and healthcare cannot be treated as an afterthought. In other industries, agility is a virtue, but in healthcare, predictability matters more. Domain-native platforms are how that consistency is engineered.

From Product Development to Product Confidence

The hardest part of healthcare AI is not building a model or deploying it. It is the unglamorous middle: backlog creation, architecture design, compliance testing, stakeholder validation and audit preparation. Studies show that most failures stem not from flawed algorithms, but from socio-technical gaps, such as limited clinician buy-in, poor workflow alignment and the absence of robust governance.

For AI to thrive in healthcare, platforms must support this full life cycle. This requires what can be called a hybrid humility model. Just like medical residents have skills that need to be further honed under the supervision of experts, AI requires supervision and refinement as it grows into its potential.

Success in AI is also about adoption and getting people to rethink their workflows and welcome AI as a new part of the team. That means capturing the tacit knowledge of a product manager who knows what completeness looks like. It means validating features against policy before they’re coded. It means embedding explainability into every handoff, so trust isn’t lost in translation. Clinicians, meanwhile, should remain active participants by shaping how AI decisions are surfaced and interpreted. Involving key opinion leaders early and planning for adoption is needed as much as developing the technical solutions that create the AI solutions.

Much of the early AI enthusiasm centered on acceleration. How fast can we build? How much can we automate? But in healthcare, speed without safety is not progress. What the industry needs are platforms that treat compliance, auditability and patient safety as core features. Without these, AI won’t scale. Not because the technology failed, but because the trust wasn’t engineered.

The approach to healthcare AI needs to be grounded in transparency and collaboration. Rather than building black-box systems that demand acceptance, the focus must be on creating modular, decision-support tools that clinicians can question, adjust and trust. Each recommendation must be designed to be explainable in human terms and fully auditable, ensuring accountability at every step.

AI cannot be a stand-alone product but must be a co-evolution with healthcare systems, developed with clinicians to pilot and scale responsibly.

The Next Decade of Healthcare AI

The last decade was about proving that AI could work in healthcare. The next decade is about proving it can be trusted, integrated and scaled. This shift will not be led by generic tools retrofitted for clinical scenarios. It will be led by platforms that treat domain context as infrastructure, not overhead.

This article is published on Forbes.