Back to projects

AI-Enhanced Clinical Decision Support

Diagnostic-assist study integrating AI models into a clinical decision support system, evaluated across 1,000+ simulated patient scenarios.

  • Python
  • AI Models
  • Healthcare
  • Simulation

/ Outcomes

  • 20% reduction in diagnostic errors across 1,000+ simulated cases
  • Closed-loop feedback design that improved model performance over successive runs
  • Per-decision provenance: every recommendation traces to the model and feature inputs that produced it

Overview

Clinical decision support tools earn or lose trust on a single question: can the clinician see why the system thinks what it thinks? I built a study around a CDSS that integrates AI models into the diagnostic loop, evaluated it against 1,000+ simulated patient cases, and instrumented every recommendation with the inputs and model that produced it.

Approach

The study was structured to answer two questions:

  • Does the AI-augmented loop reduce diagnostic errors versus a non-AI baseline?
  • Does a feedback loop on clinician corrections actually improve the model over time, or just add noise?

To get a defensible answer I drove the system through a large simulated case set rather than a small real-patient cohort, then measured error rates on a held-out slice.

What I built

  • The integration layer between the CDSS surface and the underlying AI models — input normalization, feature assembly, recommendation rendering
  • A simulation harness that ran 1,000+ synthetic cases through the loop, comparing AI-augmented and baseline branches on the same inputs
  • A feedback-capture mechanism: each clinician correction was logged with the original recommendation and used to retrain the model on the next pass

Results

  • Diagnostic errors fell 20% in the AI-augmented branch versus the baseline across the 1,000+ simulated cases.
  • The feedback loop produced measurable per-iteration improvement on the simulated cohort, validating the design choice to keep clinician corrections as a first-class training signal.
  • Every recommendation was traceable back to the inputs and the specific model version that produced it, which is the table-stakes requirement for any real-world clinical deployment.

Lessons

Decision-support systems do not fail on raw model accuracy — they fail on legibility and trust. The simulation harness ended up being more valuable than any single model improvement: it made it possible to A/B workflows (with-AI vs. without-AI) rather than just models, which is the comparison clinicians actually care about.