Are AI-Generated Medical Notes Really Any Worse?

Commentary

Apr 4, 2025

Photo by The best photo for all/Getty Images

By Shira H. Fischer and Sarah L. Gebauer

Once upon a time, at the end of a patient exam, doctors scribbled notes almost as an afterthought for no one but themselves. In the modern medical era, the informal note has turned into the highly orchestrated electronic health record (EHR), used for management of care across specialties and institutions, but with import for billing and malpractice lawsuits too.

Primary care doctors now spend, on average, two hours a day filling out patients' electronic health records. From 2009 to 2018, EHR length has increased by 60 percent. And yet most of this work is simply documentation, not problem-solving or reasoning. Why take doctors away from their patients or hire other teams of humans, like medical scribes, to transcribe and enter all this data, when AI can now do it instead?

Primary care doctors now spend, on average, two hours a day filling out patients' electronic health records.

Indeed, new AI scribes are being piloted around the country, backed by companies including Microsoft (DAX Nuance), Amazon (HealthScribe), and others in a market that's worth more than $2 billion. And it's not just private hospitals; the U.S. Department of Veterans Affairs signed contracts for ambient scribe pilots with two companies, Nuance and Abridge, last summer. Some estimates suggest that as many as 30 percent of physician practices are already using the technology.

These AI systems all work similarly: they listen in on the conversation between a patient and physician, create a transcript, then organize the information from the transcript into the standard format doctors use, generating a note within minutes or even seconds of the clinical conversation. This is, for many medical professionals, promising, exciting, and nerve-wracking. AI is powerful, but it is also known for being biased, for making untrue statements with great confidence, and for its poor quality of training data; to say nothing of all the issues around models potentially not being up to date with the newest medical literature and guidelines.

And yet, all these pitfalls—biases, incorrect interpretations, outdated guidelines—are human problems, too.

As flawed as AI notes may be, it's entirely possible that human-generated medical notes could be more flawed. One study conducted at the VA found that when compared with secret recordings of medical encounters, 90 percent of doctor-written medical notes contained at least one error. Another study found that 96 percent of speech recognition–generated notes contained errors; 42 percent of final signed notes still had errors. An emergency room–based study found that physical exams are often “documented” even if they didn't occur; only slightly above 50 percent of documented physical exams, by body system, were verified by an observer. And many issues raised by patients during visits don't make it into the medical record.

Compared with this status quo, AI scribes don't look so dangerous. Right now, there is already inaccurate information in medical records that is used to make clinical decisions, evaluate risk profiles, and develop new predictive algorithms. AI may reinforce biases, but those biases likely come from humans; if the human-authored texts on which the tools are based contain biased views of race or gender (which they often do), the resulting tools will as well.

As flawed as AI notes may be, it's entirely possible that human-generated medical notes could be more flawed.

None of this addresses the privacy risk of AI tools, which is serious. Recordings of your doctor's appointments may be stored on a server hosted by a third party and at risk for breach. But your medical data is already at high risk for breach. Since 2021, there have been more than 700 health data breaches each year. The 703 in 2024 alone affected more than half of the U.S. population (over 181 million people). And if you wanted to ask your hospital not to use your data for AI or for research, you likely can't. Sharing your data is a prerequisite for getting care at many, if not most, hospitals. The primary U.S. law governing health data, HIPAA, may be long overdue for an update, but AI tools for documentation aren't really going to make what is already a bad situation any worse.

AI may be just as (un)reliable as our current methods of medical documentation, but it is also much, much faster. This speed can bring about many benefits. It may allow doctors more time with their patients. It may decrease doctors' after-hours workload. It may help reduce rates of burnout among medical professionals, which the pandemic pushed to all-time highs. These are all good things. If AI can help doctors get back to the point of doctoring—focusing on their patients rather than documentation—and the notes it produces are no worse than what we already have, then that would be a step forward.