Automated medical scribe
Automated medical scribes are tools that transcribe medical speech, such as patient consultations and dictated clinical notes. These tools produce summaries of consultations as well, aiming to reduce the administrative burden on clinicians and improve efficiency in documentation improvement|documentation]. Automated medical scribes based on Large Language Models became increasingly popular in 2024.
The privacy protections of automated medical scribes vary widely. While it is possible to do all the transcription and summarizing locally, with no connection to the internet, most closed-source providers require that data be sent to their own servers, securely processed, and the results sent back. Some retailers use zero-knowledge encryption. Select AI scribes do not use patient data to train their AIs, or rent or resell it to third parties. Meanwhile, few providers have published safety or utility data in academic journals, and are actually responsive to requests from medical researchers studying their products.
Privacy
Some providers unclear about what happens to user data. Some may sell data to third parties. Some explicitly send user data to for-profit tech companies for secondary purposes, which may not be specified. Some require users to sign consents to such reuse of their data. Some ingest user data to train the software, promising to anonymize it; however, deanonymization may be possible. It is intrinsically impossible to prevent an LLM from correlating its inputs; they work by finding similar patterns across very large data sets. Some information on the patient will be known from other sources. The software may correlate such information with the "anonymized" clinical consultation record, and, asked about the named patient, provide information which they only told their doctor privately. Because a patient's record is all about the same patient, it is all unavoidably linked; in very many cases, medical histories are intrinsically identifiable. Depending on how common a condition and what other data is available, K-anonymity may be useless. Differential privacy could theoretically preserve privacy.Data broker companies like Google, Amazon, and Microsoft have produced or bought up medical scribes, some of which use user data for secondary purposes, which has led to antitrust concerns. Transfer of patient records for AI training has, in the past, prompted legal action.
Open-source programs typically do all the transcription locally, on the doctor's own computer. Open-source software is widely used in healthcare, with some national public healthcare bodies holding hack days.
Data resale and commercialization
Several automated medical scribe providers include terms in their service agreements that allow the reuse, sale, or commercialization of de-identified or user-submitted data. Although such data are generally described as anonymized or aggregated, these practices have raised ethical concerns among clinicians and privacy advocates regarding secondary uses of medical information beyond clinical documentation.Freed, an AI transcription and scribe platform, states in its Terms of Use that it may “collect, use, publish, disseminate, sell, transfer, and otherwise exploit” de-identified and aggregated data derived from user inputs.OpenEvidence similarly states that it may “collect, use, transfer, sell, and disclose non-personal information and customer usage data for any purpose including commercial uses.”Doximity, which offers an AI-enabled medical scribe as part of its physician platform, grants itself a “nonexclusive, irrevocable, worldwide, perpetual, unlimited, assignable, sublicensable, royalty-free” license to “copy, prepare derivative works from, improve, distribute, publish,... analyze, index, tag, commercialize” content submitted by users, subject to its Privacy Policy.Because these terms allow broad secondary use—including sale, licensing, model-training, derivative works, and commercial exploitation of de-identified or user-submitted data—some commentators have recommended that clinicians review data-handling provisions carefully when adopting AI-scribe tools, particularly in clinical environments where patient privacy and regulatory compliance are critical.
Encryption
Multifactor authentication for access to the data is expected practice.Typically, Diffie–Hellman key exchange is used for encryption; this is the standard method commonly used for things like online banking. This encryption is expensive but not impossible to break; it is not generally considered safe against eavesdroppers with the resources of a nation-state.
If content is encrypted between the client and the service provider's remote server, then the server has an unencrypted copy. This is necessary if the data is used by the service provider. Zero-knowledge encryption implies that the only unencrypted copy is at the client, and the server cannot decrypt the data any more easily than a monster-in-the-middle attacker.
Platforms
Scribes may operate on desktops, laptop, or mobile computers, under a variety of operating systems. These vary in their risks; for instance, mobiles can be lost. The underlying mobile or desktop operating systems are also part of the trusted computing base, and if they are not secure, the software relying on them cannot be secure either.Some AI medical scribe platforms are designed to operate as cloud-based applications that generate structured clinical documentation from clinician–patient conversations. These systems may offer features such as real-time transcription, document generation, and integration with electronic health record systems.
Examples include Heidi Health, an AI healthcare technology company whose software is used to produce clinical notes and related documentation from recorded consultations. Like other digital health tools, these platforms must address data security, privacy requirements, and compatibility with existing health information systems.
Confabulation, omissions, and other errors
Like other LLMs, medical-scribe LLMs are prone to confabulation, where they make up content based on statistically associations between their training data and the transcription audio. LLMs do not distinguish between trying to transcribe the audio and guessing what words will come next, but perform both processes mixed together. They are especially likely to take short silences or non-speech noises and invent some sort of speech to transcribe them as.LLM medical scribes have been known to confabulate racist and otherwise prejudiced content; this is partly because the training datasets of many LLMs contain pseudoscientific texts about medical racism. They may misgender patients. A survey found that most doctors preferred, in principle, that scribes be trained on data reviewed by medical subject experts. Relevant, accurate training data increases the probability of an accurate transcription, but does not guarantee accuracy. Software trained on thousands of real clinical conversations generated transcripts with lower word error rates. Software trained on manually-transcribed training data did better than software trained with automatically transcribed training data.
Autoscribes omit parts of the conversation classes as irrelevant. The may wrongly classify pertinent information as irrelevant and omit it. They may also confuse historic and current symptoms, or otherwise misclassify information. They may also simply wrongly transcribe the speech, writing something incorrect instead. If clinicians do not carefully check the recording, such mistakes could make their way into their medical records and cause patient harms.
Patient consent
Professional organizations generally require that scribes be used only with patient consent; some bodies may require written consent. Medics must also abide by local surveillance laws, which may criminalize recording private conversations without consent. Full information on how data is encrypted, transmitted, stored, and destroyed should be provided. In some jurisdictions, it is illegal to transmit the data to any country without equivalent privacy laws, or process or store the data there; vendors who cannot guarantee that their products won't illegally send data abroad cannot be legally used.Some vendors collect data for reuse or resale. Medical professionals are generally considered to have a duty to review the terms and conditions of the user agreement and identify such data reuse. General practices are generally required to provide information on secondary uses to patients, allow them to opt out of secondary uses, and obtain consent for each specific secondary use. Data must only be used for agreed-upon purposes.
Technology and market
The medical scribe market is, as of 2024, highly competitive, with over 50 products on the market. Many of these products are just proprietary wrappers around the same LLM backends, including backends whose designers have warned they are not to be used for critical applications like medicine. Some vendors market scribes specialized to specific branches of medicine. Increasingly, vendors market their products as more than scribes, claiming that they are intelligent assistants and co-pilots to doctors. These broader uses raise more accuracy concerns. Extracting information from the conversation to autopopulate a form, for instance, may be problematic, with symptoms incorrectly auto-labelled as "absent" even if they were repeatedly discussed. Models failed to extract many indirect descriptions of symptoms, like a patient saying they could only sleep for four hours.LLMs are not trained to produce facts, but things which look like facts. The use of templates and rules can make them more reliable at extracting semantic information, but "confabulations" or "hallucinations" are an intrinsic part of the technology.
Pricing
With the exception of fully open-source programs, which are free, medical scribe computer programs are rented rather than sold. Monthly fees vary from mid-two figures to four figures, in US dollars. Some companies run on a freemium model, where a certain number of transcriptions per month are free.Scribes that integrate into Electronic Health Records, removing the need for copy-pasting, typically cost more.
Fully open-source scribes provide the software for free. The user can install it on hardware of their choice, or pay to have it installed. Some open-source scribes can be installed on the local device or on a local server. They can typically be set not to send any information externally, and can indeed be used with no internet connection.
Impact in healthcare
AI medical scribes are transforming the healthcare industry by directly addressing some of the most pressing challenges clinicians face, especially the administrative burden that contributes to burnout.Reducing clinician burnout
One of the most significant impacts of AI scribes is their ability to alleviate the overwhelming documentation workload that healthcare professionals face. By automating the transcription and summarization of consultations, AI scribes free up valuable time that clinicians would otherwise spend on administrative tasks. Studies have shown that the average clinician spends a significant portion of their workday on documentation, leading to fatigue and diminishing patient interaction. For example, in the UK's largest clinical rollout of ambient AI, 4 in 5 GPs using the tool said it saved them time, with the same number reporting that it enabled them to build a better rapport with patients.By automating these repetitive tasks, AI scribes create a healthier work-life balance for clinicians, allowing them to focus on patient care and reduce after-hours charting. This reduction in administrative burden directly contributes to lower levels of stress and burnout, a concern that has been exacerbated in healthcare settings in recent years. The ability to offload routine documentation tasks helps clinicians reclaim their time and mental energy, leading to improved overall job satisfaction.