Teaching Machines to Read

Caucasian man's arms and hands typing on laptop

Unlocking the potential of the electronic health record with Natural Language Processing

by: Mary Mohr

Artificial intelligence, also termed AI, is a phrase most commonly associated with technology companies like Google and Microsoft, but AI also has the capacity to revolutionize health care, and MUSC researchers are paving the way.

One of these researchers, Stephane Meystre, M.D., Ph.D., SmartState Chair of the Translational Biomedical Informatics Center for Economic Excellence, is working on developing commercially available products that use AI to gather patient data more efficiently.  His company, Clinacuity, is attempting to solve healthcare issues such as analyzing clinical notes and de-identifying patient data by harnessing the power of a type of AI known as Natural Language Processing (NLP).   

Currently, only structured data, or information that is entered into data fields in the electronic health record (EHR), can be easily read by computers. NLP algorithms are trying to unlock the rich information in the clinical narrative. By turning unstructured text into structured data, these algorithms enable a computer to “read” the extra information written in clinical notes.

For example, CliniWhiz, one of the products being developed by Clinacuity, attempts to solve the problem of confusing health records. Multiple doctors may see a single patient, and each will write notes in the EHR about what allergies or comorbid conditions the patient has and what medications he or she is taking. Such information is critical for preventing medical errors, such as prescribing a patient a drug to which he or she is allergic.

Although structured data fields exist for allergies and comorbid conditions, physicians do not always use them because they are time-consuming to complete. Instead, they document this critical information in the notes, which until now have been unreadable by computers. By using NLP algorithms to look for mentions of allergies or conditions in the clinical notes, CliniWhiz accesses once-hidden patient information and, through a dashboard, makes it immediately visible to any physician accessing that patient’s record in the EHR.

“CliniWhiz uses natural language processing to extract different types of key patient information,” said Meystre. “It makes it easy for the provider or the physician then to use this information and add it to a dashboard that summarizes that information.” 

But NLP has more to offer than just decoding clinical notes.  It also has the potential to do a better job of deidentifying EHR data, an essential first step in unlocking these data for research.  For example, EHR data could be used to study how well given subsets of patients with a given disease respond to various treatments, according to Meystre.

“We are trying to create a learning health system, where data on the care given to patients are then used, analyzed, and interpreted to drive improvements in health care quality,” said Meystre.

CliniDeID, another product being developed by Clinacuity, offers a novel and more efficient and effective solution for deidentifying patient information in clinical notes in the EHR.  CliniDeID is unique in that it does not just delete identifying information, but rather replaces it with a randomly generated name or place, making inadvertent reidentification virtually impossible.  CliniDeID does this by searching for identifiers based on language trends such as Ms./Mrs./or Mr. before a name and replacing it with a randomly generated name.  For example, Mr. Robertson could become Mr. Smith. With CliniDeID, the time needed to deidentify a record is reduced from 13 minutes to less than a second, and sensitivity is increased to 99%.  

"CliniDeID is probably the most accurate clinical text de-identification system currently available, and we are making it available here, to support the MUSC research enterprise,” explained Meystre.

The power of AI has great potential for advancing patient care and helping MUSC investigators perform vital medical research.

Diagram of how a trial eligibility surveillance system works

One  potential application of NLP is to extract information from clinical narratives, automatically discover patients eligible for clinical trials and notify their care team members early to enhance clinical trial enrollment. Figure courtesy of Dr. Meystre.