ClinicalBERT: Unlocking Patient Clinical and Business Insights from Electronic Health Records
- Manoj Bapat

- Sep 18, 2022
- 1 min read

Main idea:
Electronic Health Records (EHR) store patient medical history such as patient demographics, vitals, progress notes, prescriptions, and lab reports along with administrative data such as billing and insurance claims. This data is both structured and unstructured, leading to the relative underutilization of this data. Breakthroughs in AI/ML such as natural language processing(NLP) using transformer models can help unlock the insights in EHR, which can support a variety of clinical and business use cases.
Where models like ClinicalBERT come into the picture:
General purpose large language models (LLMs) have been trained on general corpora such as Wikipedia and cannot reliably identify medical and clinical language. As the unstructured data in EHRs are high-dimensional, and sparse, the task becomes especially challenging. This is where a model such as ClinicalBERT, which has been trained specifically on clinical and medical terms using a domain specific dataset such as the EHRs of 58,976 unique hospital admissions from 38,597 patients of the Beth Israel Deaconess Medical Center, can make a material difference in the accuracy of NLP predictions.
What to look out for:
Expansion of research in educational settings on clinical applications of transfer learnings
Moves by big tech. companies such as Amazon, Google and Meta
Open sourcing of large language clinical models that can drive focused innovation on
Find out more:




Comments