Informatics Research Seminar: Natural Language Processing

 November 14 @ 4:00 – 5:00 pm


Speaker: Emily Pfaff and Ashraf Farrag, MSIS
Presented from UNC-CH

Broadcast Link: Seminar



Screening patients for lung cancer nodules is important for both retrospective and prospective clinical research and operations.  This presentation describes how teams at the North Carolina Translational and Clinical Sciences (NC TraCS) Institute and the Carolina Data Warehouse for Health (CDW-H) used Natural Language Processing (NLP) with the free text of radiology reports to screen for lung cancer nodules. The focus of the presentation is on the implementation of the NLP infrastructure and the challenges of going from the requirements envisioned by the clinician to a set of specifications suitable for programmatic information retrieval. The issues encountered in the validation of the NLP results will also be addressed.


Emily Pfaff is a research analyst at the NC TraCS Institute with the Carolina Data Warehouse. She earned a B.A. in Russian and Eastern European Studies from Wesleyan University, and a master’s in Information Science from UNC Chapel Hill. In addition, she holds a Certificate in Clinical Informatics from UNC Chapel Hill.

Ashraf Farrag is a research analyst at the NC TraCS Institute with the Carolina Data Warehouse – Health group. One of his specialties is working with natural language processing infrastructure to extract structured information from the free-text provider reports. He did his undergraduate coursework at UNC Chapel Hill in Biology and Computer Science and graduate coursework Information Science at the UNC School of Information and Library Science. He also holds a certificate in Health Informatics from the Carolina Health Informatics Program and has trained as a Nationally Registered Emergency Medical Technician.