September 2012 Newsletter



Transforming Healthcare Through Informatics and Analytics
2012 September
Volume 1, Issue 1
From DCHI…

This is the inaugural issue of the quarterly Duke Center for Health Informatics (DCHI) Newsletter, so worthy of a few comments about its purpose. The Center is distinguished by a strong history in applied health and research informatics, a mission and vision of improving human health, a tight integration with Duke Health System operations and research programs, and an interdisciplinary curriculum and training environment. The Center is a home for multidisciplinary work, and strives to leverage informatics expertise from across Duke. The quarterly Newsletter is one way to keep the informatics community informed, and provides a means for sharing and coordinating the important work of this community. Thanks to the people who have so graciously agreed to contribute, their names are included with each section. We are always open to comments about the Newsletter or suggestions for information to include. Please send comments or suggestions to the DCHI mailbox:

In cleaning out some files recently, I came across several interesting articles from years past that are wonderful and informative documentation of the rich history of Duke informatics. I have included one of these articles here. It describes the cardiovascular database developed at Duke with quotes from past and present Duke faculty. Titled “Learning by numbers,” it was written by Renee Twombly and published by the Duke News Service 10 June1994 in the Duke Dialogue, a faculty and staff newsletter that is the predecessor to today’s Working@Duke. I think you will find it both newsworthy and entertaining!

Ed Hammond, PhD

Director, DCHI




Health Level Seven – HL7

Health Level Seven® (HL7®) International, the global authority on standards for interoperability of health information technology, earlier this month announced its decision to make its standards available without cost under licensing terms effective early 2013. Dr. Ed Hammond, who is a founding member of HL7, was a member of the 2011-2012 Board of Directors making this landmark decision.

HL7 announced election results on 12 September 2012 for its Board of Directors at the 26th Annual Plenary and Working Group Meeting in Baltimore, MD.  Dr. Ed Hammond was re-elected Secretary for the 2012-2013 term.

Duke’s two HL7 Clinical Interoperability Council (CIC) co-chairs, Dr. Ed Hammond and Anita Walden, have been working with CDISC on approaches for developing data standards, the process for balloting, and a unified public comment process. CDISC and HL7’s CIC co-chairs met several times during the summer to discuss a collaborative approach for developing data standards and to prepare for the HL7 meeting in Baltimore that included the FDA in the data standards development discussion.


Upcoming Duke/UNC Seminars

All seminars are held from 4:00-5:00 pm in Hock Plaza (Ground Floor Auditorium),  2424 Erwin Rd, Durham.
Seminars at UNC are held in the UNC Health Sciences Library, Room 328, Chapel Hill.

Sept 19, 4:00 – 5:00 pm
Barbara Massoudi, MPH, PhD – BreathEasy:  A PHR Tool for Managing Asthma in an Underserved Population

Sept 26, 4:00 – 5:00 pm
Carol Hamilton, MD – PhenX Tool Kit

Oct 3, 4:00 – 5:00 pm
Dennis Schmidt, MS, CISSP – IT Security in a Decentralized Environment

Oct 10, 4:00 – 5:00 pm
Helena Ellis – Duke BioBank

Oct 24, 4:00 – 5:00 pm
Eric Eisenstein, DBA – Take Two Alerts and Call Me in the Morning

Oct 31, 4:00 – 5:00 pm
Emilie Lamb, MSPH – Implementing Electronic Laboratory Reporting to the North Carolina Division of Public Health for Attesting to Public Health Meaningful Use Objectives

Nov 14, 4:00 – 5:00 pm
Emily Pfaff, Ashraf Farrag, MSIS and Samuel Cykert, MD – Natural Language Processing

Nov 28, 4:00 – 5:00 pm
Salvatore Mungal – National Cancer Informatics Program (NCIP)

Nov 28, 4:00 – 5:00 pm
Stephanie Haas, PhD – Natural Language Processing

Upcoming Conferences

SCDM 2012 Annual Conference
Sept 22-25, Los Angeles, CA

Healthcare Data Warehouse Association (HDWA)
October 2-4,  Ottawa Ontario

AMIA 2012 Annual Symposium
Nov 3-7, Chicago, IL

Data Warehouse and Enterprise (reported by Howard Shang)

DEDUCE (Duke Enterprise Data Unified Content Explorer)
DEDUCE v4.3 will be released in mid-November with a major new component of geospatial visualization

Users will benefit from:

  • More accurate county and zip code data
    • New automated ability to correct for misspellings in original data
    • Verification process means that data are more dependable
    • Counties are accurately plotted, not based on weights of zip codes
  • New ability to easily join demographic and socioeconomic Census data to each patient through use of the Federal Information Processing Standard (FIPS) code

Work also proceeds on integrating data from Epic into DEDUCE, which will continue to contain historic data as far back as 1996 and add new availability of Maestro Care data. Epic Healthcare is the software company providing the platform for the integrated electronic health record (EHR) being deployed within Duke Medicine. The Duke-specific implementation of the Epic platform is called Maestro Care. The first implementation of Maestro Care occurred in July with the Wave 1 ambulatory go-live. The next milestone will occur on 10 October with the Wave 2 go-live.

Analytic Workspace Server (AWS)

Since mid-summer, the Data Warehouse has been hosting a new service called the Analytic Workspace Server (AWS), a virtual desktop that provides access to analytic software and storage of data. The AWS allows investigators and analysts to perform their analyses within a secured environment, and prevents the need to copy PHI or other sensitive data to potentially unsecure environments outside of the DHTS firewall. The AWS has been fully configured with installations of analytical software (including SAS, R, and ArcGIS), and is actively being used.

HDWA Abstract Acceptances
All of the abstracts submitted to the Healthcare Data Warehouse Association (HDWA) by the Data Warehouse were accepted for the annual conference, to be held 2-4 October.

Presentation: “Geospatially Enabling Duke’s Enterprise Data Warehouse Tool”

Poster: “Connecting the Dots Between Text and Data”

Poster: “Conceptual Model for Research-Driven Data Marts”

In addition, Information Architect Stephanie Brinson was invited to join a roundtable discussion on “Research Data Warehousing” and Howard Shang was invited to lead a roundtable discussion on “New Data Delivery Methods.” Four members of the Data Warehouse team will travel to Ottawa, Ontario to participate in the conference.

See appendices for usage statistics for DEDUCE/Discern.


Westat Child Electronic Health Record project (reported by Anita Walden)

The Child Electronic Health Record project, funded by the Agency for Healthcare Research and Quality (AHRQ) (PI, J. Ferranti), is an initiative to improve EHR systems for children’s health. The project team created a requirements document that will be provided to vendors to facilitate EHR system development. Dr. Ken Gersing’s EHR development team produced two EHR prototypes using the Child EHR Format.  Dr. Constance Johnson performed usability testing with clinicians. The format is expected to become part of the meaningful use requirements for EHR systems. Currently the team is working on plans for obtaining public comment on the format requirements, and a maintenance and governance plan.


SCDM Informatics Education

To train data managers on clinical research informatics, Anita Walden worked with the clinical data management professional organization Society of Clinical Data Management (SCDM) to develop a clinical research data standards course. It covers national and international clinical research standards and data standards implementation. Anita taught the first online five week instructor facilitated class over the summer.


The Human Studies Database (HSDB) Project (reported by Swati Chakraborty)

The HSDB project is a federated database for the design and results of human studies from multiple institutions. HSDB uses Ontology of Clinical Research (OCRe) and common clinical vocabularies to standardize information collection and maintenance. The participants are from CTSAs from the University of California-San Francisco, Duke University, Johns Hopkins, Stanford, The Rockefeller University, Mayo Clinic, University of California- Davis, University of Texas Southwestern, University of Texas Health Science Center-San Antonio, University of Washington, among others. For further details see HSDB wiki –

Duke University has led the administrative subgroup to produce a list of about 135 common administrative data elements identified from the CTSA participants’ Institutional Review Boards (IRBs) and Clinical Trial Management Systems (CTMS). Duke also led the harmonization with Biomedical Research Integrated Domain Group (BRIDG) model of the semantics of protocol-driven clinical research. Release 3.1 of the model is publicly available for download at  Along with other leaders like Johns Hopkins University and The Rockefeller University, Duke has participated in developing strategy for local implementation.

Work from the HSDB Project has been presented at national meetings including the TBI-CRI joint summit, AMIA annual symposium, and SCTS/ACRT meeting; all the published materials are available from the HSDB wiki. An upcoming session in November at the AMIA 2012 Annual Symposium in Chicago titled “Ontology-Based Federated Data Access to Human Studies Information” will be presented.  Swati Chakraborty from Duke is a co-author along with Ida Sim (University of San Francisco) and other CTSA participants.

Data Standards Projects (Controlled Vocabulary) (reported by Anita Walden)

Tuberculosis Data Standards

The ORISE (Oak Ridge Institute for Science and Education) has identified data elements for pediatric pulmonary and extra pulmonary TB. An ORISE fellow, the Duke project leader and the FDA project officer are meeting bi-weekly to review data elements and discuss overlap with adult TB data elements.  A group of clinical stakeholders have agreed to participate on the Clinical Review Committee and will hold their first meeting in early October.

With help from an intern from the Masters of Management Clinical Informatics program (MMCi), work has been completed on resolution of the critical errors for the model submitted to the National Cancer Institute (NCI) data element repository. The model has been submitted to the Enterprise Vocabulary Services team to create new data elements not currently in the thesaurus.

Anesthesiology Preoperative Domain Analysis Model (DAM)

The Anesthesiology working group had to withdraw their intent to ballot the preoperative data standards due to lack of resources to complete the changes to the data elements and the class model in time for the final content submission. The majority of the changes have now been made to the DAM; the goal is to ballot in January 2013.

Biorepository Data Elements

The biorepository data element development initiative led by Helena Ellis and the oversight committee consisting of Mary Beth Joshi, Aenoch Lynn and Anita Walden kicked off its cross campus initiative in July. There are four enthusiastic working groups, each made up of a facilitator and an informatics leader, who are developing common data elements for the biorepository. A monthly facilitators-oversight committee meeting was held at the beginning of September to review the working groups’ progress. The informatics leaders also meet monthly to review the data elements identified and promote harmonization across the working groups.

FDA R24 Grants  
Duke is participating in several FDA projects.


Duke Office of Clinical Research (DOCR) (reported by Denise Snyder and Cory Ennis)


The Research Management Team (RMT) and the Clinical Research Support Office (CRSO) have merged into a new office called the Duke Office of Clinical Research (DOCR) now located on the 11th floor of the North Carolina Mutual Life Building. Several staff have been assigned new roles for DOCR.  SBRs also have a new name:  Clinical Research Units (CRUs). If you have a question or need assistance on a clinical research question, please email

Data Security

DOCR continues to partner with the Information Security Office (ISO) on data security vulnerabilities. The ISO is planning to establish a project governance committee with input from multiple entities. This will enable a transparent process for establishing the policies and workflows used to manage security incidents. A representative from DOCR will participate in this governance process.

DOCR is working with OIT and SOM Finance to investigate an alternative method of securely capturing social security numbers directly into SAP for study participant payments and reporting.


REDCap will be utilized to request retrospective Research Data Security Plans (RDSPs) from studies started before 21 November 2011.  Approximately 2,000 surveys have gone out to researchers to date.  Additional details regarding the plan will be released via the CRU Research Practice Managers.

REDCap is being migrated to a 3-tier structure to enhance security. A proxy server will be available from the Internet and the REDCap application server will join the database server behind the PHI firewall. Traffic will be monitored and controlled as it flows through the proxy server. In addition, the database server is being migrated to a 64-bit environment to improve performance. Following these changes, the system will be upgraded to the latest stable version of the software.

Results of the Duke Information Security Office (ISO) system penetration test on REDCap have been shared with the CTSA consortium members.

New procedures for registrations are being implemented for clinical trials where the investigator holds an IND/IDE for the study. registrations for these studies will now list the Responsible Party as “Sponsor-Investigator”, meaning the investigator has the authority and responsibility to approve and release the registration record for publishing to   For current clinical trial registrations where the investigator holds the IND or IDE, the Responsible Party field is being updated and investigators contacted to make them aware of this change. For clinical trials not conducted under an IND or IDE, the Responsible Party field will continue to be listed as “Sponsor” and the Sponsor listed as “Duke University”.  DOCR is posting three FTEs this week to assist study teams with results reporting.


DOCR is working with the Office of Corporate Research Collaborations (OCRC) to finalize a contract with ResearchMatch, a national registry to match prospective study volunteers with studies they might qualify for. Denise Snyder will serve as the primary contact for approving the initial registration of a Duke investigator, with backup from Cory Ennis and Shelly Epps. DOCR will work closely with DCRU to integrate ResearchMatch with DCRU subject recruitment.


eIRB 12.1 was released in July.  eIRB 12.2 is targeted for release at the end of September.  Planned updates include the ability to edit previously entered Research Data Security Plans (RDSPs), a question asking if the study will utilize biospecimens, branding updates to reflect the change of organization to DOCR, and various application fixes. DOCR IT staff has implemented in-depth system monitoring to track and address ongoing performance issues with the application.

In its first step to address the need for verified disaster recovery capabilities, DOCR IT staff now has a staging instance online and has instituted server mirroring to copy the production environment to a backup environment.

For detailed metrics, see Metrics for Data and Research Management Support.


Velos (reported by Cory Ennis, Denise Snyder, Matt Gardner, Julie Eckstrand)

A new test environment has been created and Velos v9.0 installed for evaluation and testing. Discussion regarding future direction of the CTMS is ongoing.  The DOCR IT team is working with Velos developers to address data requirements needed to potentially retire the Subject Billing Registry legacy application. New code has been delivered and tested. Velos is working on issues identified by DOCR IT. Once the technical solution is functioning as expected, DOCR will work with business stakeholders currently utilizing the Subject Billing Registry to address their needs.

Development of a technology solution to support electronic acquisition of early phase research data and operational tasks at the Duke Clinical Research Unit (DCRU), which began in 2010, is underway as a Phase 1 suite of applications.  A Phase 1 Core product will allow for deconstruction/translation of a protocol (electronic source document) and will interact with volunteer management, scheduling and resource management of research unit space, equipment and personnel, biospecimen chain of custody management, and ultimately a comprehensive electronic study plan/source document.  The system will leverage existing functionality and core services built within the eResearch CTMS, and include custom built pieces which cover the functionality gaps. It interacts with larger Duke IT infrastructure such as the Master Patient Index, eIRB, and Duke eResearch.  Over time information will be pulled from and pushed to other IT systems as required for operational efficiency, including communicating with the EHR (EPIC Research), increased interactions with the CTMS (eResearch), and many other research IT systems.

DCRU is part of a Global Proof of Concept Network (Singapore IMU and MDRI in India), and expects to share protocol configuration with its global partners. This will allow for real-time access to integrated, multi-site, early phase clinical trial data by sponsors and study teams, global alerting, remote sponsor monitoring, and construction of a global repository of data for secondary research purposes.

Duke Bioinformatics Shared Resources (DBSR) (reported by Salvatore Mungal)

GenePattern and caArray were deployed in 2010 and are widely used to meet the data storage and data sharing needs of Duke Cancer Institute (DCI) investigators with high-dimensional gene expression datasets. The DCI Bioinformatics unit has launched an aggressive campaign to build GenePattern modules to meet the requirements of reproducible research.  Currently, a total of 22 new GenePattern modules have been developed, ranging from data management and preprocessing to data analysis and visualization. The modules have been piloted for analysis of existing mRNA microarray datasets in breast and ovarian cancer from several DCI investigators, and externally from The Cancer Genome Atlas (TCGA) project.

A URL has been established and security clearance obtained to use a special port existing outside the firewall to access the RProteomics software application to caGrid.  Implementation and deployment of the software to the training grid is underway. When completed, implementation data, translation, and analytical grid services for proteomics mass spectrometry using provenance and the latest security measures will facilitate reproducible research. It will also extend the functionality of existing grid technology to make sharing data and applications tractable.

These applications use NCI’s caBIG® syntactic, semantic and data standards to facilitate interoperability. They will use the National Cancer Informatics Program’s (NCIP) formal standards when they become available.

Next steps will include evaluation of existing data standards, recommendation for omics-related data elements to be used in the context of the Duke Biobank, and recommendation of standards to be used for the MIDR and other types of initiatives that fall along a spectrum of required structure and rigor.

Enterprise Biobanking Informatics (reported by Helena Ellis)

The Duke Biobank continues to work towards obtaining institutional funding to support implementation and deployment of Labvantage as Duke’s enterprise-wide Biobank Information Management System to be used by all Duke clinical researchers who use or hold human biological specimens. A detailed 5-year business plan was reviewed by DHTS’s financial group, which includes implementation and migration costs for the major biobanking groups at Duke. The proposal has been reviewed and approved by the DukeQuEST committee (Quality and Excellence in Scientific Translation, formerly TMQF), and a request for funding will be brought to the Chancellor’s office in September 2012. Additional information and links to resources can be found on the Duke Biobank website (

Index of Biospecimens

The Duke Biobank Index of Biospecimens, also known as “the Index” (, is a simple web-based searchable database for Duke researchers to look for biospecimen collections that may be available for collaboration at Duke. FAQ’s for the Index are available on the Duke Biobank website ( The Duke Biobank is actively working with many researchers to register more collections in the Index and will assist biobanks or investigators who have collections they wish to register. Please contact the biospecimen-expediter <> for details.

Institutional Biobanking Terminology

On 13 July 2012 the Duke Biobank and members of the Biobank Informatics Working Group launched an initiative to establish standard biobanking terminology. This biobanking terminology, which will focus on essential biobanking terms and data elements, will be implemented in the LabVantage system during the deployment of the system and will be made available to all biobanking groups. The standardization of biobanking terminology will help promote semantic interoperability across all biobanking and clinical research groups at Duke. Terminology working groups consisting of individuals with biobanking expertise have been formed and are meeting regularly. The project is expected to last up to one year. Interested parties should contact Helena Ellis <> for more information.

MURDOCK Informatics (reported by Jessie Tenenbaum, Doug Wixted, and Anita Walden)

MIDR i2b2 deployment

A proof-of-concept i2b2 instance is being developed for the MIDR to evaluate i2b2 as a technology solution moving forward. Loading metadata into the i2b2 instance has been completed. Additional progress has been largely on hold while the developer’s effort was needed on other tasks, but as of 5 September, work has re-commenced. The first task will be  loading a 1,000-row selected subset of registry data. Next steps after that will include review of the metadata structure by the MIDR team, design and implementation of a streamlined process for future data transfer, and performance testing with a more realistic number of records.

Omics data

A confluence of separate but related initiatives are coming together at an opportune time for next steps with respect to incorporation of omic data in the MIDR:

  1. The Biobank Standard Terminology Project is underway, including a group focused on omic data.
  2. Some Horizon 2 projects are beginning to think about genomic data.
  3. The CTSA Omics Data Standards Working Group has completed preliminary work and is ready to do some in-depth work to evaluate existing data standards in a specific omic area, likely to be transcriptomics, but will depend in part on the needs of H2 investigators.
  4. Two potential projects under consideration involve secondary analysis of an existing genomic dataset, providing a real-world use case to guide requirements.


A revised version of the MURDOCK Study Governance document has been completed based on input from a number of stakeholders. It was distributed to MURDOCK Leadership and will be discussed in an upcoming Leadership meeting. This document describes in detail the orchestration of people, processes, technology, and policy to leverage, optimize, and maximize the value of both MURDOCK data and other aspects of the Study, including  biospecimens, infrastructure, and human expertise. In particular, two key areas the document addresses are data governance- how data is generated/acquired, stored, and accessed using role-based permissions, and MURDOCK Study research proposals- how sub-studies are to be proposed, reviewed, approved, and executed.

In parallel, a distinct IRB Protocol for the MIDR itself (as opposed to references to it in the Horizon 1 and Horizon 1.5 protocols) has been developed and is being submitted for IRB approval.  Having this separate protocol is necessary to enable use of the MIDR (which holds a limited data set) for review preparatory to research.

MURDOCK Registry Web Enrollment

The MURDOCK Registry development team is working to provide Registry participants with the option of completing their enrollment forms online and submitting them to the Kannapolis office prior to their enrollment visit. Completing enrollment forms and any follow-up forms via the web is expected to increase enrollment and improve follow-up compliance.

Progress on the web enrollment functionality completed since the last iteration includes:

  • Email Template Prototypes – The team demonstrated several prototypes for the email notifications to the web users. These templates can be modified by the Kannapolis study office.
  • Completed Nightly Processor Web Enrollment Form Version Change Notification – When a new version of the form is ready to be released, a notification email to inform participants they have so many days to complete and submit their current version of the form before it is deleted and replaced with the new version of the form.
  • Web Participant Management Tool Implementation – A tool for administrative users to manage web participant accounts and forms.
  • New Form Data Elements – Alternate contact data elements have been added to the Registry system.
  • Reason for Change for the Web Enrollment form review – This functionality was completed for forms submitted by the participant and reviewed by the study staff during a visit with the participant.
  • Import physical addresses and produce GIS data extract – (partially complete).

It is anticipated that testing the developed items and evaluation of the tools needed to code the medications will take three weeks. Results will be reviewed by the team at their meeting on 18 September.

Usability Testing

The project team is reviewing the heuristics test results and the user acceptance test results.  The findings will be ranked in order of severity and then discussed with the developers for implementation.  The public will also participate in the testing to ensure the tool is user friendly, intuitive, and provides the information needed.  The Internal Review Board documents are being reviewed by the Kannapolis office prior to being submitted for approval.


Horizon 2 Data Transfers

Access and procedures have been established to enable investigators/staff for the Severe Acne and Physical Performance Studies to access pdf copies of CRFs that Kannapolis study staff have scanned and reside on a secure FTP site.  In addition to CRFs, files from Actigraph accelerometers (used in the Physical Performance Study) are being transferred in this manner.

Medication Coding

Systems have been identified and will be evaluated for coding the medications collected in the registry, a step needed for analysis and reporting.  Initial investigation has been done on the National Center for Biomedical Ontology (NCBO) tools using the RxNorm terminology for coding both legacy free text data and data entry moving forward.  A research side project will evaluate the accuracy rate for coding existing data and characterize the types of entries that cannot be mapped through the tool.

RXNORM, which is produced by the National Library of Medicine, has an annotator specifically used for coding medications and is also being evaluated to determine accuracy and ease of use and implementation.

Electronic Health Record (EHR) Data

MURDOCK Study informaticians are working with the two major healthcare providers in Kannapolis/Cabarrus County, Carolinas Healthcare System (CHS; source of primary care for 55% of current participants) and Novant Health (source of primary care for 10% of participants) to share electronic health record (EHR) data for MURDOCK participants.

The CHS data warehouse and analytics team (DA2, formerly the Dickson Institute) and Duke Data Warehouse group, with support from the MURDOCK Study, are planning a visit to Duke so CHS can learn more about building the Duke enterprise data warehouse [i.e. Decision Support Repository (DSR)].

Novant leadership is interested in using Epic as a common EHR platform between Novant and Duke, leveraging the interoperability between systems to share data securely for common patients. This would facilitate integration of clinical data from Novant for the MURDOCK Study, and may provide a secure method to do so. Doug Wixted, Informatics Project Leader for the MURDOCK Study, is working with Keith Griffin, Novant CMIO (MURDOCK champion and lead contact) and Novant compliance personnel regarding a Data Use Agreement (DUA) and related privacy and security considerations. Evaluation of the Epic Care Everywhere application for the MURDOCK Study is in nascent stages; however, efforts to determine feasibility have been initiated from both sides.

Visit the DCHI website ( for additional information including
informatics news, publications, conferences, faculty, education, and research.

Copyright © 2012 Duke Center for Health Informatics
Contact Us

Subscribe to Our Newsletter!

Standardized Collection and Submission of Cardiovascular Endpoint Data



Rebecca Wilgus

by Rebecca Wilgus

Building on an FDA writing group chaired by Karen A. Hicks, MD a multidisciplinary expert panel convened in 2011 to standardize cardiovascular endpoints (such as death and MI) to achieve computational interoperability. Representatives from the DCRI, ACC, and CDISC partnered to harmonize clinical definitions, represent the terms in UML and the CDISC Study Data Tabulation Model, and create a demonstration test dataset. Sponsored by FDA grant 1R24FD004411-01, James E. Tcheng, MD (PI) and David Kong, MD (Co-PI) led this project. Robert Anderson, James Topping, Rebecca Wilgus, and Brian McCourt (DCRI), Maria Isler (ACC), and Steve Kopko and Amy Palmer (CDISC) completed the project team.

Next steps include publication of a CDISC Cardiovascular Therapeutic Area User Guide and an ACC-AHA Task Force on Clinical Data Standards manuscript. Future goals include publication as a controlled terminology in the NCI Thesaurus and balloting through HL7 in a future version of the CV Domain Analysis Model.

Tenenbaum Named Associate Editor of JBI



Jessie Tenenbaum

Jessica D. Tenenbaum, PhD, the Associate Director for Bioinformatics for the Duke Translational Medicine Institute, will join the Journal of Biomedical Informatics as an Associate Editor beginning in January 2014. Of the appointment, Dr. Tenenbaum said, “I’m very excited to take on this new role and honored to be joining the JBI editorial team. It’s a great opportunity to both observe and impact the literature in our field, and the peer review process itself.”

Upcoming DOCR Training

The December schedule of DOCR training sessions for research staff at Duke is listed below. Registration instructions are available on the DOCR website.

  • Research Data Integrity/Data Security — December 3 and December 19
  • Human Subjects Research at Duke — December 3 and December 19
  • Investigator Responsibilities — December 4
  • Urine Pregnancy Screening for Research — December 4
  • Biobanking Research Specimens — December 5
  • Industry Funded Clinical Research- Process for Contracts — December 9
  • Informed Consent — December 10
  • Study Documentation: Regulations and Best Practices — December 12
  • IRB Overview — December 17
  • Workshop: Results Entry — December 18


Validation and Submission of Data and Analysis to the Gene Expression Omnibus (GEO) Repository

by Salvatore Mungal

The GEO repository, hosted by the National Center for Biotechnology Information (NCBI), is a public functional genomics repository that accepts array and sequence-based data supporting MIAME-compliant data submissions. The DCI Bioinformatics and Information Systems Shared Resource groups duplicated analysis environments in the validation process to accurately reproduce microarray data analysis before submission to the GEO repository. This validation process was performed by creating multiple environments using Debian GNU/Linux 64-bit operating system on local and networked virtual servers. The same R version used in the original analysis was installed, followed by the precise loading by version numbers of the required R packages. The data was propagated to the other environments, and the integrity was ascertained by matching MD5 checksums of the original data. The Shared Resource groups have successfully validated and submitted their first study submission to GEO with validation confirmation by all of their duplicated environments.

NCBI’s goal is to advance science and health by providing access to biomedical and genomic information.

DHTS Staff Present Poster at Healthcare Data Warehouse Association Conference (HDWA 2013)

Michael Kahl, MBA, PMP, a Senior Data Analyst for Information Management & Enterprise Reporting at Duke Health Technology Solutions (DHTS), presented a poster titled “Traceability in Healthcare Data Sharing Projects Through the Use of Data Warehousing Artifacts:Methods from the Southeastern Diabetes Initiative (SEDI)” at the Healthcare Data Warehouse Association (HDWA) conference October 1-3, 2013 in Scottsdale, Arizona. The poster, authored by the members of the Southeastern Diabetes Initiative (SEDI) Informatics team—including Kahl and fellow DHTS Information Management department staff members Felicia Dunston, Lori Morris, and Shelley Rusincovitch—details the standard process followed by the team to gather, analyze, profile, and evaluate data shared from varying EHR platforms in use at the current data sharing partner sites, which are located in four counties across the southeastern United States (Cabarrus County and Durham County, NC; Mingo County, WV; Quitman County, MS).

The goal of the process is to provide a mechanism that drives data traceability within data sourced from each EHR platform  and facilitates thorough analysis and profiling of the shared data during Phase 1 of the SEDI Datamart development. To meet the goal, output documentation is created during each step of the process that serves as input documentation for the next step. As a result, the output artifacts that are created ultimately serve as cross-references and reality checks during the development of the SEDI Datamart. The SEDI Informatics team has determined that by following the process as well as by creating thorough documentation in the way of Source System Analysis Documentation, Data Dictionaries for the Data Extracts, and Data Submission Reports throughout each step of the approach, the building of the SEDI Datamart will be more consistent across data sharing partners. Less duplication of effort when future data sharing partners are integrated into the datamart is also expected to be a tangible benefit. As a result, the team believes adequate flexibility and scalability is allowed for while still keeping  the level of stability necessary in order to ensure quality data  exists when integration is implemented as part of Phase 2 of the SEDI Datamart development.

Richesson Text Receives Record Number of Downloads

Clinical Research Informatics, authored by Rachel Richesson, PhD and James E. Andrews, PhD has been recognized by the publisher (Springer) for garnering a record number of chapter downloads since its publication online in February 2012. This achievement led to a ranking in the top 25% most-downloaded text in Springer’s eBook collection for 2012. The book’s print text sales also are doing well; at the request of the editors, royalties totaling roughly $650 were donated to support research and patient education programs for the Primary Cilia Dyskineshia Foundation, a rare disease patient advocacy organization whose leaders and members are inspirational and engaged in national discussions to transform research in the U.S.

In a recent letter to colleagues associated with the text, Dr. Richesson said: “We hope that you all are proud of your association with
this text, and want you to know that your contributions have influenced the training of thousands of research and informatics professionals, as well as improved the prospects of a small community of patients and underfunded advocates affected by a devastating and poorly understood lung disease.”

To purchase this text, please visit Springer’s website.

Online Questionnaires Now Available for MURDOCK Participants



The Measurement to understand Reclassification of Disease of Cabarrus/Kannapolis (MURDOCK) Registry now includes an online system integrated into the Registry in February 2014 that allows participants to complete their visit questionnaires online. The system is designed to assist the office staff with study management activities by increasing the various options for form completion, heightening registry enrollment, and reducing paper processing.

The new system distributes automated notices to participants to remind them to submit their forms using the online system prior to their enrollment visit, with email notification to study staff when a form is submitted. During a participant enrollment visit, staff can make changes with the participant’s approval. The forms can also be printed by participants if enrollment will take place at one of the study sites that does not have an internet connection.

The online system was a major undertaking, built from the ground up. The MURDOCK home page accessed by participants shows the completion status of their forms. They can manage their account information, and in the future will be able to complete follow-up forms online. The controlled release is currently in a 30-day pilot phase to identify any issues or enhancements that are needed prior to a full-scale release.

Development of the system was led by Julie Frund, DTMI. Workflows and revised processes for the study staff were facilitated by Kimberly Ellis, Kannapolis Clinical Data Specialist. Because of the high visibility and use by the general public, a usability study was conducted last summer to improve the interface. The team involved with the usability study included Constance Johnson, PhD, MS, RN, Duke School of Nursing, Michelle Smerek, Kannapolis, and Anita Walden, DTMI.