Clinical Data Management and the Data Manager

Posted by Laura Dalfonso on Mon, Jan 12, 2015 @ 02:59 PM

What is Data Management?

Data Management is a term encompassing various functions and applicable within several industries.   Within the field of research it is often referred to more specifically as Clinical Data Management.   Data Management is an integral part of doing research and can best be described as a process for collecting, validating and reporting the data produced during a clinical trial or other type of research study.   Highly effective Data Management is crucial for the generation of reproducible and reliable study results.     The degree of Data Management will vary from one research effort to another, but all research efforts will require some level of data management prior to the data being analyzed and published.  How the data is to be collected, validated and reported are precisely outlined in a document called the Data Management Plan.   This helps to ensure that the way in which data is reported and collected is consistent among all sites participating in the research effort, as well as the consistency in which the data is analyzed. 


What is the role of a data manager?

A data manager is an important member of the research team, whose main priority, is to ensure the integrity of the data that is generated for use during a clinical trial or other research effort.   They can be employed by pharmaceutical or medical device firms, as well as by contract research organizations.  Some large hospitals or clinics may also hire data managers if their involvement in research is great enough to support position(s).  Often data managers at hospitals and clinics have other responsibilities as well; including direct patient care.  For the remainder or this section, we will focus on the responsibilities of data managers employed by either pharmaceutical/medical device firms or by contract research organizations.  

The data manager is charged with a variety of tasks related to developing the processes and procedures for the collection; validation and reporting of the data generated for use during a clinical trial other research effort.   Among these responsibilities, are the development and processing of case report forms (CRF), the identification and generation of necessary logic checks and writing and resolving queries.  Because much of a data manager’s time is spent performing one of these three tasks, we will take a deeper look at each of these. 

Case Report Forms (CRFs) are forms used during a clinical trial or other research effort to collect and report the required subject level data.  They can be in either electronic (eCRF) or paper (CRF) format.   The number of CRFs used for a given research effort will depend on how much data is being collected, what types of data are being collected and how often data is being collected on each subject.  It will also depend on how each form is designed and how much information is included on a single form.  Sometimes a CRF may be designed where data related to Medical History and the results of a Physical exam are combined, while other CRFs separate this information onto two separate forms. 

Logic Checks, as indicated by their name, look for a logical pattern to the data that has been reported on the CRF/eCRF.   They are also referred to as edit checks and are generated to identify errors in the data.  The errors range from simple identification of missing data to more complex issues, like lack of consistency between a data point reported on Form A, for example, and a related data point reported on Form B.  An example of this is that on Form A, the subject’s gender is reported as Male, but on Form B, the results of the screening pregnancy test are reported as negative, instead of not applicable.   They also identify data that is out of range, which means that the value is abnormal and not likely to be true.  An example to this would be a subject who is reported as being 152 years old. 

Writing and resolving queries is often the most time intensive part of a data manager’s job; especially early on in the data collection process.  It is not abnormal for several queries to be written on each CRF submitted for the first several subjects.  Generally speaking, as the research effort progresses, the queries begin to decrease in number because the necessary corrections are made in the collection and reporting process.   Although queries cannot be completely eliminated by the use of eCRFs, they do significantly decrease the number of queries generated, by disallowing certain data to be entered, such as an age of 152 years old, and also requiring that all mandatory fields have been completed in real-time.  

Querying advice for the data manager

When writing queries for research conducted using paper CRFs, double check the Site Number, Subject Number, CRF Title and Question Number before sending the query.  Often there is an error in one of these which makes it unlikely or impossible for the site to answer.  If they are able to determine the correct Site Number, Subject Number, CRF Title or Question Number, it will require unnecessary time that most site staff do not have the luxury of.   These errors will also most likely result in having to send an additional query to the site, for which they will have to again spend time to resolve.   Other helpful tips include:

-          Do not combine queries on separate CRFs into a single query.

-          If a response is to be reported as a predefined value, list the options for the response, such as negative, positive, or not applicable. 

The best advice you can give to a participating hospital or clinic regarding queries, is to complete queries as soon as possible after receiving them.  This will alert them to errors/omissions sooner rather than later, and prevent them from continuing to complete additional forms in a manner that will continue to generate the same queries.  

Topics: Outcomes Research, CRO, Query, Data Management, Clinical Trial, CRF, Validation