GCP: FDA Draft on Data Integrity in Clinical Studies

The U.S. Food and Drug Administration (FDA) published a draft guidance on Documenting Electronic Data Files and Statistical Analysis Programs for comment. The guidance has been provided to inform sponsors of recommendations for submissions to the Center for Veterinary Medicine (CVM) to support new animal drug applications. The aim of the guidance is to reduce the number of revisions that may be required for CVM to effectively review data submissions. Additionally, submission preparation for sponsors is clarified by providing a suggested framework, including examples, on how to describe and organize the information regarding the electronic data files and statistical analysis programs.

The revised version of the draft guidance replaces the version made available in December 2015. FDA says, "the document has been revised to update contact information, clarify existing language, remove recommendations that are no longer applicable, and provide additional details on the README file". Comments on this draft guidance should be submitted by July 20, 2018.

Submissions to support new animal drug applications generally include a Final Study Report (FSR). For each study that includes electronic data files, CVM needs additional information regarding documentation of the process for data generation and the statistical analysis conducted. Therefore, the submission should include

  • readable electronic data files, 
  • a description of how the data are processed, and 
  • a description of the statistical analyses employed.

The documentation should clearly describe the entire process by which the data were collected, including the computer programs that processed all electronic data files for analysis, and the programs that implemented the statistical analyses. The draft guidance describes the information which should be submitted, together with the FSR and electronic datasets, and gives recommendations for how README files should be organized and completed.

The README file (typically a PDF file with the filename README.pdf) is an overview of electronic data files (e.g. Case Report Forms - CRFs), documentation, and programming files included in the submission. An effective README file should quickly orient the assessor to crucial information needed to understand the electronic files.The README Files should include a brief introduction, background and other information relevant to analyzing and interpreting the data. A data flow could also be provided, such as audit trail processes or how the data were captured and merged to derive the analysis datasets. According to FDA, it is acceptable to submit a separate document (e.g., a Statistical Report) as an appendix in the FSR that provides details on the statistical analysis (then the information does not have to be repeated in the README file).

Furthermore, the draft guidance contains an APPENDIX providing examples of SAS codes that convert data files to acceptable formats (i.e. non-proprietary XML files, XPT files). Finally, the document shows detailed examples for the following topics:

Electronic Data Files (e.g. Data Collection Forms) should be provided as

  • a List of Data Files including comments. Comments should include information on how data were collected (paper data collection forms or electronic data capture (EDC) system) or any reference(s) to location of information in the FSR (e.g., reference ranges or scoring definitions) that are needed to interpret the dataset.
  • Data File Contents (e.g. variable names, abbreviations used in the file, variable label or description, formulas for derived variables, and additional details)

Audit Trail Files

 For Data Analysis Programs the following should be included:

  • Program File Listings (i.e. programs used to perform randomization, process the data, generate summaries, and perform the statistical analysis).
  • Overview of Data and Analysis Flow (i.e general overview of the data and analysis process used in the study and submission). For example, description whether data sets were directly downloaded from data management systems and whether any processing (merging, validation, editing) were performed prior to analysis. However, the submitted datasets should be unmodified.
  • Instructions for Running Programs should describe the sequence of program calls needed for CVM to run the programs. For each program, all directories and files referenced to access or store data should be provided (including directory and file names, locations, and aliases if used).
  • Randomization Programs Lists of any programs used to generate random assignments. The details on the randomization process, the programs used, and resulting allocation tables should be included in the FSR.

Go back

GMP Conferences by Topics