PREMIER Documentation and Data Storage
Objectives
The aim is to ensure the traceability and integrity of the research data so that the reported results can be documented and verified.
Background
To ensure experimental recording, experiments should be documented. The entries should include all data and relevant details in a way that other researchers can trace and, if needed, repeat the experiment.
To ensure the traceability, the source of the data (primary and secondary), including the identity of the scientist involved in the generation of the information should be available to authorized personnel, ensuring the personal data protection rights of the people involved.
Tasks / Actions
In order to create a lab specific action plan, the first step is an assessment, which will be carried out by the PREMIER team. The assessment will determine the status quo of the laboratory in regard to existing quality tools. Here you find the general tasks / actions that are necessary to implement the module.
Both electronic and paper data can be collected. Electronic data are collected on a device and/or its software application. The information required to verify and/or analyze electronic data may include metadata such as machine settings, software type and version, etc. Paper data are generated, for example, by using a paper notebook or the map record.
Experimental data consist of two components:
1. Primary data (raw data): These are original data which are the result of the original measurements, observations and experimental activities.
2. Secondary data (derived data): This are raw data that have been analyzed and processed.
All data should be recorded immediately after generation and permanently and safely stored.
In PREMIER, protocols can be recorded as SOPs (standard protocols) or WIs (work instructions). When generating data, the researcher documents which protocol was used and whether there were any changes, including description of the deviations if any. Any changes to data already entered should be clearly described and explained. Subsequent changes to source data should not obscure or explain the original data. The changes should include the identification of the person making the change and the date on which the change was done. In general, it is recommended to ensure that the protocols are assignable, readable, simultaneous, original, accurate and complete (commonly referred to as ALCOA):
- Assignable: The author(s), all persons who participated and/or contributed to the experiment, including the recorders if applicable, must be uniquely identified so that the data can be traced back to each individual contribution by name and date.
- Readable: The record must be legible and recorded in and/or on a permanent medium (paper or electronic).
- Simultaneous: Newly obtained/collected data and new scientific discussions and ideas should be recorded at the time of observation.
- Original: The initial recording of the data should be retained.
- Accurate: The recorded observations must be true and accurate.
- Complete: Records should be complete to ensure traceability, immediate and accurate retrieval and exact reconstruction or review of the work described.
In order to minimize the probability of loss or damage to experimental data, they should be entered directly in the standard system used in the lab, e.g. ELN. For computer applications used to collect, analyze, plot, summarize or otherwise characterize experimental data, the following information should be provided: name, version and provider of the application and where in the experiment or protocol the application was used.
Each organization / laboratory must implement rules regarding the recording of raw data. Raw data and other records should be sufficiently detailed and complete to ensure study traceability and reconstruction. If computers are used to acquire, modify or archive data, the raw data must be clearly labeled as such.
Document of research data can be done on paper or even better, if the local conditions allow, in an electronic laboratory notebook (ELN). The electronic documentation has multiple advantages such as to facilitate the cooperation within and between teams ("Team Science"), where projects, templates and stocks can be exchanged electronically. All entries are visible and allow a real-time workflow. Furthermore, electronic documentation increases the quality and reliability of research documentation and is an essential element not only in the digitization process but also in increasing the robustness and efficiency of research.
The FAIR Guiding Principles for scientific data management and stewardship intend to provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets. Their emphasis lies in the capacity of computational systems with none or minimal human intervention, as humans increasingly rely on computational support to deal with data as a result of the increase in volume, complexity, and creation speed of data.
There are multiple possibilities beside an ELN for documentation management. Examples include an open and guided Wiki, a SharePoint® or a LIMS for all team members granting free access to documents such as SOPs, protocols and templates. The content of SOPs and protocols need to be regularly checked to keep them actualized. (See PREMIER module „Conducting Experiments“).
An ELN (electronic laboratory notebook) is a software tool that in its simple form replicates an interface similar to a page in a traditional paper laboratory notebook. Raw data, protocols, observations, notes and other data can be entered into this electronic notebook via computer or mobile device. Especially, the sharing of data and the joint work on projects, even across work groups, offers clear advantages compared to the traditional paper laboratory notebook.
The number of available ELN tools is growing rapidly and the functions of individual tools are changing rapidly. Therefore, it can be difficult to evaluate all the advantages and limitations when searching for the best solution for your project. This constantly updated overview of existing ELNs can be helpful in the evaluation and selection process. Here is an approach to selecting suitable ELNs.
The traceability and integrity of the data ensures that the reported results can be reproduced. Traceability is the ability to identify the source of the data (raw and/or analyzed) and any person with a relevant influence on the data sets mentioned in a publication.
Each experimental record should contain the following important references, if possible:
- Names of all persons involved in the experiment.
- Specific experimental (research) plan (see module „Planning of Experiments“) with hypothesis and counter-hypothesis to which the experiment should refer.
- Any protocols, standard procedures, test methods, statistical tools (and/or data analysis software) used.
- Description of all materials and equipment used.
- Date on which each experiment was performed.
- Location of records and materials.
- All raw data generated, processed and reported in the experiment.
- An appropriate reference should be added to the path where the raw data are stored or if the raw data come from other researchers conducting supporting experiments.
It is recommended that all associated experiment records must be referenced in the experiment. The raw data obtained in an experiment should be stored in a separate archive system and referenced in the experiment recording. The laboratory should establish conventions for the file names of all data files and experimental records to ensure consistency and traceability including, a unique identifier assigned to each experimental data record in accordance with the applicable SOPs.
The responsibility for the creation of the trial records and the documentation of the resulting data lies with the researcher who creates the data. Where several researchers collaborate on data generation, they should be identified as collaborators.
Due to a steady increase in information technologies, a large part of the data collected in research today is available in a digital way. It is therefore essential to comprehensively handling the data, especially its archiving. According to Good Scientific Practice (GSP) guidelines, all primary and secondary data must be securely stored or retained for at least 10 years after their creation.
Every organization or laboratory should define an internal policy ensuring the integrity of the research data. All research data must be stored in the Electronic Laboratory Notebook (ELN) as soon as possible after they have been generated.
If a larger amount of files is involved or the file size exceeds the limit allowed in the ELN, the primary data must be stored on the archive storage, the secondary data on the standard storage.
For further information, see module Planning of Experiments - Data storage.