This page is designed to help you:
Below you'll find a series of common scenarios and best practices for addressing each scenario.
You are using a new laboratory instrument to analyze a sample. Its software outputs a data file with a name that is a very long string of random-appearing alphanumeric characters. You have saved the files in a folder but they are difficult to browse, and some will not always open on your laptop, so you have to use another computer in the lab.
This random-appearing string is more meaningful to the machine than to the humans conducting the experiment. Further, these long file names can potentially cause errors when opening the file using different types or versions of software or in a different operating system environment. The best practice is to:
20190102_ac_smithlab_utra_exp01_gel_003
This file name contains information meaningful to the creator and team and includes: date ("2019-01-02"), creator initials ("ac"), project ID ("smithlab_utra" for Smith Lab Undergraduate Teaching and Research Award), experiment methodology ("gel" for gel electrophoresis), experiment ID ("exp01"), and the number in the sequence of files generated during the experiment ("003").
In the same way that a very long file name may potentially cause problems, so too can the use of non-standard symbols and spaces in a file name. In some cases, non-standard symbols can have specific meaning to the software or system that is unknown to the user. Similarly, some software or operating systems may not always open a file if it views spaces in the file name as an error.
You and your team members name files inconsistently. This prevents you from being able to sort all the files in the directory in a logical manner, such as chronologically by date or numeric sequence in the ascending or descending order they were created, or by the version number.
Enable the logical sorting of files by adding the creation date, a sequence number, or a version number to the file name.
Date formats can differ in various parts of the world (e.g., 04-05-05 can be either May 5th or April 5th depending on which country you are in), so a best practice to avoid any confusion is to follow the format endorsed by the International Organization for Standardization (ISO), which is in the order of year-month-day (YYYYMMDD). April 5, 2005 would be written as 20050405.
The best practice for adding a sequence number is starting with ‘01’ for one instead of ‘1’ so that a computer recognizes the place value and does not interpret one as ten. For example, for tens of files use 01-99, for hundreds use 001-999, and thousands of files use 0001-9999, and so on. Some researchers like to include a descriptive file in the directory that provides information about the files in that directory called a README file. It will appear at the top by assigning it “00,” e.g., 00_README.
If you plan on having multiple drafts of a file, you can add a version number. This way you can sort the files and recognize the last version saved when you completed the document (and avoid having several files with "final" in the name).
Technology can change quickly. Software can have many versions over its lifetime with small (v1.1) or large updates (v2) to repair a bug or add a new feature. In some cases, if you have not updated the software in a while it can lead to issues when trying to open or work with files created using earlier versions. You may have created the file with software for which you or your school purchased a license, but you may have let the license expire; or you do not have the program on your own computer and are unable to open the file.
There are a few best practices for selecting file formats to use for long-term storage and sharing of data.
Strasser, Carly (2015). Research Data Management: A Primer. National Information Standards Organization (NISO). https://www.niso.org/publications/primer-research-data-management.
This page was designed to help you:
Create file names with standard characters to prevent computational errors
Brown University Library | Providence, RI 02912 | (401) 863-2165 | Contact | Comments | Library Feedback | Site Map