Student Dataset Instructions

Undergraduates can submit data collected for their senior thesis or other research to an Academic Commons data collection.

When you submit data, you must include metadata describing your data files so that future researchers know enough about your research, and how the data were collected, to incorporate them into their own work.

This information is included in a README file that you will submit along with your data. When creating a README file, always ask yourself what types of information you would need in order to use someone else’s data.

See this sample README.

Creating your README File

To create a README file for your dataset, we recommend that you use this Google Doc template or create your own Word document containing the same elements. Add the required information and save as a text or PDF file with the following file name structure: "README_{Metadata/Dataset title}_{Today's date}.txt" (example "README_BeetleData_20080131"). It may be appropriate to include multiple README files when more than one data file is submitted.

The template contains the following elements, some of which may not be applicable to your project.

Element Description Examples
Title Name of the dataset or research project that produced it. Continued Growth of Trees at Valley Oak Reserve
Creator Names and addresses of the organizations or people who created the data; preferred format for personal names is surname first. Smith, Jane
Date Key dates associated with the data, including: project start and end date; release date; time period covered by the data. yyyy-mm-dd, or yyyy.mm.dd-yyyy.mm.dd for a range
Method How the data were generated, listing equipment and software used (including model and version numbers if possible), formulae, algorithms, experimental protocols, and other things one might include in a lab notebook. Description of uncertainty, precision, or accuracy of measurements. Links or references to publications or other documentation containing experimental design or protocols used in data collection. Visual count recorded on paper, simulation program, survey administered in person;

Ten 1 square meter plots were randomly placed throughout the Oak Creek research facility. Plastic sample bags (Ziplock) are labeled with the randomly assigned plot number. Approximately 0.5 g of soil, free from plant debris, is collected from the middle of the plot. Soil samples are placed ...
Processing How the data have been altered or processed (e.g., normalized). Quality assurance and quality control that have been applied. Known problems that limit data’s use. Raw data from field / interviews was entered manually into an Excel spreadsheet.

All sampling was done by Jane Researcher, Sally Labmanager and John Fieldassistant. Data was plotted and reviewed for data entry errors by a second lab member before submission.
Source Citations to data derived from other sources, including details of where the source data is held and how it was accessed. Citations for the external data sources relied upon for key data elements as described in Method section.
Funder Organizations or agencies who funded the research.
Subject Keywords or phrases describing the subject or content of the data. Follow the conventions of your discipline for keywords and use standardized taxonomies or vocabularies when available. Grasslands, Biomass, Eastern Oregon, Aetiology
Place All applicable physical locations. Data were collected in coastal range of Oregon, in the McDonald-Dunn Research Forest (Oregon State University), in the vicinity of 44°38'11.7"N 123°17'47.0"W.
Variable List and Codes All variables in the data files. Special codes or abbreviations used in either the file names or the variables in the data files. Column headings for any tabular data. “Fe2-ion” is the measurement of Fe +2 ions found dissolved in water samples measured as PPM (parts per million).

N is the measurement of soil nitrogen

999 indicates a missing value in the data
File structure Organization of the data file(s). If dataset is composed of multiple files, describe the organization (e.g. file names, directories or folders in a .zip archive). Brief description of each file or file type including where in the research process it lies (e.g. raw/unanalyzed data, process/analyzed data, rendered/visualized data) Zipped files: 2014_Oak_Creek_Inventory.txt - analyzed nitrogen and phosphorous levels for Oak Creek plots, 2014_Zena_Forest_Inventory.txt - analyzed nitrogen and phosphorous levels for Zena Forest plots

projectLevel_raw.csv - raw results from a query of the NSF Awards Database for Oregon State Univesity awards that started on or after the start date of the DMP requirement (18 January 2011) through the end of 2013 and ended by 1 July 2015
Necessary Software Names of any special-purpose software packages required to create, view, analyze, or otherwise use the data. Tab separated text files, originally created in Microsoft Excel. Files should be readable in any basic text editor such as Notepad, Open Office, TextEdit etc.
Rights Any known intellectual property rights, statutory rights, licenses, or restrictions on use of the data. This data is freely available for re-use. Please acknowledge Jane Smith in any publications that use this data.

Submitting your Data and README

Go to the Student Research Data area of the Academic Commons and choose your collection (e.g. Environmental and Earth Sciences Data).

Choose the "Submit a new item to this collection" link. Enter the required descriptive information and submit your data and README files to the collection.

If you plan to or have already submitted a senior thesis or other research paper to the Academic Commons, your Subject Librarian will later create a link from your research paper to the supporting dataset.