dataset template
Annotations for datasets on the ELITE Portal [Download]
| Attribute | Description | Required | columnType | DependsOn | Source | Parent | Valid Values |
|---|---|---|---|---|---|---|---|
| dataType | The category or format of data generated or collected in an experiment, describing the kind of information the dataset contains (for example, genomics, imaging, proteomics, or behavioral data). | True | STRING_LIST | Sage Bionetworks | ManifestColumn | clinical,drugScreen,electrophysiology,epigenetics,geneExpression,genomeAssembly,genomicVariants,imaging,lipidomics,metabolomics,metagenomics,Not applicable,Not collected,Not specified,Other,phenotype,proteomics,Unknown,wearableData | |
| measurementTechnique | The name of the measurement technique describing the assay method. Provide a value OR provide one of these values - Unknown Not collected, Not applicable, Not specified | True | STRING | ManifestColumn | Bisulfite sequencing,Clinical data,Genome-wide association study,High-performance liquid chromatography tandem mass spectrometry,Liquid chromatography mass spectrometry,Liquid chromatography tandem mass spectrometry,Mass spectrometry,Metabolomics,MetaX-processed metabolomics data,Proximity extension assay,RNA sequencing,Single-cell RNA sequencing,Shotgun metagenomic sequencing,Single nucleotide polymorphism array,Tandem Mass Tag proteomics,Whole-genome sequencing,Unknown,Other,Not collected,Not applicable,Not specified | ||
| specimenType | Type of biological material sample taken from a biological entity for research purposes | True | STRING_LIST | ManifestColumn | blood, brain, buffy coat, cell line, cells, nervous system, organoid, plasma, saliva, serum, skin, stool, tissue, urine, Not applicable, Not collected, Not specified, Other, Unknown | ||
| studyKey | The short acronym for a study name in a URL-friendly format (ex. LLFS_Metabolomics OR ELPSCRNA) | True | STRING | Sage Bionetworks | ManifestColumn | ||
| contributor | True | STRING | Sage Bionetworks | ||||
| project | The ELITE project short name associated with the tool | True | STRING | Sage Bionetworks | ILO BU, ILO TGEN, LC, LG, LLFS, NECS APOE, Not assigned | ||
| alternateName | An altername name that can be used for search and discovery improvement. | False | STRING | SageBionetworks | |||
| conditionsOfAccess | Additional requirements a user may need outside of Data Use Modifiers. This could include additional registration updating profile information joining a Synapse Team or using specific authentication methods like 2FA or RAS. Omit property if not applicable/unknown. | False | STRING | SageBionetworks | |||
| countryOfOrigin | Origin of individuals from which data were generated. Omit if not applicable/unknown. | False | STRING | SageBionetworks | |||
| creator | Main researchers involved in producing the data in priority order. Usually matches the project PI(s) and data lead(s) responsible for conception and initial content creation. For tools this is the manufacturer or developer of the instrument. Expects properly formatted name of the organization or person (e.g. NF-OSI" or "Robert Allaway") not an id. See https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/creator/." | True | STRING_LIST | SageBionetworks | |||
| croissant_file_s3_object | Link to croissant file for dataset. | False | STRING | SageBionetworks | |||
| dataRestriction | Indicates the restriction level of files/folders. | True | STRING | SageBionetworks | Controlled, Registered, Open | ||
| dataUseModifiers | List of data use ontology (DUO) terms that are true for dataset which describes the allowable scope and terms for data use. Most datasets allow "General Research Use" unless otherwise specified. | False | STRING_LIST | SageBionetworks | Clinical Care Use,Collaboration Required,Disease Specific Research,Ethics Approval Required,General Research Use,Genetic Studies Only,Geographical Restriction,Health or Medical or Biomedical Research,Institution Specific Restriction,No General Methods Research,No Restriction,Non-Commercial Use Only,Not-for-Profit Non-Commercial Use Only,Not-for-Profit Organisation Use Only,Population Origins or Ancestry Research Only,Population Origins or Ancestry Research Prohibited,Project Specific Restriction,Publication Moratorium,Publication Required,Research Specific Restrictions,Return to Database or Resource,Time Limit on Use,User Specific Restriction | ||
| datasetType | The classification of a dataset based on its role, scope, or purpose within a study. | True | STRING | SageBionetworks | experimental, publication | ||
| datePublished | Date the dataset was published or made available on Synapse formatted as YYYY-MM-DD. Maps to schema.org datePublished. | False | STRING | SageBionetworks | |||
| individualCount | Number of unique individuals included in the dataset (whether as individual-level or as aggregate data). Omit if not applicable/unknown. | False | INTEGER | SageBionetworks | |||
| keywords | Typically between 1 to 5 informative terms or phrases that help users find the dataset. | True | STRING_LIST | SageBionetworks | |||
| license | License attached to the data. If indicates UNKNOWN or RESTRICTED-USE. Data may not be used without further contact for terms. | True | STRING_LIST | SageBionetworks | CC BY-NC,CC BY-NC 4.0,CC BY-NC 3.0,CC BY-NC 2.5,CC BY-NC 2.0,CC BY-NC 1.0,CC BY-NC-ND,CC BY-NC-ND 4.0,CC BY-NC-ND 3.0,CC BY-NC-ND 2.5,CC BY-NC-ND 2.0,CC BY-NC-ND 1.0,CC BY-NC-SA,CC BY-NC-SA 4.0,CC BY-NC-SA 3.0,CC BY-NC-SA 2.5,CC BY-NC-SA 2.0,CC BY-NC-SA 1.0,CC BY-ND,CC BY-ND 4.0,CC BY-ND 3.0,CC BY-ND 2.5,CC BY-ND 2.0,CC BY-ND 1.0,CC BY-SA,CC BY-SA 4.0,CC BY-SA 3.0,CC BY-SA 2.5,CC BY-SA 2.0,CC BY-SA 1.0,CC-0,CC0 1.0,CC-BY,CC-BY 4.0,CC-BY 3.0,CC-BY 2.5,CC-BY 2.0,CC-BY 1.0,ODC-BY,ODC-BY 1.0,ODC-ODbL,ODC-ODbL 1.0,ODC-PDDL,ODC-PDDL 1.0,Public Domain,UNKNOWN | ||
| subject | Applicable subject term(s) for dataset cataloging; use the Library of Congress Subject Headings (LCSH) scheme. | False | STRING_LIST | SageBionetworks | |||
| assay | The analysis or technology used to generate the data in this file | True | STRING_LIST | sage.annotations-experimentalData.assay-0.0.26 | ManifestColumn | 10x multiome, 16SrRNAseq, active avoidance learning behavior, anxiety-related behavior, ATACSeq, atomicForceMicroscopy, autoradiography, Baker Lipidomics, Biocrates Bile Acids, Biocrates p180, Biocrates Q500, bisulfiteSeq, Blood Chemistry Measurement, brightfieldMicroscopy, cellViabilityAssay, ChIPSeq, CITESeq, contextual conditioning behavior, CUT&Tag, DIA, DNA optical mapping, electrochemiluminescence, elevated plus maze test, elevated T maze apparatus method, ELISA, errBisulfiteSeq, exomeSeq, FIA-MSMS, FitBark, frailty assessment, Genotyping, HI-C, HiChIPseq, high content screen, HPLC, HPLC-MSMS, Immunocytochemistry, immunofluorescence, immunohistochemistry, in vivo bioluminescence, ISOSeq, jumpingLibrary, kinesthetic behavior, label free mass spectrometry, Laser Speckle Imaging, LC-MS, LC-MSMS, LC-SRM, Leiden Oxylipins, lentiMPRA, LFP, liquid chromatography-electrochemical detection, lncrnaSeq, locomotor activation behavior, long-read rnaSeq, LTP, MDMS-SL, memory behavior, Metabolon, methylationArray, MIB/MS, microRNAcounts, mirnaArray, mirnaSeq, MRI, mRNAcounts, MudPIT, m6A-rnaSeq, nextGenerationTargetedSequencing, Nightingale NMR, NOMe-Seq, novelty response behavior, open field test, oxBS-Seq, pharmacodynamics, pharmacokinetics, photograph, polymeraseChainReaction, Positron Emission Tomography, proximity extension assay, questionnaire, Rader Lipidomics, Real Time PCR, Ribo-Seq, rotarod performance test, rnaArray, rnaSeq, RPPA, sandwich ELISA, Sanger sequencing, scale, scATACSeq, scCGIseq, scirnaSeq, scrnaSeq, scwholeGenomeSeq, SiMoA, snpArray, snATACSeq, snrnaSeq, spontaneous alternation, STARRSeq, TMT quantitation, tractionForceMicroscopy, UPLC-MSMS, UPLC-ESI-QTOF-MS, UC Davis GCTOF, UCSD Untargeted Metabolomics, Vernier Caliper, von Frey test, westernBlot, wheel running, whole-cell patch clamp, wholeGenomeSeq, Wishart Catecholamines, Wishart High Value Metabolites, Zeno Electronic Walkway, Not collected, Not specified, Not applicable, Other, Unknown | |
| species | The name of a species (typically a taxonomic group) of organism. | True | STRING | Sage Bionetworks | ManifestColumn | Cross-Species Avian,Cross-Species Mammalian,Human,Mouse |