assay genotyping annotation template
Template for file-based Genotyping annotations [Download]
| Attribute | Description | Required | columnType | DependsOn | Source | Parent | Valid Values |
|---|---|---|---|---|---|---|---|
| ModelSystemType | Type of model system | False | BOOLEAN | Sage Bionetworks | ManifestColumn | animal, cerebral organoid, immortalized cell line, iPSC, organoid, primary cell culture, Not assigned | |
| isModelSystem | Boolean flag indicating whether or not a file has data from a model system | True | BOOLEAN | Sage Bionetworks | ManifestColumn | True, False, Not assigned | |
| adjustedCovariates | Any covariates the GWAS is adjusted for. Multiple values can be listed separately by a semicolon (;). Example - age;sex | False | STRING | ManifestColumn | |||
| analysisType | Type of analysis | False | STRING | Sage Bionetworks | ManifestColumn | ANOVA,batch effect correction,clustering,data normalization,de-novo assembly,dose response study,enrichment analysis,genome-wide association,mendelian randomization analysis,network analysis,polygenic risk scores, Not assigned | |
| arrayInformation | Additional information about the genotyping array. For example, for targeted arrays, please provide the specific type of array. Example - Immunochip | False | STRING | ManifestColumn | |||
| arrayManufacturer | Manufacturer of the genotyping array used for the discovery stage. Separate multiple manufacturers by a semicolon (;). Example - Illumina, Affymetrix, Perlegen | False | STRING | ManifestColumn | |||
| backgroundTrait | Any background trait(s) shared by all individuals in the GWAS (e.g. in both cases and controls). Example - Nicotine dependence | False | STRING | ManifestColumn | |||
| consortium | The name of the consortium | True | STRING | Sage Bionetworks | ManifestColumn | ELITE, ELITE CDCP, Not assigned | |
| dataSubtype | Further qualification of dataType, which may be used to indicate the state of processing of the data, aggregation of the data, or presence of metadata | True | STRING | Sage Bionetworks | ManifestColumn | raw,processed,results,normalized,metadata, Not assigned | |
| dataType | The category or format of data generated or collected in an experiment, describing the kind of information the dataset contains (for example, genomics, imaging, proteomics, or behavioral data). | True | STRING_LIST | Sage Bionetworks | ManifestColumn | clinical,drugScreen,electrophysiology,epigenetics,geneExpression,genomeAssembly,genomicVariants,imaging,lipidomics,metabolomics,metagenomics,Not applicable,Not collected,Not specified,Other,phenotype,proteomics,Unknown,wearableData | |
| fileFormat | Defined format of the data file, typically corresponding to extension, but sometimes indicating more general group of files produced by the same tool or software | True | STRING | Sage Bionetworks | ManifestColumn | FASTQ, FASTA, SAM, BAM, CRAM, VCF, BCF, GTF, GFF, GFF3, BED, BigBed, WIG, BigWig, CSV, TSV, MTX, H5AD, LOOM, RDS,PED, MAP, BED_PLINK, TPED, TFAM, BEDGRAPH, NARROWPEAK, BROADPEAK, TAGALIGN, OME-TIFF, ND2, CZI, LSM, H5, HDF5, PARQUET, PDB, MMCIF, JSON, YAML, XML, TXT, XLSX, CSV, Not assigned | |
| genotypeTechnology | Method(s) used to genotype variants in the discovery stage. Separate multiple methods separated by a semicolon (;). Example - genome-wide genotyping array, exome-wide sequencing, targeted genotyping array | True | STRING | ManifestColumn | |||
| imputation | Were SNPs imputed for the discovery GWAS? | True | STRING | ManifestColumn | |||
| imputationPanel | Panel used for imputation. Example - 1000 Genomes Phase 3 | False | STRING | ManifestColumn | |||
| imputationSoftware | Imputation software. Example - IMPUTE | False | STRING | ManifestColumn | |||
| measurementTechnique | The name of the measurement technique describing the assay method. Provide a value OR provide one of these values - Unknown Not collected, Not applicable, Not specified | True | STRING | ManifestColumn | Bisulfite sequencing,Clinical data,Genome-wide association study,High-performance liquid chromatography tandem mass spectrometry,Liquid chromatography mass spectrometry,Liquid chromatography tandem mass spectrometry,Mass spectrometry,Metabolomics,MetaX-processed metabolomics data,Proximity extension assay,RNA sequencing,Single-cell RNA sequencing,Shotgun metagenomic sequencing,Single nucleotide polymorphism array,Tandem Mass Tag proteomics,Whole-genome sequencing,Unknown,Other,Not collected,Not applicable,Not specified | ||
| numberIndividuals | Number of individuals in a group. Example - 2000 | True | INTEGER | ManifestColumn | |||
| readmeFile | The name of any readme file uploaded to Synapse. Example - example.tsv | False | STRING | ManifestColumn | |||
| reagentCatalogNumber | If the assay reagent is a commercial product, enter the vendor's catalog identifier. If the reagent is a custom preparation enter 'NA'. | False | STRING | ManifestColumn | |||
| reagentContact | The contact information is particularly helpful when the reagent is not from a commercial vendor. | False | STRING | ManifestColumn | |||
| reagentIDs | One or more identifiers, separated by a semicolon (;). The reagent identifier(s) must be stored in a data dictionary .csv file uploaded to Synapse. | False | STRING | ManifestColumn | |||
| reagentLotNumber | The lot number is often provided by a reagent source when the reagent is replenished over time. | False | STRING | ManifestColumn | |||
| reagentManufacturer | The manufacturer is the source of a reagent and may include commercial vendors as well as non-commercial sources (e.g., collaborating labs). | False | STRING | ManifestColumn | |||
| reagentName | The reagent name is an alternative to the Reagent ID. | False | STRING | ManifestColumn | |||
| reagentWeblink | An internet address that may provide details of an assay reagent. | False | STRING | ManifestColumn | |||
| reportedTrait | The trait under investigation. Please describe the trait concisely but with enough detail to be clear to a non-specialist. Avoid the use of abbreviations; if these are necessary, please define them or their source in the readme file. Example - Reticulocyte count | True | STRING | ManifestColumn | |||
| resourceType | The type of resource being stored and annotated | True | STRING | Sage Bionetworks | ManifestColumn | experimentalData,metadata,tool,analysis,computationalNotebook,softwareTool,Not assigned | |
| speciesGroup | The taxonomic ranking including both species and subspecies the individual belongs to. | True | STRING | ManifestColumn | Amphibian,Bird,Fish,Invertebrate,Mammal,Not applicable,Not collected,Not specified,Reptile,Unknown | ||
| speciesName | The scientific name of the species (typically a taxonomic group, ex. "Eremophila alpestris) the individual belongs to.""" | True | STRING | ManifestColumn | Acomys cahirinus, Acomys russatus, Accipiter cooperii, Actitis macularius, Aix sponsa, Anas acuta, Anas carolinensis, Anas platyrhynchos, Antigone canadensis, Archilochus colubris, Artibeus jamaicensis, Baeolophus bicolor, Balaena mysticetus, Blarina brevicauda, Bombycilla cedrorum, Bonasa umbellus, Bos taurus, Branta canadensis, Bubo virginianus, Buteo jamaicensis, Buteo lineatus, Canis latrans, Cardinalis cardinalis, Castor canadensis, Cavia porcellus, Charadrius vociferus, Chinchilla lanigera, Columba livia, Condylura cristata, Corvus brachyrhynchos, Cricetomys ansorgei, Cricetulus barabensis, Cricetulus griseus, Cryptomys damarensis, Cuniculus paca, Cyanocitta cristata, Cygnus olor, Dryobates pubescens, Dumetella carolinensis, Ellobius lutescens, Ellobius talpinus, Eonycteris spelaea, Eptesicus fuscus, Equus caballus, Eremophila alpestris, Fukomys damarensis, Haemorhous mexicanus, Heterocephalus glaber, Hirundo rustica, Homo sapiens, Hydrochoerus hydrochaeris, Hydroprogne caspia, Hylocichla mustelina, Icteria virens, Larus argentatus, Larus delawarensis, Macaca fascicularis, Macaca mulatta, Mareca strepera, Marmota monax, Melanerpes carolinus, Meleagris gallopavo, Melospiza melodia, Meriones unguiculatus, Mesocricetus auratus, Microtus pennsylvanicus, Mimus polyglottos, Molothrus ater, Multi-species, Mus musculus, Myocastor coypus, Myotis lucifugus, Nannospalax galili, Neosciurus carolinensis, Neotoma cinerea, Neotoma floridana, Not applicable, Not collected, Not provided, Not specified, Octodon degus, Odocoileus virginianus, Ondatra zibethicus, Other, Pan troglodytes, Papio anubis, Passer domesticus, Passerina caerulea, Passerina cyanea, Peromyscus gossypinus, Peromyscus leucopus, Peromyscus maniculatus, Phalacrocorax auritus, Phasianus colchicus, Picoides villosus, Pipilo erythrophthalmus, Poecile carolinensis, Quiscalus quiscula, Rattus norvegicus, Rattus rattus, Regulus calendula, Saimiri boliviensis, Sayornis phoebe, Scalopus aquaticus, Sciurus carolinensis, Sciurus niger, Sciurus vulgaris, Scolopax minor, Setophaga citrina, Setophaga coronata, Setophaga dominica, Setophaga petechia, Setophaga pinus, Sialia sialis, Sigmodon hispidus, Sitta carolinensis, Spatula clypeata, Spinus tristis, Spizella passerina, Spizella pusilla, Spizelloides arborea, Struthio camelus, Sturnus vulgaris, Sus scrofa, Sylvilagus floridanus, Tachycineta bicolor, Tamias striatus, Tamiasciurus hudsonicus, Thryothorus ludovicianus, Toxostoma rufum, Troglodytes aedon, Turdus migratorius, Tursiops truncatus, Vicugna pacos, Vireo griseus, Vireo olivaceus, Zalophus californianus, Zenaida macroura, Zonotrichia albicollis, Unknown | ||
| specimenType | Type of biological material sample taken from a biological entity for research purposes | True | STRING_LIST | ManifestColumn | blood, brain, buffy coat, cell line, cells, nervous system, organoid, plasma, saliva, serum, skin, stool, tissue, urine, Not applicable, Not collected, Not specified, Other, Unknown | ||
| stage | Stage of the experimental design. | True | STRING | ManifestColumn | Discovery,Not applicable,Not collected,Not specified,Other,Replication,Unknown | ||
| statisticalModel | A brief description of the statistical model used to determine association significance. It is important to distinguish studies that would otherwise appear identical (e.g., the same trait analyzed using additive, dominant, and recessive models). Example - additive model | False | STRING | ManifestColumn | |||
| studyKey | The short acronym for a study name in a URL-friendly format (ex. LLFS_Metabolomics OR ELPSCRNA) | True | STRING | Sage Bionetworks | ManifestColumn | ||
| summaryFile | The name of any summary statistics file uploaded to Synapse. Example - example.tsv | False | STRING | ManifestColumn | |||
| summaryStatisticsAssembly | Genome assembly for the summary statistics. Example - GRCh38 | False | STRING | ManifestColumn | |||
| treatmentAmountUnit | Unit of treatment amount. | False | NUMBER | ManifestColumn | AFU,AI,AU/ml,DK units/ml,g/dl,g/l,gm,HAU,IU,iu/l,IU/ml,Kallikrein Inactivator Unit per Milliliter,kg,l,M,mg,mg/dl,mg/l,mg/ml,miu/ml,ml,mM,MOI,ng,ng/dl,ng/ml,ng/nl,ng/ul,nl,nM,Not specified,NPX,optical density,other,PFU,PFUe,pg,pg/mg creatinine,pg/ml,pg/nl,pg/ul,pl,pM,Pound,TCID50,ug,ug/dl,ug/l,ug/ml,ug/ul,uiu/ml,ul,uM,umol/l,units/ml | ||
| treatmentAmountValue | The Amount Value indicates how much (concentration, mass, volume) of a treatment agent was applied to a sample. | False | STRING | ManifestColumn | |||
| treatmentDurationUnit | Unit of treatment duration. | False | STRING | ManifestColumn | d.p.c.,Days,Hours,Minutes,Months,Not applicable,Not collected,Not specified,Seconds,Unknown,Weeks,Years | ||
| treatmentDurationValue | Duration of treatment. | False | NUMBER | ManifestColumn | |||
| treatmentIDs | One or more identifiers, separated by a semicolon (;). The treatment identifier(s) must be stored in a data dictionary .csv file uploaded to Synapse. | False | STRING | ManifestColumn | |||
| treatmentName | Treatments refer to in vitro modifications of samples. Three treatment types are supported- agent amount, duration, and temperature. The treatment name is an alternate identifier to the Treatment ID. | False | STRING | ManifestColumn | |||
| treatmentTemperatureUnit | Unit of treatment temperature. | False | STRING | ManifestColumn | C,F,K,Not specified | ||
| treatmentTemperatureValue | Value of treatment temperature. | False | NUMBER | ManifestColumn | |||
| useReadMeFile | Was a readme file uploaded to Synapse? | True | STRING | ManifestColumn | |||
| useReagent | Was a reagent applied to the sample? | True | BOOLEAN | ManifestColumn | FALSE,TRUE | ||
| useSummaryFile | Was a summary statitics file uploded to Synapse? | True | STRING | ManifestColumn | |||
| useTreatment | Was a treatment applied to the sample? | True | BOOLEAN | ManifestColumn | |||
| variantCount | The number of variants analysed in the discovery stage (after QC). Example - 52500 | True | STRING | ManifestColumn | |||
| Filename | False | STRING | |||||
| metadataType | For files of dataSubtype: metadata, a description of the type of metadata in the file | False | STRING | Sage Bionetworks | ManifestColumn | individual,biospecimen,assay,supplementary files,Not Assigned | |
| project | The ELITE project short name associated with the tool | True | STRING | Sage Bionetworks | ILO BU, ILO TGEN, LC, LG, LLFS, NECS APOE, Not assigned | ||
| organ | Indicate the organ the specimen is from. An organ is a unique macroscopic (gross) anatomic structure that performs specific functions. It is composed of various tissues. | True | STRING | sage.annotations-experimentalData.organ-0.0.4 | ManifestColumn | blood, bone marrow, brain, breast, Bursa Of Fabricius, cerebrospinal fluid, colon, kidney, large intestine, liver, lung, lymph node, mammary gland, nerves, nose, ovary, pancreas, prostate, skin, spleen, Not collected, Not specified, Not applicable, Other, Unknown, plasma, gonadal fat, inguinal fat, gastrocnemius muscle | |
| tissue | Indicate the tissue the specimen is from. A tissue is a multicellular anatomical structure that consists of many cells of one or a few types arranged in an extracellular matrix. | True | STRING_LIST | sage.annotations-experimentalData.tissue-0.0.11 | ManifestColumn | amygdala, amygdaloid complex, anterior cingulate cortex, angular gyrus, blood, bone marrow, Buccal Mucosa, Buffy Coat, caudate nucleus, cecum derived fecal material, cerebellar cortex, cerebellum, cerebral cortex, cortical plate, dorsal anterior cingulate cortex, dorsal pallium, Dorsal Root Ganglion, dorsolateral prefrontal cortex, dorsomedial prefrontal cortex, embryonic tissue, entorhinal cortex, fecal material, forebrain, frontal cortex, frontal lobe, frontal pole, fusiform gyrus, hippocampus, head of caudate nucleus, inferior frontal gyrus, inferior temporal cortex, inferior temporal gyrus, inferolateral temporal cortex, insula, insular cortex, lateral entorhinal cortex, left cerebral hemisphere, liver, mammillary body, medial dorsal nucleus of thalamus, medial entorhinal cortex, medial frontal cortex, medial ganglionic eminence, medial orbital frontal cortex, medial prefrontal cortex, meninges, midbrain, middle frontal gyrus, middle temporal gyrus, nerve tissue, Not Applicable, nucleus accumbens, occipital lobe, occipital visual cortex, olfactory neuroepithelium, orbitofrontal cortex, parahippocampal gyrus, parietal cortex, parietal lobe, plasma, posterior cingulate cortex, posteroinferior parietal cortex, posterior inferior parietal cortex, posterior superior temporal cortex, precentral gyrus, prefrontal cortex, primary auditory cortex, primary motor cortex, primary somatosensory cortex, primary tumor, primary visual cortex, putamen, right cerebral hemisphere, serum, splenocyte, striatum, subgenual anterior cingulate cortex, subgenual cingulate cortex, superior parietal lobe, superior temporal gyrus, temporal cortex, temporal lobe, temporal pole, thalamus, unspecified, ventricular zone, ventrolateral prefrontal cortex, VZ/SVZ, whole brain, gonadal fat, inguinal fat, kidney, plasma, liver, gastrocnemius muscle | |
| familyStudyParticipant | Indicates whether or not a file has data from a human participant involved in a family study (ex. LLFS) | False | STRING | sage.annotations-demographics.ethnicityfamilyStudyParticipant-0.0.2 | ManifestColumn | Yes, No, Not Assigned | |
| assay | The analysis or technology used to generate the data in this file | True | STRING_LIST | sage.annotations-experimentalData.assay-0.0.26 | ManifestColumn | 10x multiome, 16SrRNAseq, active avoidance learning behavior, anxiety-related behavior, ATACSeq, atomicForceMicroscopy, autoradiography, Baker Lipidomics, Biocrates Bile Acids, Biocrates p180, Biocrates Q500, bisulfiteSeq, Blood Chemistry Measurement, brightfieldMicroscopy, cellViabilityAssay, ChIPSeq, CITESeq, contextual conditioning behavior, CUT&Tag, DIA, DNA optical mapping, electrochemiluminescence, elevated plus maze test, elevated T maze apparatus method, ELISA, errBisulfiteSeq, exomeSeq, FIA-MSMS, FitBark, frailty assessment, Genotyping, HI-C, HiChIPseq, high content screen, HPLC, HPLC-MSMS, Immunocytochemistry, immunofluorescence, immunohistochemistry, in vivo bioluminescence, ISOSeq, jumpingLibrary, kinesthetic behavior, label free mass spectrometry, Laser Speckle Imaging, LC-MS, LC-MSMS, LC-SRM, Leiden Oxylipins, lentiMPRA, LFP, liquid chromatography-electrochemical detection, lncrnaSeq, locomotor activation behavior, long-read rnaSeq, LTP, MDMS-SL, memory behavior, Metabolon, methylationArray, MIB/MS, microRNAcounts, mirnaArray, mirnaSeq, MRI, mRNAcounts, MudPIT, m6A-rnaSeq, nextGenerationTargetedSequencing, Nightingale NMR, NOMe-Seq, novelty response behavior, open field test, oxBS-Seq, pharmacodynamics, pharmacokinetics, photograph, polymeraseChainReaction, Positron Emission Tomography, proximity extension assay, questionnaire, Rader Lipidomics, Real Time PCR, Ribo-Seq, rotarod performance test, rnaArray, rnaSeq, RPPA, sandwich ELISA, Sanger sequencing, scale, scATACSeq, scCGIseq, scirnaSeq, scrnaSeq, scwholeGenomeSeq, SiMoA, snpArray, snATACSeq, snrnaSeq, spontaneous alternation, STARRSeq, TMT quantitation, tractionForceMicroscopy, UPLC-MSMS, UPLC-ESI-QTOF-MS, UC Davis GCTOF, UCSD Untargeted Metabolomics, Vernier Caliper, von Frey test, westernBlot, wheel running, whole-cell patch clamp, wholeGenomeSeq, Wishart Catecholamines, Wishart High Value Metabolites, Zeno Electronic Walkway, Not collected, Not specified, Not applicable, Other, Unknown | |
| isMultiSpecimen | Boolean flag indicating whether or not a file has data for multiple specimens | True | STRING | Sage Bionetworks | ManifestColumn | true,false,Not assigned |