STRs for identity ENFSI Reference database, v3/R3

Quality Control

STRidER provides quality control (QC) of autosomal STR datasets. STRidER is accepting datasets from diverse worldwide populations and forensically relevant autosomal STR markers that comply with ethical standards 1 2. The minimum requirements for population datasets for STRidER are 15 autosomal STR loci typed in 100 samples. Up to 1,000 samples can be accepted per dataset. Larger datasets should not simply be split up into smaller ones, please contact us in case of questions. Further requirements of journals might apply when datasets are intended for peer-reviewed publication.

A suite of software tools has been developed to scrutinize STR population data and thus increase the quality of datasets to ensure reliable allele frequency estimates. The necessary steps for submission of CE-based STR data to STRidER are outlined below. After positive evaluation, the authors will receive a unique STRidER accession numbers that serves as indicator of successful QC for the editors and reviewers. Please contact STRidER in case you want to submit STR sequence data or for any comments concerning your submission.

The Executive Board of the International Society of Forensic Genetics (ISFG) and the editors of Forensic Science International: Genetics and Forensic Science International: Reports invited STRidER to logistically organize and perform QC of autosomal STR population data in the course of manuscript preparations for the journal 3. Before STR population papers are put forward to the editors for review, the authors are requested to submit the data to STRidER. The minimum requirements for population datasets for these two journals are 15 autosomal STR loci typed in 500 samples 3 (for exceptional populations, the latter number can be smaller, please directly contact the editors before submission).

Step 1

Prepare your STR data file as shown in the example file that can be downloaded and used as template. It is a tab-delimited text file that can be created using standard text software or MS Excel (then, save file under .txt format).

Typical errors found in STR datasets and strategies how to avoid them are described in 4.

The initial lines (identified using the "#" symbol) specify details of the dataset and origin of the samples. Line 1 must contain a description of population(s) reported (e.g., the title of the study), number of samples, geographic origin, and the number of STR loci. Line 2 must indicate the contact author’s name with email address. Further text lines marked with "#" can be included for comments or description of the detailed geographic background and the appropriate metapopulation affiliation of the genotypes. Lines below these text lines list the original STR genotypes including amelogenin. Allele nomenclature criteria are applied as described here. The order of loci does not matter. Alleles for the same locus have to be reported in adjacent columns. Loci names must not contain spaces.

Note that only complete genotypes are accepted for QC. It is imperative that STR genotypes are reported individually and unshuffled using a unique identifier for each genotype in the dataset. The names are necessary for correspondence. The STR data file should be named Author_country_number of samples.txt (e.g. Parson_AUT_573.txt).

Step 2

Enter accompanying information per dataset in the online submission form and upload your STR data file (Step 1). This information is necessary for evaluation of the dataset. Keep raw data files available for any later inquiries. Inspection may be necessary for quality control purposes. By submitting the data, you confirm that informed consent and ethics approval for data generation and publication have been granted according to your national laws. You also confirm that you are submitting unshuffled (original) genotypes and that complete raw data are available for all genotypes for quality control purposes. By submitting, you agree that allele frequencies may be uploaded onto the STRidER database when QC is passed.

The data will be immediately checked for plausibility as outlined in 5 using in-house software. When submission is complete, you will receive a confirmation by e-mail.

Step 3

During STRidER quality control and evaluation, communication with respect to individual genotypes may follow. Once your data passed QC you will receive the STRidER accession number(s) for your data together with allele frequencies calculated from the dataset(s). Please provide accession number(s) to the journal editor and cite STRidER 5 in your manuscript.

Step 4

Data that successfully passed QC may be uploaded on the STRidER database. Any new release will be announced via the STRidER newsletter.

References

1 D'Amato ME, Bodner M, Butler JM, Gusmao L, Linacre A, Parson W, Schneider PM, Vallone P, Carracedo A (2020) Ethical publication of research on genetics and genomics of biological material: guidelines and recommendations; Forensic Sci Int Gen 48:102299
2 D'Amato ME, Joly Y, Lynch V, Machado H, Scudder N, Zieger M (2024) Ethical considerations for Forensic Genetic Frequency databases: First Report conception and development; Forensic Sci Int Gen 71:103053
3 Gusmão L, Butler JM, Linacre A, Parson W, Roewer L, Schneider PM, Carracedo A (2017) Revised guidelines for the publication of genetic population data; Forensic Sci Int Gen 30:160-163
4 Bodner M & Parson W (2020) The STRidER Report on Two Years of Quality Control of Autosomal STR Population Datasets; Genes 11(8): 901 (doi: 10.3390/genes11080901)
5 Bodner M, Bastisch I, Butler JM, Fimmers R, Gill P, Gusmão L, Morling N, Phillips C, Prinz M, Schneider PM, Parson W (2016) Recommendations of the DNA Commission of the International Society for Forensic Genetics (ISFG) on quality control of autosomal Short Tandem Repeat allele frequency databasing (STRidER); Forensic Sci Int Gen 24:97-102