Commit 1dfcb3c2 authored by peguerin's avatar peguerin
Browse files

readme update

parent f430b363
# subset_30_samples_vcf
Here we provide VCF files of filtered SNPs detected from RAD sequencing for the paper "Genomic resources for Mediterranean fishes"
# Samples
A total of 90 samples (30 per species) from the Western Mediterranean fishes :
* Diplodus sargus
* Mullus surmuletus
* Serranus cabrilla
# Variant calling and filtering
## Variant calling
* Using `stacks2 populations` we generated SNPs for 90 samples among 3 species from RADseq data.
* Only one randomly selected SNP was retained per locus, and a locus was retained only if present in at least 85% of individuals. Individuals with an excess coverage depth (>1,000,000x) or >30% missing data were filtered out. We kept loci with maximum observed heterozygosity=0.6.
## Filtering steps
* Keep all pairs of loci that are closer than 5000 bp
* Keep pairs of loci with linkage desequilibrum r² > 0.8
* Keep SNPs with a minimum minor allele frequency (MAF) of 1%
see https://gitlab.mbb.univ-montp2.fr/reservebenefit/snps_statistics repository for details
# Data
## 3 VCFs, one for each species : *Diplodus sargus*, *Mullus surmuletus*, *Serranus cabrilla*
* diplodus_subset30.recode.vcf
* mullus_subset30.recode.vcf
* serran_subset30.recode.vcf
# 1 CSV, table of the 90 samples
* sample.csv
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment