README.md 1.35 KB
Newer Older
peguerin's avatar
peguerin committed
1
2
# subset_30_samples_vcf

peguerin's avatar
peguerin committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Here we provide VCF files of filtered SNPs detected from RAD sequencing for the paper "Genomic resources for Mediterranean fishes"

# Samples

A total of 90 samples (30 per species) from the Western Mediterranean fishes :
* Diplodus sargus
* Mullus surmuletus
* Serranus cabrilla


# Variant calling and filtering

## Variant calling

* Using `stacks2 populations` we generated SNPs for 90 samples among 3 species from RADseq data.
* Only one randomly selected SNP was retained per locus, and a locus was retained only if present in at least 85% of individuals. Individuals with an excess coverage depth (>1,000,000x) or >30% missing data were filtered out. We kept loci with maximum observed heterozygosity=0.6.

## Filtering steps

* Keep all pairs of loci that are closer than 5000 bp
* Keep pairs of loci with linkage desequilibrum r² > 0.8
* Keep SNPs with a minimum minor allele frequency (MAF) of 1%

see https://gitlab.mbb.univ-montp2.fr/reservebenefit/snps_statistics repository for details


# Data

## 3 VCFs, one for each species : *Diplodus sargus*, *Mullus surmuletus*, *Serranus cabrilla*

peguerin's avatar
peguerin committed
33
34
35
* [diplodus_subset30.recode.vcf](diplodus_subset30.recode.vcf)
* [mullus_subset30.recode.vcf](mullus_subset30.recode.vcf)
* [serran_subset30.recode.vcf](serran_subset30.recode.vcf)
peguerin's avatar
peguerin committed
36
37
38

# 1 CSV, table of the 90 samples 

peguerin's avatar
peguerin committed
39
* [sample.csv](sample.csv) 
peguerin's avatar
peguerin committed
40
41