README.md 4.72 KB
Newer Older
Romain Feron's avatar
Romain Feron committed
1
# RadSex
2

Romain Feron's avatar
Romain Feron committed
3
## Overview
4

Romain Feron's avatar
Romain Feron committed
5
The RADSex pipeline implements several functions for the analysis of RAD-Sequencing data with focus on sex. This pipeline was developed for the PhyloSex project, which investigates sex determining factors in a wide range of fish species.
6

Romain Feron's avatar
Romain Feron committed
7
8
9
The RADSex pipeline was developed by Romain Feron and Yann Guiguen while working at INRA, Rennes, France.

## Requirements
10

11
- A C++11 compliant compiler (GCC >= 4.8.1, Clang >= 3.3)
Romain Feron's avatar
Romain Feron committed
12
13
14
- The zlib library (which should be installed on linux by default)

## Installation
15

16
17
18
19
- Clone: `git clone git@github.com:INRA-LPGP/RadSex.git`
- Alternative: Download the archive and unzip it
- Go to the RadSex directory (`cd RadSex`)
- Run `make`
Romain Feron's avatar
Romain Feron committed
20
- The compiled `radsex` binary is located in `RadSex/bin/`
21

Romain Feron's avatar
Romain Feron committed
22
## Usage
23

Romain Feron's avatar
Romain Feron committed
24
### General
25

26
`radsex <command> [options]`
27
28
29

**Available commands** :

30
31
Command            | Description
------------------ | ------------
Romain Feron's avatar
Romain Feron committed
32
33
`process`    | Compute a matrix of coverage from a set of demultiplexed reads files
`distrib` | Compute the distribution of sequences between sexes
34
`subset` | Extract a subset of the coverage matrix
Romain Feron's avatar
Romain Feron committed
35
36
37
38
`signif` | Extract sequences significantly associated with sex
`loci` | Recreate polymorphic loci from a subset of coverage matrix
`mapping` | Map a subset of sequences (coverage table or fasta) to a reference genome and output sex-association metrics for each mapped sequence
`freq` | Compute sequence frequencies for the population
39

Romain Feron's avatar
Romain Feron committed
40
### process
41

Romain Feron's avatar
Romain Feron committed
42
`radsex process -d input_dir_path -o output_file_path [ -t n_threads -c min_cov ]`
43

44
*Generates a matrix of coverage for all individuals and all sequences. The output is a tabulated file, where each line contains the ID, sequence and coverage for each individual of a marker.*
45
46
47
48
49

**Options** :

Option | Full name | Description
--- | --- | ---
50
51
52
53
`-d` | `input_dir_path` | Path to a folder containing demultiplexed reads |
`-o``output_file_path` | Path to the output file |
`-t``n_threads` | Number of threads to use (default: 1) |
`-c``min_cov` | Minimum coverage to consider a marker in an individual (default: 1) |
54

Romain Feron's avatar
Romain Feron committed
55
### distrib
56

Romain Feron's avatar
Romain Feron committed
57
`radsex distrib -f input_file_path -o output_file_path -p popmap_file_path [ -c min_cov --output-matrix ]`
58

Romain Feron's avatar
Romain Feron committed
59
*Generates a table which contains the number of sequences present with coverage higher than min_cov and the probability of association with sex for every combination of number of males and number of females.*
60
61
62
63
64

**Options** :

Option | Full name | Description
--- | --- | ---
Romain Feron's avatar
Romain Feron committed
65
`-f` | `input_file_path` | Path to an coverage matrix obtained with `process` |
66
`-o``output_file_path` | Path to the output file |
Romain Feron's avatar
Romain Feron committed
67
68
`-p``popmap_file_path` | Path to a popmap file indicating the sex of each individual |
`-c``min_cov` | Minimum coverage to consider a sequence present in an individual (default: 1) |
69

Romain Feron's avatar
Romain Feron committed
70
### Subset
71

Romain Feron's avatar
Romain Feron committed
72
`radsex subset -f input_file_path -o output_file_path -p popmap_file_path [ -c min_cov --min-males min_males --min-females min_females --max-males max_males --max-females max_females --min-individuals min_individuals --max-individuals max_individuals]`
73

Romain Feron's avatar
Romain Feron committed
74
*Filters the coverage matrix to only export sequences present in any combination of M males and F females, with min_males ≤ M ≤ max_males, min_females ≤ F ≤ max_females, and min_individuals ≤ M + F ≤ max_individuals*
75
76
77
78
79

**Options** :

Option | Full name | Description
--- | --- | ---
Romain Feron's avatar
Romain Feron committed
80
`-f` | `input_file_path` | Path to an coverage matrix obtained with `process` |
81
`-o``output_file_path` | Path to the output file |
Romain Feron's avatar
Romain Feron committed
82
83
84
85
86
87
88
89
`-p``popmap_file_path` | Path to a popmap file indicating the sex of each individual |
`-c``min_cov` | Minimum coverage to consider a sequence in an individual (default: 1) |
`--min-males``min_males` | Minimum number of males with the sequence |
`--min-females``min_females` | Minimum number of females with the sequence |
`--max-males``max_males` | Maximum number of males with the sequence |
`--max-females``max_females` | Maximum number of females with the sequence |
`--max-individuals``max_individuals` | Maximum number of individuals with the sequence |
`--max-individuals``max_individuals` | Maximum number of individuals with the sequence |
90
91
92

### LICENSE

93
94
95
96
97
98
99
Copyright (C) 2018 Romain Feron and INRA LPGP

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/