getting_started.rst 8.43 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
Getting started
===============

Installation
------------

Requirements
~~~~~~~~~~~~

* A C++11 compliant compiler (GCC >= 4.8.1, Clang >= 3.3)
11
* The zlib library (usually installed on linux by default)
12
13
14
15
16
17

.. _install-release:

Installation
~~~~~~~~~~~~

RomainFeron's avatar
RomainFeron committed
18
There are three ways to install RADSex:
19
20
21

**1. Install the latest release**

22
23
24
25
* Download the latest release from `GitHub <https://github.com/RomainFeron/RadSex/releases>`_
* Unzip the archive
* Navigate to the `RADSex` directory
* Run ``make``
26

RomainFeron's avatar
RomainFeron committed
27
28
The compiled ``radsex`` binary will be located in **RADSex/bin/**.

29
**2. Install the latest stable development version**
30

31
To install the latest stable version of RADSex directly from the GitHub repository, run the following commands:
32
33
34

::

35
36
    git clone https://github.com/RomainFeron/RADSex.git
    cd RADSex
37
38
    make

39
The compiled ``radsex`` binary will be located in **RADSex/bin/**.
40

RomainFeron's avatar
RomainFeron committed
41
42
43
44
45
46
47
48
**3. Install RADSex with conda**

RADSex is available in `Bioconda <https://bioconda.github.io/recipes/radsex/README.html?#recipe-Recipe%20&#x27;radsex&#x27;>`_. To install RADSex with Conda, run the following command:

::

    conda install -c bioconda radsex

49
50
51
52
53
54

Update RADSex
~~~~~~~~~~~~~

To update RADSex, you can download the latest stable release and install it as described in the :ref:`install-release` section.

55
If you installed RADSex directly from the GitHub repository, update RADSex by running the following commands from the **RADSex** directory:
56
57
58
59
60
61

::

    git pull
    make rebuild

RomainFeron's avatar
RomainFeron committed
62
63
64
65
66
67
If you installed RADSex with Conda, run:

::

    conda update -c bioconda radsex

68
69
70
71

Before starting
---------------

72
Before running the pipeline, you should prepare the following files:
73
74
75

* A **set of demultiplexed reads**. The current version of RADSex does not implement demultiplexing. Raw sequencing reads can be demultiplexed using `Stacks <http://catchenlab.life.illinois.edu/stacks/comp/process_radtags.php>`_ or `pyRAD <http://nbviewer.jupyter.org/gist/dereneaton/af9548ea0e94bff99aa0/pyRAD_v.3.0.ipynb#The-seven-steps-described>`_.

Romain Feron's avatar
Romain Feron committed
76
* A **group information file (popmap)**: a tabulated file with individual ID as the first column and group as the second column. It is important that the individual IDs in the popmap are the same as the names of the demultiplexed reads files (see the :ref:`population-map` section).
77

Romain Feron's avatar
Romain Feron committed
78
* To align markers to a genome: the **genome file** in fasta format.
79

Romain Feron's avatar
Romain Feron committed
80
.. note:: When visualizing ``map`` results with ``radsex-vis``, linkage groups / chromosomes are automatically inferred from scaffold names in the reference sequence if their name starts with *LG*, *CHR*, or *NC* (case unsensitive). If chromosomes are named differently in the reference genome, you should prepare a tabulated file with reference contig ID in the first column and corresponding chromosome name in the second column (see the :ref:`chromosomes-names`).
81
82
83
84
85


Running RADSex
--------------

86
.. _computing-depth-table:
87

88
89
Computing the markers depth table
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
90

Romain Feron's avatar
Romain Feron committed
91
The first step of RADSex is to create a table of marker depths for the entire dataset using the ``process`` command:
92
93
94

::

95
    radsex process --input-dir ./samples --output-file markers_table.tsv --threads 16 --min-depth 1
96

Romain Feron's avatar
Romain Feron committed
97
98
In this example, demultiplexed reads are located in **./samples** and the markers table generated by ``process`` will be saved to **markers_table.tsv**. The parameter ``--threads`` specifies the number of threads to use, and ``--min-depth`` specifies the minimum depth to consider a marker present in an individual: markers which are not present with depth higher than this value in at least one individual will not be retained in the markers table.
It is advised to keep the minimum depth to the default value of 1 for this step, as it can be adjusted for each analysis later.
99

Romain Feron's avatar
Romain Feron committed
100
The resulting file **markers_table.tsv** is a tabulated file described in the :ref:`markers-depths-table-file` section.
101
102


Romain Feron's avatar
Romain Feron committed
103
104
Computing the distribution of markers between groups
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
105

Romain Feron's avatar
Romain Feron committed
106
The ``distrib`` command computes the distribution of markers between groups from a markers depth table:
107
108
109

::

RomainFeron's avatar
RomainFeron committed
110
    radsex distrib --markers-table markers_table.tsv --output-file distribution.tsv --popmap popmap.tsv --min-depth 5 --groups M,F``
111

RomainFeron's avatar
RomainFeron committed
112
In this example, ``--markers-table`` is the table generated in the :ref:`computing-depth-table` section, and the distribution of markers between groups will be saved to **distribution.tsv**. The group of each individual in the population is given by **popmap.tsv** (see the :ref:`population-map` section). Groups of individuals to compare (as defined in the :ref:`population-map`) are specified manually with the parameter ``--groups``. The minimum depth to consider a marker present in an individual is set to 5, meaning that markers with depth lower than 5 in an individual will not be considered present in this individual.
113

Romain Feron's avatar
Romain Feron committed
114
The resulting file **distribution.tsv** is a table described in the :ref:`sex-distribution-file` section.
115

116
This distribution can be visualized with the ``plot_sex_distribution()`` function of `RADSex-vis <https://github.com/RomainFeron/RADSex-vis>`_, which generates a tile plot of marker counts with number of males on the x-axis and number of females on the y-axis.
117
118


Romain Feron's avatar
Romain Feron committed
119
120
121
Extracting markers significantly associated with sex
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Romain Feron's avatar
Romain Feron committed
122
Markers significantly associated with sex are obtained with the ``signif`` command:
123
124
125

::

RomainFeron's avatar
RomainFeron committed
126
    radsex signif --markers-table markers_table.tsv --output-file markers.tsv --popmap popmap.tsv --min-depth 5 --groups M,F [ --output-fasta ]
127

RomainFeron's avatar
RomainFeron committed
128
In this example, ``--markers-table`` is the table generated in the :ref:`computing-depth-table` section, and markers significantly associated with sex are saved to **markers.tsv**. The sex of each individual in the population is given by **popmap.tsv** (see the :ref:`population-map` section). Groups of individuals to compare (as defined in the :ref:`population-map`) are specified manually with the parameter ``--groups``. The minimum depth to consider a marker present in an individual is set to 5, meaning that markers with depth lower than 5 in an individual will not be considered present in this individual.
129

130
By default, the ``signif`` function generates an output file in the same format as the markers depth table. Markers can also be exported to a fasta file using the ``--output-fasta`` parameter (see the :ref:`fasta-file` section).
131

132
The markers table generated by ``signif`` can be visualized with the ``plot_depth()`` function of `RADSex-vis <https://github.com/RomainFeron/RADSex-vis>`_, which generates a heatmap showing the depth of each marker in each individual.
133
134


135
136
Aligning markers to a genome
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137

Romain Feron's avatar
Romain Feron committed
138
Markers can be aligned to a genome using the ``map`` command:
139
140
141

::

RomainFeron's avatar
RomainFeron committed
142
    radsex map --markers-file markers_table.tsv --output-file alignment_results.tsv --popmap popmap.tsv --genome-file genome.fasta --min-quality 20 --min-frequency 0.1 --min-depth 5 --groups M,F
143

RomainFeron's avatar
RomainFeron committed
144
In this example, ``--markers-file`` is the markers depth table generated in the :ref:`computing-depth-table` step, and the path to the reference genome file is given by ``--genome-file``; results will are saved to **alignment_results.tsv**. The sex of each individual in the population is given by **popmap.tsv** (see the :ref:`population-map` section), and the minimum depth to consider a marker present in an individual is set to 5, meaning that markers with depth lower than 5 in an individual will not be considered present in this individual. Groups of individuals to compare (as defined in the :ref:`population-map`) are specified manually with the parameter ``--groups``
145

146
The parameter ``--min-quality`` specifies the minimum mapping quality (as defined in `BWA <http://bio-bwa.sourceforge.net/bwa.shtml>`_) to consider a marker properly aligned and is set to 20 in this example. The parameter ``--min-frequency`` specifies the minimum frequency of a marker in the population to retain this marker and is set to 0.1 here, meaning that only sequences present in at least 10% of individuals of the population are aligned to the genome.
147

Romain Feron's avatar
Romain Feron committed
148
The resulting file ``mapping.tsv`` is a table described in the :ref:`mapping-results-file` section.
149

Romain Feron's avatar
Romain Feron committed
150
Alignment results from ``map`` can be visualized with the ``plot_genome()`` function of `RADSex-vis <https://github.com/RomainFeron/RADSex-vis>`_, which generates a circular plot showing bias and association with sex for each marker aligned to the genome.
151

Romain Feron's avatar
Romain Feron committed
152
Alignment results for a specific contig can be visualized with the ``plot_contig()`` function to show the same metrics for a single contig.