The ``signif`` command filters the coverage table to only export markers significantly associated with sex. The probability of association with sex is computed using a chi-squared test with Yate's correction for continuity. Markers are significantly associated with sex when p ≤ (0.05 / total number of markers), to implement Bonferroni correction. Markers significantly associated with sex can be exported either in table format (same as the output of ``process``) or in fasta format, with marker information contained in the sequence IDs.
**Options**
===================== ===========
Option Description
===================== ===========
``--input-file`` Path to an coverage table obtained with ``process``
``--output-file`` Path to the output file (in tsv or fasta format)
``--popmap-file`` Path to a popmap file indicating the sex of each individual
``--output-format`` Output format, either "table" or "fasta" (default: "table")
``--min-coverage`` Minimum coverage to consider a marker present in an individual (default: 1)
===================== ===========
**Sample output**
* Table format :
::
ID Sequence individual_1 individual_2 individual_3 individual_4 individual_5
15 TGCA..TATT 0 15 24 17 21
27 TGCA..GACC 20 18 3 26 4
43 TGCA..ATCG 2 1 5 16 0
86 TGCA..CCGA 14 29 23 2 19
* FASTA format :
In FASTA format, IDs are generated with the following pattern : <marker_ID>_<number_of_males>M_<number_of_females>F_cov:<minimum_coverage>.
The ``loci`` command attempts to find markers belonging to the same locus for a list of markers (in tsv format) obtained with ``subset`` or ``signif``. For each specified marker, the Levenstein distance to every marker in the original coverage table is computed, and markers with distance shorter than **max_distance** are retained. The output file is a tabulated file where each line corresponds to a marker. The first column gives the ID of the reconstructed polymorphic locus containing this marker, the marker ID from the coverage table is in the second column, and the marker's sequence is in the third column. The last column indicates whether the marker comes from the specified list of markers ("Original") or was recovered from the coverage table ("Recovered").
**Options**
===================== ===========
Option Description
===================== ===========
``--input-file`` Path to an coverage table obtained with ``subset`` or ``signif``
``--coverage-table`` Path to an coverage table obtained with ``process``
``--output-file`` Path to the output file (in tsv format)
``--popmap-file`` Path to a popmap file indicating the sex of each individual
``--max-distance`` Maximum Levenstein distance between two sequences to group them in a locus (default 1)
``--threads`` Number of threads to use (default 1)
``--min-coverage`` Minimum coverage to consider a marker present in an individual (default: 1)
The ``map`` command aligns all makers from a coverage table (obtained either from ``process``, ``subset``, or ``signif``) to a reference genome provided in fasta format. The output is a tabulated file where each line gives a marker ID, the contig where the marker mapped, the mapping position of the marker on this contig, the sex-bias of the marker (defined as M / Tm - F / Tf where M and F are the number of males and females in which the marker is present, and Tm and Tf are the total number of males and females in the population), the probability of association with sex for this marker (obtained with a chi-square test with Yate's correction for continuity), and the significativity of the association with sex after Bonferroni correction.
**Options**
===================== ===========
Option Description
===================== ===========
``--input-file`` Path to an coverage table obtained with ``process``, ``subset``, or ``signif``
``--output-file`` Path to the output file (in tsv format)
``--popmap-file`` Path to a popmap file indicating the sex of each individual
``--genome-file`` Path to a reference genome file in fasta format
``--min-coverage`` Minimum coverage to consider a marker present in an individual (default: 1)
``--min-quality`` Minimum mapping quality, as defined in BWA, to consider a sequence properly mapped (default: 20)
``--min-frequency`` Minimum frequency in at least one sex for a sequence to be retained (default: 0.25)
The ``freq`` command computes the distribution of markers in the population. The output is a tabulated file where the first column gives the number of individuals, and the second column gives the number of markers found in this number of individuals.
**Options**
===================== ===========
Option Description
===================== ===========
``--input-file`` Path to an coverage table obtained with ``process``
``--output-file`` Path to the output file (in tsv format)
``--min-coverage`` Minimum coverage to consider a marker present in an individual (default: 1)