Commit ade8baaa authored by Bastien MACE's avatar Bastien MACE
Browse files

Readme redaction

parent 237fc13a
......@@ -87,7 +87,7 @@ cp -r ./bin/swarm /usr/local/bin
<a name="step1"></a>
## STEP 1 : Pair-end sequencing
## STEP 1 : Pair-end sequencing (OBITools)
First, unzip your data in your shell if you need :
......@@ -116,7 +116,7 @@ obigrep -p 'mode!="joined"' Atl.fastq > Atl.ali.fastq
<a name="step2"></a>
## STEP 2 : Demultiplexing
## STEP 2 : Demultiplexing (OBITools)
The _.txt_ files assign each sequence to its sample thanks to its tag because each tag corresponds to a reverse or a forward sequence from a sample.
......@@ -146,7 +146,7 @@ mv -t ./dada2_and_obitools Med.ali.assigned.fastq Atl.ali.assigned.fastq
Now you have as many files as samples, containing pair-ended and demultiplexed sequences.
<a name="step3"></a>
## STEP 3 : Dereplication
## STEP 3 : Dereplication (OBITools)
Now that you have the sequences corresponding to the barcode you want to study, dereplicate them to only conserve the amplicons with their abundance stored in the header :
......@@ -164,7 +164,7 @@ obigrep -l 20 -p 'count>=10' Aquarium_2.uniq.fasta > Aquarium_2.grep.fasta
<a name="step5"></a>
## STEP 5 : Elimination of PCR errors
## STEP 5 : Elimination of PCR errors (OBITools)
_obiclean_ is a command which eliminates punctual errors caused during PCR. The algorithm makes parwise alignments for all the amplicons. It counts the number of dissimilarities between the amplicons, and calculates the ratio between the abundance of the 2 amplicons. If there is only one dissimilarity (parameter by default, but can be modified) and if the ratio is lower than a chosen threshold, the less abundant amplicon is considered as a variant of the most abundant one.
......@@ -176,7 +176,7 @@ obiclean -r 0.05 -H Aquarium_2.grep.fasta > Aquarium_2.clean.fasta
<a name="step6"></a>
## STEP 6 : Taxonomic assignment
## STEP 6 : Taxonomic assignment (OBITools)
_ecotag_ is the command which permits to assign each head amplicon to its corresponding taxon. The algorithm compares the amplicons with the sequences from the reference database. If the similarity score is higher than the threshold chosen, the amplicon is assigned to its "taxid" thanks to the taxonomy database.
......@@ -191,14 +191,15 @@ obiannotate -k count Aquarium_2.tag.fasta > Aquarium_2.tag_1.fasta
<a name="step7"></a>
## STEP 7 : Gathering in OTU
## STEP 7 : Gathering in OTU (swarm)
swarm -z -d 1 -o stats_Aq2.txt -w pipeline3_Aq2.fasta < pipeline1_Aq2.tag_1.fasta
# "-z" option permits to accept the abundance in the header, provided that there is no space in the header and that the value is preceded by "size="
# "-d" is the maximal number of differences tolerated between 2 sequences to be gathered in the same OTU
# "-o" option returns a ".txt" file in which each line corresponds to an OTU with all the amplicons belonging to this OTU
# "-w" option gives a "fasta" file with the representative sequence of each OTU
<a name="step8"></a>
## STEP 8 : Analyse your results
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment