Commit 6522af51 authored by peguerin's avatar peguerin
Browse files

readme update

parent a0376a73
......@@ -2,114 +2,4 @@
snakemake obitools workflow to process rapidrun eDNA data
# Prerequisites
* python3
* snakemake
* singularity
python3 dependencies
```
pip3 install pandas
pip3 install biopython
```
python3 dependencies to run `snakemake`:
```
pip3 install datrie
pip3 install ConfigArgParse
pip3 install appdirs
pip3 install gitdb2
```
# Configuration / input files
You have to set 2 files:
* `RAPIDRUN_METADATA` *e.g.* [all_samples.tsv](resources/test/all_samples.tsv)
* `CONFIG_FILE` *e.g.* [config/config.yaml](config/)
# The Workflow
## 1. Set rapidrun
Indicate the absolute path of rapidrun metadata table [all_samples.tsv](resources/test/all_samples.tsv) in the `fichiers` `rapidrun` field in [config/config.yaml](config/)
```
cd 01_settings
snakemake --configfile "../"$CONFIGFILE -s readwrite_rapidrun_demultiplexing.py --cores $CORES
cd ..
```
This command will read `CONFIGFILE` and the `RAPIDRUN_METADATA` then return a file [results/01_settings/all_demultiplex.csv](results/01_settings/) which can be used to process rapidrun data in the next steps of the workflow.
[results/01_settings/all_demultiplex.csv](results/01_settings/) has 14 fields : *demultiplex,projet,marker,run,plaque,sample,barcode5,barcode3,primer5,primer3,min_f,min_r,lenBarcode5,lenBarcode3* and each row is an unique sample which belong to an unique marker and a project.
## 2 Assembly
```
cd 02_assembly
snakemake --configfile "../"$CONFIGFILE -s Snakefile --cores $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk --home $HOME" --latency-wait 20
cd ..
```
Merge paired-end sequences and remove unaligned sequences records. Results are stored into [results/02_assembly](results/02_assembly)
## 3 Demultiplexing
```
cd 03_demultiplex
snakemake --configfile "../"$CONFIGFILE -s Snakefile --cores $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk --home $HOME" --latency-wait 20
cd ..
```
The tags correspond to short and specific sequences added on the 5’ end of each primer to distinguish the different samples.
Check results of demultiplexing into [results/03_demultiplex](results/03_demultiplex)
## 4 Filter samples
```
cd 04_filter_samples
snakemake --configfile "../"$CONFIGFILE -s Snakefile --cores $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk --home $HOME" --latency-wait 20
cd ..
```
For each sample, dereplicate sequences, remove sequences with a given length, remove sequences with IUAPC ambiguity, remove PCR clone, remove sequences which are classified as 'internal' by `obiclean`
Check results into [results/04_filter_samples](results/04_filter_samples)
## 5 Concatenate samples into runs
```
for projet in `ls results/04_filter_samples/04_filtered/`;
do
for marker in `ls results/04_filter_samples/04_filtered/${projet}/`;
do
for run in `ls results/04_filter_samples/04_filtered/${projet}/${marker}/`;
do
echo results/04_filter_samples/04_filtered/${projet}/${marker}/${run};
mkdir -p results/05_assignment/01_runs/${projet}/${marker}/
cat results/04_filter_samples/04_filtered/${projet}/${marker}/${run}/*.c.r.l.u.fasta > results/05_assignment/01_runs/${projet}/${marker}/${run}.fasta
done
done
done
```
sample files are concatenated by run into [results/05_assignment/01_runs](results/05_assignment/01_runs)
## 6 Assignment
```
cd 05_assignment
snakemake --configfile "../"$CONFIGFILE -s Snakefile --cores $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk --home $HOME" --latency-wait 20
cd ..
```
Assign each sequence to a taxon. Format the output into a csv results in folder [results/06_final_tables](results/06_final_tables)
\ No newline at end of file
Visit project [wiki](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_obitools/-/wikis/home) for documentation.
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment