# snakemake_rapidrun_swarm OTU clustering with SWARM on RAPIDRUN data encapsulated in SNAKEMAKE # Prerequisites * python3 * snakemake * singularity python3 dependencies ``` pip3 install pandas pip3 install biopython ``` python3 dependencies to run `snakemake`: ``` pip3 install datrie pip3 install ConfigArgParse pip3 install appdirs pip3 install gitdb2 ``` # Configuration / input files You have to set 2 files: * [01_infos/all_samples.tsv](01_infos/all_samples.tsv) * [01_infos/config.yaml](01_infos/config_test.yaml) # Run the workflow ``` CORES=32 CONFIGFILE="../01_infos/config_test.yaml" bash main.sh $CORES $CONFIGFILE ``` # Run from scratch ## clone repositories ``` git clone git@gitlab.mbb.univ-montp2.fr:edna/snakemake_rapidrun_swarm.git cd snakemake_rapidrun_swarm ``` ## write demultiplexing table From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` * `fichiers:` `dat` :warning: the colon "marker" into `fichiers:` `rapidrun` must be the same name as marker's keys of `fichiers:` `dat` into `01_infos/config_test.yaml` This will generate a file `01_infos/all_demultiplex.csv` with each line arguments for command. ``` snakemake --configfile 01_infos/config_test.yaml -s readwrite_rapidrun_demultiplexing.py ``` ## merge fastq From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` Deduce the `{run}` fastq paired-end to merge :warning: guadeloupe have been removed by hand into 02_assembly/Snakefile !!! ``` cd 02_assembly snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ``` ## demultiplexing From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` will generate demultiplexed .fasta and .qual files for each `{projet}`/`{marker}`/`{sample}` into `03_demultiplexing` ``` cd 03_demultiplexing snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ``` ## cat qualities From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` Simply concatenate and format .qual files by `{projet}`/`{marker}` ``` cd 04_cat_quality snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ``` ## clustering ``` cd 05_clustering snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ```