# snakemake_rapidrun_swarm OTU clustering with SWARM on RAPIDRUN data encapsulated in SNAKEMAKE # Prerequisites * python3 * snakemake * singularity python3 dependencies ``` pip3 install pandas pip3 install biopython ``` python3 dependencies to run `snakemake`: ``` pip3 install datrie pip3 install ConfigArgParse pip3 install appdirs pip3 install gitdb2 ``` # Configuration / input files You have to set 2 files: * [01_infos/all_samples.tsv](01_infos/all_samples.tsv) * [01_infos/config.yaml](01_infos/config_test.yaml) # Run the workflow ``` CORES=32 CONFIGFILE="01_infos/config_test.yaml" bash main.sh $CORES $CONFIGFILE ``` # Run from scratch ## clone repositories ``` git clone git@gitlab.mbb.univ-montp2.fr:edna/snakemake_rapidrun_swarm.git cd snakemake_rapidrun_swarm ``` ## write demultiplexing table From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` * `fichiers:` `dat` * `blacklist:` * :warning: the colon "marker" into `fichiers:` `rapidrun` must be the same name as marker's keys of `fichiers:` `dat` into `01_infos/config_test.yaml` * This will generate a file `01_infos/all_demultiplex.csv` with each line arguments for each command to run in parallel. * `blacklist:` `projet:` contains a list of projects you don't want to proceed * `blacklist:` `run:` contains a list of runs you don't want to proceed ``` snakemake --configfile 01_infos/config_test.yaml -s readwrite_rapidrun_demultiplexing.py ``` ## merge fastq From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` * `blacklist:` * Deduce the `{run}` fastq paired-end to merge * `blacklist:` `projet:` contains a list of projects you don't want to proceed * `blacklist:` `run:` contains a list of runs you don't want to proceed ``` cd 02_assembly snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ``` ## demultiplexing From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` * will generate demultiplexed .fasta for each `{projet}`/`{marker}`/`{sample}` into `03_demultiplexing` ``` cd 03_demultiplexing snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ``` ## cat qualities From - `01_infos/config_test.yaml` * `fichiers:` `rapidrun` * will concatenate and format .qual files by `{projet}`/`{marker}` ``` cd 04_cat_quality snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ``` ## clustering ... ``` cd 05_clustering snakemake --configfile $CONFIGFILE -s Snakefile -j $CORES --use-singularity --singularity-args "--bind /media/superdisk:/media/superdisk" --latency-wait 20 cd .. ```