From RAPIDRUN FASTQ to OTU table
You can browse the different steps using the sidebar on the right.
Installation
Prerequisites
Installation via Conda
The default conda solver is a bit slow and sometimes has issues with selecting the latest package releases. Therefore, we recommend to install Mamba as a drop-in replacement via
conda install -c conda-forge mamba
Then, you can install Snakemake, pandas, biopython and dependencies with
mamba create -n snakemake_rapidrun -c conda-forge -c bioconda snakemake biopython pandas
from the conda-forge and bioconda channels. This will install all required software into an isolated software environment, that has to be activated with
conda activate snakemake_rapidrun
Get started
- open a shell
- clone the project and switch to the main folder, it's your working directory
git clone https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm
cd snakemake_rapidrun_swarm
- Activate the conda environment to access the required dependencies
conda activate snakemake_rapidrun
You are ready to run the analysis !
Download example data
curl -JLO http://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_data_test/-/raw/master/test_rapidrun_data.tar.gz; tar zfxz test_rapidrun_data.tar.gz -C resources/test/
- Data is downloaded at resources/test/test_rapidrun_data
- this folder contains a reference database for 4 markers (Teleo01; Mamm01; Vert01; Chond01), NGS metabarcoding raw data, required metadata to handle demultiplexing on RAPIDRUN format
Run the workflow
Simply type the following command to process example data
snakemake --configfile config/config_test_rapidrun.yaml --cores 4 --use-conda
- The first argument
--configfile
config/config_test_rapidrun.yaml contains parameters to apply - The second argument
--cores
4 is the number of CPU cores you want to allow the system uses to run the whole workflow - The third argument
--use-conda
is only necessary if you installed the program and its dependencies using conda
Cluster MBB
clone the workflow
git clone https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm.git
cd snakemake_rapidrun_obitools
Download data test (resources/test)
scp -r resources/test/ peguerin@162.38.181.66:~/snakemake_rapidrun_swarm/resources
Download singularity image (sedna.simg)
scp -r sedna.simg peguerin@162.38.181.66:~/snakemake_rapidrun_swarm/
load singularity module
module load singularity/3.5.3
run the workflow local
singularity exec sedna.simg snakemake --use-conda --configfile config/config_test_rapidrun.yaml -j 4
run the workflow sge
singularity exec sedna.simg snakemake --use-conda --cluster '/opt/gridengine/bin/linux-x64/qsub -q cemeb20.q -N testsmk -j y -pe robin 16' --configfile config/config_test_rapidrun.yaml -j 16