Update home authored by peguerin's avatar peguerin
......@@ -3,3 +3,121 @@
You can browse the different steps using the sidebar on the right.
# Installation
## Prerequisites
* linux system
* [python3](https://www.python.org/)
* [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/)
## Installation via Conda
The default conda solver is a bit slow and sometimes has issues with selecting the latest package releases. Therefore, we recommend to install Mamba as a drop-in replacement via
```
conda install -c conda-forge mamba
```
Then, you can install Snakemake, pandas, biopython and dependencies with
```
mamba create -n snakemake_rapidrun -c conda-forge -c bioconda snakemake biopython pandas
```
from the conda-forge and bioconda channels. This will install all required software into an isolated software environment, that has to be activated with
```
conda activate snakemake_rapidrun
```
# Get started
* open a shell
* clone the project and switch to the main folder, it's your working directory
```
git clone https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm
cd snakemake_rapidrun_swarm
```
* Activate the conda environment to access the required dependencies
```
conda activate snakemake_rapidrun
```
You are ready to run the analysis !
## Download data
The complete data set can be downloaded and stored into [resources/tutorial](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_obitools/-/tree/master/resources/tutorial) folder with the following command:
```
wget -c https://gitlab.mbb.univ-montp2.fr/edna/tutorial_metabarcoding_data/-/raw/master/tutorial_rapidrun_data.tar.gz -O - | tar -xz -C ./resources/tutorial/
```
* Data is downloaded at [resources/tutorial](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm/-/tree/master/resources/tutorial)
* This is a tiny subset of a real metabarcoding analysis in rapidrun format
# Run the workflow
Simply type the following command to process data (estimated time: 25 minutes)
```
bash main.sh config/config_tutorial.yaml 8
```
* This will generate OTU occ:urences tables into [results/06_assignment/04_table](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm/-/tree/master/results/06_assignment/04_table)
* The first argument [config/config_tutorial.yaml](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm/-/blob/master/config/config_tutorial.yaml) contains mandatory parameters information
* The second argument **8** is the number of CPU cores you want to allow the system uses to run the whole workflow
# To go further
Please check the [wiki](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm/-/wikis/home).
# Cluster MBB
clone the workflow
```
git clone https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm.git
cd snakemake_rapidrun_obitools
```
Download data test (resources/test)
```
scp -r resources/test/ peguerin@162.38.181.66:~/snakemake_rapidrun_swarm/resources
```
Download singularity image (sedna.simg)
```
scp -r sedna.simg peguerin@162.38.181.66:~/snakemake_rapidrun_swarm/
```
load singularity module
```
module load singularity/3.5.3
```
run the workflow local
```
singularity exec sedna.simg snakemake --use-conda --configfile config/config_test_rapidrun.yaml -j 4
```
run the workflow sge
```
singularity exec sedna.simg snakemake --use-conda --cluster '/opt/gridengine/bin/linux-x64/qsub -q cemeb20.q -N testsmk -j y -pe robin 16' --configfile config/config_test_rapidrun.yaml -j 16
```