README.md 2.53 KB
Newer Older
peguerin's avatar
peguerin committed
1
2
# snakemake_rapidrun_swarm

peguerin's avatar
peguerin committed
3
OTU clustering based on [TARA Fred's metabarcoding pipeline](https://github.com/frederic-mahe/swarm/wiki/Fred%27s-metabarcoding-pipeline) applied on RAPIDRUN data managed with [SNAKEMAKE](https://snakemake.readthedocs.io/en/stable/)
peguerin's avatar
peguerin committed
4

peguerin's avatar
peguerin committed
5
# Installation
peguerin's avatar
peguerin committed
6

peguerin's avatar
peguerin committed
7
## Prerequisites
peguerin's avatar
peguerin committed
8

peguerin's avatar
peguerin committed
9
10
* linux system
* [python3](https://www.python.org/)
peguerin's avatar
peguerin committed
11
* [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/)
peguerin's avatar
peguerin committed
12

peguerin's avatar
peguerin committed
13
## Installation via Conda
peguerin's avatar
peguerin committed
14

peguerin's avatar
peguerin committed
15
The default conda solver is a bit slow and sometimes has issues with selecting the latest package releases. Therefore, we recommend to install Mamba as a drop-in replacement via
peguerin's avatar
peguerin committed
16
```
peguerin's avatar
peguerin committed
17
conda install -c conda-forge mamba
peguerin's avatar
peguerin committed
18
```
peguerin's avatar
peguerin committed
19

peguerin's avatar
peguerin committed
20
Then, you can install Snakemake, pandas, biopython and dependencies with
peguerin's avatar
peguerin committed
21
```
peguerin's avatar
peguerin committed
22
mamba create -n snakemake_rapidrun -c conda-forge -c bioconda snakemake biopython pandas
peguerin's avatar
peguerin committed
23
```
peguerin's avatar
peguerin committed
24

peguerin's avatar
peguerin committed
25
from the conda-forge and bioconda channels. This will install all required software into an isolated software environment, that has to be activated with
peguerin's avatar
peguerin committed
26
```
peguerin's avatar
peguerin committed
27
conda activate snakemake_rapidrun
peguerin's avatar
peguerin committed
28
29
30
```


peguerin's avatar
peguerin committed
31
# Get started
peguerin's avatar
peguerin committed
32

peguerin's avatar
peguerin committed
33
34
* open a shell
* clone the project and switch to the main folder, it's your working directory
peguerin's avatar
peguerin committed
35
36

```
peguerin's avatar
peguerin committed
37
git clone https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm
peguerin's avatar
peguerin committed
38
39
40
cd snakemake_rapidrun_swarm
```

peguerin's avatar
peguerin committed
41
* Activate the conda environment to access the required dependencies
peguerin's avatar
peguerin committed
42
43

```
peguerin's avatar
peguerin committed
44
conda activate snakemake_rapidrun
peguerin's avatar
peguerin committed
45
46
```

peguerin's avatar
peguerin committed
47
You are ready to run the analysis !
peguerin's avatar
peguerin committed
48
49


peguerin's avatar
peguerin committed
50
## Download data
peguerin's avatar
peguerin committed
51

peguerin's avatar
peguerin committed
52

peguerin's avatar
peguerin committed
53
The complete data set can be downloaded and stored into [resources/tutorial](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_obitools/-/tree/master/resources/tutorial) folder with the following command:
peguerin's avatar
peguerin committed
54

peguerin's avatar
peguerin committed
55
```
peguerin's avatar
peguerin committed
56
wget -c https://gitlab.mbb.univ-montp2.fr/edna/tutorial_metabarcoding_data/-/raw/master/tutorial_rapidrun_data.tar.gz -O - | tar -xz -C ./resources/tutorial/
peguerin's avatar
peguerin committed
57
58
```

peguerin's avatar
peguerin committed
59
# Run the workflow
peguerin's avatar
peguerin committed
60

peguerin's avatar
peguerin committed
61
Simply type the following command to process data (estimated time: 20 minutes)
peguerin's avatar
peguerin committed
62
63

```
peguerin's avatar
peguerin committed
64
bash main.sh config/config_tutorial.yaml 8
peguerin's avatar
peguerin committed
65
66
```

peguerin's avatar
peguerin committed
67
68
69
70
71

* This will generate OTU occurences tables into [results/06_assignment/04_table](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm/-/tree/master/results/06_assignment/04_table)
* The first argument [config/config_tutorial.yaml](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm/-/blob/master/config/config_tutorial.yaml) contains mandatory parameters information
* The second argument **8** is the number of CPU cores you want to allow the system use to run the whole workflow

peguerin's avatar
peguerin committed
72
# To go further
peguerin's avatar
peguerin committed
73

peguerin's avatar
peguerin committed
74
Please check the [wiki](https://gitlab.mbb.univ-montp2.fr/edna/snakemake_rapidrun_swarm/-/wikis/home).