# STACKS2 using SNAKEMAKE Workflow RADseq workflow using [STACKS2](http://creskolab.uoregon.edu/stacks/) This was designed to process RADseq data from [RESERVEBENEFIT](https://www.biodiversa.org/1023) project. # Table of contents 1. [Introduction](#1-introduction) 2. [Installation](#2-installation) 1. [Prerequisite](#21-prerequisite) 2. [Data Files](#22-data-files) 3. [Set up](#23-set-up) 3. [Reporting bugs](#3-reporting-bugs) 4. [Running the pipeline](#5-running-the-pipeline) 1. [Initialisation](#41-initialisation) 2. [Configuration](#42-configuration) 3. [Run the pipeline into a single command](#43-run-the-pipeline-into-a-single-command) 4. [Run the pipeline step by step](#44-run-the-pipeline-step-by-step) # 1. Introduction blablabla # 2. Installation ## 2.1 Prerequisite You must install the following softwares and packages : - [SNAKEMAKE 5.3.0](https://snakemake.readthedocs.io/en/stable/getting_started/installation.html) * Check version and if the program is correctly installed by typing : ``` snakemake --version ## should give you the output 5.3.0 ``` - [STACKS 2.0b](http://catchenlab.life.illinois.edu/stacks/) * Check version and if programs are correctly installed by typing : ``` process_radtags --version clone_filter --version gstacks --version populations --version ## should give you the output 2.0b ``` - [BWA 0.7.17](https://icb.med.cornell.edu/wiki/index.php/Elementolab/BWA_tutorial) * Download `bwa` at: http://sourceforge.net/projects/bio-bwa/files/ ``` tar -xvf bwa-x.x.x.tar.bz2 cd bwa-x.x.x ./configure --prefix=/where/to/install make make install ``` * Check version and if programs are correctly installed by typing : ``` bwa ## should give you the output Program: bwa (alignment via Burrows-Wheeler transformation) Version: 0.7.17-r1188 ... ``` - [SAMTOOLS 1.9 ](http://www.htslib.org/) * Download `htslib` and `samtools` at : http://www.htslib.org/download/ * Building each desired package from source is very simple: ``` cd htslib-1.x ./configure --prefix=/where/to/install make make install cd .. ## and similarly for samtools : cd samtools-1.x ./configure --prefix=/where/to/install make make install ``` * Check version and if programs are correctly installed by typing : ``` samtools --version ## should give you the output samtools 1.9 Using htslib 1.9 Copyright (C) 2018 Genome Research Ltd. ``` ## 2.2 Data Files The included data files are : * [config.yaml](01-info_files/config.yaml) : * [barcodes.txt](01-info_files/barcodes.txt) : * [infos.csv](01-info_files) : * [populations_map.txt](01-info_files) : ## 2.3 Set Up clone the project and switch to the main folder, it's your working directory ``` git clone http://gitlab.mbb.univ-montp2.fr/reservebenefit/snakemake_stacks2.git cd snakemake_stacks2 ``` You will see the following folders : # 3. Reporting bugs If you're sure you've found a bug — e.g. if one of my programs crashes with an obscur error message, or if the resulting file is missing part of the original data, then by all means submit a bug report. I use [GitLab's issue system](https://gitlab.com/reservebenefit/snakemake_stacks2/issues) as my bug database. You can submit your bug reports there. Please be as verbose as possible — e.g. include the command line, etc # 4. Running the pipeline ## 4.1 Initialisation * open a shell * make a folder, name it yourself, I named it workdir ``` mkdir workdir cd workdir ``` * clone the project and switch to the main folder, it's your working directory ``` git clone http://gitlab.mbb.univ-montp2.fr/reservebenefit/snakemake_stacks2.git cd snakemake_stacks2 ``` ## 4.2 Configuration WORK IN PROGRESS !!!! ## 4.3 Run the pipeline into a single command Once you finished [Initialisation](#41-initialisation) and [Configuration](#42-configuration) steps. You can run the whole pipeline simply typing : ``` ## number of CPU cores available for running the pipeline (for instance here 64 cores) N_CORES=64 ## run the pipeline into a single command bash main.sh $N_CORES ``` ## 4.4 Run the pipeline step by step WORK IN PROGRESS !!!! that's it ! The pipeline is running and crunching your data. Look for the log folder output folder after the pipeline is finished.