Commit 7a1f316f authored by peguerin's avatar peguerin
Browse files

initial commit

parents
Only_obitools pipeline using NEXTFLOW
=====================================
# Table of contents
1. [Introduction](#1-introduction)
2. [Installation](#2-installation)
1. [Requirements](#21-requirements)
2. [Optional components](#22-optional-components)
3. [Reporting bugs](#3-reporting-bugs)
4. [Running the pipeline](#4-running-the-pipeline)
-----------------
# 1. Introduction
Here, we reproduce the bioinformatics pipeline used by [SPYGEN](http://www.spygen.com/) to generate species environmental presence from raw eDNA data. This pipeline is based on [OBItools](https://git.metabarcoding.org/obitools/obitools/wikis/home) a set of python programs designed to analyse Next Generation Sequencer outputs (illumina) in the context of DNA Metabarcoding.
# 2. Installation
You will need to install
## 2.1. Requirements
In order to run "only_obitools", you need a couple of programs. Most of
them should be available pre-compiled for your distribution. The
programs and libraries you absolutely need are:
- [OBItools](https://pythonhosted.org/OBITools/welcome.html#installing-the-obitools)
## 2.2. Optional components
Other libraries are optional and only limit the features that are
built. These include:
# 3. Reporting bugs
If you're sure you've found a bug — e.g. if one of my programs crashes
with an obscur error message, or if the resulting file is missing part
of the original data, then by all means submit a bug report.
I use [GitLab's issue system](https://gitlab.com/edna/nextflow_obitools/issues)
as my bug database. You can submit your bug reports there. Please be as
verbose as possible — e.g. include the command line, etc
# 4. Running the pipeline
Quickstart
1. create a new folder for nextflow to work in
2. switch to this new folder
3. open a shell
4. type this command to download nextflow into this folder
```
curl -fsSL get.nextflow.io | bash
```
5. make sure that the programs stated in the Requirements section below are installed on your machine. After nextflow is downloaded, replace all the "YOUR_***" parts in the following command with your own paths
6. run your command
```
./nextflow run scripts/step1.nf --datafolder 'path/to/fastq/and/dat/files'
```
that's it ! The pipeline is running and crunching your data. Look for the overview.txt or. overview_new.txt in your output folder after the pipeline is finished
params.str = 'Hello world!'
params.workingfolder="$(pwd)"
params.datafolder="/media/superdisk/edna/donnees/rhone_test"
sequences= Channel.fromFilePairs(params.datafolder+"/*_R{1,2}.fastq.gz",flat:true)
barcodes=Channel.fromPath(params.datafolder+"/*.dat")
process illuminapairedend {
"""
[t=2h]paired end alignment then keep reads with quality > 40
"""
input:
set val(id), file(R1_fastq), file(R2_fastq) from sequences
output:
file fastqMerged into fastqMergeds
script:
"""
illuminapairedend -r $R2_fastq $R1_fastq --score-min=40 > fastqMerged
"""
}
process remove_unaligned {
"""
[t=1h]remove unaligned sequence records
"""
input:
file fastqMerged from fastqMergeds
output:
file mergedAligned into mergedAligneds
script:
"""
obigrep -p 'mode!="joined"' $fastqMerged > mergedAligned
"""
}
process assign_sequences {
"""
[t=6h]assign each sequence record to the corresponding sample/marker combination
"""
input:
file mergedAligned from mergedAligneds
file barcode from barcodes
output:
file assignedMerged into assigedMergeds
file unassignedMerged into unassignedMergeds
script:
"""
ngsfilter -t $barcode -u unassignedMerged $mergedAligned --fasta-output > assignedMerged
"""
}
process split_sequences {
"""
split the input sequence file in a set of subfiles according to the values of attribute "sample"
"""
input:
file assignedMerged from assigedMergeds
output:
file 'sample_*.fasta' into demultiplexed mode flatten
script:
"""
obisplit -p "sample_" -t sample --fasta $assignedMerged
"""
}
process dereplicate {
input:
file sampleSplit from demultiplexed
output:
dereplicated into dereplicateds
script:
"""
#dereplicate reads into uniq sequences
obiuniq -m sample $sampleSplit > dereplicated
"""
}
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment