Commit 437b1ffb authored by peguerin's avatar peguerin
Browse files

readme update

parent 7fcb88b1
......@@ -121,40 +121,42 @@ Useful parameters for each program are stored into the file [config.yaml](config
Before to run the workflow, you have to set your parameters. Please edit [config.yaml](config.yaml).
```diff
container:
/workdir/obitools.simg
fastqFolderPath:
/workdir/edna_miseq_rawdata/
fastqFiles:
- seqrunA
- seqrunB
barcodeFiles:
seqrunA : sample_description1.dat
seqrunB : sample_description2.dat
illuminapairedend:
- s_min : 40
s_min : 40
good_length_samples:
- count : 10
- seq_length : 20
count : 10
seq_length : 20
clean_pcrerr_samples:
- r : 0.05
r : 0.05
assign_taxon:
- bdr : bdr/embl_std
- fasta : bdr/db_embl_std.fasta
bdr : /workdir/reference_database/embl_std
fasta : /workdir/reference_database/db_embl_std.fasta
```
* `s_min : 40` :score for keeping alignment. If the alignment score is below this threshold both the sequences are just concatenated. The mode attribute is set to the value joined.
- software : option `--s_min` from [illuminapairedend](https://pythonhosted.org/OBITools/scripts/illuminapairedend.html?highlight=illumina#module-illuminapairedend)
- step : merge illumina paired-end sequences by pair
- we set this value at 40
* `count : 10` : minimum number of copy for keeping a sequence.
- software : option `-p 'count>{count}'` from [obigrep](https://pythonhosted.org/OBITools/scripts/obigrep.html?highlight=obigrep#module-obigrep)
- step : filter unique sequences according to their qualities and abundances
- we set this value at 10
* `seq_length : 20` : minimum length for keeping a sequence.
- software : option `-p 'seq_length>{seq_length}'` from [obigrep](https://pythonhosted.org/OBITools/scripts/obigrep.html?highlight=obigrep#module-obigrep)
- step : filter unique sequences according to their qualities and abundances
- we set this value at 20
* `r : 0.05` : threshold ratio between counts (rare/abundant counts) of two sequence records so that the less abundant one is a variant of the more abundant
- software : option `-r` from [obiclean](https://pythonhosted.org/OBITools/scripts/obiclean.html?highlight=obiclean#module-obiclean)
- step : remove singleton and PCR errors
- we set this value at 0.05
* `bdr : bdr/embl_std` : relative path to the folder `bdr` which contains reference database files. You have to add the prefix of the ref database files for instance "embl_something"
- software : option `-d` from [ecotag](https://pythonhosted.org/OBITools/scripts/ecotag.html?highlight=ecotag#module-ecotag)
- step : assign each sequences to a species
* `fasta : bdr/db_embl_std.fasta` : relative path to the fasta file of the reference database.
- software : option `-R` from [ecotag](https://pythonhosted.org/OBITools/scripts/ecotag.html?highlight=ecotag#module-ecotag)
- step : assign each sequences to a species
parameters | description | software | rule | default value | excepted type
---------|------------------|-------|------------------|---------|----
container | absolute path of singularity container file `obitools.simg` | [singularity](https://singularity.lbl.gov/) | every rules need this container to work | /workdir/obitools.simg | absolute path of `simg` file
fastqFolderPath | absolute path of a folder which contains pairend-end raw reads `.fastq.gz` files and the sample description `.dat` files. | [illuminapairedend](https://pythonhosted.org/OBITools/scripts/illuminapairedend.html?highlight=illumina#module-illuminapairedend), [ngsfilter](https://pythonhosted.org/OBITools/scripts/ngsfilter.html) | illuminapairedend, assign_sequences | /workdir/edna_miseq_rawdata/ | absolute path of a folder
fastqFiles | list of wildcard `{run}` where `{run}` is the name of the sequencing run | [illuminapairedend](https://pythonhosted.org/OBITools/scripts/illuminapairedend.html?highlight=illumina#module-illuminapairedend) | illuminapairedend | seqrunA, seqrunB | list of strings
barcodeFiles | a dictionary with `{run}` as key and associated sample description file as value | ngsfilter | assign_sequences | {seqrunA : sample_description1.dat, seqrunB : sample_description2.dat} | dictionary
s_min | score for keeping alignment. If the alignment score is below this threshold both the sequences are just concatenated. The mode attribute is set to the value joined | [illuminapairedend](https://pythonhosted.org/OBITools/scripts/illuminapairedend.html?highlight=illumina#module-illuminapairedend) | illuminapairedend | 40 | integer
count | minimum number of copy for keeping a sequence | [obigrep](https://pythonhosted.org/OBITools/scripts/obigrep.html?highlight=obigrep#module-obigrep) | good_length_samples | 10 | integer
seq_length | minimum length for keeping a sequence | [obigrep](https://pythonhosted.org/OBITools/scripts/obigrep.html?highlight=obigrep#module-obigrep) | good_length_samples | 20 | integer
r | threshold ratio between counts (rare/abundant counts) of two sequence records so that the less abundant one is a variant of the more abundant | [obiclean](https://pythonhosted.org/OBITools/scripts/obiclean.html?highlight=obiclean#module-obiclean) | clean_pcrerr_samples | 0.05 | float
bdr | absolute path to the folder which contains reference database files and prefix | [ecotag](https://pythonhosted.org/OBITools/scripts/ecotag.html?highlight=ecotag#module-ecotag) | assign_taxon | /workdir/reference_database/embl_std | absolute path of a folder + prefix
fasta | absolute path to the fasta file of the reference database | [ecotag](https://pythonhosted.org/OBITools/scripts/ecotag.html?highlight=ecotag#module-ecotag) | assign_taxon | /workdir/reference_database/db_embl_std.fasta | absolute path to a fasta file
## 3.3 Run the workflow into a single command
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment