Commit 3b34884c authored by peguerin's avatar peguerin
Browse files

readme update

parent 7e471414
......@@ -234,61 +234,61 @@ Let's define some wildcards `*` :
## Input files
Provided data (see [Initialisation](#41-initialisation)) as input are stored into 3 folders.
Provided data (see [Initialisation](#31-initialisation)):
* `bdr` : this folder contains reference database files. Relative path of fasta files and used prefix must be written in [config.yaml](config.yaml) (see [Configuration](#42-configuration).
* Singularity container
* `raw` : this folder contains raw paired-end reads files `{run}_R1.fastq.gz` and `{run}_R2.fastq.gz` where `{run}` is the name of the sequencing run.
* Reference database files. Relative path of fasta files and used prefix must be written in [config.yaml](config.yaml) (see [Configuration](#32-configuration)).
* `barcodes`: this folder contains sample description `.dat` files
* FASTQ files. Raw paired-end reads files `{run}_R1.fastq.gz` and `{run}_R2.fastq.gz` where `{run}` is the name of the sequencing run (see [Configuration](#32-configuration)).
## Intermediate files
* Sample description .dat files (see [Configuration](#32-configuration)).
Generated files at each step of the workflow. Anything needed for building the final results output files.
## Intermediate files
* `assembled`
- `assembled/{run}/{run}.fastq` : merged illumina paired-end sequences. It was made by [illuminapairedend](https://pythonhosted.org/OBITools/scripts/illuminapairedend.html?highlight=illumina#module-illuminapairedend).
A Snakemake workflow is defined by specifying rules in a Snakefile. Rules decompose the workflow into small steps by specifying how to create sets of output files from sets of input files. Snakemake automatically determines the dependencies between the rules by matching file names.
- `assembled/{run}/{run}.ali.fastq` : only well merged paired-end sequences. It was filtered by [obigrep](https://pythonhosted.org/OBITools/scripts/obigrep.html?highlight=obigrep#module-obigrep)
- `assembled/{run}/{run}.ali.assigned.fastq` : {sample}-assigned merged sequences. It was made by [ngsfilter](https://pythonhosted.org/OBITools/scripts/ngsfilter.html?highlight=ngsfilter#module-ngsfilter)
- `assembled/{run}/{run}.ali.unidentified.fastq`: unidentified by {sample} merged sequences. It was made by [ngsfilter](https://pythonhosted.org/OBITools/scripts/ngsfilter.html?highlight=ngsfilter#module-ngsfilter)
A rule is designed to generate output files and log files.
* `samples`
- `samples/{run}_sample_{sample}.fasta` : sequences of {sample}. Demultiplexing was processed by [obisplit](https://pythonhosted.org/OBITools/scripts/obisplit.html?highlight=obisplit#module-obisplit)
- `samples/{run}_sample_{sample}.uniq.fasta` : dereplicated sequences of {sample}. Sequences are grouped together by [obiuniq](https://pythonhosted.org/OBITools/scripts/obiuniq.html?highlight=obiuniq#module-obiuniq)
- `samples/{run}_sample_{sample}.l.u.fasta` : filtered unique sequences according to their qualities and abundances. Done by [obigrep](https://pythonhosted.org/OBITools/scripts/obigrep.html?highlight=obigrep#module-obigrep)
- `samples/{run}_sample_{sample}.r.l.u.fasta` : classified unique sequences as head, internal or singleton. Done by [obiclean](https://pythonhosted.org/OBITools/scripts/obiclean.html?highlight=obiclean#module-obiclean).
- `samples/{run}_sample_{sample}.c.r.l.u.fasta` : filtered internal/singleton unique sequences of {sample}. Done by [obigrep](https://pythonhosted.org/OBITools/scripts/obigrep.html?highlight=obigrep#module-obigrep)
* [01-assembly](01-assembly)
* [02-demultiplex](02-demultiplex)
* [03-filtered](03-filtered)
* `runs`
- `runs/{run}_run.fasta` : concatenated {sample} fasta files from the same {run}
- `runs/{run}_run.uniq.fasta` : dereplicated sequences of {run}. Sequences are grouped together by [obiuniq](https://pythonhosted.org/OBITools/scripts/obiuniq.html?highlight=obiuniq#module-obiuniq)
- `runs/{run}_run.tag.u.fasta` : sequences are assigned to a species. It's assigned by [ecotag](https://pythonhosted.org/OBITools/scripts/ecotag.html?highlight=ecotag#module-ecotag)
- `runs/{run}_run.a.t.u.fasta` : species-assigned unique sequences without unuseful attributes. Done by [obiannotate](https://pythonhosted.org/OBITools/scripts/obiannotate.html?highlight=obiannotate#module-obiannotate)
- `runs/{run}_run.s.a.t.u.fasta` : sorted *by number of copy* species-assigned unique sequences. Done by [obisort](https://pythonhosted.org/OBITools/scripts/obisort.html?highlight=obisort#module-obisort)
## Final results *output files*
## Final results
* `tables` : this folder contains all the matrix species/sample for each {run}
* [04-final_tables](04-final_tables): this folder contains all the matrix species/sample for each `{run}`.
## Log files
For every ran task, log have been written into `log`'s folders
* `log`
- `log/illuminapairedend/{run}.log`
- `log/remove_unaligned/{run}.log`
- `log/assign_sequences/{run}.log`
- `log/split_sequences/{run}.log`
- `log/dereplicate_samples/{sample}.log`
- `log/goodlength_samples/{sample}.log`
- `log/clean_pcrerr/{sample}.log`
- `log/rm_internal_samples/{sample}.log`
- `log/dereplicate_runs/{sample}.log`
- `log/assign_taxon/{run}.log`
- `log/rm_attributes/{run}.log`
- `log/sort_runs/{run}.log`
- `table_runs/{run}.log`
* [99-log](99-log): this folder contains log files for every rules for every `{run}` for every `{sample}`.
```
99-log/
├── 01-assembly
│   └── {run}.log
├── 02-remove_unaligned
│   └── {run}.log
├── 03-assign_sequences
│   └── {run}.log
├── 04-split_sequences
│   └── {run}.log
├── 05-dereplicate_samples
│   └── 161124_SND393_A_L005_GWM-849
├── 06-goodlength_samples
│   └── {run}/{sample}.log
├── 07-clean_pcrerr
│   └── {run}/{sample}.log
├── 08-rm_internal_samples
│   └── {run}/{sample}.log
├── 09-dereplicate_runs
│   └── {run}.log
├── 10-assign_taxon
│   └── {run}.log
├── 11-rm_attributes
│   └── {run}.log
├── 12-sort_runs
│   └── {run}.log
└── 13-table_runs
└── {run}.log
```
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment