Update Validate authored by peguerin's avatar peguerin
......@@ -17,9 +17,8 @@ This table summarizes the command-line arguments which are using by `mkbdr valid
| `--fasta` | `-f` | NA | path of the input barcodes sequences FASTA file |
| `--output_prefix` | `-o` | NA | Output files prefix names |
| `--curate` | `-c` | NA | path of the input taxonomy curation CSV file. Header must be current_name;ncbi_name;genus;family;ncbi_rank. A curation CSV file can be generated with the command curegen (see the [Curegen section](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Curegen) to learn how to produce a curation file) |
| `--ncbi_taxdump` | `-n` | NA | path of NCBI taxonomy folder |
| `--ncbi_taxdump_load` | `-l` | FALSE | load NCBI taxonomy from NCBI taxonomy folder path |
| `--ncbi_taxdump_edition` | `-e` | FALSE | allow curation to edit NCBI taxonomy files in order to add new taxonomy nodes |
| `--ncbi_taxdump_load` | `-l` | NA | load NCBI taxonomy archive |
| `--ncbi_taxonomy_edition` | `-e` | NA | folder of NCBI taxonomy files to edit in order to add new taxonomy nodes |
# Example of commands
......@@ -61,8 +60,8 @@ mkbdr validate --fasta raw.fasta --output_prefix res --curate curation_table.csv
```
mkbdr validate --fasta raw.fasta \
--output_prefix res \
--ncbi_taxdump path/to/ncbi_taxo \
--ncbi_taxdump_load
--ncbi_taxdump_load path/to/ncbi_taxo
```
ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxonomy tree object in your home folder.
......@@ -70,7 +69,7 @@ ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxo
### Using a local NCBI taxonomy, performs a curation which add new species to your local taxonomy:
* To run mkbdr in order to add new species, you have to allow edition of local NCBI taxonomy files with the argument `--ncbi_taxdump_edition`. To specify the location of NCBI taxonomy folder to edit, add the argument `--ncbi_taxdump`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
* To run mkbdr in order to add new species, you have to allow edition of a given local NCBI taxonomy files with the argument `--ncbi_taxonomy_edition`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
The following curation CSV gives instruction to MKBDR to create a new custom species called _Distichodus perspicillatus_ with genus _Distichodus_ family _Distichodontidae_.
......@@ -86,9 +85,7 @@ The MKBDR complete command is:
mkbdr validate --fasta raw.fasta \
--output_prefix res \
--curate curation_new_species.csv \
--ncbi_taxdump path/to/ncbi_taxo \
--ncbi_taxdump_load \
--ncbi_taxdump_edition
--ncbi_taxonomy_edition path/to/ncbi_taxo
```
This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new custom species _Distichodus perspicillatus_. This species records will be generated with custom taxid in valid FASTA file output called `res_valid.fasta`.
......@@ -100,7 +97,7 @@ This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new
* Valid FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#valid-fasta-file))
* Faulty taxonomy FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-taxonomy-fasta-file))
* Faulty format FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-format-fasta-file))
* Edited taxonomy files: with `--curate`, `--ncbi_taxdump_edition` and `--ncbi_taxdump` options, validate module edits taxonomy files mentionned by `--ncbi_taxdump` so that nodes are added to the tree of life according to curation file specification.
* Edited taxonomy files: with `--curate` and `--ncbi_taxdump_edition` options, the `validate` module edits taxonomy files located in the folder path given from `--ncbi_taxdump_edition` so that nodes are added to the tree of life according to curation file specification.
To perform taxonomic assignment in further analysis you need valid FASTA file and corresponding taxonomy files.
......
......