| `--curate` | `-c` | NA | path of the input taxonomy curation CSV file. Header must be current_name;ncbi_name;genus;family;ncbi_rank. A curation CSV file can be generated with the command curegen (see the [Curegen section](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Curegen) to learn how to produce a curation file) |
| `--ncbi_taxdump` | `-n` | NA | path of NCBI taxonomy folder |
ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxonomy tree object in your home folder.
...
...
@@ -70,7 +69,7 @@ ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxo
### Using a local NCBI taxonomy, performs a curation which add new species to your local taxonomy:
* To run mkbdr in order to add new species, you have to allow edition of local NCBI taxonomy files with the argument `--ncbi_taxdump_edition`. To specify the location of NCBI taxonomy folder to edit, add the argument `--ncbi_taxdump`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
* To run mkbdr in order to add new species, you have to allow edition of a given local NCBI taxonomy files with the argument `--ncbi_taxonomy_edition`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
The following curation CSV gives instruction to MKBDR to create a new custom species called _Distichodus perspicillatus_ with genus _Distichodus_ family _Distichodontidae_.
...
...
@@ -86,9 +85,7 @@ The MKBDR complete command is:
mkbdr validate --fasta raw.fasta \
--output_prefix res \
--curate curation_new_species.csv \
--ncbi_taxdump path/to/ncbi_taxo \
--ncbi_taxdump_load \
--ncbi_taxdump_edition
--ncbi_taxonomy_edition path/to/ncbi_taxo
```
This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new custom species _Distichodus perspicillatus_. This species records will be generated with custom taxid in valid FASTA file output called `res_valid.fasta`.
...
...
@@ -100,7 +97,7 @@ This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new
* Valid FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#valid-fasta-file))
* Faulty taxonomy FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-taxonomy-fasta-file))
* Faulty format FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-format-fasta-file))
* Edited taxonomy files: with `--curate`,`--ncbi_taxdump_edition`and `--ncbi_taxdump`options, validate module edits taxonomy files mentionned by`--ncbi_taxdump` so that nodes are added to the tree of life according to curation file specification.
* Edited taxonomy files: with `--curate` and`--ncbi_taxdump_edition` options, the `validate` module edits taxonomy files located in the folder path given from`--ncbi_taxdump_edition` so that nodes are added to the tree of life according to curation file specification.
To perform taxonomic assignment in further analysis you need valid FASTA file and corresponding taxonomy files.