| `--curate` | `-c` | NA | path of the input taxonomy curation CSV file. Header must be current_name;ncbi_name;genus;family;ncbi_rank. A curation CSV file can be generated with the command curegen (see the [Curegen section](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Curegen) to learn how to produce a curation file) |
| `--curate` | `-c` | NA | path of the input taxonomy curation CSV file. Header must be current_name;ncbi_name;genus;family;ncbi_rank. A curation CSV file can be generated with the command curegen (see the [Curegen section](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Curegen) to learn how to produce a curation file) |
| `--ncbi_taxdump` | `-n` | NA | path of NCBI taxonomy folder |
ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxonomy tree object in your home folder.
ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxonomy tree object in your home folder.
...
@@ -70,7 +69,7 @@ ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxo
...
@@ -70,7 +69,7 @@ ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxo
### Using a local NCBI taxonomy, performs a curation which add new species to your local taxonomy:
### Using a local NCBI taxonomy, performs a curation which add new species to your local taxonomy:
* To run mkbdr in order to add new species, you have to allow edition of local NCBI taxonomy files with the argument `--ncbi_taxdump_edition`. To specify the location of NCBI taxonomy folder to edit, add the argument `--ncbi_taxdump`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
* To run mkbdr in order to add new species, you have to allow edition of a given local NCBI taxonomy files with the argument `--ncbi_taxonomy_edition`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
The following curation CSV gives instruction to MKBDR to create a new custom species called _Distichodus perspicillatus_ with genus _Distichodus_ family _Distichodontidae_.
The following curation CSV gives instruction to MKBDR to create a new custom species called _Distichodus perspicillatus_ with genus _Distichodus_ family _Distichodontidae_.
...
@@ -86,9 +85,7 @@ The MKBDR complete command is:
...
@@ -86,9 +85,7 @@ The MKBDR complete command is:
mkbdr validate --fasta raw.fasta \
mkbdr validate --fasta raw.fasta \
--output_prefix res \
--output_prefix res \
--curate curation_new_species.csv \
--curate curation_new_species.csv \
--ncbi_taxdump path/to/ncbi_taxo \
--ncbi_taxonomy_edition path/to/ncbi_taxo
--ncbi_taxdump_load \
--ncbi_taxdump_edition
```
```
This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new custom species _Distichodus perspicillatus_. This species records will be generated with custom taxid in valid FASTA file output called `res_valid.fasta`.
This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new custom species _Distichodus perspicillatus_. This species records will be generated with custom taxid in valid FASTA file output called `res_valid.fasta`.
...
@@ -100,7 +97,7 @@ This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new
...
@@ -100,7 +97,7 @@ This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new
* Valid FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#valid-fasta-file))
* Valid FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#valid-fasta-file))
* Faulty taxonomy FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-taxonomy-fasta-file))
* Faulty taxonomy FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-taxonomy-fasta-file))
* Faulty format FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-format-fasta-file))
* Faulty format FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-format-fasta-file))
* Edited taxonomy files: with `--curate`,`--ncbi_taxdump_edition`and `--ncbi_taxdump`options, validate module edits taxonomy files mentionned by`--ncbi_taxdump` so that nodes are added to the tree of life according to curation file specification.
* Edited taxonomy files: with `--curate` and`--ncbi_taxdump_edition` options, the `validate` module edits taxonomy files located in the folder path given from`--ncbi_taxdump_edition` so that nodes are added to the tree of life according to curation file specification.
To perform taxonomic assignment in further analysis you need valid FASTA file and corresponding taxonomy files.
To perform taxonomic assignment in further analysis you need valid FASTA file and corresponding taxonomy files.