... | ... | @@ -17,9 +17,8 @@ This table summarizes the command-line arguments which are using by `mkbdr valid |
|
|
| `--fasta` | `-f` | NA | path of the input barcodes sequences FASTA file |
|
|
|
| `--output_prefix` | `-o` | NA | Output files prefix names |
|
|
|
| `--curate` | `-c` | NA | path of the input taxonomy curation CSV file. Header must be current_name;ncbi_name;genus;family;ncbi_rank. A curation CSV file can be generated with the command curegen (see the [Curegen section](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Curegen) to learn how to produce a curation file) |
|
|
|
| `--ncbi_taxdump` | `-n` | NA | path of NCBI taxonomy folder |
|
|
|
| `--ncbi_taxdump_load` | `-l` | FALSE | load NCBI taxonomy from NCBI taxonomy folder path |
|
|
|
| `--ncbi_taxdump_edition` | `-e` | FALSE | allow curation to edit NCBI taxonomy files in order to add new taxonomy nodes |
|
|
|
| `--ncbi_taxdump_load` | `-l` | NA | load NCBI taxonomy archive |
|
|
|
| `--ncbi_taxonomy_edition` | `-e` | NA | folder of NCBI taxonomy files to edit in order to add new taxonomy nodes |
|
|
|
|
|
|
|
|
|
# Example of commands
|
... | ... | @@ -61,8 +60,8 @@ mkbdr validate --fasta raw.fasta --output_prefix res --curate curation_table.csv |
|
|
```
|
|
|
mkbdr validate --fasta raw.fasta \
|
|
|
--output_prefix res \
|
|
|
--ncbi_taxdump path/to/ncbi_taxo \
|
|
|
--ncbi_taxdump_load
|
|
|
--ncbi_taxdump_load path/to/ncbi_taxo
|
|
|
|
|
|
```
|
|
|
|
|
|
ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxonomy tree object in your home folder.
|
... | ... | @@ -70,7 +69,7 @@ ete3 will load the files located in `path/to/ncbi_taxo` and stores the NCBI taxo |
|
|
|
|
|
### Using a local NCBI taxonomy, performs a curation which add new species to your local taxonomy:
|
|
|
|
|
|
* To run mkbdr in order to add new species, you have to allow edition of local NCBI taxonomy files with the argument `--ncbi_taxdump_edition`. To specify the location of NCBI taxonomy folder to edit, add the argument `--ncbi_taxdump`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
|
|
|
* To run mkbdr in order to add new species, you have to allow edition of a given local NCBI taxonomy files with the argument `--ncbi_taxonomy_edition`. To apply a curation, add the argument `--curate`. Thorough description of curation CSV file is available [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#curation-file).
|
|
|
|
|
|
The following curation CSV gives instruction to MKBDR to create a new custom species called _Distichodus perspicillatus_ with genus _Distichodus_ family _Distichodontidae_.
|
|
|
|
... | ... | @@ -86,9 +85,7 @@ The MKBDR complete command is: |
|
|
mkbdr validate --fasta raw.fasta \
|
|
|
--output_prefix res \
|
|
|
--curate curation_new_species.csv \
|
|
|
--ncbi_taxdump path/to/ncbi_taxo \
|
|
|
--ncbi_taxdump_load \
|
|
|
--ncbi_taxdump_edition
|
|
|
--ncbi_taxonomy_edition path/to/ncbi_taxo
|
|
|
```
|
|
|
|
|
|
This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new custom species _Distichodus perspicillatus_. This species records will be generated with custom taxid in valid FASTA file output called `res_valid.fasta`.
|
... | ... | @@ -100,7 +97,7 @@ This will edit NCBI taxonomy files located on `path/to/ncbi_taxo` adding a new |
|
|
* Valid FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#valid-fasta-file))
|
|
|
* Faulty taxonomy FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-taxonomy-fasta-file))
|
|
|
* Faulty format FASTA file (see description [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#faulty-format-fasta-file))
|
|
|
* Edited taxonomy files: with `--curate`, `--ncbi_taxdump_edition` and `--ncbi_taxdump` options, validate module edits taxonomy files mentionned by `--ncbi_taxdump` so that nodes are added to the tree of life according to curation file specification.
|
|
|
* Edited taxonomy files: with `--curate` and `--ncbi_taxdump_edition` options, the `validate` module edits taxonomy files located in the folder path given from `--ncbi_taxdump_edition` so that nodes are added to the tree of life according to curation file specification.
|
|
|
|
|
|
To perform taxonomic assignment in further analysis you need valid FASTA file and corresponding taxonomy files.
|
|
|
|
... | ... | |