... | ... | @@ -156,3 +156,60 @@ CNNNCTCAACACAAAAAAATCACTACATAAACAAACTT--CCAACAAGAGGAGGCAAGTC |
|
|
* ID07 species name is format faulty (excepted 2 words _Genus_ _species_ separated by ` ` or `_`)
|
|
|
* ID19 DNA sequence is format faulty (IUAPC ambiguities `NNN` and gaps `-`)
|
|
|
|
|
|
|
|
|
# Curation File
|
|
|
|
|
|
The curation file is both the input file of `mkbdr validate --curate` command and the output file of `mkbdr curegen` command. It is a mandatory file to perform curation on faulty taxonomy records.
|
|
|
|
|
|
The curation file is a CSV table with `;` as delimiter. The curation table must have record as rows. The curation table must have the following columns:
|
|
|
|
|
|
* `current_name` is the species name of the record in the input FASTA file
|
|
|
* `ncbi_name`is the curated species name of the record
|
|
|
* `genus` is the curated genus name of the record
|
|
|
* `family` is the curated family of the record
|
|
|
* `ncbi_rank` is the NCBI knowledge level of curated names (species, genus and family)
|
|
|
* `method` gives non-mandatory information about the source of the curation of the record (i.e. geonames or NCBI synonyms seeking)
|
|
|
|
|
|
### Examples of curation files
|
|
|
|
|
|
* To replace a wrong species name by the correct NCBI species name
|
|
|
|
|
|
```
|
|
|
current_name;ncbi_name;genus;family;ncbi_rank;method
|
|
|
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI synonym score=1.0
|
|
|
```
|
|
|
|
|
|
Here the wrong species Names `Albula forsteri` genus `Albula` family `Albulidae` will be replaced by NCBI species name `Albula argentea` genus `Albula` family `Albulidae` by MKBDR.
|
|
|
|
|
|
|
|
|
* To create a new custom species but the genus is known in NCBI
|
|
|
|
|
|
```
|
|
|
current_name;ncbi_name;genus;family;ncbi_rank;method
|
|
|
Distichodus perspicillatus;NA;Distichodus;Distichodontidae;genus;Catalogue of Life
|
|
|
```
|
|
|
|
|
|
The known rank in NCBI is genus so the NCBI species name is unknown. In that case a new species `Distichodus perspicillatus` genus `Distichodus` family `Distichodontidae` will be created in NCBI by MKBDR.
|
|
|
|
|
|
|
|
|
* To create a new custom genus and species but family is known in NCBI
|
|
|
|
|
|
```
|
|
|
current_name;ncbi_name;genus;family;ncbi_rank;method
|
|
|
Neoaploactis_tridorsalis;NA;Neoaploactis;Aploactinidae;family;Catalogue of Life
|
|
|
```
|
|
|
|
|
|
The known rank in NCBI is family so the NCBI species name and NCBI genus are unknown. In this case, a new species `Neoaploactis tridorsalis` and new genus `Neoaploactis` family `Aploactinidae` will be created in NCBI by MKBDR.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|