Update Validate authored by peguerin's avatar peguerin
The **validate** module produces valid record fasta file with taxid attributes. This fasta file works with NCBI taxonomy so that it can be used for further analysis like taxonomic assignment using ecotag.
### Usage
#### The simplest case:
```
mkbdr validate --fasta raw.fasta --output_prefix res
```
This will read the FASTA file `raw.fasta` and check taxonomy and format of each records. Ultimately, mkbdr generates 3 FASTA files, valid records, faulty taxonomy records and faulty format records with `res` as prefix such as:
```
res_valid.fasta
res_faulty_format.fasta
res_faulty_taxonomy.fasta
```
#### Basic curation:
* To perform curation of names without editing NCBI taxonomy files
```
mkbdr validate --fasta raw.fasta --output_prefix res --curate curation_table.csv
```
`curation_table.csv` is read to perform curation on species name of records in `raw.fasta`.
#### Using a local NCBI taxonomy
* To run mkbdr with your own NCBI taxonomy located at path/to/ncbi_taxo you have to add the argument `--ncbi_taxdump_load` to your command for the first time:
```
mkbdr validate --fasta raw.fasta --output_prefix res --ncbi_taxdump path/to/ncbi_taxo
```
ete3 will load the files located in path/to/ncbi_taxo and stores the NCBI taxonomy tree object in your home folder. Once the NCBI taxonomy is locally loaded, you can simply run the command:
```
mkbdr validate --fasta raw.fasta --output_prefix res
```
The default ncbi taxonomy handled by mkbdr is now your local NCBI taxonomy.
If you want to change your local NCBI taxonomy again. For instance you want to load the taxonomies located at path/to/an/other/ncbi_taxo2:
```
mkbdr validate --fasta raw.fasta --output_prefix res --ncbi_taxdump path/to/an/other/ncbi_taxo2
```
### Inputs
### Outputs
### Options
\ No newline at end of file