... | ... | @@ -21,17 +21,42 @@ The custom reference database generated by MKBDR can be use in further analysis |
|
|
See [Installing MKBDR](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Installing-MKBDR) for installation instructions.
|
|
|
|
|
|
|
|
|
Download example data with:
|
|
|
|
|
|
```
|
|
|
curl -LJO https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/raw/master/tests/data/raw.fasta
|
|
|
```
|
|
|
|
|
|
* `raw.fasta`: a FASTA file of 4 records representative sequence of 4 taxon groups. More detail about the input [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#representative-sequences).
|
|
|
|
|
|
If you have installed MKBDR, you can run the example data with:
|
|
|
|
|
|
```
|
|
|
mkbdr init_ncbi_taxdump --folder_path arbre/nomcompres
|
|
|
mkbdr init_ncbi_taxdump --folder_path nouveau/arbre/ --decompress
|
|
|
mkbdr validate --fasta teleo_ok.fasta --output_prefix checked_teleo
|
|
|
mkbdr curegen --database_globalnames 'Catalogue of Life' --output_prefix cure_teleo --fasta checked_teleo_faulty_taxon.fasta
|
|
|
mkbdr validate --fasta teleo_ok.fasta --curate cure_teleo_curation.csv --output_prefix curated_teleo
|
|
|
mkbdr validate --fasta teleo_ok.fasta --curate cure_teleo_curation.csv --ncbi_taxonomy_edition nouveau/arbre/ --output_prefix curated_teleo
|
|
|
mkbdr validate --fasta tests/data/raw.fasta --output_prefix res_raw
|
|
|
```
|
|
|
|
|
|
This will outputs:
|
|
|
|
|
|
```
|
|
|
Checking arguments...done.
|
|
|
Validate records...
|
|
|
Loading local NCBI taxonomy...done.
|
|
|
4 processed records.
|
|
|
On these records, 2 are valid, 0 are faulty format and 2 are faulty taxon.
|
|
|
```
|
|
|
|
|
|
`validate` module checks if the format or the taxonomy is valid. Then it writes 3 files:
|
|
|
|
|
|
* `res_raw_faulty_format.fasta` : a FASTA file with faulty format records (empty in this example)
|
|
|
* ` res_raw_faulty_taxon.fasta`: a FASTA file with faulty taxonomy records (2 faulty records in this example)
|
|
|
* `res_raw_valide.fasta`: a FASTA file with correct records that can be use as reference database for taxonomic assignment (2 valid records in this example)
|
|
|
|
|
|
Read more details about output files [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#output-files)
|
|
|
|
|
|
# Next steps
|
|
|
|
|
|
Now that you've gotten the example to work, use the menu in the upper right to navigate to the more detailed descriptions and instructions for exploring your own data.
|
|
|
|
|
|
|
|
|
|
|
|
# Software Requirements
|
... | ... | |