Update home authored by peguerin's avatar peguerin
......@@ -21,17 +21,42 @@ The custom reference database generated by MKBDR can be use in further analysis
See [Installing MKBDR](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Installing-MKBDR) for installation instructions.
Download example data with:
```
curl -LJO https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/raw/master/tests/data/raw.fasta
```
* `raw.fasta`: a FASTA file of 4 records representative sequence of 4 taxon groups. More detail about the input [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#representative-sequences).
If you have installed MKBDR, you can run the example data with:
```
mkbdr init_ncbi_taxdump --folder_path arbre/nomcompres
mkbdr init_ncbi_taxdump --folder_path nouveau/arbre/ --decompress
mkbdr validate --fasta teleo_ok.fasta --output_prefix checked_teleo
mkbdr curegen --database_globalnames 'Catalogue of Life' --output_prefix cure_teleo --fasta checked_teleo_faulty_taxon.fasta
mkbdr validate --fasta teleo_ok.fasta --curate cure_teleo_curation.csv --output_prefix curated_teleo
mkbdr validate --fasta teleo_ok.fasta --curate cure_teleo_curation.csv --ncbi_taxonomy_edition nouveau/arbre/ --output_prefix curated_teleo
mkbdr validate --fasta tests/data/raw.fasta --output_prefix res_raw
```
This will outputs:
```
Checking arguments...done.
Validate records...
Loading local NCBI taxonomy...done.
4 processed records.
On these records, 2 are valid, 0 are faulty format and 2 are faulty taxon.
```
`validate` module checks if the format or the taxonomy is valid. Then it writes 3 files:
* `res_raw_faulty_format.fasta` : a FASTA file with faulty format records (empty in this example)
* ` res_raw_faulty_taxon.fasta`: a FASTA file with faulty taxonomy records (2 faulty records in this example)
* `res_raw_valide.fasta`: a FASTA file with correct records that can be use as reference database for taxonomic assignment (2 valid records in this example)
Read more details about output files [here](https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/Files-definition#output-files)
# Next steps
Now that you've gotten the example to work, use the menu in the upper right to navigate to the more detailed descriptions and instructions for exploring your own data.
# Software Requirements
......
......