|
|
There are several files that may be needed depending on the analysis. These files, as well as, files output by MKBDR are described here.
|
|
|
|
|
|
# Input Files
|
|
|
|
|
|
* Representative sequences
|
|
|
|
|
|
The representative sequences must be stored as a FASTA file. See the definition of FASTA format on wikipedia [here](https://en.wikipedia.org/wiki/FASTA_format).
|
|
|
|
|
|
The FASTA file is a set of records of representatives sequences of taxon you want to put into your custom reference database.
|
|
|
|
|
|
The following format for the description line is required (otherwise MKDIR will consider records as faulty format):
|
|
|
|
|
|
```
|
|
|
> ID; species_name=Mullus_surmuletus
|
|
|
ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC
|
|
|
```
|
|
|
|
|
|
or
|
|
|
|
|
|
```
|
|
|
> ID; species_name=Mullus surmuletus
|
|
|
ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC
|
|
|
```
|
|
|
|
|
|
* The first character must be `>`
|
|
|
* The `ID` is the unique identifier of the sequence.
|
|
|
* `;` is the delimiter between identifier and species name
|
|
|
* `species_name=` is mandatory and must be the prefix of the species name
|
|
|
* `Mullus surmuletus` is the species name in NCBI taxonom. It have to be exactly the same than the name in NCBI taxonomy otherwise MKBDR will result a taxonomy fault. The name of the species is composed of 2 words _Genus_ and _species_ separated by a delimiter. The delimiter can be `_` or ` `. Otherwise MKBDR will result a format fault.
|
|
|
|
|
|
|
|
|
|
|
|
|