custom_reference_database issueshttps://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues2022-09-05T13:46:31Zhttps://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/9error Ophidion rochei2022-09-05T13:46:31Zmbrunoerror Ophidion rocheiThe specie *Ophidion rochei* is present in the NCBI taxonomy database: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?name=Ophidion+rochei
## Step 2: mkbdr curegen result:
|current_name|ncbi_name|genus|family|ncbi_rank|method...The specie *Ophidion rochei* is present in the NCBI taxonomy database: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?name=Ophidion+rochei
## Step 2: mkbdr curegen result:
|current_name|ncbi_name|genus|family|ncbi_rank|method|
|--|--|--|--|--|--|
|Ophidion rochei |Ophidion rochei|Ophidion|Orchidaceae|species|NCBI synonym score=1.0|
## Step 3: mkbdr validate result:
*terminal*
```
error Ophidion rochei
error Ophidion rochei
```
*20220829_DB_Ref_CEFE_teleo_curated_faulty_taxon.fasta*
```
>Sample_ID51; species_name=Ophidion_rochei; faulty taxonomy: species name Ophidion rochei not found in NCBI; faulty taxonomy: species name Ophidion rochei not found in NCBI
CTCCTAAAATACCGGCTATATAACTTAATACATACACACGTTAAAGGGGAGGAAAGTCGT
AA
>Sample_ID52; species_name=Ophidion_rochei; faulty taxonomy: species name Ophidion rochei not found in NCBI; faulty taxonomy: species name Ophidion rochei not found in NCBI
CTCCTAAAATACCGGCTATATAACTTAATACATACACACGTTAAAGGGGAGGAAAGTCGT
AA
```https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/6Bug - when adding a new node, add ";"2021-12-13T15:01:14ZvmarquesBug - when adding a new node, add ";"Bug - when adding a new node, the programs add ";" after the sequence name
This causes ecotag to behave weirdly later onBug - when adding a new node, the programs add ";" after the sequence name
This causes ecotag to behave weirdly later onhttps://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/7Bug - step 3 validate and curate2021-12-13T12:04:17ZmbrunoBug - step 3 validate and curate```bash
mkbdr validate --fasta data/teleo_ok_global+med.fasta --curate res_raw_curation.csv --ncbi_taxonomy_edition customtaxonomy/ --output_prefix res_taxo_curated
```
```bash
Checking arguments...done.
Validate records...
Loading loca...```bash
mkbdr validate --fasta data/teleo_ok_global+med.fasta --curate res_raw_curation.csv --ncbi_taxonomy_edition customtaxonomy/ --output_prefix res_taxo_curated
```
```bash
Checking arguments...done.
Validate records...
Loading local NCBI taxonomy...done.
Curating records with faulty taxonomy...
Traceback (most recent call last):
File "/home/mbruno/.local/bin/mkbdr", line 8, in <module>
sys.exit(main())
File "/home/mbruno/.local/lib/python3.9/site-packages/mkbdr/__main__.py", line 55, in main
results = curation(args.curate, rawResults, taxDic, ncbi, args.ncbi_taxonomy_edition)
File "/home/mbruno/.local/lib/python3.9/site-packages/mkbdr/curate.py", line 171, in curation
cureRankNCBI = convert_ncbirank_literal_to_integer(cureRecord.ncbi_rank.values[0])
AttributeError: 'bool' object has no attribute 'ncbi_rank'
```mbrunombrunohttps://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/8Bug - step 2 Curation generation - 'Catalogue of Life database not found'2021-12-13T12:01:59ZmbrunoBug - step 2 Curation generation - 'Catalogue of Life database not found'```
mkbdr curegen --fasta res_raw_faulty_taxon.fasta \
--database_globalnames 'Catalogue of Life' \
--output_prefix res_raw
```
Results - 10/12/2021
```
current_name;ncbi_name;genus;family;ncbi_rank;method
Albula forsteri;Albula argent...```
mkbdr curegen --fasta res_raw_faulty_taxon.fasta \
--database_globalnames 'Catalogue of Life' \
--output_prefix res_raw
```
Results - 10/12/2021
```
current_name;ncbi_name;genus;family;ncbi_rank;method
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI synonym score=1.0
Amphiprion fuscocaudatus;NA;NA;NA;NA;FAILURE: Catalogue of Life database not found in globalNames query
Atherinomorus lineatus;NA;NA;NA;NA;FAILURE: Catalogue of Life database not found in globalNames query
...
```
Results - 15/05/2021
```
current_name;ncbi_name;genus;family;ncbi_rank;method
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI synonym score=1.0
Amphiprion fuscocaudatus;NA;Amphiprion;Pomacentridae;genus;Catalogue of Life
Atherinomorus lineatus;NA;Atherinomorus;Atherinidae;genus;Catalogue of Life
...
```mbrunombrunohttps://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/5several taxid for one species - bug2021-06-21T13:56:09Zvmarquesseveral taxid for one species - bugExample with the csv for curation
```
current_name;ncbi_name;genus;family;ncbi_rank;method
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI synonym score=1.0
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI sy...Example with the csv for curation
```
current_name;ncbi_name;genus;family;ncbi_rank;method
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI synonym score=1.0
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI synonym score=1.0
Albula forsteri;Albula argentea;Albula;Albulidae;species;NCBI synonym score=1.0
Amphiprion fuscocaudatus;NA;Amphiprion;Pomacentridae;genus;Catalogue of Life
Atherinomorus lineatus;NA;Atherinomorus;Atherinidae;genus;Catalogue of Life
Haemulon chrysargyreum;Brachygenys chrysargyreum;Brachygenys;Haemulidae;species;NCBI synonym score=1.0
Haemulon chrysargyreum;Brachygenys chrysargyreum;Brachygenys;Haemulidae;species;NCBI synonym score=1.0
Canthigaster epilampra;NA;Canthigaster;Tetraodontidae;genus;Catalogue of Life
Distichodus perspicillatus;NA;Distichodus;Distichodontidae;genus;Catalogue of Life
Distichodus perspicillatus;NA;Distichodus;Distichodontidae;genus;Catalogue of Life
Hirundichthys rondeleti;Hirundichthys rondeletii;Hirundichthys;Exocoetidae;species;NCBI synonym score=0.9565217391304348
Haemulon chrysargyreum;Brachygenys chrysargyreum;Brachygenys;Haemulidae;species;NCBI synonym score=1.0
Hyporhamphus melanopterus;NA;Hyporhamphus;Hemiramphidae;genus;Catalogue of Life
Haemulopsis corvinaeformis;Pomadasys corvinaeformis;Pomadasys;Haemulidae;species;NCBI synonym score=1.0
Neoglyphidodon crossi;NA;Neoglyphidodon;Pomacentridae;genus;Catalogue of Life
Neoglyphidodon crossi;NA;Neoglyphidodon;Pomacentridae;genus;Catalogue of Life
Neoploactis tridorsalis;NA;NA;Aploactinidae;family;Catalogue of Life
Ophidion barbatum;NA;Ophidion;Ophidiidae;genus;Catalogue of Life
Ostorhinchus monospilus;NA;Ostorhinchus;Apogonidae;genus;Catalogue of Life
Ostorhinchus monospilus;NA;Ostorhinchus;Apogonidae;genus;Catalogue of Life
Cynoponticus savanna;NA;Cynoponticus;Muraenesocidae;genus;Catalogue of Life
Pseudanthias randali;Pseudanthias randalli;Pseudanthias;Serranidae;species;NCBI synonym score=0.95
Pseudanthias randali;Pseudanthias randalli;Pseudanthias;Serranidae;species;NCBI synonym score=0.95
Pseudanthias randali;Pseudanthias randalli;Pseudanthias;Serranidae;species;NCBI synonym score=0.95
Pseudanthias randali;Pseudanthias randalli;Pseudanthias;Serranidae;species;NCBI synonym score=0.95
Pseudanthias randali;Pseudanthias randalli;Pseudanthias;Serranidae;species;NCBI synonym score=0.95
Pseudanthias randali;Pseudanthias randalli;Pseudanthias;Serranidae;species;NCBI synonym score=0.95
Rhinobatos sainsburyi;NA;Rhinobatos;Rhinobatidae;genus;Catalogue of Life
Aspitrigla cuculus;Chelidonichthys cuculus;Chelidonichthys;Triglidae;species;NCBI synonym score=1.0
Aspitrigla cuculus;Chelidonichthys cuculus;Chelidonichthys;Triglidae;species;NCBI synonym score=1.0
Carcharhinus taurus;Carcharhinus cautus;Carcharhinus;Carcharhinidae;species;NCBI synonym score=0.8947368421052632
Glaucostegus cemicullus;Glaucostegus cemiculus;Glaucostegus;Glaucostegidae;species;NCBI synonym score=0.9565217391304348
Glaucostegus cemicullus;Glaucostegus cemiculus;Glaucostegus;Glaucostegidae;species;NCBI synonym score=0.9565217391304348
Glaucostegus cemicullus;Glaucostegus cemiculus;Glaucostegus;Glaucostegidae;species;NCBI synonym score=0.9565217391304348
Glaucostegus cemicullus;Glaucostegus cemiculus;Glaucostegus;Glaucostegidae;species;NCBI synonym score=0.9565217391304348
Gobius ater;NA;Gobius;Gobiidae;genus;Catalogue of Life
Ophidion rochei;NA;Ophidion;Ophidiidae;genus;NA
Ophidion rochei;NA;Ophidion;Ophidiidae;genus;NA
```
and the customtaxonomy names.dmp
```
10000000 | Amphiprion fuscocaudatus | | scientific name |
10000001 | Atherinomorus lineatus | | scientific name |
10000002 | Canthigaster epilampra | | scientific name |
10000003 | Distichodus perspicillatus | | scientific name |
10000004 | Distichodus perspicillatus | | scientific name |
10000005 | Hyporhamphus melanopterus | | scientific name |
10000006 | Neoglyphidodon crossi | | scientific name |
10000007 | Neoglyphidodon crossi | | scientific name |
10000008 | Ophidion barbatum | | scientific name |
10000009 | Ostorhinchus monospilus | | scientific name |
10000010 | Ostorhinchus monospilus | | scientific name |
10000011 | Cynoponticus savanna | | scientific name |
10000012 | Rhinobatos sainsburyi | | scientific name |
10000013 | Gobius ater | | scientific name |
10000014 | Ophidion rochei | | scientific name |
10000015 | Ophidion rochei | | scientific name |
```https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/1Error text written names.dmp2021-06-21T13:55:53ZvmarquesError text written names.dmp```
names.dmp:3416120:2839645 | ANK:collector:H.Duman:10209 | | isotype |
names.dmp:3416121:2839645 | GAZI:collector:H.Duman:10209 | | holotype |
names.dmp:3416122:2839645 | HUB:collector:H.Duman:10209 | | isotype |
```
causing ecota...```
names.dmp:3416120:2839645 | ANK:collector:H.Duman:10209 | | isotype |
names.dmp:3416121:2839645 | GAZI:collector:H.Duman:10209 | | holotype |
names.dmp:3416122:2839645 | HUB:collector:H.Duman:10209 | | isotype |
```
causing ecotag to failhttps://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/3faulty format -- needs correction2021-06-21T13:55:37Zvmarquesfaulty format -- needs correctionI have some species name indicated as faulty format if there is more than Genus_species for example Genus_species_subspecies (or even Genus_sp_cf_species for when there is a possible new undescribed species)
```
>RBM2_194; species_name=...I have some species name indicated as faulty format if there is more than Genus_species for example Genus_species_subspecies (or even Genus_sp_cf_species for when there is a possible new undescribed species)
```
>RBM2_194; species_name=Syngnathus_typhle_rondeleti ; faulty species name format Syngnathus_typhle_rondeleti
CCCCTAATATCTCATAAATTTAAGTAAAACACCTGAAAAATTAAGGGGAGGCAAGTCGTA
A
```
It needs to be corrected to allow such cases in an accepted formatpeguerinpierre-edouard.guerin@cefe.cnrs.frpeguerinpierre-edouard.guerin@cefe.cnrs.frhttps://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/2error installation2021-06-14T08:36:34Zvmarqueserror installationI have an error when trying to install using `pip3 install .`
```
Processing /media/superdisk/edna/training/vmarques/custom_reference_database
Collecting PyQt5 (from mkbdr==1.0.1)
Using cached https://files.pythonhosted.org/packages/...I have an error when trying to install using `pip3 install .`
```
Processing /media/superdisk/edna/training/vmarques/custom_reference_database
Collecting PyQt5 (from mkbdr==1.0.1)
Using cached https://files.pythonhosted.org/packages/8e/a4/d5e4bf99dd50134c88b95e926d7b81aad2473b47fde5e3e4eac2c69a8942/PyQt5-5.15.4.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.7/tokenize.py", line 447, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-build-_4w3l78w/PyQt5/setup.py'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-_4w3l78w/PyQt5/
```https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/issues/4How to finish the installation of ete3 ? (Fix issue: mkbdr curegen sqlite3.Op...2021-06-14T08:35:48Zpeguerinpierre-edouard.guerin@cefe.cnrs.frHow to finish the installation of ete3 ? (Fix issue: mkbdr curegen sqlite3.OperationalError)Je n'arrive pas à resoudre le bug décrit dans : https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/How-to-guide#fix-issue-mkbdr-curegen-sqlite3operationalerror
Je galère car je trouv epas le dossier
```
which ete3...Je n'arrive pas à resoudre le bug décrit dans : https://gitlab.mbb.univ-montp2.fr/edna/custom_reference_database/-/wikis/How-to-guide#fix-issue-mkbdr-curegen-sqlite3operationalerror
Je galère car je trouv epas le dossier
```
which ete3
/home/vmarques/.local/bin/ete3
```
j’ai essayé avec python3/ete3 en local mais aussi en env conda
```
which ete3
/opt/anaconda3/envs/mkbdr_dep/bin/ete3
```
```
cd /opt/anaconda3/envs/mkbdr_dep/bin/ete3/
-bash: cd: /opt/anaconda3/envs/mkbdr_dep/bin/ete3/: Not a directory
```
```
cd /home/vmarques/.local/bin/ete3
-bash: cd: /home/vmarques/.local/bin/ete3: Not a directory
```
si tu as une idée, car du coup ça bloque pour la suite