Commit 447e2f5a authored by peguerin's avatar peguerin
Browse files

quast metrics

parent a93bbc90
## dependencies
sudo apt-get update && sudo apt-get install -y pkg-config libfreetype6-dev libpng-dev python-matplotlib
## Quast
git clone https://github.com/ablab/quast.git
cd quast
./setup.py install
Assembly serranus.1 serran_mbb_gapClosed improved_serran mullus_gapclose_mbb_gapClosed mullus_gapclose_hpc_gapClosed sar_hpc_g_gapClosed
# contigs (>= 0 bp) 34168 177982 3102 442644 272725 221585
# contigs (>= 1000 bp) 34168 20164 3102 34257 27638 19129
# contigs (>= 5000 bp) 10222 2190 2122 5417 2940 2344
# contigs (>= 10000 bp) 6329 1713 1645 4435 2159 1779
# contigs (>= 25000 bp) 4063 1485 1418 3594 1771 1285
# contigs (>= 50000 bp) 2975 1355 1294 2813 1517 940
Total length (>= 0 bp) 663224709 697914092 630579612 666444108 630495908 860656326
Total length (>= 1000 bp) 663224709 656076523 630579612 566759743 577710634 813268755
Total length (>= 5000 bp) 610460294 626940096 626940776 517116744 534778755 785047582
Total length (>= 10000 bp) 583798943 623752971 623753651 510327740 529438335 781123307
Total length (>= 25000 bp) 548865149 620167381 620187813 496231759 523327264 773135675
Total length (>= 50000 bp) 508844184 615500364 615743990 467474763 514184196 760968506
# contigs 12204 2469 2401 5907 3425 2627
Largest contig 1807726 2764103 2764103 1111082 3278074 22692753
Total length 619323479 628172492 628173172 519292505 536930553 786295233
GC (%) 40.44 40.32 40.32 45.05 45.14 42.09
N50 180366 613332 629123 191830 479354 3371708
N75 73870 397188 421491 102184 257789 789582
L50 910 352 342 825 317 58
L75 2251 663 643 1751 698 182
# N's per 100 kbp 1086.23 3770.02 3770.13 9764.56 6706.19 2934.13
This diff is collapsed.
/home/pguerin/src/anaconda3/lib/python3.6/site-packages/quast-5.0.2-py3.6.egg/EGG-INFO/scripts/quast.py --min-contig 4000 --eukaryote --circos --output-dir metrics/diplodus/platanus/mesolr /donnees/RESERVEBENEFIT/whole_genome_assembly/assemblage_sar/gapclose_platanus_hpc/sar_hpc_g_gapClosed.fa
Version: 5.0.2, aa6e8430
System information:
OS: Linux-4.15.0-54-generic-x86_64-with-debian-buster-sid (linux_64)
Python version: 3.6.8
CPUs number: 8
Started: 2019-07-11 11:31:36
Logging to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/quast.log
NOTICE: Maximum number of threads is set to 2 (use --threads option to set it manually)
CWD: /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring
Main parameters:
MODE: default, threads: 2, eukaryotic: true, min contig length: 4000, min alignment length: 65, \
min alignment IDY: 95.0, ambiguity: one, threshold for extensive misassembly size: 1000
Contigs:
Pre-processing...
/donnees/RESERVEBENEFIT/whole_genome_assembly/assemblage_sar/gapclose_platanus_hpc/sar_hpc_g_gapClosed.fa ==> sar_hpc_g_gapClosed
2019-07-11 11:32:05
Running Basic statistics processor...
Contig files:
sar_hpc_g_gapClosed
Calculating N50 and L50...
sar_hpc_g_gapClosed, N50 = 3371708, L50 = 58, Total length = 786295233, GC % = 42.09, # N's per 100 kbp = 2934.13
Drawing Nx plot...
saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/basic_stats/Nx_plot.pdf
Drawing cumulative plot...
saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/basic_stats/cumulative_plot.pdf
Drawing GC content plot...
saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/basic_stats/GC_content_plot.pdf
Drawing sar_hpc_g_gapClosed GC content plot...
saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/basic_stats/sar_hpc_g_gapClosed_GC_content_plot.pdf
Done.
NOTICE: Genes are not predicted by default. Use --gene-finding or --glimmer option to enable it.
2019-07-11 11:32:33
Creating large visual summaries...
This may take a while: press Ctrl-C to skip this step..
1 of 2: Creating Icarus viewers...
2 of 2: Creating PDF with all tables and plots...
Done
2019-07-11 11:32:39
RESULTS:
Text versions of total report are saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/report.txt, report.tsv, and report.tex
Text versions of transposed total report are saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/transposed_report.txt, transposed_report.tsv, and transposed_report.tex
HTML version (interactive tables and plots) is saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/report.html
PDF version (tables and plots) is saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/report.pdf
Icarus (contig browser) is saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/icarus.html
Log is saved to /home/pguerin/working/reservebenefit/genome_assembly/projets/genome_assemblies_collection/measuring/metrics/diplodus/platanus/mesolr/quast.log
Finished: 2019-07-11 11:32:42
Elapsed time: 0:01:05.914062
NOTICEs: 2; WARNINGs: 0; non-fatal ERRORs: 0
Thank you for using QUAST!
This diff is collapsed.
\documentclass[12pt,a4paper]{article}
\begin{document}
\begin{table}[ht]
\begin{center}
\caption{All statistics are based on contigs of size $\geq$ 4000 bp, unless otherwise noted (e.g., "\# contigs ($\geq$ 0 bp)" and "Total length ($\geq$ 0 bp)" include all contigs).}
\begin{tabular}{|l*{1}{|r}|}
\hline
Assembly & sar\_hpc\_g\_gapClosed \\ \hline
\# contigs ($\geq$ 0 bp) & 221585 \\ \hline
\# contigs ($\geq$ 1000 bp) & 19129 \\ \hline
\# contigs ($\geq$ 5000 bp) & 2344 \\ \hline
\# contigs ($\geq$ 10000 bp) & 1779 \\ \hline
\# contigs ($\geq$ 25000 bp) & 1285 \\ \hline
\# contigs ($\geq$ 50000 bp) & 940 \\ \hline
Total length ($\geq$ 0 bp) & 860656326 \\ \hline
Total length ($\geq$ 1000 bp) & 813268755 \\ \hline
Total length ($\geq$ 5000 bp) & 785047582 \\ \hline
Total length ($\geq$ 10000 bp) & 781123307 \\ \hline
Total length ($\geq$ 25000 bp) & 773135675 \\ \hline
Total length ($\geq$ 50000 bp) & 760968506 \\ \hline
\# contigs & 2627 \\ \hline
Largest contig & 22692753 \\ \hline
Total length & 786295233 \\ \hline
GC (\%) & 42.09 \\ \hline
N50 & 3371708 \\ \hline
N75 & 789582 \\ \hline
L50 & 58 \\ \hline
L75 & 182 \\ \hline
\# N's per 100 kbp & 2934.13 \\ \hline
\end{tabular}
\end{center}
\end{table}
\end{document}
Assembly sar_hpc_g_gapClosed
# contigs (>= 0 bp) 221585
# contigs (>= 1000 bp) 19129
# contigs (>= 5000 bp) 2344
# contigs (>= 10000 bp) 1779
# contigs (>= 25000 bp) 1285
# contigs (>= 50000 bp) 940
Total length (>= 0 bp) 860656326
Total length (>= 1000 bp) 813268755
Total length (>= 5000 bp) 785047582
Total length (>= 10000 bp) 781123307
Total length (>= 25000 bp) 773135675
Total length (>= 50000 bp) 760968506
# contigs 2627
Largest contig 22692753
Total length 786295233
GC (%) 42.09
N50 3371708
N75 789582
L50 58
L75 182
# N's per 100 kbp 2934.13
All statistics are based on contigs of size >= 4000 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).
Assembly sar_hpc_g_gapClosed
# contigs (>= 0 bp) 221585
# contigs (>= 1000 bp) 19129
# contigs (>= 5000 bp) 2344
# contigs (>= 10000 bp) 1779
# contigs (>= 25000 bp) 1285
# contigs (>= 50000 bp) 940
Total length (>= 0 bp) 860656326
Total length (>= 1000 bp) 813268755
Total length (>= 5000 bp) 785047582
Total length (>= 10000 bp) 781123307
Total length (>= 25000 bp) 773135675
Total length (>= 50000 bp) 760968506
# contigs 2627
Largest contig 22692753
Total length 786295233
GC (%) 42.09
N50 3371708
N75 789582
L50 58
L75 182
# N's per 100 kbp 2934.13
\documentclass[12pt,a4paper]{article}
\begin{document}
\begin{table}[ht]
\begin{center}
\caption{All statistics are based on contigs of size $\geq$ 4000 bp, unless otherwise noted (e.g., "\# contigs ($\geq$ 0 bp)" and "Total length ($\geq$ 0 bp)" include all contigs).}
\begin{tabular}{|l*{21}{|r}|}
\hline
Assembly & \# contigs ($\geq$ 0 bp) & \# contigs ($\geq$ 1000 bp) & \# contigs ($\geq$ 5000 bp) & \# contigs ($\geq$ 10000 bp) & \# contigs ($\geq$ 25000 bp) & \# contigs ($\geq$ 50000 bp) & Total length ($\geq$ 0 bp) & Total length ($\geq$ 1000 bp) & Total length ($\geq$ 5000 bp) & Total length ($\geq$ 10000 bp) & Total length ($\geq$ 25000 bp) & Total length ($\geq$ 50000 bp) & \# contigs & Largest contig & Total length & GC (\%) & N50 & N75 & L50 & L75 & \# N's per 100 kbp \\ \hline
sar\_hpc\_g\_gapClosed & 221585 & 19129 & 2344 & 1779 & 1285 & 940 & 860656326 & 813268755 & 785047582 & 781123307 & 773135675 & 760968506 & 2627 & 22692753 & 786295233 & 42.09 & 3371708 & 789582 & 58 & 182 & 2934.13 \\ \hline
\end{tabular}
\end{center}
\end{table}
\end{document}
Assembly # contigs (>= 0 bp) # contigs (>= 1000 bp) # contigs (>= 5000 bp) # contigs (>= 10000 bp) # contigs (>= 25000 bp) # contigs (>= 50000 bp) Total length (>= 0 bp) Total length (>= 1000 bp) Total length (>= 5000 bp) Total length (>= 10000 bp) Total length (>= 25000 bp) Total length (>= 50000 bp) # contigs Largest contig Total length GC (%) N50 N75 L50 L75 # N's per 100 kbp
sar_hpc_g_gapClosed 221585 19129 2344 1779 1285 940 860656326 813268755 785047582 781123307 773135675 760968506 2627 22692753 786295233 42.09 3371708 789582 58 182 2934.13
All statistics are based on contigs of size >= 4000 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).
Assembly # contigs (>= 0 bp) # contigs (>= 1000 bp) # contigs (>= 5000 bp) # contigs (>= 10000 bp) # contigs (>= 25000 bp) # contigs (>= 50000 bp) Total length (>= 0 bp) Total length (>= 1000 bp) Total length (>= 5000 bp) Total length (>= 10000 bp) Total length (>= 25000 bp) Total length (>= 50000 bp) # contigs Largest contig Total length GC (%) N50 N75 L50 L75 # N's per 100 kbp
sar_hpc_g_gapClosed 221585 19129 2344 1779 1285 940 860656326 813268755 785047582 781123307 773135675 760968506 2627 22692753 786295233 42.09 3371708 789582 58 182 2934.13
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment