Skip to content

Use less space

The pipeline uses a lot of space, it is necessary to remove some intermediate results which are not necessary

Example of space used for the med_coastal dataset by folder: 460K ./04_demultiplex_dat 28K ./05_demultiplex_flags 283G ./03_remove_unaligned 730G ./06_assign_marker_sample_to_sequence 148M ./13_cat_samples_into_runs 161M ./18_table_assigned_sequences 149M ./16_remove_annotations 1.7G ./10_goodlength_samples 132M ./14_dereplicate_runs 237M ./15_taxonomic_assignment 4.0K ./00_flags 1.2M ./01_settings 155M ./12_rm_internal_samples 2.2G ./11_clean_pcrerr_samples 323G ./08_samples 328K ./07_split_fastq_by_sample 6.7G ./09_dereplicate_samples 607G ./02_illuminapairedend 149M ./17_sort_abundance_assigned_sequences 323G ./02b_scaterred 2.3T .

Cleaning necessary for:

  • several part of the 02_illuminapairerend
  • ./02b_scaterred remove the folder during the pipeline (and the files it contains within the 02_ folder as well)
  • gain space with the folders 08_samples & 06_assign_marker_sample_to_sequence