if LC_ALL=C grep-F-q-w-m1"$ctg"$out/$ref.lowcov &&LC_ALL=C grep-F-q-w-m1"$ctg"$out/$ref.all ;then go="_lowcov"# attention : $out/$ref.lowcov contient aussi les non-supect !!!
if LC_ALL=C grep-F-q-w-m1"$ctg"$out/$ref.lowcov &&LC_ALL=C grep-F-q-w-m1"$ctg"$out/$ref.all ;then go="_lowcov"# attention : $out/$ref.lowcov contient aussi les non-supect !!!
echo-e"\n`basename$0` is a program that can detect potential cross-contaminations in assembled transcriptomes using sequencing reads to find true origin of transcripts.
echo-e"\n`basename$0` is a program that can detect potential cross-contaminations in assembled transcriptomes using sequencing reads to find true origin of transcripts.
--mode p|u :\t\t\t'p' for paired and 'u' for unpaired (default : 'p') [short: -m]
--mode p|u :\t\t\t'p' for paired and 'u' for unpaired (default : 'p') [short: -m]
--in STR :\t\t\tName of the directory containing the fasta files to be analyzed (DEFAULT : working directory) [short: -i]
--in STR :\t\t\tName of the directory containing the input files to be analyzed (DEFAULT : working directory) [short: -i]
--tool B|B2|K|R|S|H :\t\t'B' for bowtie, 'B2' for bowtie2, 'K' for kallisto, 'S' for salmon, 'R' for rapmap (DEFAULT : 'B') [short: -t]
--tool B|K|R :\t\t'B' for bowtie, 'K' for kallisto, 'R' for rapmap (DEFAULT : 'R') [short: -t]
--fold-threshold FLOAT :\tValue between 1 and N (DEFAULT : 2) [short: -f]
--fold-threshold FLOAT :\tValue between 1 and N (DEFAULT : 2) [short: -f]
--minimum-coverage FLOAT :\tValue in TPM (DEFAULT : 0.2) [short: -c]
--minimum-coverage FLOAT :\tTPM value (DEFAULT : 0.2) [short: -c]
--overexp FLOAT :\t\t\tTPM value (DEFAULT : 300) [short: -d]
--threads INT :\t\t\tNumber of threads to use (DEFAULT : 1) [short: -n]
--threads INT :\t\t\tNumber of threads to use (DEFAULT : 1) [short: -n]
--output-prefix STR :\t\tPrefix of output directory that will be created (DEFAULT : empty) [short: -p]
--output-prefix STR :\t\tPrefix of output directory that will be created (DEFAULT : empty) [short: -p]
--output-level 1|2|3 :\t\tSelect the fasta files to output. '1' for none, '2' for clean and lowcov, '3' for all (DEFAULT : 2) [short: -l]
--output-level 1|2|3 :\t\tSelect the fasta files to output. '1' for none, '2' for clean and lowcov, '3' for all (DEFAULT : 2) [short: -l]
--graph yes|no :\t\tProduce graphical output using R (DEFAULT : no) [short: -g]
--graph yes|no :\t\tProduce graphical output using R (DEFAULT : no) [short: -g]
--add-option STR :\t\tThis text string will be understood as additional options for the mapper/quantifier used (DEFAULT : empty) [short: -a]
--add-option 'STR' :\t\tThis text string will be understood as additional options for the mapper/quantifier used (DEFAULT : empty) [short: -a]
--recat SRT :\t\t\tName of the previous CroCo output folder of which you wish to re-categorize transcripts (DEFAULT : no) [short: -r]
--recat SRT :\t\t\tName of a previous CroCo output directory you wish to use to re-categorize transcripts (DEFAULT : no) [short: -r]
--trim5 INT :\t\t\tnb bases trimmed from 5' (DEFAULT : 0) [short: -x]
--trim5 INT :\t\t\tnb bases trimmed from 5' (DEFAULT : 0) [short: -x]
--trim3 INT :\t\t\tnb bases trimmed from 3' (DEFAULT : 0) [short: -y]
--trim3 INT :\t\t\tnb bases trimmed from 3' (DEFAULT : 0) [short: -y]
--suspect-id INT :\t\tIndicate the minimum percent identity between two transcripts to suspect a cross contamination (DEFAULT : 95) [short: -s]
--suspect-id INT :\t\tIndicate the minimum percent identity between two transcripts to suspect a cross contamination (DEFAULT : 95) [short: -s]
--suspect-len INT :\t\tIndicate the minimum length of an alignment between two transcripts to suspect a cross contamination (DEFAULT : 40) [short: -w]
--suspect-len INT :\t\tIndicate the minimum length of an alignment between two transcripts to suspect a cross contamination (DEFAULT : 40) [short: -w]
--frag-length FLOAT :\t\tEstimated average fragment length (no default value). Only used in specific combinations of --mode and --tool [short: -u]
--frag-length FLOAT :\t\tEstimated average fragment length (no default value). Only used in specific combinations of --mode and --tool [short: -u]
--frag-sd FLOAT :\t\tEstimated standard deviation of fragment length (no default value). Only used in specific combinations of --mode and --tool [short: -v]
--frag-sd FLOAT :\t\tEstimated standard deviation of fragment length (no default value). Only used in specific combinations of --mode and --tool [short: -v]
--recat STR :\t\t\tIndicate the name of a previous CroCo output directory to be used to re-categorize transcripts (DEFAULT : no) [short: -r]
It is good practice to redirect information about each CroCo run into an output log file using the following structure :
It is good practice to redirect information about each CroCo run into an output log file using the following structure :
'| tee log_file'
'2>&1 | tee log_file'
Minimal working example :
Minimal working example :
CroCo_v0.1.sh --mode p | tee log_file
CroCo_v0.1.sh --mode p 2>&1 | tee log_file
Exhaustive example :
Exhaustive example :
CroCo_v0.1.sh --mode p --in data_folder_name --tool B --fold-threshold 2 --minimum-coverage 0.2 --threads 8 --output-prefix test1_ --output-level 2 --graph yes --add-option '-v 0' --trim5 0 --trim3 0 --suspect-id 95 --suspect-len 40 --recat no | tee log_file
CroCo_v0.1.sh --mode p --in data_folder_name --tool B --fold-threshold 2 --minimum-coverage 0.2 --overexp 300 --threads 8 --output-prefix test1_ --output-level 2 --graph yes --add-option '-v 0' --trim5 0 --trim3 0 --suspect-id 95 --suspect-len 40 --recat no 2>&1 | tee log_file
Exhaustive example using shortcuts :
Exhaustive example using shortcuts :
CroCo_v0.1.sh -m p -i data_folder_name -t B -f 2 -c 0.2 -n 8 -p test1_ -l 2 -g yes -a '-v 0' -x 0 -y 0 -s 95 -w 40 -r no | tee log_file
CroCo_v0.1.sh -m p -i data_folder_name -t B -f 2 -c 0.2 -d 300 -n 8 -p test1_ -l 2 -g yes -a '-v 0' -x 0 -y 0 -s 95 -w 40 -r no 2>&1 | tee log_file
Example for re-categorizing previous CroCo results
Example for re-categorizing previous CroCo results