Commit bfc05ce5 authored by eortega's avatar eortega
Browse files

Corrected 02 and 03's input argument corrected. Commented 3 scripts in the readme.

parent 6003a162
#! /bin/bash
## VARIABLES
path=/home/enrique/work/Gandon/coevolution/phages/
#path=/home/enrique/work/Gandon/coevolution/phages/
path=$1
n_threads=35
## SCRIPT
mkdir -p data/trimmed/{W_seq,R_seq,Other_seq}
trimm_summary=${path}data/summary_trimm
......@@ -45,7 +50,7 @@ do
echo "Working on file " $shortname;
java -jar /usr/local/src/Trimmomatic-0.38/trimmomatic-0.38.jar \
PE \
-threads 35 \
-threads $n_threads \
-phred33 \
-summary /tmp/tmp.trimm_summary \
-quiet \
......@@ -72,7 +77,7 @@ do
echo "Working on file " $shortname;
java -jar /usr/local/src/Trimmomatic-0.38/trimmomatic-0.38.jar \
PE \
-threads 35 \
-threads $n_threads \
-phred33 \
-summary /tmp/tmp.trimm_summary \
-quiet \
......@@ -100,7 +105,7 @@ do
echo "Working on file " $shortname;
java -jar /usr/local/src/Trimmomatic-0.38/trimmomatic-0.38.jar \
PE \
-threads 35 \
-threads $n_threads \
-phred33 \
-summary /tmp/tmp.trimm_summary \
-quiet \
......
......@@ -2,7 +2,8 @@
## DEFINE PATH
path=/home/enrique/work/Gandon/coevolution/phages/
path=$1
# path=/home/enrique/work/Gandon/coevolution/phages/
## CREATE INDEXES AND
......@@ -13,27 +14,29 @@ path=/home/enrique/work/Gandon/coevolution/phages/
# bowtie2 --phred33 -5 12 -p 35 -t -x ${path}data/refs/indexes_Sv/Sv -1 ${path}data/trimmed/W_seq/W4T3_S54_R1.fq.gz -2 ${path}data/trimmed/W_seq/W4T3_S54_R2.fq.gz -S ${path}results/test.sam
path_fasta=/home/enrique/work/Gandon/coevolution/phages/data/trimmed/
path_fasta=${path}data/trimmed/
path_results=/home/enrique/work/Gandon/coevolution/phages/results/
path_results=${path}results/
bacteria_index=/home/enrique/work/Gandon/coevolution/phages/data/refs/indexes_St/St
virus_index=/home/enrique/work/Gandon/coevolution/phages/data/refs/indexes_Sv/Sv
bacteria_index=${path}data/refs/indexes_St/St
virus_index=${path}data/refs/indexes_Sv/Sv
for i in $(find $path_fasta -name *_R1.fq.gz)
do
## Declare local variables
# echo $i
root_name=$(basename -s _R1.fq.gz $i)
var=$(dirname $i)
outdir=${var/data\/trimmed/results/mapping}/
## Give some feedback to the user
echo -e "\n"phage $root_name -\> ${outdir}${root_name}.sam
echo $i ${i/_R1/_R2}
echo $virus_index
## Mapping and indexing bam file
echo "#### MAPPING"
bowtie2 --phred33 -5 12 -p 24 -t -x $virus_index -1 $i -2 ${i/_R1/_R2} -S ${outdir}${root_name}.sam
......
......@@ -30,14 +30,14 @@ Folders:
* lib
* __pycahce__
----
## Coding practices
I tried to use as much as possible the Python Enhancement Proposal 8 (PEP-8). https://www.python.org/dev/peps/pep-0008/
In python I tried to use as much as possible the Python Enhancement Proposal 8 (PEP-8). https://www.python.org/dev/peps/pep-0008/
A difference I use regularl is using double `##` at the begining of a line containing comments.
During the developement stages I comment some code lines that would be uncommented as a block. Having two '#' signs un real comments allows not to mistake them for command lines.
A difference I use regularly is using double `##` at the begining of a line containing *informative comments*.
During the developement stages I comment some code lines that would be uncommented as a block. Having two '#' prevents comments to be executed as code.
Example:
......@@ -48,22 +48,30 @@ for i in input_list:
print(i / sum(list))
```
Concerning bash coding I use often double spaces to separate commands, parameters, and arguments. When using some long names it makes things more readable
Concerning **bash** coding I use often double spaces to separate commands, parameters, and arguments. When using some long names it makes things more readable
Big chunks of code are commented with capitals and short phrases,
whereas longer phrases in comments are in lower case.
Example:
```bash
## BIG CODE CHUNK
for i in $(find $path_fasta -name *_R1.fq.gz)
do
## Declare local variables
# echo $i
root_name=$(basename -s _R1.fq.gz $i)
var=$(dirname $i)
outdir=${var/data\/trimmed/results/mapping}/
## Give some feedback to the user
echo -e "\n"phage $root_name -\> ${outdir}${root_name}.sam
echo $i ${i/_R1/_R2}
echo $virus_index
## Mapping and indexing bam file
echo "#### MAPPING"
bowtie2 --phred33 -5 12 -p 24 -t -x $virus_index -1 $i -2 ${i/_R1/_R2} -S ${outdir}${root_name}.sam
......@@ -87,3 +95,43 @@ samtools sort \
samtools index -b ${outdir}${root_name}.sort.bam
```
----
## File descriptions
### 00_create_py_env.sh
Creates a python virtual environment using `virtualenv`, the default python3 version of the system and will storte the environment in `~/envs/coev`. The installation of packages is done through pip.
### 01_quality_check.sh*
It will use FastQC to create quality control reports and then use multiqc to assemble the reports in only file. To make things easier, the input files are separated in 3 groups R, W and Other. These groups come from different treatments.
This script takes one argument: The path to the working directory, which is the project directory: `/home/user/work/coevolution/phages`
### 02_trimm_and_clean.sh*
Launches Trimmomatic to clean data.
The parameters are embeded in the code -- for now
This script takes one argument: The path to the working directory, which is the project directory: `/home/user/work/coevolution/phages`
### 03_mapping.sh*
The index creation is commented in the top of the script. It's only required once.
The path to the input files is a full path using a variable.
The mapper is bowtie2, after mapping the sam is sorted and converted to a bam and indexed so it's ready for the next stage.
This script takes one argument: The path to the working directory, which is the project directory: `/home/user/work/coevolution/phages`
* 04_snpcalling.sh*
* 05b_convert_protospacer_dico2fasta.py*
* 06b_blast_protospaces.sh*
* 07_2_run_vcf_parser_all_files.py
* 07_2_test.py
* 07_run_vcf_parser_all_files.py*
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment