Running bwa, Li and Durbin Bioinformatics, 2009 Jul 15;25(14):1754-60

This function executes the docker container bwa1 where BWA is installed BWA is a read alignment package that efficiently align short sequencing reads against a large reference sequence This aligner provides optimal results with DNA-seq data

bwa(
  group = c("sudo", "docker"),
  fastq.folder = getwd(),
  scratch.folder = "/data/scratch",
  genome.folder,
  seq.type = c("se", "pe"),
  threads = 1,
  sample.id,
  circRNA = FALSE
)

Arguments

group,: a character string. Two options: "sudo" or "docker", depending to which group the user belongs
fastq.folder,: a character string indicating where gzip fastq files are located
scratch.folder,: a character string indicating the scratch folder where docker container will be mounted
genome.folder,: a character string indicating the folder where the indexed reference genome for bwa is located
seq.type,: a character string indicating the type of reads to be trimmed. Two options: "se" or "pe" respectively for single end and pair end sequencing
threads,: a number indicating the number of cores to be used from the application
sample.id,: a character string indicating the unique id to be associated to the bam that will be created
circRNA,: a boolean variable indicating whether the analysis concerns a circRNA prediction or not.

Value

three files: dedup_reads.bam, which is sorted and duplicates marked bam file, dedup_reads.bai, which is the index of the dedup_reads.bam, and dedup_reads.stats, which provides mapping statistics

Author

Raffaele Calogero

Examples

if (FALSE) {
    #downloading fastq files
    system("wget http://130.192.119.59/public/test_R1.fastq.gz")
    #running bwa
    bwa(group="docker",fastq.folder=getwd(), scratch.folder="/data/scratch",
    genome.folder="/data/scratch/hg19bwa", seq.type="se",
    threads=24, sample.id="exome")

}