This function executes the two steps STAR as sugested by best practice GATK for calling variants on RNAseq data only PE data are accepted

star2steps(
  group = c("sudo", "docker"),
  fastq.folder = getwd(),
  scratch.folder = "/data/scratch",
  genome.folder,
  groupid,
  threads = 1,
  opossum.preprocessing = FALSE
)

Arguments

group,

a character string. Two options: "sudo" or "docker", depending to which group the user belongs

fastq.folder,

a character string indicating where gzip fastq files are located

scratch.folder,

a character string indicating the scratch folder where docker container will be mounted

genome.folder,

a character string indicating the folder where the indexed reference genome for STAR is located.

groupid,

a character string to be inserted in the bam as identifier for the sample

threads,

a number indicating the number of cores to be used from the application

opossum.preprocessing,

a boolean TRUE or FALSE to use opossum for RNAseq data preprocessing https://wellcomeopenresearch.org/articles/2-6/v1

Value

three files: dedup_reads.bam, which is sorted and duplicates marked bam file, dedup_reads.bai, which is the index of the dedup_reads.bam, and dedup_reads.stats, which provides mapping statistics

Author

Raffaele Calogero, raffaele.calogero [at] unito [dot] it, Bioinformatics and Genomics unit, University of Torino Italy

Examples

if (FALSE) {
    #downloading fastq files
    system("wget http://130.192.119.59/public/test_R1.fastq.gz")
    system("wget http://130.192.119.59/public/test_R2.fastq.gz")
    #running star2step nostrand pe
    star2steps(group="docker",fastq.folder=getwd(), scratch.folder="/data/scratch",
    genome.folder="/data/scratch/hg38star", groupid="test", threads=8, opossum.preprocessing=FALSE)

}