This function executes a set of docker containers allowing the detection of TFs and Histon marks peaks. #params skewer

chipseqCounts(
  group = c("sudo", "docker"),
  output.folder = getwd(),
  mock.folder,
  test.folder,
  scratch.folder,
  adapter5 = "AGATCGGAAGAGCACACGTCTGAACTCCAGTCA",
  adapter3 = "AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT",
  threads = 8,
  seq.type = "se",
  min.length = 30,
  genome.folder,
  mock.id = "igg",
  test.id = "tf",
  genome,
  read.size = 50,
  tool = "macs",
  macs.min.mfold = 10,
  macs.max.mfold = 30,
  macs.pval = "1e-5",
  sicer.wsize = 200,
  sicer.gsize = 200,
  sicer.fdr = 0.1,
  tss.distance = 0,
  max.upstream.distance = 10000,
  remove.duplicates = "N"
)

Arguments

group,

a character string. Two options: "sudo" or "docker", depending to which group the user belongs

output.folder,

a character string indicating where final results will be saved

mock.folder,

a character string indicating where gzip fastq file for unspecific ChIP is located

test.folder,

a character string indicating where gzip fastq file for specific ChIP is located

scratch.folder,

a character string indicating the scratch folder where docker container will be mounted

adapter5,

a character string indicating the fwd adapter

adapter3,

a character string indicating the rev adapter

threads,

a number indicating the number of cores to be used from the application

seq.type,

a character string indicating the type of reads to be trimmed. One options: "se" for single end sequencing

min.length,

a number indicating minimal length required to return a trimmed read

genome.folder,

a character string indicating the folder where the indexed reference genome is located

mock.id,

a character string indicating the unique id to be associated to the mock bam that will be created

test.id,

a character string indicating the unique id to be associated to the test bam that will be created

genome,

a character string indicating the genome used as reference for data generation. Available options: hg19, hg38, mm9, mm10

read.size,

an integer indicating the length of the sequenced reads

tool,

a character string indicating the peaks calling algorith. Available options: macs and sicer. Macs, v 1.14, is used to call TF peaks, as instead sicer, v 1.1, is used to call histone mark peaks

macs.min.mfold,

an integer indicating the minimum enrichment ratio against background

macs.max.mfold,

an integer indicating the maximum enrichment ratio against background

macs.pval,

a character string, indicationg the pvalue cutoff to be used to filter peaks with low statistical significance.The number must be provided in scientific notation as the default value shows

sicer.wsize,

an integer indicating the windows size to be used by sicer

sicer.gsize,

an integer indicating the gap size to be used by sicer. Suggested values: H3K4Me3=200; H3K27Me3=600

sicer.fdr,

an integer indicating the pvalue cutoff to be used to filter peaks with low statistical significance

tss.distance,

an integer indicating the distance of TSS with respect to gene start

max.upstream.distance,

an integer indicating the maximum distance to associate a gene ID to a peak

remove.duplicates,

a character string indicating if duplicated reads have to be removed. Available options: Y, to remove douplicates, N to keep duplicates

Value

Returns the output of skewer, bwa, chipseq

Author

Raffaele Calogero

Examples

if (FALSE) {
system("wget 130.192.119.59/public/test.chipseqCounts.zip")
unzip("test.chipseqCounts.zip")
setwd("test.chipseqCounts")
library(docker4seq)
chipseqCounts(group = "docker", output.folder = "/data/tests/chipseqCounts/test.chipseqCounts/prdm51.igg",
             mock.folder="/data/tests/chipseqCounts/test.chipseqCounts/igg",
             test.folder="/data/tests/chipseqCounts/test.chipseqCounts/prdm51", scratch.folder="/data/scratch/",
             adapter5 = "AGATCGGAAGAGCACACGTCTGAACTCCAGTCA",
             adapter3 = "AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT",
             threads = 8, min.length = 30, genome.folder="/data/genomes/mm10bwa",
             mock.id = "igg", test.id = "tf", genome="mm10", read.size = 50,
             tool = "macs", macs.min.mfold = 10, macs.max.mfold = 30,
             macs.pval = "1e-5", sicer.wsize = 200, sicer.gsize = 200,
             sicer.fdr = 0.1, tss.distance = 0, max.upstream.distance = 10000,
             remove.duplicates = "N")
}