This function executes the docker container annotate.1, where refGenome is used to annotate a single cell counts table with ensembl gene ids on first column using ENSEMBL GTF annotation

  group = c("docker", "sudo"),
  biotype = NULL,
  mt = c(TRUE, FALSE),
  ribo.proteins = c(TRUE, FALSE),
  umiXgene = 3,
  riboStart.percentage = 20,
  riboEnd.percentage = 70,
  mitoStart.percentage = 1,
  mitoEnd.percentage = 100,
  thresholdGenes = 250



a character string. Two options: "sudo" or "docker", depending to which group the user belongs


a character string indicating the folder where input data are located and where output will be written and matrix name "/bin/users/matrix.csv". The system recognize automatically csv as comma separated files and txt as tab separated file,

a character string indicating the ENSEMBL gtf file


a character string the biotypes of interest


a boolean to define if mitocondrial genes have to be removed, FALSE mean that mt genes are removed


a boolean to define if ribosomal proteins have to be removed, FALSE mean that ribosomal proteins (gene names starting with rpl or rps) are removed


a integer defining how many UMI are required to call a gene as present. default: 3


start range for ribosomal percentage, cells within the range are kept


end range for ribosomal percentagem cells within the range are kept


start range for mitochondrial percentage, cells within the range are retained


end range for mitochondrial percentage, cells within the range are retained


parameter to filter cells according to the number og significative genes expressed


one file: annotated_counts table, where ensembl ids are linked to gene symbols and a PDF showing the effect of ribo and mito genes removal. Filtered_annotated annotated counts table with only cells and genes given by filtering thresholds. A pdf showing the effect of genes counts of the filtering and a filteredStatistics.txt indicating how many cell and genes were filtered out


Raffaele Calogero, Luca Alessandri


if (FALSE) { system("wget") library(rCASC) system("unzip") #filtering low quality cells lorenzFilter(group="docker",scratch.folder="/data/scratch/", file=paste(getwd(),"testSCumi_mm10.csv",sep="/"), p_value=0.05,separator=',') #running annotation and removal of mit and ribo proteins genes #download mouse GTF for mm10 system("wget") system("gunzip Mus_musculus.GRCm38.92.gtf.gz") scannobyGtf(group="docker", file=paste(getwd(),"testSCumi_mm10.csv",sep="/"),"Mus_musculus.GRCm38.94.gtf", biotype="protein_coding", mt=TRUE, ribo.proteins=TRUE, umiXgene=3, riboStart.percentage=0, riboEnd.percentage=100, mitoStart.percentage=0, mitoEnd.percentage=100, thresholdGenes=100) }