scannobyGtf.Rd
This function executes the docker container annotate.1, where refGenome is used to annotate a single cell counts table with ensembl gene ids on first column using ENSEMBL GTF annotation
a character string. Two options: "sudo"
or "docker"
, depending to which group the user belongs
a character string indicating the folder where input data are located and where output will be written and matrix name "/bin/users/matrix.csv". The system recognize automatically csv as comma separated files and txt as tab separated file
a character string indicating the ENSEMBL gtf file
a character string the biotypes of interest
a boolean to define if mitocondrial genes have to be removed, FALSE mean that mt genes are removed
a boolean to define if ribosomal proteins have to be removed, FALSE mean that ribosomal proteins (gene names starting with rpl or rps) are removed
a integer defining how many UMI are required to call a gene as present. default: 3
start range for ribosomal percentage, cells within the range are kept
end range for ribosomal percentagem cells within the range are kept
start range for mitochondrial percentage, cells within the range are retained
end range for mitochondrial percentage, cells within the range are retained
parameter to filter cells according to the number og significative genes expressed
one file: annotated_counts table, where ensembl ids are linked to gene symbols and a PDF showing the effect of ribo and mito genes removal. Filtered_annotated annotated counts table with only cells and genes given by filtering thresholds. A pdf showing the effect of genes counts of the filtering and a filteredStatistics.txt indicating how many cell and genes were filtered out
if (FALSE) {
system("wget http://130.192.119.59/public/testSCumi_mm10.csv.zip")
library(rCASC)
system("unzip testSCumi_mm10.csv.zip")
#filtering low quality cells
lorenzFilter(group="docker",scratch.folder="/data/scratch/",
file=paste(getwd(),"testSCumi_mm10.csv",sep="/"),
p_value=0.05,separator=',')
#running annotation and removal of mit and ribo proteins genes
#download mouse GTF for mm10
system("wget ftp://ftp.ensembl.org/pub/release-92/gtf/mus_musculus/Mus_musculus.GRCm38.92.gtf.gz")
system("gunzip Mus_musculus.GRCm38.92.gtf.gz")
scannobyGtf(group="docker", file=paste(getwd(),"testSCumi_mm10.csv",sep="/"),
gtf.name="Mus_musculus.GRCm38.94.gtf", biotype="protein_coding",
mt=TRUE, ribo.proteins=TRUE, umiXgene=3, riboStart.percentage=0,
riboEnd.percentage=100, mitoStart.percentage=0, mitoEnd.percentage=100, thresholdGenes=100)
}