scannobyGtf.Rd
This function executes the docker container annotate.1, where refGenome is used to annotate a single cell counts table with ensembl gene ids on first column using ENSEMBL GTF annotation
scannobyGtf( group = c("docker", "sudo"), file, gtf.name, biotype = NULL, mt = c(TRUE, FALSE), ribo.proteins = c(TRUE, FALSE), umiXgene = 3, riboStart.percentage = 20, riboEnd.percentage = 70, mitoStart.percentage = 1, mitoEnd.percentage = 100, thresholdGenes = 250 )
group, | a character string. Two options: |
---|---|
file, | a character string indicating the folder where input data are located and where output will be written and matrix name "/bin/users/matrix.csv". The system recognize automatically csv as comma separated files and txt as tab separated file |
gtf.name, | a character string indicating the ENSEMBL gtf file |
biotype, | a character string the biotypes of interest |
mt, | a boolean to define if mitocondrial genes have to be removed, FALSE mean that mt genes are removed |
ribo.proteins, | a boolean to define if ribosomal proteins have to be removed, FALSE mean that ribosomal proteins (gene names starting with rpl or rps) are removed |
umiXgene, | a integer defining how many UMI are required to call a gene as present. default: 3 |
riboStart.percentage, | start range for ribosomal percentage, cells within the range are kept |
riboEnd.percentage, | end range for ribosomal percentagem cells within the range are kept |
mitoStart.percentage, | start range for mitochondrial percentage, cells within the range are retained |
mitoEnd.percentage, | end range for mitochondrial percentage, cells within the range are retained |
thresholdGenes, | parameter to filter cells according to the number og significative genes expressed |
one file: annotated_counts table, where ensembl ids are linked to gene symbols and a PDF showing the effect of ribo and mito genes removal. Filtered_annotated annotated counts table with only cells and genes given by filtering thresholds. A pdf showing the effect of genes counts of the filtering and a filteredStatistics.txt indicating how many cell and genes were filtered out
Raffaele Calogero, Luca Alessandri
if (FALSE) { system("wget http://130.192.119.59/public/testSCumi_mm10.csv.zip") library(rCASC) system("unzip testSCumi_mm10.csv.zip") #filtering low quality cells lorenzFilter(group="docker",scratch.folder="/data/scratch/", file=paste(getwd(),"testSCumi_mm10.csv",sep="/"), p_value=0.05,separator=',') #running annotation and removal of mit and ribo proteins genes #download mouse GTF for mm10 system("wget ftp://ftp.ensembl.org/pub/release-92/gtf/mus_musculus/Mus_musculus.GRCm38.92.gtf.gz") system("gunzip Mus_musculus.GRCm38.92.gtf.gz") scannobyGtf(group="docker", file=paste(getwd(),"testSCumi_mm10.csv",sep="/"), gtf.name="Mus_musculus.GRCm38.94.gtf", biotype="protein_coding", mt=TRUE, ribo.proteins=TRUE, umiXgene=3, riboStart.percentage=0, riboEnd.percentage=100, mitoStart.percentage=0, mitoEnd.percentage=100, thresholdGenes=100) }