This function estimates the information content of the PCs of the experiment, required by Seurat clustering .

seuratPCAEval(
  group = c("sudo", "docker"),
  scratch.folder,
  file,
  separator,
  logTen = 0,
  seed = 1111,
  sparse = FALSE,
  format = "NULL"
)

Arguments

group,

a character string. Two options: sudo or docker, depending to which group the user belongs

scratch.folder,

a character string indicating the path of the scratch folder

file,

a character string indicating the path of the file, with file name and extension included

separator,

separator used in count file, e.g. '\t', ','

logTen,

1 if the count matrix is already in log10, 0 otherwise

seed,

important value to reproduce the same results with same input

sparse,

boolean for sparse matrix. A sparse matrix is a format that reduces the size of the matrix, considering only positions different from 0. The format supported in rCASC is the one generated by 10XGenomics output: genes.tsv, barcodes.tsv and matrix.tbx.

format,

output file format csv or txt. Only required if sparse matrix is used

Value

Plot with PCA scores is provided to detect the PCA dimensions to be used in Seurat clustering algorithm

Author

Luca Alessandri, alessandri [dot] luca1991 [at] gmail [dot] com, University of Torino

Examples

if (FALSE) {
 system("wget http://130.192.119.59/public/section4.1_examples.zip")
 unzip("section4.1_examples.zip")
 setwd("section4.1_examples")
 system("wget ftp://ftp.ensembl.org/pub/release-94/gtf/homo_sapiens/Homo_sapiens.GRCh38.94.gtf.gz")
 system("gzip -d Homo_sapiens.GRCh38.94.gtf.gz")
 system("mv Homo_sapiens.GRCh38.94.gtf genome.gtf")
 scannobyGtf(group="docker", file=paste(getwd(),"bmsnkn_5x100cells.txt",sep="/"),
 gtf.name="genome.gtf", biotype="protein_coding", mt=TRUE, ribo.proteins=TRUE,umiXgene=3)
 
 seuratPCAEval(group="docker",scratch.folder="/data/scratch/", 
          file=paste(getwd(), "annotated_bmsnkn_5x100cells.txt", sep="/"), 
          separator="\t", logTen = 0, seed = 111, format="NULL")
}