This function executes the whole autoencoder pipeline

wrapperAutoencoder(
  group = c("sudo", "docker"),
  scratch.folder,
  file,
  separator,
  nCluster,
  bias = c("mirna", "TF", "CUSTOM", "kinasi", "immunoSignature", "ALL"),
  permutation,
  nEpochs,
  patiencePercentage = 5,
  cl,
  seed = 1111,
  projectName,
  bN = "NULL",
  lr = 0.01,
  beta_1 = 0.9,
  beta_2 = 0.999,
  epsilon = 1e-08,
  decay = 0,
  loss = "mean_squared_error",
  clusterMethod = c("GRIPH", "SIMLR", "SEURAT", "SHARP"),
  pcaDimensions = 5,
  permAtTime = 3,
  largeScale = FALSE,
  Sp = 0.8,
  threads = 1,
  X = 0.15,
  K = 2,
  counts = c("False"),
  skipvis = c("False"),
  regularization = 10,
  variational = FALSE
)

Arguments

group,

a character string. Two options: sudo or docker, depending to which group the user belongs

scratch.folder,

a character string indicating the path of the scratch folder

file,

a character string indicating the path of the file, with file name and extension included

separator,

separator used in count file, e.g. '\t', ','

nCluster,

number of cluster in which the dataset is divided

bias,

bias method to use : "mirna" , "TF", "CUSTOM", kinasi,immunoSignature,ALL

permutation,

number of permutations to perform the pValue to evaluate clustering

nEpochs,

number of Epochs for neural network training

patiencePercentage,

number of Epochs percentage of not training before to stop.

cl,

Clustering.output file. Can be the output of every clustering algorithm from rCASC or can be customized with first column cells names, second column cluster they belong.All path needs to be provided.

seed,

important value to reproduce the same results with same input

projectName,

might be different from the matrixname in order to perform different analysis on the same dataset

bN,

name of the custom bias file. This file need header, in the first column has to be the source and in the second column the gene symbol.All path needs to be provided,

lr,

learning rate, the speed of learning. Higher value may increase the speed of convergence but may also be not very precise

beta_1,

look at keras optimizer parameters

beta_2,

look at keras optimizer parameters

epsilon,

look at keras optimizer parameters

decay,

look at keras optimizer parameters

loss,

loss of function to use, for other loss of function check the keras loss of functions.

clusterMethod,

clustering methods: "GRIPH","SIMLR","SEURAT","SHARP"

pcaDimensions,

number of dimensions to use for Seurat Pca reduction.

permAtTime,

number of permutation in parallel

largeScale,

boolean for SIMLR analysis, TRUE if rows are less then columns or if the computational time are huge

Sp,

minimun number of percentage of cells that has to be in common between two permutation to be the same cluster.

threads,

integer refering to the max number of process run in parallel default 1 max the number of clusters under analysis, i.e. nCluster

X,

from 0 to 1 argument for XL-mHG default 0.15, for more info see cometsc help.

K,

the number of gene combinations to be considered., possible values 2, 3, 4, default 2. WARNING increasing the number of combinations makes the matrices very big

counts,

if set to True it will graph the log(expression+1). To be used if unlogged data are provided

skipvis,

set to True to skip visualizations

regularization,

this parameter balances between reconstruction loss and enforcing a normal distribution in the latent space

variational,

TRUE or FALSE if you want to use a variational autoencoder or the standard autoencoder

Value

folders the complete autoencoder analysis.

Author

Luca Alessandri, alessandri [dot] luca1991 [at] gmail [dot] com, University of Torino

Examples

if (FALSE) {
 wrapperAutoencoder(group="sudo",scratch.folder=scratch.folder,file="/home/lucastormreig/test/setA.csv",separator=",",nCluster=5,bias="mirna",permutation=10,nEpochs=10,cl="/home/lucastormreig/test/setA_clustering.output.csv",projectName="mirna",clusterMethod="GRIPH")
}