This function execute integrationPsblk analysis which search for correspondence between clusters of two different experiments using clusters-pseudo-bulks, z-scored on rows, and a subset of randomly selected genes. Thus, the function clustersBulk has to be run on the two datasets before their comparison.

integrationPsblk(
  group = c("sudo", "docker"),
  scratch.folder,
  fileX,
  fileY,
  separatorX,
  separatorY,
  max.genes = 500,
  split.by = 100,
  outputFolder
)

Arguments

group,

a character string. Two options: sudo or docker, depending to which group the user belongs

scratch.folder,

a character string indicating the path of the scratch folder

fileX,

a character string indicating the path of the pseudobulkRow file, with file name and extension included.

fileY,

a character string indicating the path of the pseudobulkRow file, with file name and extension included.

separatorX,

separator used in count file, e.g. '\t', ','

separatorY,

separator used in count file, e.g. '\t', ','

max.genes,

MAX number of random genes to be used for each cluster, default 500

split.by,

value indication the splitting range default 100. I.e. if max.genes= 500 with split.by set to 100 there will be 5 sets of genes selected 100, 200, 300, 400, 500

outputFolder,

where results are placed

Value

A folder called XYpb with all the results generated. The final frequency table is saved in final_score.csv, which is made by the frequency of having Pearson >= 0.5 between X and Y datasets for each possible comparisons between the clusters of the X and Y experiment. Pearson correlation is calculated on 10000 random selections of genes for each threshold.

Author

Luca Alessandri, alessandri [dot] luca1991 [at] gmail [dot] com, University of Torino

Examples