Cloud-based LOcally linear Unbiased Dysbiosis (CLOUD) test
Source:R/cloudStatistic.R
cloudStatistic.Rd
Cloud-based LOcally linear Unbiased Dysbiosis (CLOUD) test
Arguments
- x
A phyloseq object
- reference_samples
Vector of samples to use as reference.
- dist_mat
A distance matrix. Can be output of
phyloseq::distance
orvegan::vegdist
.- k_num
Neighbors to use. Default is 5 percent. User can define percent of samples. See
Montassier E et al. 2018
for more details.- ndim
Dimension of the space in which the data are to be represented. Default is -1. See
Montassier E et al. 2018
Details
Calculates CLOUD score.
Cloud-based LOcally linear Unbiased Dysbiosis (CLOUD) test is a non-parametric
test and returns a measure of dysbiosis. The function was adapted from
the original article by Montassier E et al. 2018
. Here, a user defines
a set of reference samples from which distance of every other sample is
calculated. When calculating the CLOUD
stats the k is an
important parameter specified by argument k_num
. By default we use a
conservative 80 percent of the samples in each group.
References
Montassier E et al. (2018). CLOUD: a non-parametric detection test for microbiome outliers. Microbiome, 6(1), pp.1-10.
Examples
data("WirbelJ_2018")
library(phyloseq)
ps <- WirbelJ_2018
# Define controls as reference samples
ref.samples <- sample_names(subset_samples(WirbelJ_2018,
disease == "healthy"))
dist.data <- phyloseq::distance(ps, "bray")
cloud.results <- cloudStatistic(x= ps,
dist_mat = dist.data,
reference_samples = ref.samples,
ndim=-1,
k_num=5)
head(cloud.results)
#> stats pvals log2Stats study_name subject_id
#> CCMD10032470ST-11-0 0.6673767 1 -0.5834267 WirbelJ_2018 CCMD10032470ST-11-0
#> CCMD10191450ST-11-0 0.6603512 1 -0.5986947 WirbelJ_2018 CCMD10191450ST-11-0
#> CCMD11006829ST-21-0 0.8141492 1 -0.2966349 WirbelJ_2018 CCMD11006829ST-21-0
#> CCMD12232071ST-21-0 0.7767293 1 -0.3645161 WirbelJ_2018 CCMD12232071ST-21-0
#> CCMD13071240ST-21-0 0.7656822 1 -0.3851824 WirbelJ_2018 CCMD13071240ST-21-0
#> CCMD13934959ST-21-0 0.6943361 1 -0.5262940 WirbelJ_2018 CCMD13934959ST-21-0
#> body_site study_condition disease age age_category gender
#> CCMD10032470ST-11-0 stool control healthy 45 adult male
#> CCMD10191450ST-11-0 stool control healthy 62 adult female
#> CCMD11006829ST-21-0 stool CRC CRC 42 adult female
#> CCMD12232071ST-21-0 stool CRC CRC 75 senior male
#> CCMD13071240ST-21-0 stool CRC CRC 66 senior female
#> CCMD13934959ST-21-0 stool CRC CRC 56 adult male
#> country non_westernized sequencing_platform
#> CCMD10032470ST-11-0 DEU no IlluminaHiSeq
#> CCMD10191450ST-11-0 DEU no IlluminaHiSeq
#> CCMD11006829ST-21-0 DEU no IlluminaHiSeq
#> CCMD12232071ST-21-0 DEU no IlluminaHiSeq
#> CCMD13071240ST-21-0 DEU no IlluminaHiSeq
#> CCMD13934959ST-21-0 DEU no IlluminaHiSeq
#> DNA_extraction_kit PMID number_reads number_bases
#> CCMD10032470ST-11-0 Gnome 30936547 37708359 4639213592
#> CCMD10191450ST-11-0 Gnome 30936547 34952407 4351849639
#> CCMD11006829ST-21-0 Gnome 30936547 83456496 7310970279
#> CCMD12232071ST-21-0 Gnome 30936547 65269931 5370021950
#> CCMD13071240ST-21-0 Gnome 30936547 63427722 5486983593
#> CCMD13934959ST-21-0 Gnome 30936547 36979669 3186342244
#> minimum_read_length median_read_length NCBI_accession
#> CCMD10032470ST-11-0 45 139 ERR2726404
#> CCMD10191450ST-11-0 45 140 ERR2726405
#> CCMD11006829ST-21-0 45 94 <NA>
#> CCMD12232071ST-21-0 2 88 <NA>
#> CCMD13071240ST-21-0 45 93 <NA>
#> CCMD13934959ST-21-0 45 93 <NA>
#> curator BMI tnm ajcc nspecies group
#> CCMD10032470ST-11-0 Jacob_Wirbel;Paolo_Manghi 30.7 <NA> <NA> 72 control
#> CCMD10191450ST-11-0 Jacob_Wirbel;Paolo_Manghi 28.5 <NA> <NA> 87 control
#> CCMD11006829ST-21-0 Paolo_Manghi 35.0 t3n0m0 ii 144 case
#> CCMD12232071ST-21-0 Paolo_Manghi 28.0 t3n2m0 iii 110 case
#> CCMD13071240ST-21-0 Paolo_Manghi 23.0 t3n0m0 ii 126 case
#> CCMD13934959ST-21-0 Paolo_Manghi 27.0 t3n0m0 ii 100 case