Dysbiosis Score Based on Euclidean Distance to Group Centroids
Source:R/eucledianDistCentroids.R
euclideanDistCentroids.Rd
Dysbiosis Score Based on Euclidean Distance to Group Centroids
Arguments
- x
A phyloseq object
- dist_mat
A distance matrix. Can be output of
phyloseq::distance
orvegan::vegdist
.- use_squared
Logical. Default is FALSE. If TURE to the score is calculated using the squared distance to group centroids. see
usedist::dist_to_centroids
.- group_col
A column in
phyloseq::sample_data
with all control and case labels.- control_label
A character string specifying control/healthy labels in group_col.
- case_label
A character string specifying case/disease labels in group_col.
Details
Calculates difference in euclidean distance (ED)
for a sample
to group centroids. For example, sample_1 to control centroid minus
sample_1 to case centroid. The user can provide a custom distance
matrix. This approach was used in AlShawaqfeh MK et al. (2017)
.
References
AlShawaqfeh MK et al. (2017). A dysbiosis index to assess microbial changes in fecal samples of dogs with chronic inflammatory enteropathy. FEMS microbiology ecology, 93(11), p.fix136.
Examples
library(dysbiosisR)
# We use WirbelJ_2018 as test data
dist.mat <- phyloseq::distance(WirbelJ_2018, "bray")
db.1 <- euclideanDistCentroids(WirbelJ_2018,
dist_mat = dist.mat,
use_squared = TRUE,
group_col = "disease",
control_label = "healthy",
case_label = "CRC")
head(db.1)
#> CentroidDist_CRC CentroidDist_healthy CentroidDist_score
#> CCMD11006829ST-21-0 0.2912347 0.3684569 0.077222248
#> CCMD12232071ST-21-0 0.2744104 0.3415175 0.067107098
#> CCMD13071240ST-21-0 0.2172320 0.2697302 0.052498215
#> CCMD13934959ST-21-0 0.1578083 0.2092996 0.051491262
#> CCMD14479708ST-21-0 0.2095968 0.2218744 0.012277622
#> CCMD18872694ST-21-0 0.1859037 0.1906539 0.004750221
#> study_name subject_id body_site study_condition
#> CCMD11006829ST-21-0 WirbelJ_2018 CCMD11006829ST-21-0 stool CRC
#> CCMD12232071ST-21-0 WirbelJ_2018 CCMD12232071ST-21-0 stool CRC
#> CCMD13071240ST-21-0 WirbelJ_2018 CCMD13071240ST-21-0 stool CRC
#> CCMD13934959ST-21-0 WirbelJ_2018 CCMD13934959ST-21-0 stool CRC
#> CCMD14479708ST-21-0 WirbelJ_2018 CCMD14479708ST-21-0 stool CRC
#> CCMD18872694ST-21-0 WirbelJ_2018 CCMD18872694ST-21-0 stool CRC
#> disease age age_category gender country non_westernized
#> CCMD11006829ST-21-0 CRC 42 adult female DEU no
#> CCMD12232071ST-21-0 CRC 75 senior male DEU no
#> CCMD13071240ST-21-0 CRC 66 senior female DEU no
#> CCMD13934959ST-21-0 CRC 56 adult male DEU no
#> CCMD14479708ST-21-0 CRC 74 senior male DEU no
#> CCMD18872694ST-21-0 CRC 63 adult male DEU no
#> sequencing_platform DNA_extraction_kit PMID
#> CCMD11006829ST-21-0 IlluminaHiSeq Gnome 30936547
#> CCMD12232071ST-21-0 IlluminaHiSeq Gnome 30936547
#> CCMD13071240ST-21-0 IlluminaHiSeq Gnome 30936547
#> CCMD13934959ST-21-0 IlluminaHiSeq Gnome 30936547
#> CCMD14479708ST-21-0 IlluminaHiSeq Gnome 30936547
#> CCMD18872694ST-21-0 IlluminaHiSeq Gnome 30936547
#> number_reads number_bases minimum_read_length
#> CCMD11006829ST-21-0 83456496 7310970279 45
#> CCMD12232071ST-21-0 65269931 5370021950 2
#> CCMD13071240ST-21-0 63427722 5486983593 45
#> CCMD13934959ST-21-0 36979669 3186342244 45
#> CCMD14479708ST-21-0 65475493 5637718108 45
#> CCMD18872694ST-21-0 55300211 4780095276 45
#> median_read_length NCBI_accession curator BMI tnm
#> CCMD11006829ST-21-0 94 <NA> Paolo_Manghi 35 t3n0m0
#> CCMD12232071ST-21-0 88 <NA> Paolo_Manghi 28 t3n2m0
#> CCMD13071240ST-21-0 93 <NA> Paolo_Manghi 23 t3n0m0
#> CCMD13934959ST-21-0 93 <NA> Paolo_Manghi 27 t3n0m0
#> CCMD14479708ST-21-0 92 <NA> Paolo_Manghi 26 t3n0m1
#> CCMD18872694ST-21-0 93 <NA> Paolo_Manghi 25 t3n1m0
#> ajcc nspecies group
#> CCMD11006829ST-21-0 ii 144 case
#> CCMD12232071ST-21-0 iii 110 case
#> CCMD13071240ST-21-0 ii 126 case
#> CCMD13934959ST-21-0 ii 100 case
#> CCMD14479708ST-21-0 iv 110 case
#> CCMD18872694ST-21-0 iii 145 case