Skip to contents

Dysbiosis Score Based on Euclidean Distance to Group Centroids

Arguments

x

A phyloseq object

dist_mat

A distance matrix. Can be output of phyloseq::distance or vegan::vegdist.

use_squared

Logical. Default is FALSE. If TURE to the score is calculated using the squared distance to group centroids. see usedist::dist_to_centroids.

group_col

A column in phyloseq::sample_data with all control and case labels.

control_label

A character string specifying control/healthy labels in group_col.

case_label

A character string specifying case/disease labels in group_col.

Value

A data frame with Centroid distance to each group and a score with sample information

Details

Calculates difference in euclidean distance (ED) for a sample to group centroids. For example, sample_1 to control centroid minus sample_1 to case centroid. The user can provide a custom distance matrix. This approach was used in AlShawaqfeh MK et al. (2017).

References

  • AlShawaqfeh MK et al. (2017). A dysbiosis index to assess microbial changes in fecal samples of dogs with chronic inflammatory enteropathy. FEMS microbiology ecology, 93(11), p.fix136.

Author

Sudarshan A. Shetty

Examples

library(dysbiosisR)
# We use WirbelJ_2018 as test data
dist.mat <- phyloseq::distance(WirbelJ_2018, "bray")
db.1 <- euclideanDistCentroids(WirbelJ_2018,
                               dist_mat = dist.mat,
                               use_squared = TRUE,
                               group_col = "disease",
                               control_label = "healthy",
                               case_label = "CRC")

head(db.1)
#>                     CentroidDist_CRC CentroidDist_healthy CentroidDist_score
#> CCMD11006829ST-21-0        0.2912347            0.3684569        0.077222248
#> CCMD12232071ST-21-0        0.2744104            0.3415175        0.067107098
#> CCMD13071240ST-21-0        0.2172320            0.2697302        0.052498215
#> CCMD13934959ST-21-0        0.1578083            0.2092996        0.051491262
#> CCMD14479708ST-21-0        0.2095968            0.2218744        0.012277622
#> CCMD18872694ST-21-0        0.1859037            0.1906539        0.004750221
#>                       study_name          subject_id body_site study_condition
#> CCMD11006829ST-21-0 WirbelJ_2018 CCMD11006829ST-21-0     stool             CRC
#> CCMD12232071ST-21-0 WirbelJ_2018 CCMD12232071ST-21-0     stool             CRC
#> CCMD13071240ST-21-0 WirbelJ_2018 CCMD13071240ST-21-0     stool             CRC
#> CCMD13934959ST-21-0 WirbelJ_2018 CCMD13934959ST-21-0     stool             CRC
#> CCMD14479708ST-21-0 WirbelJ_2018 CCMD14479708ST-21-0     stool             CRC
#> CCMD18872694ST-21-0 WirbelJ_2018 CCMD18872694ST-21-0     stool             CRC
#>                     disease age age_category gender country non_westernized
#> CCMD11006829ST-21-0     CRC  42        adult female     DEU              no
#> CCMD12232071ST-21-0     CRC  75       senior   male     DEU              no
#> CCMD13071240ST-21-0     CRC  66       senior female     DEU              no
#> CCMD13934959ST-21-0     CRC  56        adult   male     DEU              no
#> CCMD14479708ST-21-0     CRC  74       senior   male     DEU              no
#> CCMD18872694ST-21-0     CRC  63        adult   male     DEU              no
#>                     sequencing_platform DNA_extraction_kit     PMID
#> CCMD11006829ST-21-0       IlluminaHiSeq              Gnome 30936547
#> CCMD12232071ST-21-0       IlluminaHiSeq              Gnome 30936547
#> CCMD13071240ST-21-0       IlluminaHiSeq              Gnome 30936547
#> CCMD13934959ST-21-0       IlluminaHiSeq              Gnome 30936547
#> CCMD14479708ST-21-0       IlluminaHiSeq              Gnome 30936547
#> CCMD18872694ST-21-0       IlluminaHiSeq              Gnome 30936547
#>                     number_reads number_bases minimum_read_length
#> CCMD11006829ST-21-0     83456496   7310970279                  45
#> CCMD12232071ST-21-0     65269931   5370021950                   2
#> CCMD13071240ST-21-0     63427722   5486983593                  45
#> CCMD13934959ST-21-0     36979669   3186342244                  45
#> CCMD14479708ST-21-0     65475493   5637718108                  45
#> CCMD18872694ST-21-0     55300211   4780095276                  45
#>                     median_read_length NCBI_accession      curator BMI    tnm
#> CCMD11006829ST-21-0                 94           <NA> Paolo_Manghi  35 t3n0m0
#> CCMD12232071ST-21-0                 88           <NA> Paolo_Manghi  28 t3n2m0
#> CCMD13071240ST-21-0                 93           <NA> Paolo_Manghi  23 t3n0m0
#> CCMD13934959ST-21-0                 93           <NA> Paolo_Manghi  27 t3n0m0
#> CCMD14479708ST-21-0                 92           <NA> Paolo_Manghi  26 t3n0m1
#> CCMD18872694ST-21-0                 93           <NA> Paolo_Manghi  25 t3n1m0
#>                     ajcc nspecies group
#> CCMD11006829ST-21-0   ii      144  case
#> CCMD12232071ST-21-0  iii      110  case
#> CCMD13071240ST-21-0   ii      126  case
#> CCMD13934959ST-21-0   ii      100  case
#> CCMD14479708ST-21-0   iv      110  case
#> CCMD18872694ST-21-0  iii      145  case