Skip to contents

Score based on the distance to a healthy/reference plane

Arguments

x

A phyloseq object

dist_mat

A distance matrix. Can be output of phyloseq::distance or vegan::vegdist.

reference_samples

Vector of samples to use as reference.

Value

A data frame with distance to reference plane ('dtrpScore') values and sample information.

Details

Calculates a 'healthy' or 'reference plane'. The plane is calculated in a space derived from PCoA and based on user-defined distances between samples from healthy subjects. Briefly, a model is constructed using the samples from healthy subjects, which were fitted to a two-dimensional plane embedded in a three-dimensional space using the least-squares method. The plane is then restricted to only span the three-dimensional ranges of the healthy control samples. The plane is considered a proxy for the normal microbial variation in healthy subjects. distanceToReferencePlane calculates the Euclidean distance (within PCoA space) from each sample towards the plane, which can be considered a measure of (ab)normality. The user can provide a custom distance matrix. If the user provides a UniFrac distance matrix, then the resulting score is comparable to the dysbiosis score reported in Halfvarson J, Brislawn CJ, Lamendella R et al. (2017).

References

  • Halfvarson J, Brislawn CJ, Lamendella R et al. (2017) Dynamics of the human gut microbiome in inflammatory bowel disease. Nature Microbiology, 2, article number: 17004.

  • Vázquez-Baeza Y. (2017) Reference Plane, GitHub repository, https://github.com/ElDeveloper/reference-plane

Author

Wouter A.A. de Steenhuijsen Piters

Examples

library(dysbiosisR)
# We use WirbelJ_2018 as test data
dist.mat <- phyloseq::distance(WirbelJ_2018, "bray")
ref.samples <- sample_names(subset_samples(WirbelJ_2018,
                                           disease == "healthy"))
dtrp.results <- distanceToReferencePlane(WirbelJ_2018,
                                         dist_mat = dist.mat,
                                         reference_samples = ref.samples)

head(dtrp.results)
#>                      dtrpScore   study_name          subject_id body_site
#> CCMD11006829ST-21-0 0.05128249 WirbelJ_2018 CCMD11006829ST-21-0     stool
#> CCMD12232071ST-21-0 0.22944516 WirbelJ_2018 CCMD12232071ST-21-0     stool
#> CCMD13071240ST-21-0 0.14400941 WirbelJ_2018 CCMD13071240ST-21-0     stool
#> CCMD13934959ST-21-0 0.02996153 WirbelJ_2018 CCMD13934959ST-21-0     stool
#> CCMD14479708ST-21-0 0.03891631 WirbelJ_2018 CCMD14479708ST-21-0     stool
#> CCMD18872694ST-21-0 0.13630248 WirbelJ_2018 CCMD18872694ST-21-0     stool
#>                     study_condition disease age age_category gender country
#> CCMD11006829ST-21-0             CRC     CRC  42        adult female     DEU
#> CCMD12232071ST-21-0             CRC     CRC  75       senior   male     DEU
#> CCMD13071240ST-21-0             CRC     CRC  66       senior female     DEU
#> CCMD13934959ST-21-0             CRC     CRC  56        adult   male     DEU
#> CCMD14479708ST-21-0             CRC     CRC  74       senior   male     DEU
#> CCMD18872694ST-21-0             CRC     CRC  63        adult   male     DEU
#>                     non_westernized sequencing_platform DNA_extraction_kit
#> CCMD11006829ST-21-0              no       IlluminaHiSeq              Gnome
#> CCMD12232071ST-21-0              no       IlluminaHiSeq              Gnome
#> CCMD13071240ST-21-0              no       IlluminaHiSeq              Gnome
#> CCMD13934959ST-21-0              no       IlluminaHiSeq              Gnome
#> CCMD14479708ST-21-0              no       IlluminaHiSeq              Gnome
#> CCMD18872694ST-21-0              no       IlluminaHiSeq              Gnome
#>                         PMID number_reads number_bases minimum_read_length
#> CCMD11006829ST-21-0 30936547     83456496   7310970279                  45
#> CCMD12232071ST-21-0 30936547     65269931   5370021950                   2
#> CCMD13071240ST-21-0 30936547     63427722   5486983593                  45
#> CCMD13934959ST-21-0 30936547     36979669   3186342244                  45
#> CCMD14479708ST-21-0 30936547     65475493   5637718108                  45
#> CCMD18872694ST-21-0 30936547     55300211   4780095276                  45
#>                     median_read_length NCBI_accession      curator BMI    tnm
#> CCMD11006829ST-21-0                 94           <NA> Paolo_Manghi  35 t3n0m0
#> CCMD12232071ST-21-0                 88           <NA> Paolo_Manghi  28 t3n2m0
#> CCMD13071240ST-21-0                 93           <NA> Paolo_Manghi  23 t3n0m0
#> CCMD13934959ST-21-0                 93           <NA> Paolo_Manghi  27 t3n0m0
#> CCMD14479708ST-21-0                 92           <NA> Paolo_Manghi  26 t3n0m1
#> CCMD18872694ST-21-0                 93           <NA> Paolo_Manghi  25 t3n1m0
#>                     ajcc nspecies group
#> CCMD11006829ST-21-0   ii      144  case
#> CCMD12232071ST-21-0  iii      110  case
#> CCMD13071240ST-21-0   ii      126  case
#> CCMD13934959ST-21-0   ii      100  case
#> CCMD14479708ST-21-0   iv      110  case
#> CCMD18872694ST-21-0  iii      145  case