By Angus Ball
library’s
library(phyloseq)
library(ALDEx2)
library(microbiomeMarker)
Differential abundance time! First load the data
filepathing <- "C:\\Users\\angus\\OneDrive - UNBC\\Angus Ball\\Lab work\\Bioinformatics\\Kenzies Data\\physeq_Key.rds" #add your file location here
physeq_Key <- readRDS(file = filepathing)
Loading required package: phyloseq
Registered S3 method overwritten by 'data.table':
method from
print.data.table
Go read this massive scary as hell document, or at least skim the
interesting parts???? https://www.bioconductor.org/packages/devel/bioc/vignettes/ALDEx2/inst/doc/ALDEx2_vignette.html
Note Aldex the package doesn’t have an option for adding psuedocounts
during the clr transformation. Therefore we are going to apply the
psuedocount then run aldex without the transformation.
But HEY Angus you said the robust clr transformation was better! Why
are we using pseudocounts here? The answer is because all the papers
that I’ve read that compare differential abundance methodologies all use
a pseudocount. Therefore, the philosophy is using something that could
be worse than the best option, but has been proven better than the most
alternatives.
Now, there are many ways to add a pseudocount, but hopefully you’ve
come to appreciate my hamfisted approaches so…
physeq_psuedo <- physeq_Key #copy of physeq_Key
otu_table_pseudo <- otu_table(physeq_psuedo)
otu_table_pseudo<-as.data.frame(otu_table_pseudo)
otu_table_pseudo<-otu_table_pseudo +1 # literally justs adds one to every value in otu_table
# Then we need to take our sample data and convert it to a vector
sd = data.frame(sample_data(physeq_Key)) #takes the sample data out
vect <- as.vector(sd[,"Layer"]) #switch layer with whatever column you're interested in
Fun fact, the aldex package proper doesn’t use phyloseq objects, just
an honest to god OTU table, and the vector. BTW in our OTU table all the
species are called by OTU, i.e OTU1 OTU2 ect. you can change this here
and now on the otu table or be arsed with it later (what Ill show)
Time to run aldex
physeq.aldex<-aldex(otu_table_pseudo, # OTU table
vect, #sample data in a vector
mc.samples = 128, #how many monte carlo simulations to run, 128 is recommended
denom = "all", #which OTU's to use, i.e. all of them, but there may be cases where you don't want to use all of em.
verbose = TRUE, #tells you progress
test = "t", #t test because I'm looking at a binary comparision here, use "kw" if you have multiple comparisons
effect = TRUE) #determines if you want to calculate abundances and effect size
See https://www.bioconductor.org/packages/devel/bioc/vignettes/ALDEx2/inst/doc/ALDEx2_vignette.html#722_aldexttest(x)
for explanations on variable names
#extract taxa information
taxa <- tax_table(physeq_Key)
taxa <- as.data.frame(taxa)
aldex.treat.bound <- cbind(physeq.aldex, taxa)#bind taxa informaton to the aldex object
aldex_sig_treat <- aldex.treat.bound[ aldex.treat.bound$"wi.eBH" < 0.05, ] #extract significant values
aldex_sig_treat #there wasnt any signifigant values :( better luck next time!
Now we did this at a species level which typically means there are
just so many comparisons that nothing will ever be significant. use
taxa_glom to conglomerate all the taxa to different levels (i.e. genus
and class ect) and see if something pops up!
#e.g.
physeq_genus <- tax_glom(physeq_Key, taxrank="Genus")
PS. theres a method to do linear combinations of variable comparisons
but its beyond the scope of my time right now
LS0tDQp0aXRsZTogIkFsZGV4MiBkaWZmZXJlbnRpYWwgYWJ1bmRhbmNlIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCkJ5IEFuZ3VzIEJhbGwNCg0KDQpsaWJyYXJ5J3MNCmBgYHtyfQ0KbGlicmFyeShwaHlsb3NlcSkNCmxpYnJhcnkoQUxERXgyKQ0KbGlicmFyeShtaWNyb2Jpb21lTWFya2VyKQ0KYGBgDQoNCg0KRGlmZmVyZW50aWFsIGFidW5kYW5jZSB0aW1lIQ0KRmlyc3QgbG9hZCB0aGUgZGF0YQ0KYGBge3J9DQpmaWxlcGF0aGluZyA8LSAiQzpcXFVzZXJzXFxhbmd1c1xcT25lRHJpdmUgLSBVTkJDXFxBbmd1cyBCYWxsXFxMYWIgd29ya1xcQmlvaW5mb3JtYXRpY3NcXEtlbnppZXMgRGF0YVxccGh5c2VxX0tleS5yZHMiICNhZGQgeW91ciBmaWxlIGxvY2F0aW9uIGhlcmUNCnBoeXNlcV9LZXkgPC0gcmVhZFJEUyhmaWxlID0gZmlsZXBhdGhpbmcpDQpgYGANCg0KR28gcmVhZCB0aGlzIG1hc3NpdmUgc2NhcnkgYXMgaGVsbCBkb2N1bWVudCwgb3IgYXQgbGVhc3Qgc2tpbSB0aGUgaW50ZXJlc3RpbmcgcGFydHM/Pz8/DQpodHRwczovL3d3dy5iaW9jb25kdWN0b3Iub3JnL3BhY2thZ2VzL2RldmVsL2Jpb2MvdmlnbmV0dGVzL0FMREV4Mi9pbnN0L2RvYy9BTERFeDJfdmlnbmV0dGUuaHRtbA0KDQoNCg0KDQpOb3RlIEFsZGV4IHRoZSBwYWNrYWdlIGRvZXNuJ3QgaGF2ZSBhbiBvcHRpb24gZm9yIGFkZGluZyBwc3VlZG9jb3VudHMgZHVyaW5nIHRoZSBjbHIgdHJhbnNmb3JtYXRpb24uIFRoZXJlZm9yZSB3ZSBhcmUgZ29pbmcgdG8gYXBwbHkgdGhlIHBzdWVkb2NvdW50IHRoZW4gcnVuIGFsZGV4IHdpdGhvdXQgdGhlIHRyYW5zZm9ybWF0aW9uLg0KDQpCdXQgSEVZIEFuZ3VzIHlvdSBzYWlkIHRoZSByb2J1c3QgY2xyIHRyYW5zZm9ybWF0aW9uIHdhcyBiZXR0ZXIhIFdoeSBhcmUgd2UgdXNpbmcgcHNldWRvY291bnRzIGhlcmU/IFRoZSBhbnN3ZXIgaXMgYmVjYXVzZSBhbGwgdGhlIHBhcGVycyB0aGF0IEkndmUgcmVhZCB0aGF0IGNvbXBhcmUgZGlmZmVyZW50aWFsIGFidW5kYW5jZSBtZXRob2RvbG9naWVzIGFsbCB1c2UgYSBwc2V1ZG9jb3VudC4gVGhlcmVmb3JlLCB0aGUgcGhpbG9zb3BoeSBpcyB1c2luZyBzb21ldGhpbmcgdGhhdCBjb3VsZCBiZSB3b3JzZSB0aGFuIHRoZSBiZXN0IG9wdGlvbiwgYnV0IGhhcyBiZWVuIHByb3ZlbiBiZXR0ZXIgdGhhbiB0aGUgbW9zdCBhbHRlcm5hdGl2ZXMuIA0KDQoNCg0KTm93LCB0aGVyZSBhcmUgbWFueSB3YXlzIHRvIGFkZCBhIHBzZXVkb2NvdW50LCBidXQgaG9wZWZ1bGx5IHlvdSd2ZSBjb21lIHRvIGFwcHJlY2lhdGUgbXkgaGFtZmlzdGVkIGFwcHJvYWNoZXMgc28uLi4NCg0KYGBge3J9DQpwaHlzZXFfcHN1ZWRvIDwtIHBoeXNlcV9LZXkgI2NvcHkgb2YgcGh5c2VxX0tleQ0Kb3R1X3RhYmxlX3BzZXVkbyA8LSBvdHVfdGFibGUocGh5c2VxX3BzdWVkbykgI3N0ZWFscyB0aGUgT1RVIHRhYmxlIGZyb20gdGhlIHBoeWxvc2VxIG9iamVjdA0Kb3R1X3RhYmxlX3BzZXVkbzwtYXMuZGF0YS5mcmFtZShvdHVfdGFibGVfcHNldWRvKSAjY29udmVydHMgaXQgdG8gYSBkYXRhZnJhbWUNCm90dV90YWJsZV9wc2V1ZG88LW90dV90YWJsZV9wc2V1ZG8gKzEgIyBsaXRlcmFsbHkganVzdHMgYWRkcyBvbmUgdG8gZXZlcnkgdmFsdWUgaW4gb3R1X3RhYmxlDQoNCiMgVGhlbiB3ZSBuZWVkIHRvIHRha2Ugb3VyIHNhbXBsZSBkYXRhIGFuZCBjb252ZXJ0IGl0IHRvIGEgdmVjdG9yDQpzZCA9IGRhdGEuZnJhbWUoc2FtcGxlX2RhdGEocGh5c2VxX0tleSkpICN0YWtlcyB0aGUgc2FtcGxlIGRhdGEgb3V0DQp2ZWN0IDwtIGFzLnZlY3RvcihzZFssIkxheWVyIl0pICNzd2l0Y2ggbGF5ZXIgd2l0aCB3aGF0ZXZlciBjb2x1bW4geW91J3JlIGludGVyZXN0ZWQgaW4NCmBgYA0KRnVuIGZhY3QsIHRoZSBhbGRleCBwYWNrYWdlIHByb3BlciBkb2Vzbid0IHVzZSBwaHlsb3NlcSBvYmplY3RzLCBqdXN0IGFuIGhvbmVzdCB0byBnb2QgT1RVIHRhYmxlLCBhbmQgdGhlIHZlY3Rvci4gQlRXIGluIG91ciBPVFUgdGFibGUgYWxsIHRoZSBzcGVjaWVzIGFyZSBjYWxsZWQgYnkgT1RVLCBpLmUgT1RVMSBPVFUyIGVjdC4geW91IGNhbiBjaGFuZ2UgdGhpcyBoZXJlIGFuZCBub3cgb24gdGhlIG90dSB0YWJsZSBvciBiZSBhcnNlZCB3aXRoIGl0IGxhdGVyICh3aGF0IElsbCBzaG93KQ0KDQpUaW1lIHRvIHJ1biBhbGRleA0KDQoNCg0KDQpgYGB7cn0NCnBoeXNlcS5hbGRleDwtYWxkZXgob3R1X3RhYmxlX3BzZXVkbywgIyBPVFUgdGFibGUNCiAgICAgICAgICAgICAgICAgICAgdmVjdCwgI3NhbXBsZSBkYXRhIGluIGEgdmVjdG9yDQogICAgICAgICAgICAgICAgICAgIG1jLnNhbXBsZXMgPSAxMjgsICNob3cgbWFueSBtb250ZSBjYXJsbyBzaW11bGF0aW9ucyB0byBydW4sIDEyOCBpcyByZWNvbW1lbmRlZA0KICAgICAgICAgICAgICAgICAgICBkZW5vbSA9ICJhbGwiLCAjd2hpY2ggT1RVJ3MgdG8gdXNlLCBpLmUuIGFsbCBvZiB0aGVtLCBidXQgdGhlcmUgbWF5IGJlIGNhc2VzIHdoZXJlIHlvdSBkb24ndCB3YW50IHRvIHVzZSBhbGwgb2YgZW0uDQogICAgICAgICAgICAgICAgICAgIHZlcmJvc2UgPSBUUlVFLCAjdGVsbHMgeW91IHByb2dyZXNzDQogICAgICAgICAgICAgICAgICAgIHRlc3QgPSAidCIsICN0IHRlc3QgYmVjYXVzZSBJJ20gbG9va2luZyBhdCBhIGJpbmFyeSBjb21wYXJpc2lvbiBoZXJlLCB1c2UgImt3IiBpZiB5b3UgaGF2ZSBtdWx0aXBsZSBjb21wYXJpc29ucyANCiAgICAgICAgICAgICAgICAgICAgZWZmZWN0ID0gVFJVRSkgI2RldGVybWluZXMgaWYgeW91IHdhbnQgdG8gY2FsY3VsYXRlIGFidW5kYW5jZXMgYW5kIGVmZmVjdCBzaXplDQpgYGANCg0KU2VlIGh0dHBzOi8vd3d3LmJpb2NvbmR1Y3Rvci5vcmcvcGFja2FnZXMvZGV2ZWwvYmlvYy92aWduZXR0ZXMvQUxERXgyL2luc3QvZG9jL0FMREV4Ml92aWduZXR0ZS5odG1sIzcyMl9hbGRleHR0ZXN0KHgpDQoNCmZvciBleHBsYW5hdGlvbnMgb24gdmFyaWFibGUgbmFtZXMNCg0KYGBge3J9DQojZXh0cmFjdCB0YXhhIGluZm9ybWF0aW9uDQp0YXhhIDwtIHRheF90YWJsZShwaHlzZXFfS2V5KQ0KdGF4YSA8LSBhcy5kYXRhLmZyYW1lKHRheGEpDQoNCmFsZGV4LnRyZWF0LmJvdW5kIDwtIGNiaW5kKHBoeXNlcS5hbGRleCwgdGF4YSkjYmluZCB0YXhhIGluZm9ybWF0b24gdG8gdGhlIGFsZGV4IG9iamVjdA0KDQphbGRleF9zaWdfdHJlYXQgPC0gYWxkZXgudHJlYXQuYm91bmRbIGFsZGV4LnRyZWF0LmJvdW5kJCJ3aS5lQkgiIDwgMC4wNSwgXSAjZXh0cmFjdCBzaWduaWZpY2FudCB2YWx1ZXMNCmFsZGV4X3NpZ190cmVhdCAjdGhlcmUgd2FzbnQgYW55IHNpZ25pZmlnYW50IHZhbHVlcyA6KCAgYmV0dGVyIGx1Y2sgbmV4dCB0aW1lIQ0KYGBgDQoNCk5vdyB3ZSBkaWQgdGhpcyBhdCBhIHNwZWNpZXMgbGV2ZWwgd2hpY2ggdHlwaWNhbGx5IG1lYW5zIHRoZXJlIGFyZSBqdXN0IHNvIG1hbnkgY29tcGFyaXNvbnMgdGhhdCBub3RoaW5nIHdpbGwgZXZlciBiZSBzaWduaWZpY2FudC4gdXNlIHRheGFfZ2xvbSB0byBjb25nbG9tZXJhdGUgYWxsIHRoZSB0YXhhIHRvIGRpZmZlcmVudCBsZXZlbHMgKGkuZS4gZ2VudXMgYW5kIGNsYXNzIGVjdCkgYW5kIHNlZSBpZiBzb21ldGhpbmcgcG9wcyB1cCENCmBgYHtyfQ0KI2UuZy4NCnBoeXNlcV9nZW51cyA8LSB0YXhfZ2xvbShwaHlzZXFfS2V5LCB0YXhyYW5rPSJHZW51cyIpDQpgYGANCg0KDQoNClBTLiB0aGVyZXMgYSBtZXRob2QgdG8gZG8gbGluZWFyIGNvbWJpbmF0aW9ucyBvZiB2YXJpYWJsZSBjb21wYXJpc29ucyBidXQgaXRzIGJleW9uZCB0aGUgc2NvcGUgb2YgbXkgdGltZSByaWdodCBub3c=