27 References

1. Nyholm L, Koziol A, Marcos S, Botnen AB, Aizpurua O, Gopalakrishnan S, et al. Holo-Omics: Integrated Host-Microbiota multi-omics for basic and applied biological research. iScience. 2020;23:101414.

2. Limborg MT, Alberdi A, Kodama M, Roggenbuck M, Kristiansen K, Gilbert MTP. Applied hologenomics: Feasibility and potential in aquaculture. Trends Biotechnol. 2018;36:252–64.

3. Theis KR, Dheilly NM, Klassen JL, Brucker RM, Baines JF, Bosch TCG, et al. Getting the hologenome concept right: An Eco-Evolutionary framework for hosts and their microbiomes. mSystems. 2016;1.

4. Rosenberg E, Zilber-Rosenberg I. The hologenome concept: Human, animal and plant microbiota. Springer, Cham; 2013.

5. Fischer CN, Trautman EP, Crawford JM, Stabb EV, Handelsman J, Broderick NA. Metabolite exchange between microbiome members produces compounds that influence drosophila behavior. Elife. 2017;6.

6. Wu H-J, Wu E. The role of gut microbiota in immune homeostasis and autoimmunity. Gut Microbes. 2012;3:4–14.

7. Alberdi A, Andersen SB, Limborg MT, Dunn RR, Gilbert MTP. Disentangling host-microbiota complexity through hologenomics. Nat Rev Genet. 2022;23:281–97.

8. Mushegian AA, Arbore R, Walser J-C, Ebert D. Environmental sources of bacteria and genetic variation in behavior influence Host-Associated microbiota. Appl Environ Microbiol. 2019;85.

9. Probandt D, Eickhorst T, Ellrott A, Amann R, Knittel K. Microbial life on a sand grain: From bulk sediment to single grains. ISME J. 2018;12:623–33.

10. Yan W, Sun C, Zheng J, Wen C, Ji C, Zhang D, et al. Efficacy of fecal sampling as a gut proxy in the study of chicken gut microbiota. Front Microbiol. 2019;10:2126.

11. Fofanov VY, Furstenau TN, Sanchez D, Hepp CM, Cocking J, Sobek C, et al. Guano exposed: Impact of aerobic conditions on bat fecal microbiota. Ecol Evol. 2018;8:5563–74.

12. Ji BW, Sheth RU, Dixit PD, Huang Y, Kaufman A, Wang HH, et al. Quantifying spatiotemporal variability and noise in absolute microbiota abundances using replicate sampling. Nat Methods. 2019;16:731–6.

13. Griffin TW, Baer JG, Ward JE. Direct comparison of fecal and gut microbiota in the blue mussel (mytilus edulis) discourages fecal sampling as a proxy for resident gut community. Microb Ecol. 2021;81:180–92.

14. Bjerre RD, Hugerth LW, Boulund F, Seifert M, Johansen JD, Engstrand L. Effects of sampling strategy and DNA extraction on human skin microbiome investigations. Sci Rep. 2019;9:17287.

15. Pérez-Losada M, Crandall KA, Freishtat RJ. Two sampling methods yield distinct microbial signatures in the nasopharynges of asthmatic children. Microbiome. 2016;4:25.

16. De Spiegeleer M, De Graeve M, Huysman S, Vanderbeke A, Van Meulebroek L, Vanhaecke L. Impact of storage conditions on the human stool metabolome and lipidome: Preserving the most accurate fingerprint. Anal Chim Acta. 2020;1108:79–88.

17. Eijsden RGE van, Stassen C, Daenen L, Van Mulders SE, Bapat PM, Siewers V, et al. A universal fixation method based on quaternary ammonium salts (RNAlater) for omics-technologies: Saccharomyces cerevisiae as a case study. Biotechnol Lett. 2013;35:891–900.

18. Schweighardt AJ, Tate CM, Scott KA, Harper KA, Robertson JM. Evaluation of commercial kits for dual extraction of DNA and RNA from human body fluids. J Forensic Sci. 2015;60:157–65.

19. Wang Z, Zolnik CP, Qiu Y, Usyk M, Wang T, Strickler HD, et al. Comparison of fecal collection methods for microbiome and metabolomics studies. Front Cell Infect Microbiol. 2018;8:301.

20. Barra GB, Santa Rita TH, Almeida Vasques J de, Chianca CF, Nery LFA, Santana Soares Costa S. EDTA-mediated inhibition of DNases protects circulating cell-free DNA from ex vivo degradation in blood samples. Clin Biochem. 2015;48:976–81.

21. Weidner L, Laner-Plamberger S, Horner D, Pistorius C, Jurkin J, Karbiener M, et al. Sample buffer containing Guanidine-Hydrochloride combines biological safety and RNA preservation for SARS-CoV-2 molecular diagnostics. Diagnostics (Basel). 2022;12.

22. Simões AES, Pereira DM, Amaral JD, Nunes AF, Gomes SE, Rodrigues PM, et al. Efficient recovery of proteins from multiple source samples after trizol or trizolLS RNA extraction and long-term storage. BMC Genomics. 2013;14:1–15.

23. Ryan BJ, Henehan GT. Avoiding proteolysis during protein purification. Methods Mol Biol. 2017;1485:53–69.

24. Bolt Botnen A, Bjørnsen MB, Alberdi A, Gilbert MTP, Aizpurua O. A simplified protocol for DNA extraction from FTA cards for faecal microbiome studies. Heliyon. 2023;e12861.

25. Straughen JK, Sitarik AR, Jones AD, Li J, Allo G, Salafia C, et al. Comparison of methanol fixation versus cryopreservation of the placenta for metabolomics analysis. Sci Rep. 2023;13:4063.

26. Hedges JB, Vahidi S, Yue X, Konermann L. Effects of ammonium bicarbonate on the electrospray mass spectra of proteins: Evidence for bubble-induced unfolding. Anal Chem. 2013;85:6469–76.

27. Cuthbertson L, Rogers GB, Walker AW, Oliver A, Hoffman LR, Carroll MP, et al. Implications of multiple freeze-thawing on respiratory samples for culture-independent analyses. J Cyst Fibros. 2015;14:464–7.

28. Zhang B, Brock M, Arana C, Dende C, Oers NS van, Hooper LV, et al. Impact of Bead-Beating intensity on the genus- and Species-Level characterization of the gut microbiome using amplicon and complete 16S rRNA gene sequencing. Front Cell Infect Microbiol. 2021;11:678522.

29. Fiedorová K, Radvanský M, Němcová E, Grombiřı́ková H, Bosák J, Černochová M, et al. The impact of DNA extraction methods on stool bacterial and fungal microbiota community recovery. Front Microbiol. 2019;10:821.

30. Jones MB, Highlander SK, Anderson EL, Li W, Dayrit M, Klitgord N, et al. Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proc Natl Acad Sci U S A. 2015;112:14024–9.

31. Kircher M, Sawyer S, Meyer M. Double indexing overcomes inaccuracies in multiplex sequencing on the illumina platform. Nucleic Acids Res. 2012;40:e3.

32. Kivioja T, Vähärautio A, Karlsson K, Bonke M, Enge M, Linnarsson S, et al. Counting absolute numbers of molecules using unique molecular identifiers. Nat Methods. 2011;9:72–4.

33. Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, et al. The RIN: An RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol. 2006;7:3.

34. Huang Y, Sheth RU, Kaufman A, Wang HH. Scalable and cost-effective ribonuclease-based rRNA depletion for transcriptomics. Nucleic Acids Res. 2020;48:e20.

35. Gu W, Crawford ED, O’Donovan BD, Wilson MR, Chow ED, Retallack H, et al. Depletion of abundant sequences by hybridization (DASH): Using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications. Genome Biol. 2016;17:41.

36. Kraus AJ, Brink BG, Siegel TN. Efficient and specific oligo-based depletion of rRNA. Sci Rep. 2019;9:12281.

37. Prezza G, Heckel T, Dietrich S, Homberger C, Westermann AJ, Vogel J. Improved bacterial RNA-seq by Cas9-based depletion of ribosomal RNA reads. RNA. 2020;26:1069–78.

38. Rhie A, McCarthy SA, Fedrigo O, Damas J, Formenti G, Koren S, et al. Towards complete and error-free genome assemblies of all vertebrate species. Nature. 2021;592:737–46.

39. Wang H, Liu B, Zhang Y, Jiang F, Ren Y, Yin L, et al. Estimation of genome size using k-mer frequencies from corrected long reads. 2020.

40. Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 2.0 and smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 2020;11:1432.

41. Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18:170–5.

42. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–2.

43. Rhie A, Walenz BP, Koren S, Phillippy AM. Merqury: Reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 2020;21:245.

44. Tanizawa Y, Fujisawa T, Nakamura Y. DFAST: A flexible prokaryotic genome annotation pipeline for faster genome publication. Bioinformatics. 2017;34:1037–9.

45. Shaffer M, Borton MA, McGivern BB, Zayed AA, La Rosa SL, Solden LM, et al. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res. 2020;48:8883–900.

46. Uritskiy GV, DiRuggiero J, Taylor J. MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome. 2018;6:1–13.

47. Sieber CMK, Probst AJ, Sharrar A, Thomas BC, Hess M, Tringe SG, et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat Microbiol. 2018;3:836–43.

48. Bowers RM, Kyrpides NC, Stepanauskas R, Harmon-Smith M, Doud D, Reddy TBK, et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat Biotechnol. 2017;35:725–31.

49. Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, Sogin ML, et al. Anvi’o: An advanced analysis and visualization platform for ’omics data. PeerJ. 2015;3:e1319.

50. Orakov A, Fullam A, Coelho LP, Khedkar S, Szklarczyk D, Mende DR, et al. GUNC: Detection of chimerism and contamination in prokaryotic genomes. Genome Biol. 2021;22:178.

51. Evans JT, Denef VJ. To dereplicate or not to dereplicate? mSphere. 2020;5.

52. Olm MR, Brown CT, Brooks B, Banfield JF. dRep: A tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 2017;11:2864–8.

53. Parks DH, Chuvochina M, Rinke C, Mussig AJ, Chaumeil P-A, Hugenholtz P. GTDB: An ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 2022;50:D785–94.

54. Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk v2: Memory friendly classification with the genome taxonomy database. Bioinformatics. 2022;38:5315–6.

55. Lee S-G, Na D, Park C. Comparability of reference-based and reference-free transcriptome analysis approaches at the gene expression level. BMC Bioinformatics. 2021;22 Suppl 11:310.

56. R Development Core Team. R: A language and environment for statistical computing. Vienna, Austria; 2008.

57. Ritchie MD, Holzinger ER, Li R, Pendergrass SA, Kim D. Methods of integrating data to uncover genotype–phenotype interactions. Nat Rev Genet. 2015;16:85–97.

58. Reel PS, Reel S, Pearson E, Trucco E, Jefferson E. Using machine learning approaches for multi-omics data analysis: A review. Biotechnol Adv. 2021;49:107739.

59. Lock EF, Hoadley KA, Marron JS, Nobel AB. JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES. Ann Appl Stat. 2013;7:523–42.

60. Ray P, Zheng L, Lucas J, Carin L. Bayesian joint analysis of heterogeneous genomics data. Bioinformatics. 2014;30:1370–6.

61. Shen R, Olshen AB, Ladanyi M. Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics. 2009;25:2906–12.

62. Meng C, Helm D, Frejno M, Kuster B. moCluster: Identifying joint patterns across multiple omics data sets. J Proteome Res. 2016;15:755–65.

63. Wu D, Wang D, Zhang MQ, Gu J. Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: Application to cancer molecular classification. BMC Genomics. 2015;16:1022.

64. Mo Q, Shen R, Guo C, Vannucci M, Chan KS, Hilsenbeck SG. A fully bayesian latent variable model for integrative clustering analysis of multi-type omics data. Biostatistics. 2018;19:71–86.

65. Argelaguet R, Velten B, Arnol D, Dietrich S, Zenz T, Marioni JC, et al. Multi-Omics factor analysis—a framework for unsupervised integration of multi-omics data sets. Mol Syst Biol. 2018;14:e8124.

66. Acharjee A, Kloosterman B, Visser RGF, Maliepaard C. Integration of multi-omics data for prediction of phenotypic traits using random forest. BMC Bioinformatics. 2016;17 Suppl 5 Suppl 5:180.

67. Li S, Chen X, Liu X, Yu Y, Pan H, Haak R, et al. Complex integrated analysis of lncRNAs-miRNAs-mRNAs in oral squamous cell carcinoma. Oral Oncol. 2017;73:1–9.

68. Lee G, Bang L, Kim SY, Kim D, Sohn K-A. Identifying subtype-specific associations between gene expression and DNA methylation profiles in breast cancer. BMC Med Genomics. 2017;10 Suppl 1:28.

69. Zhang L, Lv C, Jin Y, Cheng G, Fu Y, Yuan D, et al. Deep Learning-Based Multi-Omics data integration reveals two prognostic subtypes in High-Risk neuroblastoma. Front Genet. 2018;9:477.

70. Yan KK, Zhao H, Pang H. A comparison of graph- and kernel-based –omics data integration algorithms for classifying complex traits. BMC Bioinformatics. 2017;18.

71. Speicher NK, Pfeifer N. Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery. Bioinformatics. 2015;31:i268–75.

72. Tepeli YI, Ünal AB, Akdemir FM, Tastan O. PAMOGK: A pathway graph kernel-based multiomics approach for patient clustering. Bioinformatics. 2021;36:5237–46.

73. Kim S, Jhong J-H, Lee J, Koo J-Y. Erratum to: Meta-analytic support vector machine for integrating multiple omics data. BioData Min. 2017;10:8.

74. Rappoport N, Shamir R. NEMO: Cancer subtyping by integration of partial multi-omic data. Bioinformatics. 2019;35:3348–56.

75. Lanckriet GRG, De Bie T, Cristianini N, Jordan MI, Noble WS. A statistical framework for genomic data fusion. Bioinformatics. 2004;20:2626–35.

76. Seoane JA, Day INM, Gaunt TR, Campbell C. A pathway-based data integration framework for prediction of disease progression. Bioinformatics. 2014;30:838–45.

77. Tipping ME. Sparse bayesian learning and the relevance vector machine. 2001.

78. Wu C-C, Asgharzadeh S, Triche TJ, D’Argenio DZ. Prediction of human functional genetic networks from heterogeneous data using RVM-based ensemble learning. Bioinformatics. 2010;26:807–13.

79. Kim D, Joung J-G, Sohn K-A, Shin H, Park YR, Ritchie MD, et al. Knowledge boosting: A graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction. J Am Med Inform Assoc. 2015;22:109–20.

80. Shin H, Hill NJ, Lisewski AM, Park J-S. Graph sharpening. Expert Syst Appl. 2010;37:7870–9.

81. Mostafavi S, Morris Q. Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics. 2010;26:1759–65.

82. Wang T, Shao W, Huang Z, Tang H, Zhang J, Ding Z, et al. MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification. Nat Commun. 2021;12:3445.

83. Hristoskova A, Boeva V, Tsiporkova E. A formal concept analysis approach to consensus clustering of multi-experiment expression data. BMC Bioinformatics. 2014;15:151.

84. Lock EF, Dunson DB. Bayesian consensus clustering. Bioinformatics. 2013;29:2610–6.

85. Nguyen H, Shrestha S, Draghici S, Nguyen T. PINSPlus: A tool for tumor subtype discovery in integrated genomic data. Bioinformatics. 2019;35:2843–6.

86. Bonnet E, Calzone L, Michoel T. Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput Biol. 2015;11:e1003983.

87. Wang B, Mezlini AM, Demir F, Fiume M, Tu Z, Brudno M, et al. Similarity network fusion for aggregating data types on a genomic scale. Nat Methods. 2014;11:333–7.

88. Drăghici S, Potter RB. Predicting HIV drug resistance with neural networks. Bioinformatics. 2003;19:98–107.

89. Bavafaye Haghighi E, Knudsen M, Elmedal Laursen B, Besenbacher S. Hierarchical classification of cancers of unknown primary using Multi-Omics data. Cancer Inform. 2019;18:1176935119872163.

90. Ma A, McDermaid A, Xu J, Chang Y, Ma Q. Integrative methods and practical challenges for Single-Cell multi-omics. Trends Biotechnol. 2020;38:1007–22.

91. Poirion OB, Chaudhary K, Huang S, Garmire LX. Multi-omics-based pan-cancer prognosis prediction using an ensemble of deep-learning and machine-learning models. medRxiv. 2020.

92. Holzinger ER, Dudek SM, Frase AT, Pendergrass SA, Ritchie MD. ATHENA: The analysis tool for heritable and environmental network associations. Bioinformatics. 2014;30:698–705.

93. Tan K, Huang W, Hu J, Dong S. A multi-omics supervised autoencoder for pan-cancer clinical outcome endpoints prediction. BMC Med Inform Decis Mak. 2020;20 Suppl 3:129.