Patrick M. Gillevet, Ezzat Dadkhah, Masoumeh Sikaroodi, Swati Dalmet, Louis Y. Korman, Robert Hardi, Jeffrey H. Baybick, David K. Hanzel, Greg Kuehn, Thomas Kuehn

Background: Alterations in the colonic microbiota are likely to play a role in colonic neoplasia. Characterization of these complex changes could be used as a non-invasive biomarker to screen patients for pre-malignant lesions. A prospective study of patients undergoing screening or surveillance colonoscopy was performed to determine if a unique microbiome pattern could be used to identify patients with colorectal polyps (hyperplastic and adenoma).

Methods: Home collected stool samples (HS), rectal swabs (SS), and sigmoid biopsies (BS) were obtained from 232 subjects. The cohort consisted of 87 females and 131 males of which 193 were Caucasian, 21 African American, and 4 Asian Americans. The microbiome in these samples were interrogated using MultitagTM sequencing on an Ion Torrent PGM Next Generation Sequencing System. We analyzed 504 samples (168 HS, 211 SS, 125 BS) that met depth of coverage criteria based on rarefaction analysis with three different operational taxonomic units (OTUs) clustering techniques (UPARSE, UPGMA, and UCLUST). Nonpara- metric statistical methods (Metastats, LEfSe, Kruskal Wallis, Indicator) were used to identify the taxa that were significantly different between subjects with polyps and subjects without polyps. These informative OTUs were then used to build classifying predictors for the presence of polyps using a Naïve Bayes, Support Vector Machine (SVM), and Neural Network models. The prediction power of classifiers was highest when the informative UPARSE OTUs from the HS samples were used in the model.

Results: Polyps were found in 59% of the colonoscopies; one or more adenomas were found in 42% and advanced adenomas in 10%. Microbiome exploratory modeling and analysis resulted in classification accuracy for a Naïve Bayes model of 79%, the SVM 68% and the Neural Network 81%. A naïve holdout analysis was performed using these models and we observed an average false positive rate (FPR) of 12% and an average false negative rate (FNR) of 11.5% over all the models. Furthermore, only 7 out of 19 false negatives were wrong with all three models and, of these, only two were advanced adenomas. Conclusion: These results indicate that microbiome analysis combined with advanced machine learning methods could be used: as a biomarker to screen patients for polyps, to optimize the use of colonoscopy, and to reduce the associated health care costs. Additional studies with larger populations need to be performed to improve model performance, clinical utility and characterize a colon neoplasia microbiota model.

Gastroenterology, Volume 152, Issue 5, Supplement 1, Page S152
DOI: https://doi.org/10.1016/S0016-5085(17)30830-2