Publications

Research work from our associates

Explore our latest, cutting-edge research fueling our AI-powered apps and partner projects.

Using word embeddings to learn a better food ontology

Food ontologies require significant effort to create and maintain as they involve manual and time-consuming tasks, often with limited alignment to the underlying food science knowledge. We propose a semi-supervised framework for the automated ontology population from an existing ontology scaffold by using word embeddings. Having applied this on the domain of food and subsequent evaluation against an expert-curated ontology, FoodOn, we observe that the food word embeddings capture the latent relationships and characteristics of foods. The resulting ontology, which utilizes word embeddings trained from the Wikipedia corpus, has an improvement of 89.7% in precision when compared to the expert-curated ontology FoodOn (0.34 vs. 0.18, respectively, p value = 2.6 × 10–138), and it has a 43.6% shorter path distance (hops) between predicted and actual food instances (2.91 vs. 5.16, respectively, p value = 4.7 × 10–84) when compared to other methods.

Methane and fatty acid metabolism pathways are predictive of Low-FODMAP diet efficacy for patients with irritable bowel syndrome

Objective: Identification of microbiota-based biomarkers as predictors of low-FODMAP diet response and design of a diet recommendation strategy for IBS patients.
Design: We created a compendium of gut microbiome and disease severity data before and after a low-FODMAP diet treatment from published studies followed by unified data processing, statistical analysis and predictive modeling. We employed data-driven methods that solely rely on the compendium data, as well as hypothesis-driven methods that focus on methane and short chain fatty acid (SCFA) metabolism pathways that were implicated in the disease etiology.
Results: The patient’s response to a low-FODMAP diet was predictable using their pre-diet fecal samples with F1 accuracy…

An early prediction model for canine chronic kidney disease based on routine clinical laboratory tests

The aim of this study was to derive a model to predict the risk of dogs developing chronic kidney disease (CKD) using data from electronic health records (EHR) collected during routine veterinary practice. Data from 57,402 dogs were included in the study. Two thirds of the EHRs were used to build the model, which included feature selection and identification of the optimal neural network type and architecture. The remaining unseen EHRs were used to evaluate model performance. The final model was a recurrent neural network with 6 features (creatinine, blood urea nitrogen, urine specific gravity, urine protein, weight, age). Identifying CKD at the time of diagnosis, the model displayed a sensitivity of 91.4% and a specificity of 97.2%. When predicting future risk of CKD, model sensitivity was 68.8% at 1 year, and 44.8% 2 years before diagnosis. Positive predictive value (PPV)…

Chardonnay marc as a new model for upcycled co-products in the food industry: concentration of diverse natural products chemistry for consumer health and sensory benefits

Research continues to provide compelling insights into potential health benefits associated with diets rich in plant-based natural products (PBNPs). Coupled with evidence from dietary intervention trials, dietary recommendations increasingly include higher intakes of PBNPs. In addition to health benefits, PBNPs can drive flavor and sensory perceptions in foods and beverages. Chardonnay marc (pomace) is a byproduct of winemaking obtained after fruit pressing that has not undergone fermentation. Recent research has revealed that PBNP diversity within Chardonnay marc has potential relevance to human health and desirable sensory attributes in food and beverage products. This review explores the potential of Chardonnay marc as a valuable new PBNP…

Knowledge integration and decision support for accelerated discovery of antibiotic resistance genes

We present a machine learning framework to automate knowledge discovery through knowledge graph construction, inconsistency resolution, and iterative link prediction. By incorporating knowledge from 10 publicly available sources, we construct an Escherichia coli antibiotic resistance knowledge graph with 651,758 triples from 23 triple types after resolving 236 sets of inconsistencies. Iteratively applying link prediction to this graph and wet-lab validation of the generated hypotheses reveal 15 antibiotic resistant E. coli genes, with 6 of them never associated with antibiotic resistance for any microbe. Iterative link prediction leads to a performance improvement and more findings. The probability of positive findings highly correlates with experimentally validated findings (R² = 0.94). We also identify 5 homologs in Salmonella enterica that are all validated to confer…

Stay in the loop

Subscribe to our newsletter and be the first to learn the latest developments in predictive AI.

Subscribe to our newsletter