The noble beginnings and proudest moments










  1. Home
  2. /

Designed for scientific discovery. Trusted by R&D leaders.

PIPA is a company that conducts research and development of technology in the field of bioinformatics, machine learning and artificial intelligence. It also provides IT integration, software design and development services.

Our technology and infrastructure are the backbone of our core applications and services. Through a unique combination of Machine Learning, Engineering, and Bioinformatics, PIPA clients are able to automate cumbersome, costly workflows, de-risk decision-making and unlock novel insights that will enable the next generation of scientific breakthroughs across Nutrition, Food, Ingredients and Health.

A top-tier team of engineers and scientists at PIPA have honed modalities that enable transformative discoveries for the R&D communities we serve through our AI solutions, LEAP™, Ingredient Profiler and OES.

Machine Learning & AI

Our Data Science & Machine Learning team leverage best-in-class public and in-house designed frameworks to harness Big Data and generate new knowledge through AI techniques. Integrative Knowledge Graphs, powerful search capabilities and actionable dashboards bring unmatched advantage to our clients who tap into exclusive, AI-generated insights to make transformative advancements during early research and discovery.

At PIPA, business and R&D problems are handled through the analysis of complex, often heterogeneous data. To achieve that we leverage a multitude of machine learning algorithms, ranging from traditional statistical models, to state-of-the-art deep neural networks. Our toolset includes gradient boosting and ensemble classifiers; language models, used within a powerful NLP engine, capable of performing automated information extraction; and representation learning approaches with deep embedding models applied to words, sentences, graph nodes, and graph edges. We understand that our analysis drives high-stake decision-making and we significantly invest in the interpretability and explainability of insights using meta-learning techniques.

Our vision is a multidimensional understanding of the data, that combines algorithmic approaches and mathematical models, with knowledge from domain experts, in order to extract valuable, interpretable insights that our clients trust.

Stacked recurrent neural networks
Transfer learning
Ensemble methods
Classification systems

In recent years, we have seen an exponential growth in research dedicated to automatically extracting information through knowledge graphs (KG); a trend partially driven by the successful use of proprietary graphs by companies like Google and Facebook and the recent availability of quality, manually curated KBs. At PIPA, we realized that the automated mining of biomedical domain KG has the potential to significantly accelerate research and discovery across nutrition, biology, and drug discovery. PIPA develops state-of-the-art graph learning algorithms capable of reasoning on KG and extracting relevant features from them.

Our goal is to supply our clients with accurate, actionable information and novel insights that manifest themselves when using AI to query the integrated knowledge bases.

Knowledge graph embeddings
Model-based reasoning
Link prediction

PIPA handles and processes big, complex, heterogeneous data efficiently within the five V’s of big data: volume, velocity, variety, veracity and value. Our natural language processing pipelines analyze massive amounts of documents in a short period of time. Efficient expansion and further integration of novel data is a challenge that cannot be simply addressed by horizontal or vertical infrastructure scaling.

We invest significant effort in identifying scalable solutions based on the appropriate combination of algorithms and resources to provide efficient scalability and accuracy. Probabilistic approaches and hashing schemes are seamlessly integrated in our engines to significantly reduce the complexity of our solutions. Candidate objects of interest are quickly identified and then analyzed in depth by more complex pipelines. Furthermore, we have developed algorithmic mechanisms to identify and deal with noisy and potentially erroneous data. Our fault-tolerant and distributed infrastructure design allows us to continuously and accurately analyze vast volumes of data in an ever changing, messy, and exciting real-world environment.

Text mining
Natural Language Processing (Relation Extraction, Named Entity Recognition)


We integrate public and proprietary omics data via in-house, scientifically proven bioinformatics pipelines to generate insights on cohort identification, taxonomic and metabolic pathway abundances, gene expression and differentially abundant features.

​​Our state-of-the-art bioinformatics pipelines provide processing of raw data from multiple high-throughput experiment types, from both public and proprietary sources. All of our pipelines pass strict scientific validation. The algorithms and reference data banks are frequently ​​updated for optimal performance.

Systems biology
Metagenome analysis
Transcriptome analysis
Microbiome analysis

Bioinformatics analysis is impossible without high quality data. We created the Omics Engine Service (OES), a highly automated application for collection, quality control and curation of high-throughput data and accompanying meta-data. OES uses proprietary machine learning technologies and human-in-the-loop review to provide more than 10-fold increase in the speed of curation without compromising quality.

Automated retrieval for public data
Metadata curation & enrichment


Our products and innovation services harness the PIPA Data & Analytics Platform (PDAP), enabling a multi-tenant performant solution that offers high scalability while also guaranteeing high levels of security.

The LEAP™ ecosystem is powered by a microservices-based architecture built on Azure. It leverages the scalability capabilities provided by its native Kubernetes service to enable a solution that supports multi-layered multi-tenancy and guarantees data isolation and security.

The application layer is integrated with the PIPA Data and Analytics Platform (PDAP) which serves as the backbone of the data infrastructure of the entire ecosystem. PDAP leverages Azure’s Databricks and Batch services along with LEAP™ core’s Prefect-powered Execution Engine. Together, they enable highly performant and scalable data and bioinformatics pipelines which when coupled with its inherent multitenant nature can generate unique insights by combining the platform’s data assets with user-provided ones.

Αzure Infrastructure
RESTful Microservices
Distributed Processing

​The LEAP™ core platform offers a wide range of powerful and intuitive data visualizations that enable our end-users to gain a nuanced, contextualized understanding of the insights they are viewing. The pinnacle of our visualization capabilities is the interactive Network Graph. Users explore insights through our rich, traversable Graph to get an interactive view of insights and evidence, and zoom in and out on various depths of information. To enhance  the efficiency of users’ data exploration, users can reduce the viewable dataset with the help of easily applicable filtering.

Customizable interactive visualizations
Exploratory visualizations
Visual explanatory data

​​Our advanced Data and Analytics Platform (PDAP) enables the automated generation and evaluation of multiple data artifacts and Machine Learning models to provide a scalable and repeatable solution that guarantees data lineage and governance while also supporting data isolation through its inherent multitenant nature.

Scalable data-Bio pipelines as a Service
Data Lineage

Let’s advance scientific discovery together

Our mission is to offfer a cost-efficient, faster path to innovation for our partners by leveraging our team’s top-tier scientific expertise and our proprietary AI technologies.