Why ontologies – and fruit flies – matter in the age of AI
April 14, 2026
SciBite’s Head of Scientific Curation on the hidden infrastructure powering scientific discovery and how it needs our constant support to let AI truly fly – reliably.
Rebecca Foulger’s passion lies in the structured, often unseen ontologies and controlled vocabularies that help interpret biological data. As Head of Scientific Curation at SciBite-Elsevier, she leads a team that develops the content supporting modern life sciences research – and increasingly, the artificial intelligence systems transforming the field.
Making the invisible visible
For Rebecca, it all started with fruit flies. “I still have a soft spot for them. I loved that you could see the genetics,” Rebecca recalls about her early fascination with Drosophila during her A-levels. “So much molecular biology is done in colorless liquids and tubes. But when you have the fruit flies and you cross them, you can see the different wing shapes, eye colors and phenotypes. It’s all very visible.”
“And when you then move on to the molecular side and genome annotation, it’s all connected back to that visible phenotype. So you have something to relate the knowledge to again, and I think it all feeds back to that – connecting the dots and building the jigsaw picture.”
This desire to connect dots would define her career trajectory, even as she moved, as she puts it with a laugh, “up the food chain” – from flies to frogs to zebrafish, and eventually to human-centered diseases like Parkinson's – until the present day, where she sees her job as “helping companies organize their data, make sense of it, and get it into a structure so they can query it and get the most out of it.”
Meanwhile, she continues to enjoy the strong sense of community she first experienced with fruit flies, one that thrives on thinking up hilarious and memorable gene names like ‘Indy’ (which stands for ‘I’m not dead yet’) for a gene where some mutations extend lifespan, or ‘humpty dumpty’ for one that when mutated causes thinner eggshells.
Rebecca Foulger, Head of Scientific Curation, SciBite
From lab bench to data architecture
After completing her PhD on Drosophila in the early 2000s, just as the fruit fly genome was being sequenced and annotated, Rebecca pivoted from laboratory work to data analysis. A position at FlyBase, the model organism database that compiles all Drosophila-related data and publications, offered her first opportunity to contribute to the infrastructure she had been using as a researcher.
“It was my first experience building an ontology, specifically working on the Gene Ontology. It was great because I could both build the ontology and use the terms to annotate Drosophila gene products. This helped me understand how to make the ontology both usable and accurate.”
The Gene Ontology project, pioneered by a consortium including her then-boss Michael Ashburner, represented a fundamental shift in how biological knowledge could be structured and shared. Ashburner recognized early that ontologies – structured vocabularies for specific domains – could provide a common language readable by both humans and computers, maximizing the utility of scientific data.
“But Michael was also very skilled at securing funding to develop them, which is always crucial – paying people to build them and make them available to the community,” says Rebecca.
‘Everyone wants to know how everything is connected’
The real power of ontologies lies in the relationships between concepts. “Everyone wants to know how everything’s connected,” says Rebecca. “It's really about getting the bigger picture of how everything fits together and understanding why.”
These connections enable researchers to zoom in on areas of interest while maintaining context within the broader landscape of knowledge. “Contributing to downstream knowledge graphs, isolated data points can be transformed into networks of meaning, allowing scientists to identify patterns, generate hypotheses and make discoveries that would be impossible when working with fragmented information,” she says.
At Elsevier SciBite, Rebecca’s team helps deliver bespoke solutions across the life sciences and beyond, while developing curated content that feeds into the SciBite tool stack. In other words, she gets to witness the entire lifecycle of content – from initial construction through modification and expansion, to implementation in tools, and finally observing how customers use it in their research workflows.
“Seeing that entire process is quite rare,” Rebecca explains, “and getting that comprehensive view, and seeing how all these different stages are also all connected, I find very satisfying.”
The AI inflection point
The rise of generative AI has created both opportunities and imperatives for ontology work. As large language models become more central to scientific research, the need for structured, trustworthy knowledge has intensified.
“I believe we’re still on that journey, but we’re beginning to see the value that these new models can provide, and I think it’s from both sides,” Rebecca says. “The models help us in the curation process, where can we use these models to help streamline our curation workflow. On the other hand, the ontology feeds into the models.”
This two-way relationship is essential. As more people use LLMs for research queries, they increasingly recognize the importance of evidence-based answers. “One of the advantages of ontologies is that they are often a community standard,” Rebecca notes. “They are created and agreed upon by a group of experts, which means we can trust them because they are built by those who have specialized knowledge in that field.”
Meanwhile, the sheer mass and complexity of human health data make AI assistance essential. But as Rebecca emphasizes, there’s a critical role for human expertise in providing the structured, high-quality content that AI systems leverage, and critiquing the ‘answers’.
“Using these models to expand our curation efforts is crucial because there's so much data,” she says. “But we can also use the models to help us construct our content: identifying the gaps, suggesting synonyms, validating, etcetera. This allows us to focus on the tasks that humans are best at.”
Ontology community as unsung AI heroes
Despite their fundamental importance, ontologies continue to face persistent funding challenges. Much of the work takes place in academia or through community projects, often initiated by volunteers in their spare time. As Rebecca pointedly observes, “I wish everyone knew how much work goes into the underlying ontologies.”
The problem is partly one of visibility and an underappreciation of the accumulated expertise and effort behind them. The Gene Ontology, for instance, represents 20 years of continuous development by domain experts. “In addition, creating an ontology isn't a one-time effort,” Rebecca stresses. “Ongoing updates are necessary as our knowledge expands. Consistent maintenance is essential.”
Rebecca believes the solution lies in diversifying funding streams and promoting collaborations. Organizations like the Pistoia Alliance facilitate these collaborations, helping identify good fits where both sides can learn from each other.
Show, don't tell
The best way to demonstrate the value of ontologies, according to Rebecca, is through specific use cases: “Once the data is clean, structured, interoperable and out of people’s spreadsheet silos, the real work can begin, whether that’s drug repurposing, metabolic pathway modeling or any other workflow processes that pharmaceutical companies would only adopt if they generated real value and revenue,” says Rebecca.
Individuals and organizations can also support the ontology cause by giving feedback and submitting improvement requests. “This creates a virtuous cycle that strengthens both the ontologies and the case for continued funding.”
As AI continues to transform scientific research, the infrastructure for knowledge representation becomes increasingly important. Ontologies build the foundation that prevents AI systems from hallucinating, bases answers on expert consensus, and allows researchers to trust the insights they obtain.
Meanwhile, the collaborative nature of the curation community suggests this is work that attracts people driven by a shared mission. “There’s a reason why people stay and work in curation for a long time, and I think that’s partly because of the people involved. I can certainly say I’ve never encountered any ogres,” smiles Rebecca.
For Rebecca, it all connects back to those fruit flies and their visible genetics – while helping build a complete picture. The difference now is that the canvas is vastly larger, and the tools for painting this picture are evolving at breakneck speed. Yet, the fundamental need remains: a structured knowledge base that helps us make sense of complexity and drives discovery forward.
Learn more about how the SciBite team helps researchers connect the dots.
Related Articles:
AI agents in action: Uncovering our inner mutants
From quicksand to bedrock: How data quality shapes AI
Harnessing ontologies for pharma: Dr Jane Lomax on the synergy of AI and scientific expertise