Increasing the biological fidelity of single-cell AI models

Poster Abstract: Muhammad Asif, SRA University of Cambridge

Abstract

Objective:  Artificial intelligence and single-cell technologies are revolutionizing modern biology by providing insights into cellular mechanisms in health and disease. Their application in understanding neurodegenerative diseases like amyotrophic lateral sclerosis (ALS) is gaining traction, though challenges remain. Current models often fail to identify rare or transitional cell states and generate latent representations that complicate biological interpretation. To address these issues, we developed two complementary AI frameworks aimed at enhancing single-cell analysis. The first, EL-PACA, is a deep learning method designed to accurately identify rare cell types. It integrates reference information through a hybrid model combining Principal Component Analysis (PCA) and Multiple Discriminant Analysis (MDA) with a deep classifier. By incorporating reference priors directly into its architecture, EL-PACA improves the detection of rare cellular populations and subtle state changes, particularly those overlooked by conventional methods. The second framework, scBioFM, enhances the capture of disease signatures within a biologically informed low-dimensional space. This model integrates curated biological knowledge, such as protein–protein interaction networks and Gene Ontology structures, with embeddings from large foundation models. The result is a unified representation that encodes cellular and genetic relationships across functional, regulatory, and disease-related contexts. We show that applying these methodologies to single-cell ALS-related datasets, these frameworks yield more precise identification of disease-associated states in ALS-related datasets, greater enrichment of known pathways, and improved interpretability. 

Conclusions: Together, our models represent a new generation of AI tools aimed at uncovering rare cell types and the mechanistic signatures underlying complex neurodegenerative disease processes