Background/Aims: The Genomics England Rare Disease variant prioritisation algorithm is a core component of the Genomic Medicine Service within the National Health Service (NHS) in England. This algorithm supports the diagnosis of rare diseases by shortlisting a set of genetic variants to facilitate clinical assessment. Over the years, the Rare Disease variant prioritisation algorithm has undergone significant enhancements to increase its diagnostic capability through incorporating new data and tools. A persistent challenge in our software development lifecycle has been the absence of rapid and effective methods for assessing performance early in the development process, which is essential for identifying bottlenecks during feature development. To address this gap, we have implemented a Nextflow-based solution that enables comprehensive A/B testing across all combinations of parameters, software package versions and sample datasets. Another innovation involves the development of a custom, reusable Singularity overlay, which injects a profiling tool into existing Nextflow processes to generate detailed CPU usage data. By adopting these data-driven approaches, we were able to identify low-level inefficiencies within the codebase.
During a recent exercise, we implemented several targeted optimisations that collectively reduced the runtime of the variant prioritisation engine by half. More generally, this case study has provided a template by which we can now routinely incorporate on-demand performance testing into the early stages of our software development cycle – thereby ensuring the Rare Disease pipeline remains robust and optimised for performance as we continue to actively develop its capabilities.