Purpose: Circulating tumor cells (CTCs) represent a minimally invasive method for monitoring cancer evolution in patients. CTCs are generally isolated using antibodies against EPCAM protein. A key question regards the extent of EPCAM-negative CTCs, such as those that undergo EMT or whose cell of origin is EPCAM-low or negative. We studied 3,302 RNA single-cell transcriptomes reported as putative CTCs in public repositories.
Methods: Using copy number variation and cell-specific markers, we discriminated bona fide CTCs from contaminating blood cells, mislabeled as CTCs. Processed data were employed used to develop the CTCeek tool. Integration of CTCs and PBMCs allowed us to identify novel markers to expand the range of recovered CTC subtypes. Based on our findings, we developed CTCeek, the first web-based tool that automatically annotates true bona-fide CTCs from scRNA-sequencing profiles.