New insights from the world of computational biology reveal improvements for analyzing highly multiplexed imaging data through STARLING, a new probabilistic model.
STARLING, short for SegmenTation AwaRe cLusterING, addresses the troublesome issue of segmentation errors, which can cloud the interpretation of detailed cellular information.
This sophisticated machine learning model allows scientists to quantify cell populations more accurately using spatial proteomic techniques, which typically measure up to 40 proteins within cells at sub-cellular resolutions. Traditional analyses struggle due to potential overlaps and erroneous segmentations, making it difficult to determine accurate cellular phenotypes.
“STARLING models true cluster identities and segmentation errors to deliver biologically meaningful cellular phenotypes more accurately than previous methods,” say the authors of the groundbreaking research published on October 5, 2023.
The research team, composed of experts from institutions including the University of Toronto and Mount Sinai Hospital, developed STARLING to refine the process of studying complex tissues, such as lymphoid organs rich with immune cells. These densely packed cellular environments are notorious for complicative segmentation.
Through the use of advanced clustering techniques and the introduction of plausibility scores, STARLING overcomes the ambiguities prevalent in existing methods like PhenoGraph and FlowSOM. The novelty of STARLING lies not only in its ability to discern cellular identities using probability but also to mitigate the challenges posed by segmentation errors common with high-resolution image data.
The model was initially tested on cell pellets containing controlled cell types, allowing for comparative validation of its capabilities. Following this, its prowess was demonstrated on human tonsil samples, which are notoriously dense regions of immune cells.
“By considering error probabilities, STARLING significantly enhances our ability to interpret complex protein expression data and truly reflects the underlying cellular diversity,” the authors state, recognizing the model's transformative potential.
The results show STARLING outperformed the benchmarks by exhibiting clear improvements across various datasets. The plausibility scores achieved indicate how accurately the inferred clusters correspond to biological realities, which is invaluable for researchers hoping to interpret cellular behaviors within tumors and other complex biological systems.
This research holds significant promise for medical and biological fields, including cancer research and immunology, where accurate cellular profiling can lead to enhanced therapies and diagnostic tools.
With more than 240,000 cells analyzed from human tonsil tissues across 16 regions and multiple donors, STARLING has already begun to reveal insights about cellular diversity and spatial organization.
Looking forward, the authors envision STARLING as not just another tool but as part of the future standard for analyzing multiplexed imaging data. The open-source implementation of STARLING allows widespread access to its advantages within the research community.
“Our plausible scores demonstrate how well inferred clusters align with biological reality, allowing researchers to trust their findings with greater confidence,” say the authors, encapsulating STARLING's mission to improve scientific interpretation.
By providing researchers with the tools to confidently navigate cellular analysis, STARLING stands to make waves across cellular biology and clinical research, bridging computational efforts to tangible clinical outcomes.