Transcriptomics-guided Slide Representation Learning in Computational Pathology

Self-supervised learning (SSL) has been successful in building patchembeddings of small histology images (e.g., 224x224 pixels), but scaling thesemodels to learn slide embeddings from the entirety of giga-pixel whole-slideimages (WSIs) remains challenging. Here, we leverage complementary informationfrom gene expression profiles to guide slide representation learning usingmultimodal pre-training. Expression profiles constitute highly detailedmolecular descriptions of a tissue that we hypothesize offer a strongtask-agnostic training signal for learning slide embeddings. Our slide andexpression (S+E) pre-training strategy, called Tangle, employsmodality-specific encoders, the outputs of which are aligned via contrastivelearning. Tangle was pre-trained on samples from three different organs: liver(n=6,597 S+E pairs), breast (n=1,020), and lung (n=1,012) from two differentspecies (Homo sapiens and Rattus norvegicus). Across three independent testdatasets consisting of 1,265 breast WSIs, 1,946 lung WSIs, and 4,584 liverWSIs, Tangle shows significantly better few-shot performance compared tosupervised and SSL baselines. When assessed using prototype-basedclassification and slide retrieval, Tangle also shows a substantial performanceimprovement over all baselines. Code available athttps://github.com/mahmoodlab/TANGLE.

Further reading