Quantized multi-task learning for context-specific representations of gene network dynamics.

bioRxiv : the preprint server for biology

2024

https://researcherprofiles.org/profile/540986869

39229018

Chen H, Venkatesh MS, Ortega JG, Mahesh SV, Nandi TN, Madduri RK, Pelka K, Theodoris CV

Abstract

While often represented as static entities, gene networks are highly context-dependent. Here, we developed a multi-task learning strategy to yield context-specific representations of gene network dynamics. We assembled a corpus comprising ~103 million human single-cell transcriptomes from a broad range of tissues and diseases and performed a two stage pretraining, first with non-malignant cells to generate a foundational model and then with continual learning on cancer cells to tune the model to the cancer domain. We performed multi-task learning with the foundational model to learn context-specific representations of a broad range of cell types, tissues, developmental stages, and diseases. We then leveraged the cancer-tuned model to jointly learn cell states and predict tumor-restricting factors within the colorectal tumor microenvironment. Model quantization allowed resource-efficient fine-tuning and inference while preserving biological knowledge. Overall, multi-task learning enables context-specific disease modeling that can yield contextual predictions of candidate therapeutic targets for human disease.