scLEAF is a versatile framework for single-cell multi-omics data analysis, which transfers cell representations to the LLM text space.
- Python 3.10, PyTorch>=1.21.0, numpy>=1.24.0, are required for the current codebase.
We use the Vicuna-7B model to extract the cell-level text embeddings. Download embeddings from https://drive.google.com/drive/folders/1aArcZjDckc7my9gPvVqN0h8X-7a0brLV.
The original embeddings can be downloaded from https://sites.google.com/yale.edu/scelmolib. We also provide the preprocessed version in https://drive.google.com/drive/folders/1aArcZjDckc7my9gPvVqN0h8X-7a0brLV.
Download dataset from https://github.com/SydneyBioX/scJoint/blob/main/data.zip.
sh pretrain_cite.sh
sh finetune_cite.sh
sh pretrain_asap.sh
sh finetune_asap.sh
Our codebase is built based on scCLIP, timm, transformers, and Pytorch Lightning. We thank the authors for the nicely organized code!