Skip to content

ma-compbio/LLOKI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview of LLOKI

LLOKI recovered-5

LLOKI is a novel framework designed for scalable spatial transcriptomics (ST) integration across diverse technologies without requiring shared gene panels. The framework consists of two key components:

  1. LLOKI-FP: Utilizes optimal transport and feature propagation to perform a spatially informed transformation of ST gene expression profiles, aligning their sparsity with that of scRNA-seq. This optimizes the utility of scGPT embeddings.

  2. LLOKI-CAE: A conditional autoencoder that integrates embeddings across ST technologies using a novel loss function. The loss function balances batch integration while preserving robust biological information from the LLOKI-FP embeddings.

This unique combination ensures the alignment of both features and batches, enabling robust ST data integration while preserving biological specificity and local spatial interactions.


Running LLOKI

Input Data Format

LLOKI requires spatial transcriptomics data in AnnData format. The data should be structured as follows:

1. Spatial Coordinates

  • The spatial coordinates of each cell should be included in the .obsm attribute of the AnnData object.
  • Coordinates must be stored in .obsm['spatial'] and formatted as an array with dimensions [number of cells, 2], where each row represents the x and y coordinates of a cell.

2. Gene Expression Data

  • Gene expression data should be stored in .X as either a sparse or dense matrix with dimensions [number of cells, number of genes].
    • Sparse matrices are recommended for large datasets.

3. Additional Metadata (Optional)

  • Any additional metadata (e.g., cell types, batch labels) can be stored in .obs.

Installation

Step 1: Create a Conda Environment

We recommend using Anaconda to manage your environment. If you haven't already, refer to the Anaconda webpage for installation instructions.

Create a Python 3.8 environment using the following command:

conda create --name lloki python=3.10

Activate the environment:

conda activate lloki

Step 2: Install Dependencies

Install PyTorch with CUDA (Optional)

If you have an NVIDIA GPU and want to use CUDA for acceleration, install PyTorch with the desired CUDA version. For example, to install PyTorch 2.1.0 with CUDA 11.8:

pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 \
  --index-url https://download.pytorch.org/whl/cu121

Install flash attention (Optional)

pip install flash-attn --no-build-isolation

Install Remaining Dependencies

Install the remaining required packages using the provided requirements.txt file:

pip install -r requirements.txt

Running the Code

The method consists of two parts: LLOKI-FP and LLOKI-CAE. To run both parts, use the following command:

python run_lloki.py

This will:

  1. Download the necessary data and model.
  2. Run both parts of the pipeline.

About

LLOKI is a framework for cross-platform integration of spatial transcriptomics datasets without requiring shared gene panels. Includes tools for feature propagation, conditional autoencoding, and downstream analyses such as spatial domain detection and integrated gene program discovery.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors