Skip to content

SaFo-Lab/DRIFT

Repository files navigation

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

Hao Li, Xiaogeng Liu, Hung-Chun Chiu, Dianqi Li, Ning Zhang, Chaowei Xiao.

Github license

framework

The official implementation of NeurIPS 2025 paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents".

Update

  • [2026.4.19] 🛠️ Update the evaluation on AgentDyn.
  • [2026.1.30] 🛠️ Support the evaluation on more agents.
  • [2026.1.30] 🛠️ Update the evaluation code on ASB.

How to Start

We provide the evaluation of DRIFT, you can reproduce the results following:

Evaluating on AgentDojo

Construct Your Environment

conda create -n drift python=3.11
source activate drift
pip install "agentdojo==0.1.35"
pip install -r requirements.txt

Set Your API KEY

We provide three API providers, including OpenAI, Google, and OpenRouter. Please set up the API Key as you need.

export OPENAI_API_KEY=your_key
export GOOGLE_API_KEY=your_key
export OPENROUTER_API_KEY=your_key

run task with no attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 \
--build_constraints --injection_isolation --dynamic_validation
--suites banking,slack,travel,workspace

run task under attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 --do_attack \
--attack_type important_instructions \
--build_constraints --injection_isolation --dynamic_validation
--suites banking,slack,travel,workspace

You can evaluate any model from the supported providers by passing its model identifier (eg., gemini-2.5-pro) to the --model flag. To evaluate under an adaptive attack, include the --adaptive_attack configuration.

Evaluating on AgentDyn

To evaluate on AgentDyn, you can directly replace the AgentDojo dependency with the AgentDyn version. First, run:

git clone git@github.com:SaFo-Lab/AgentDyn.git
cd AgentDyn
pip install -e .

Then, AgentDojo dependency has been replaced with the AgentDyn version, which additionally supports the shopping, github, and dailylife suites. You can use the same commands as for evaluating on AgentDojo to evaluate on these three suites, as shown below:

run task with no attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 \
--build_constraints --injection_isolation --dynamic_validation
--suites shopping,github,dailylife

run task under attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 --do_attack \
--attack_type important_instructions \
--build_constraints --injection_isolation --dynamic_validation
--suites shopping,github,dailylife

Evaluating on ASB

Please refer to ASB_DRIFT/README.md.

Inspect Results

You can find the cached results in runs/.

References

If you find this work useful in your research or applications, we appreciate that if you can kindly cite:

@articles{DRIFT,
  title={DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents},
  author={Hao Li and Xiaogeng Liu and Hung-Chun Chiu and Dianqi Li and Ning Zhang and Chaowei Xiao},
  journal = {NeurIPS},
  year={2025}
}

About

[NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages