Skip to content

ramtinz/multi-agent-debate-protocols

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent Debate Protocols

This repository contains supporting materials for the preprint on debate-protocol design in multi-agent LLM systems. The study compares three debate protocols under matched prompting and decoding conditions to examine how protocol design affects peer-reference behavior, argument diversity, and consensus formation.

This is a support repository for reproducing the main results reported in the preprint, not the full private development workspace. The preprint PDF itself is not stored in this repository.

Included

  • data/: the dataset used in the study
  • scripts/: core scripts for running the primary protocol comparison, validating the judge, aggregating runs, and plotting aggregate results
  • demo_streamlit/: a lightweight interactive demo for illustrating the debate protocols

Not Included

  • local virtual environments
  • recovery logs and chat exports
  • large intermediate run folders
  • trained adapters and other heavy artifacts
  • private or local-only scratch files

Minimal first public version

If you want the smallest clean release, publish:

  1. data/data.csv
  2. scripts/rq1_quick_experiment.py
  3. scripts/aggregate_rq1_runs.py
  4. scripts/plot_rq1_aggregate.py
  5. scripts/validate_judge_model.py
  6. scripts/judge_validation_examples.json
  7. demo_streamlit/

You can add the rest later if needed.

Reproducing the main protocol comparison

Install the script dependencies:

pip install -r requirements.txt

Validate the judge model:

python scripts/validate_judge_model.py \
  --examples scripts/judge_validation_examples.json \
  --judge-model mistral:latest \
  --out outputs/judge_validation.json

Run the primary comparison:

python scripts/rq1_quick_experiment.py \
  --data data/data.csv

Aggregate the run folders:

python scripts/aggregate_rq1_runs.py \
  --runs-root outputs \
  --out outputs/rq1_aggregate.json \
  --data data/data.csv

Plot the aggregate results:

python scripts/plot_rq1_aggregate.py \
  --input outputs/rq1_aggregate.json \
  --output outputs/rq1_aggregate.png

Notes

  • The Streamlit demo is intentionally simplified. It is useful for intuition and communication, but it does not reproduce the full experimental workflow reported in the preprint.
  • Aggregate result files can be added later if you want a snapshot of the exact plotted outputs, but they are not required for a clean first public release.

Possible later additions

  • add a repo-level requirements.txt or pyproject.toml
  • decide whether the Streamlit demo should live in this repo under demo/ or in a separate promotion-focused repo

Releases

No releases published

Packages

 
 
 

Contributors

Languages