GitHub - asingularity/business-agent: project for experimenting with planning and executing business strategies under uncertainty

Business-Agent

A simulated business environment where an LLM agent autonomously sets prices, allocates marketing budget, and manages inventory. It uses MCTS to plan multi-week strategies and Bayesian belief tracking to learn demand curves from noisy observations.

Motivation

Inspired by the challenge of AI systems that autonomously plan and execute business strategies under uncertainty.

Experiment with agentic AI for business decisions in a simulated business environment
Try MCTS + Bayesian Estimation for node uncertainy (like PlanU paper https://arxiv.org/pdf/2510.18442) and quantify advantages
Compare LLM-guided vs. random action proposals for MCTS
Experiment deeply with probabilistic programming tools like PyMC

Summary

Start simple (2 products, no seasonality, no cross-product effects, known costs) and progressively add complexity. This allows demonstrating the value of uncertainty handling at each level.

We expect:

With 2 products and no seasonality, even a simple agent does okay.
Add unknown elasticities and the Bayesian component becomes necessary.
Add seasonality and multi-week inventory lag, and MCTS planning becomes necessary.
Add cross-product substitution effects and you need both working together.

Test methods against baselines at every step to verify critical pieces.

Plan

Approach

The system to be tested is PlanU-style, where the full agent uses quantile distributions on the MCTS nodes. I.e. the MCTS uses distributional value estimates rather than mean values.

The LLM's primary role is proposing candidate actions for MCTS exploration; the mathematical demand model handles state transitions and the particle filter handles uncertainty tracking.

Phases

At each phase, the least complex inference + planning method that is expected to maximize performance is listed below.

Phase	Complexity	Unknown parameters	Inference Method	Planning Method
1. Basics	2 products, fixed costs, no seasonality, no inventory constraint	4 (base demand + elasticity per product)	Grid approximation	Greedy 1-step optimization (no MCTS)
2. Add marketing	2 products, marketing budget allocation	6 (+ marketing response per product)	Grid approximation or Conjugate priors	Greedy 1-step optimization (no MCTS)
3. Add seasonality	2–3 products, seasonality, inventory lag, stockout penalties	14–21 (+ 4 seasonal multipliers per product, for 2–3 products)	Particle filtering	MCTS
4. Cross-product effects	4–5 products, substitution/complementarity, seasonality, inventory lag	40–55 (all previous + cross-elasticity per product pair, for 4–5 products)	Particle filtering; variational inference	MCTS (with wider branching)

Baselines

The full agent uses MCTS planning + Bayesian posteriors (particle filter) + LLM-guided action proposals. To isolate the contribution of each component, baselines vary one axis at a time across three dimensions: planning horizon, uncertainty handling, and action proposal method.

Additionally, a fixed heuristic baseline (set all markups to 1.5×, split marketing budget equally, reorder inventory when stock drops below a threshold) serves as a "no ML at all" reference point.

#	Planning	Uncertainty	Action Proposals	What It Tests
B1	Greedy	Point estimate	Random	Floor baseline — no intelligence
B2	Greedy	Point estimate	LLM-guided	Value of LLM alone (no planning, no Bayesian)
B3	Greedy	Bayesian	LLM-guided	Value of Bayesian alone (no planning)
B4	MCTS	Point estimate	LLM-guided	Value of planning alone (no Bayesian)
B5	MCTS	Bayesian	Random	Value of MCTS + Bayesian without LLM guidance
Full	MCTS	Bayesian	LLM-guided	Everything together

"Bayesian" baselines use grid approximation in Phase 1–2 and particle filtering in Phase 3–4, matching the inference method from the Phases table.

Each adjacent comparison isolates exactly one variable:

Comparison	Variable Isolated	Question Answered
B3 vs. Full	Greedy → MCTS	Does multi-step planning help?
B4 vs. Full	Point estimate → Bayesian	Does uncertainty handling help?
B5 vs. Full	Random → LLM-guided	Does the LLM as policy prior help?
B2 vs. B3	Point estimate → Bayesian (both greedy)	Does Bayesian help even without planning?
B2 vs. B4	Greedy → MCTS (both point estimate)	Does planning help even without Bayesian?

Expectations

In Phase 1–2 (no inventory lag, no seasonality), Baselines B1-B3 should be competitive with or match MCTS (B4-Full). That is, MCTS agents are not expected to show significant advantage over greedy. The chart would show all the intelligent agents clustered together well above random and heuristic. The story here is: "planning doesn't help much when decisions are independent."

In Phase 3-4 (inventory lag + seasonality), the separation is expected to appear. Greedy agents (B1-B2) would show periodic profit crashes — they get caught by stockouts when demand spikes seasonally because they didn't order inventory in advance. B3 would be expected to fare slightly better, having the ability to learn demand parameters. The MCTS agents would show smoother, higher cumulative profit because they anticipated the demand shift. The chart would look like: MCTS+Bayesian > MCTS-only > Greedy+Bayesian > Greedy > Heuristic > Random.

The most interesting comparison is Bayesian-without-planning vs. planning-without-Bayesian. In different scenarios, different ones win. Bayesian-without-planning excels when the main challenge is not knowing the demand parameters (early in the simulation, high uncertainty). Planning-without-Bayesian excels when parameters are roughly known but the challenge is sequential dependencies (inventory lag, seasonality). Planning-with-Bayesian is expected to outperform both.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
inference		inference
planning		planning
proposals		proposals
tests		tests
PROJECT_SUMMARY_2026-04-24.md		PROJECT_SUMMARY_2026-04-24.md
README.md		README.md
business_environment.py		business_environment.py
conftest.py		conftest.py
experiment_runner.py		experiment_runner.py
forward_model.py		forward_model.py
libraries.md		libraries.md
phase5_experiments_results.md		phase5_experiments_results.md
plan_2026-04-16_2303.md		plan_2026-04-16_2303.md
plan_phase5_2026-04-24.md		plan_phase5_2026-04-24.md
requirements.txt		requirements.txt
run_experiment.py		run_experiment.py
run_h1_evaluation.py		run_h1_evaluation.py
run_h2_evaluation.py		run_h2_evaluation.py
run_tests.sh		run_tests.sh
smoke_test_phase5.py		smoke_test_phase5.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Business-Agent

Motivation

Summary

Plan

Approach

Phases

Baselines

Expectations

Results

Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Business-Agent

Motivation

Summary

Plan

Approach

Phases

Baselines

Expectations

Results

Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages