A Monte Carlo simulation study exploring whether "boom-or-bust" or consistent batsmen are more valuable in Test cricket.
If two batsmen have the same batting average, but one is high-variance (more ducks AND more centuries) and one is consistent (steady scores), which is more valuable to a Test cricket team?
| Strategy | Win Rate | Draw Rate | Loss Rate |
|---|---|---|---|
| All Consistent | 53.8% | ~8% | 38.2% |
| High Variance Openers Only | ~52% | ~8% | ~40% |
| All High Variance | 44.8% | ~7% | 48.2% |
Bottom line: Consistency wins. High-variance batting creates collapse risk that isn't compensated by occasional big scores.
Every delivery is simulated individually. The base scoring distribution (before normalization):
| Outcome | Base Weight |
|---|---|
| Dot ball | 0.650 |
| Single | 0.220 |
| Two | 0.070 |
| Four | 0.040 |
| Six | 0.020 |
| Wicket | Dynamic |
The wicket probability changes based on balls faced - this is the core innovation.
After calculating the dynamic wicket probability, all outcomes are normalized to sum to 100%. This means when wicket probability is higher (early in an innings for high-variance batsmen), scoring shot probabilities are slightly reduced proportionally.
Both batsman types produce identical expected runs per dismissal (~38 runs for openers/middle order):
LOW VARIANCE (Consistent):
p(wicket) = 1.68% constant per ball
Result: Steady scores, few ducks, few centuries
HIGH VARIANCE (Boom-or-Bust):
p(wicket) = 0.75% + (3.4% - 0.75%) * e^(-0.025 * balls_faced)
At ball 0: 3.4% (very risky - often out early)
At ball 50: 1.8% (average risk)
At ball 100: 1.0% (safe when "set")
Result: More ducks AND more centuries
Both are calibrated using survival analysis to have the same expected batting average.
Example normalization:
- High-variance batsman at ball 0: wicket = 3.4%, so scoring outcomes scaled to 96.6%
- High-variance batsman at ball 100: wicket = 1.0%, so scoring outcomes scaled to 99.0%
- Low-variance batsman (any ball): wicket = 1.68%, so scoring outcomes scaled to 98.3%
A full Test match simulation includes:
-
Four innings - Team 1 bats, Team 2 bats, Team 1 bats again, Team 2 chases
-
Weather variation - 60% good (400-450 overs), 25% moderate (320-400), 15% poor (250-320)
-
Pitch deterioration - Later innings are harder to bat:
Innings Wicket Multiplier Effect 1st 1.00x Fresh pitch 2nd 1.10x Slightly worn 3rd 1.22x Worn 4th 1.45x Difficult chase -
Declarations - Teams can declare when:
- 1st innings: 350+ runs with 6+ wickets down, or 450+ runs
- 3rd innings: Lead of 250+ with 120+ overs remaining
| Outcome | Condition |
|---|---|
| Team 1 wins | Team 2 all out in 4th innings before reaching target |
| Team 2 wins | Team 2 reaches target in 4th innings |
| Draw | Time runs out (max overs reached) before result |
Draw rate is calibrated to ~8-10%, matching modern Test cricket.
Each team has 11 players with different base averages:
| Position | Players | Expected Average | Role |
|---|---|---|---|
| Openers | 2 | ~38 runs | Face new ball |
| Middle Order | 4 | ~38 runs | Main run scorers |
| All-rounders | 2 | ~25 runs | Balance bat/bowl |
| Tail | 3 | ~13 runs | Primarily bowlers |
We tested 25 different variance allocation strategies:
Uniform strategies:
All Very Low- Everyone consistent (decay=0.0)All High- Everyone boom-or-bust (decay=0.85)
Opener-focused:
Both Openers High- Aggressive openers, consistent restAggro + Anchor Open- One aggressive, one steady opener
Position-based:
Solid Middle- Consistent middle order, aggressive endsSwinging Tail- Aggressive tail-endersTop Heavy Gradient- Variance decreases down the order
Mixed:
Alternating Pair- Alternate high/low through lineupExplosive Start- Top 3 aggressive, rest consistent
| Rank | Strategy | Win Rate | Key Insight |
|---|---|---|---|
| 1 | All Very Low | 51.7% | Consistency is optimal |
| 2 | Explosive Start | 50.5% | Top 3 aggressive works |
| 3 | Mixed Middle | 50.3% | Alternating in middle |
| ... | ... | ... | ... |
| 21 | All High | 45.0% | Too much variance hurts |
| 25 | Both Openers Low | 43.4% | Worst strategy |
| Matchup | Result | Z-score | Significant? |
|---|---|---|---|
| All Very Low vs Both Openers High | 142-143 | 0.12 | NO (tied) |
| All Very Low vs Explosive Start | 161-120 | 4.73 | YES |
| All Very Low vs All High | 157-132 | 2.89 | YES |
| Both Openers High vs All High | 154-125 | 3.35 | YES |
High variance teams experience significantly more collapses:
| Team Type | Collapse Rate (<150 all out) |
|---|---|
| All Low Variance | ~7% |
| All High Variance | ~18% |
The centuries from high-variance batsmen don't compensate for the collapses.
-
Collapse Risk: High variance means more early dismissals. When multiple batsmen fail early in the same innings, you get a collapse that's very hard to recover from.
-
Test Cricket is Long: Unlike T20/ODI, there's no "required rate" pressure in most situations. Steady accumulation works.
-
Compounding Effect: One collapse in a 4-innings match can lose you the game, even if other innings went well.
-
"Getting Set" Comes Too Late: By the time a high-variance batsman becomes safe (50+ balls), a consistent batsman has scored similar runs with less risk.
Interestingly, having just the two openers be high-variance (with a consistent middle/lower order) performs identically to all-consistent teams. This is because:
- Openers face the new ball anyway (inherently risky)
- If they survive, they score big; if not, consistent middle order rescues
- The consistent tail prevents collapses
# Clone the repository
git clone https://github.com/rbpilgrim/cricket_variance_simulation.git
cd cricket_variance_simulation
# Run the main experiment
python team_composition_experiment.py
# Run head-to-head analysis
python analyze_matchups.py
# Or open the Jupyter notebook
jupyter notebook cricket_variance_analysis.ipynb| File | Description |
|---|---|
ball_by_ball_simulation.py |
Core simulation engine - ball outcomes, innings, matches |
team_composition_experiment.py |
25-strategy round-robin tournament |
analyze_matchups.py |
Head-to-head statistical analysis |
analyze_match_details.py |
Detailed match statistics and collapse analysis |
variance_exchange_rate.py |
Quantifies variance value in runs |
cricket_variance_analysis.ipynb |
Self-contained Colab notebook |
RESEARCH_NOTE.md |
Full research writeup with methodology |
- Simplified bowling: All bowlers treated equally (no swing, spin, pace variation)
- No player matchups: Real cricket has bowler-batsman specific interactions
- Fixed conditions: Same pitch model for all matches (no spinning tracks, green seamers)
- Basic declarations: Real captains use more nuanced decision-making
- No follow-on: The follow-on rule is not implemented
- Model different pitch types (spinning, seaming, flat)
- Add bowler skill variation
- Implement follow-on rule
- Model batting partnerships (not just individuals)
- Analyze specific match situations (chasing 300+ in 4th innings)
MIT
Built with Claude Code