-
-
Notifications
You must be signed in to change notification settings - Fork 105
Description
Summary
When entity history data is recorded at intervals greater than 1 minute (e.g. 5-minute intervals), fill_load_from_power in fetch.py double-counts energy, inflating load predictions by approximately 1.3-1.6x actual consumption.
This results in:
predbat.load_energypredicting significantly more than actual consumption- Inflated
savings_yesterday_predbatvalues - Cumulative
savings_total_predbatgrowing too fast
Root Cause
Data Sparsity
When history data is recorded at 5-minute intervals (rather than HA's sub-second state change resolution), clean_incrementing_reverse produces a dict with entries at only ~9% of minute positions. The remaining ~91% of minutes return 0 from dict.get(minute, 0).
Double-Counting Mechanism
Phase 1 of fill_load_from_power (fetch.py:246-305):
- Iterates minute-by-minute looking for "zero periods" (consecutive minutes with value 0)
- The gap minutes between real data points appear as zero-value sequences
- Any gap ≥
gap_size(default 5 minutes) is detected as a "zero period" - Phase 1 integrates power data for these gaps and ADDS it to the cumulative values (line 302)
- It then bumps up ALL more-recent minutes (lines 304-305), inflating the cumulative curve
Phase 2 (fetch.py:307-357):
- Divides data into 30-minute windows
- Reads cumulative values at start/end:
load_total = load_at_start - load_at_end load_at_startis already inflated by Phase 1- Scales power data to match the inflated total → energy counted twice
Why Standard HA Installs Are Unaffected
HA records state changes at sub-second resolution, producing dense minute-level data. fill_load_from_power finds no significant "zero periods" to misdetect, so Phase 2 works correctly.
Who Is Affected
Any deployment where entity history is stored at intervals greater than ~5 minutes (e.g. external databases with downsampled data, custom history backends).
Evidence
Example user on March 10, 2026:
consumption_today: 111 data points over 1200 minutes (9.25% coverage)consumption_power: 193 data points over 1200 minutes- Yesterday's actual consumption: 19.7 kWh
- PredBat predicted load: 51.6 kWh / 48h = 25.8 kWh/day (1.31x actual)
load_energy_actualtracking at 12.9 kWh at 19:50 (extrapolated ~15.6 kWh/day)- Energy balance verified: GivEnergy API
consumptionfield correctly excludes battery charging
Suggested Fix
After clean_incrementing_reverse produces sparse cumulative data, linearly interpolate between known data points to fill every minute index before passing to fill_load_from_power. This prevents Phase 1 from misdetecting inter-sample gaps as zero periods.
Alternatively, fill_load_from_power Phase 1 could be made aware of data sparsity by only treating a period as "zero" if there are actual data points with value 0, rather than relying on dict.get(minute, 0) defaults.
Affected Files
apps/predbat/fetch.py—fill_load_from_power(),minute_data_load()apps/predbat/utils.py—minute_data(),clean_incrementing_reverse()