From fc994148d129f6e8dce5ad399d9d5a03bbbb61f7 Mon Sep 17 00:00:00 2001 From: tiffanychu90 Date: Fri, 1 May 2026 17:06:11 +0000 Subject: [PATCH 1/5] expand readme with goals from report --- rt_predictions/README.md | 27 +++++++++++++++++++-- rt_predictions/chart_utils_for_operators.py | 16 +++++++----- 2 files changed, 35 insertions(+), 8 deletions(-) diff --git a/rt_predictions/README.md b/rt_predictions/README.md index 637f08944..26d12a875 100644 --- a/rt_predictions/README.md +++ b/rt_predictions/README.md @@ -11,7 +11,6 @@ Accurate and reliable information should be provided to transit users for journe * prediction inconsistency - how much are predictions changing from minute to minute as the bus approaches time of arrival? * prediction reliability and accuracy - are these predictions accurate (when compared to our estimated actual time of arrival)? - ## Reliable Prediction Accuracy The prediction is considered **accurate** if it falls within the bounds of this equation: `-60ln(Time to Prediction+1.3) < Prediction Error < 60ln(Time to Prediction+1.5)`. @@ -36,14 +35,31 @@ As the bus approaches each stop, the software is making predictions for when the * follow the prediction, you will **miss** the bus...this is very bad! * we want fewer of these kinds of predictions, and would much rather wait for the bus than to miss it +### Reliable Prediction Accuracy Metrics in Report + +| Goal | Metric Columns | +|---|---| +| Bus Catch Likelihood
75%+ of predictions result in catching the bus | Bus Catch Likelihood
% Early / On-Time / Late Predictions | +| Prediction Error
Closer to zero, small positive values.
Late predictions = negative values = riders miss bus. | Average prediction error (minutes) | +| Prediction Error Variability
Variability is the interquartile range (IQR = 75th - 25th percentile).
Smaller values = better = more consistent experience for riders using app.

Ex1: 25th percentile = -5 minutes = a quarter of riders get predictions that are 5 or more minutes late.
Ex2: 75th percentile = 3 minutes = a quarter of riders get predictions that are 3 or more minutes early.
Ex3: half of riders get predictions between 5 minutes late and 3 minutes early. | 10th, 20th, ..., 90th percentiles
Variability = IQR = 75th percentile - 25th percentile
Accuracy Loss = 10th percentile / 50th percentile | + ## Availability and Completeness of Predictions * This metric is the easiest to achieve. For starters, having information is better than no information. -* For each instance of scheduled stop arrival, there is complete information if there are at least 2 predictions each minute. +* It measures the completeness _within_ the RT data we are capturing, regardless of coverage gaps _across_ dates. +* For each instance of scheduled stop arrival, RT information is complete with at least 2 predictions each minute (every 30 seconds). * For the 30 minute period before the bus arrives at each stop, each minute is an observation that goes into this calculation (up to 30 observations). * This ensures that we have fairly equal number of observations for each stop and can compare across stops. * We want to avoid having 30 minutes of predictions for the 1st stop and 60 minutes of predictions for the last stop and comparing metrics that have different denominators. +### Availability and Completeness Metrics in Report + +| Goal | Metric Columns | +|---|---| +| 2+ vehicle positions and trip updates messages per minute. | [Trip Updates / Vehicle Positions] Messages per Minute | +| 100% routes are covered by RT, and 75%+ of trips have RT.

Out of scheduled trips, how many trips have RT, regardless of completeness?
Out of scheduled routes, how many routes have at least 1 trip with RT? | [Trip Updates / Vehicle Positions] % Trips,
[Trip Updates / Vehicle Positions] % Routes | +| 90%+ of minutes has predicted arrival information.

How many minutes have at least 2+ messages, in the 30 minutes before the bus arrives? | % Minutes with 2+ Predictions | + ## Prediction Inconsistency * This metric (also called jitter or wobble) captures another aspect of transit user experience. Any change in prediction is counted, so this metric **only has positive values**, but smaller positive values are better. @@ -51,6 +67,13 @@ As the bus approaches each stop, the software is making predictions for when the * If the prediction is fairly consistent, we would see small spread. * There is [research](https://www.sciencedirect.com/science/article/abs/pii/S0965856416303494) around how transit users perceive wait time, and that users perceive longer wait times than what is actually experienced. Decreasing the perceived wait time by providing real-time information has positive benefits for user experience. +### Prediction Inconsistency Metrics in Report + +| Goal | Metric Columns | +|---|---| +| Less wobbly or jittery predictions, to a point.

Real-time predictions should reflect traffic conditions and convey
updated information to riders, so aiming for zero is not the goal.

Higher = predictions change more = worse rider experience.

Lower = predictions are not fluctuating minute to minute =
riders trust the real-time arrival information. | Prediction Spread (minutes) | +| Lower padding = riders add less time to prevent missing the bus.

Riders adjust their behavior to catch the bus, and add time to adjust
for receiving late predictions.

Late predictions (negative prediction error values) become the
*time a rider adds to make sure they don't miss the bus next time*,
signaling a lack of trust with the information. | Prediction Padding (minutes)
Absolute value of the 5th percentile prediction error. | + ## Master Services Agreement Exhibit H definitions (pg 53 on pdf) diff --git a/rt_predictions/chart_utils_for_operators.py b/rt_predictions/chart_utils_for_operators.py index dc9a818a2..3a0e3cacb 100644 --- a/rt_predictions/chart_utils_for_operators.py +++ b/rt_predictions/chart_utils_for_operators.py @@ -2,10 +2,10 @@ Chart and map functions for operator report. """ -import _color_palette import altair as alt import pandas as pd from great_tables import GT +from gtfs_curator_utils import _color_palette def basic_percentiles_line_chart( @@ -20,7 +20,7 @@ def basic_percentiles_line_chart( """ chart = ( alt.Chart(df) - .mark_line(point=True) + .mark_line(point=True, interpolate="natural") # this one seems to smooth out the curves .encode( x=alt.X(x_col, title="Prediction Error (minutes)"), y=alt.Y("percentile", title="Percentiles", scale=alt.Scale(domain=[0, 100])), @@ -36,7 +36,7 @@ def basic_percentiles_line_chart( return chart -def fig5and6_prediction_error_plots(df: pd.DataFrame) -> alt.Chart: +def fig5and6_prediction_error_plots(df: pd.DataFrame, color_col: str = "day_type") -> alt.Chart: """ Negative and positive prediction error plots are combined side-by-side as 1 chart. @@ -47,14 +47,18 @@ def fig5and6_prediction_error_plots(df: pd.DataFrame) -> alt.Chart: Instead of [10, 20, ....90] for percentiles, it should show [90, 80, ...10]. """ # Make legend selectable - selection = alt.selection_point(fields=["day_type"], bind="legend") + selection = alt.selection_point(fields=[color_col], bind="legend") - neg_errors_chart = basic_percentiles_line_chart(df, x_col="neg_prediction_error_minutes").encode( + neg_errors_chart = basic_percentiles_line_chart( + df, x_col="neg_prediction_error_minutes", color_col=color_col + ).encode( opacity=alt.when(selection).then(alt.value(1)).otherwise(alt.value(0.2)), strokeWidth=alt.when(selection).then(alt.value(2)).otherwise(alt.value(1)), ) - pos_errors_chart = basic_percentiles_line_chart(df, x_col="pos_prediction_error_minutes").encode( + pos_errors_chart = basic_percentiles_line_chart( + df, x_col="pos_prediction_error_minutes", color_col=color_col + ).encode( opacity=alt.when(selection).then(alt.value(1)).otherwise(alt.value(0.2)), strokeWidth=alt.when(selection).then(alt.value(2)).otherwise(alt.value(1)), ) From 1cd279c033fcdbfa6eecc66f288a5821865dea64 Mon Sep 17 00:00:00 2001 From: tiffanychu90 Date: Fri, 1 May 2026 17:27:17 +0000 Subject: [PATCH 2/5] expand captions --- rt_predictions/operator_report.ipynb | 24 +- rt_predictions/operator_report.qmd | 541 --------------------------- 2 files changed, 20 insertions(+), 545 deletions(-) delete mode 100644 rt_predictions/operator_report.qmd diff --git a/rt_predictions/operator_report.ipynb b/rt_predictions/operator_report.ipynb index 934dcdfd1..54c3b469f 100644 --- a/rt_predictions/operator_report.ipynb +++ b/rt_predictions/operator_report.ipynb @@ -232,9 +232,19 @@ "source": [ "## General RT Metrics\n", "\n", + "Vehicle positions and trip updates are distinct RT data sources, and each can be paired with GTFS schedule data. \n", + "\n", + "The metrics from the schedule-RT pairing include and % of schedule trips with vehicle positions and % of scheduled trips with trip updates. These are calculated the same way across both RT data sources.\n", + "\n", "**Update Availability Goal 1:** 2+ vehicle positions or trip updates messages per minute.\n", "\n", - "**Update Availability Goal 2:** 100% routes are covered by RT, and 75%+ of trips have RT.\n" + "Vehicle positions or trip updates per minute is a measure of completeness *within* the RT data we are capturing, regardless of coverage gaps *across* dates.\n", + "\n", + "**Update Availability Goal 2:** 75%+ of trips have RT and 100% routes are covered by RT.\n", + "\n", + "Out of scheduled trips, how many trips have RT, regardless of completeness? If the trip appeared in RT trip updates with at least 1 message, the trip counts as having RT trip updates (similarly for vehicle positions).\n", + "\n", + "Out of scheduled routes, how many routes have at least 1 trip with RT? If at least 1 trip for that route had RT trip updates, that route counts as having RT trip updates (similarly for vehicle positions)." ] }, { @@ -289,10 +299,14 @@ "source": [ "## Prediction Accuracy Metrics\n", "\n", - "**Update Availability Goal:** 90%+ of minutes has predicted arrival information.\n", + "These metrics are derived entirely from RT trip updates (no comparison with schedule data).\n", + "\n", + "**Update Availability Goal 3:** 90%+ of minutes has predicted arrival information.\n", "\n", "**Bus Catch Likelihood Goal:** 75%+ of predictions result in catching the bus.\n", "\n", + "On-time predictions use [this definition](https://analysis.dds.dot.ca.gov/rt_operator_metrics/#reliable-prediction-accuracy), where predictions 5 minutes before must fall within narrower bounds to be considered on-time, compared to predictions 30 minutes out.\n", + "\n", "**Prediction Error Goal:** Closer to zero or smaller positive values (early predictions). Late predictions = negative values = riders miss bus\n" ] }, @@ -609,9 +623,11 @@ "source": [ "## Route Summary\n", "\n", - "Prediction accuracy varies by routes. The routes shown at the top have high variability (high IQRs).\n", + "Prediction accuracy varies by routes. The routes shown at the top have high variability. \n", + "\n", + "Variability is measured by the interquartile range (IQR), which is the difference between the 75th percentile and 25th percentile prediction errors.\n", "\n", - "* **High variability = high IQRs**: local traffic conditions mat confound the prediction algorithm. For these routes, a focus on improving service reliability through additional infrastructure (signal priority, bus lanes), or other transit planning and policies could be explored.\n", + "* **High variability = high IQRs**: local traffic conditions may confound the prediction algorithm. For these routes, a focus on improving service reliability through additional infrastructure (signal priority, bus lanes), or other transit planning and policies could be explored.\n", "\n", "* **Negative 25th percentiles**: riders miss the bus (late predictions). These routes may benefit from service reliability improvements for riders.\n", "\n", diff --git a/rt_predictions/operator_report.qmd b/rt_predictions/operator_report.qmd deleted file mode 100644 index 35e0e8d57..000000000 --- a/rt_predictions/operator_report.qmd +++ /dev/null @@ -1,541 +0,0 @@ ---- -title: '{one_month_formatted} Summary' -jupyter: python3 ---- - -```{python} -%%capture - -import warnings -warnings.filterwarnings("ignore") - -import altair as alt -import branca.colormap as cm -import folium -import pandas as pd - -import calitp_data_analysis.magics -import gt_extras as gte - -from great_tables import GT, md - -import chart_utils_for_operators as chart_utils -import prep_operator_data -import report_utils -import _color_palette -from rt_msa_utils import operator_report_month - -alt.data_transformers.enable("vegafusion") - -one_month = pd.to_datetime(operator_report_month) -``` - -```{python} -#| editable: true -#| slideshow: {slide_type: ''} -#| tags: [parameters] -# parameters cell -#name = "Redding Trip Updates" -``` - -```{python} -%%capture_parameters - -date_format = "%b %Y" # gtfs_digest/_new_operator_report_utils.py -one_month_formatted = one_month.strftime(date_format) - -name, one_month_formatted -``` - - -Generally, we want better transit user experience. Specifically, the performance metrics we can derive from GTFS RT Trip Updates distills into the following objectives: - -* Increase prediction reliability and accuracy -* Increase the availability and completeness of GTFS RT -* Decrease the inconsistency and fluctuations of predictions - - -```{python} -operator_cols = ["day_type"] - -schedule_cols = [ - "daily_trips", "daily_service_hours", "n_routes", "n_shapes", - "n_stops", "num_stop_times", "daily_arrivals", 'n_days_schedule_and_rt' -] - -vp_cols = ['vp_messages_per_minute', 'pct_vp_trips', 'pct_vp_routes'] #'daily_vp_trips' -tu_cols =['tu_messages_per_minute', 'pct_tu_trips', 'pct_tu_routes'] #'daily_tu_trips', - -tu_prediction_cols = [ - "bus_catch_likelihood", "pct_tu_complete_minutes", # both are percents - "p25", "p75", "iqr", "p50", - "n_predictions", - "prediction_padding_minutes", "avg_prediction_spread_minutes" -] -``` - -```{python} -def check_counts_across_quartet(df: pd.DataFrame): - url_cols = [f"{s}_base64_url" for s in ["schedule", "vp", "tu"]] - - counts_df = df.groupby(["schedule_name", "vp_name", "tu_name"]).agg({ - **{c: "nunique" for c in url_cols} - }).reset_index() - - # Need to do this for everyone - counts_df["max_count"] = counts_df[url_cols].max(axis=1) - - if counts_df.max_count.iloc[0] > 1: - print("There were multiple entries for each day_type:") - display(counts_df) - - urls_with_most_predictions = ( - df[url_cols + ["n_predictions"]] - .sort_values("n_predictions", ascending=False) - .reset_index(drop=True) - ).head(1)[url_cols] - - df2 = pd.merge( - df, - urls_with_most_predictions, - on = url_cols, - how = "inner" - ) - - return df2 -``` - -```{python} -df = report_utils.import_operator_df( - filters = [[ - ("month_first_day", "==", one_month), - ("tu_name", "==", name), - ]], -).pipe( - check_counts_across_quartet -).pipe( - prep_operator_data.merge_in_operator_percentiles -) - -schedule_name = df.schedule_name.iloc[0] -``` - -```{python} -# Set variables for color bars used across maps, route dropdown, and great tables -PREDICTION_ERROR_COLORS =list(_color_palette.PREDICTION_ERROR_COLOR_PALETTE.values()) -PREDICTION_ERROR_INDEX = [-5, -3, -1, 1, 3, 5] -PREDICTION_ERROR_LEGEND_CAPTION = "minutes (negative = late; positive = early)" - -POS_BAR_COLOR = _color_palette.get_color("blueberry") -NEG_BAR_COLOR = _color_palette.get_color("vivid_cerise") -``` - -## Schedule + RT Summary Stats - -```{python} -schedule_table = ( - GT(df[operator_cols + schedule_cols]) - .cols_label( - daily_trips = "Daily Trips", - daily_service_hours = "Daily Service Hours", - n_routes = "# Routes", - n_shapes = "# Shapes", - n_stops = "# Stops", - num_stop_times = "Total Scheduled Arrivals", - daily_arrivals = "Daily Scheduled Arrivals", - n_days_schedule_and_rt = "# days with both RT", - ).fmt_integer( - columns = [ - "daily_trips", "n_routes", "n_shapes", "n_stops", - "num_stop_times", "daily_arrivals", "n_days_schedule_and_rt"] - ).fmt_number( - columns = ["daily_service_hours"], decimals=1 - ).tab_spanner( - label="Schedule", - columns = schedule_cols - ).tab_header( - title = "Schedule + RT Summary Metrics", - subtitle = f"{one_month_formatted}" - ) -) - -chart_utils.format_great_table(schedule_table) -``` - -## General RT Metrics - -**Update Availability Goal 1:** 2+ vehicle positions or trip updates messages per minute. - -**Update Availability Goal 2:** 100% routes are covered by RT, and 75%+ of trips have RT. - -```{python} -rt_table = ( - GT(df[operator_cols + vp_cols + tu_cols]) - .cols_label( - tu_messages_per_minute = "Trip Updates per Minute", - pct_tu_trips = "% Trips", - pct_tu_routes = "% Routes", - vp_messages_per_minute = "Vehicle Positions per Minute", - pct_vp_trips = "% Trips", - pct_vp_routes = "% Routes", - ).fmt_number( - columns = ["tu_messages_per_minute", "vp_messages_per_minute"], - decimals=1 - ).fmt_percent( - columns=["pct_tu_trips", "pct_tu_routes", "pct_vp_trips", "pct_vp_routes"], - decimals=1 - ).tab_spanner( - label="Trip Updates", - columns=tu_cols - ).tab_spanner( - label="Vehicle Positions", - columns=vp_cols - ) -) - -chart_utils.format_great_table(rt_table, day_type_grouping = True).pipe( - gte.gt_color_box, - columns=["tu_messages_per_minute", "vp_messages_per_minute"], - palette="Blues", - domain=[1, 3] -).pipe( - gte.gt_hulk_col_numeric, - columns=["pct_tu_trips", "pct_tu_routes", "pct_vp_trips", "pct_vp_routes"], - palette=["#FFEC8B", "#E5F5E0"], #[light goldenrod1, white alyssum (from greens)] - domain=[0, 1], - alpha=0.1 -) -``` - -## Prediction Accuracy Metrics - -**Update Availability Goal:** 90%+ of minutes has predicted arrival information. - -**Bus Catch Likelihood Goal:** 75%+ of predictions result in catching the bus. - -**Prediction Error Goal:** Closer to zero or smaller positive values (early predictions). Late predictions = negative values = riders miss bus - -```{python} -table = ( - GT(df[operator_cols + tu_prediction_cols]) - .cols_label( - pct_tu_complete_minutes = "% Minutes with 2+ Predictions", - bus_catch_likelihood = "Bus Catch Likelihood (Early + On-time)", - p50 = "Prediction Error", - avg_prediction_spread_minutes = "Prediction Spread / Wobble", - prediction_padding_minutes = "Prediction Padding", - n_predictions = "# Predictions", - iqr = "Variability" - ).fmt_percent(columns=["bus_catch_likelihood", "pct_tu_complete_minutes"], decimals=1) - .fmt_number(columns=["p50", "avg_prediction_spread_minutes", "prediction_padding_minutes"], decimals=1) - .fmt_integer(columns=["n_predictions"]) - .tab_header(title = f"Trip Update Prediction Accuracy Metrics", - subtitle = "units are in minutes") -).pipe(chart_utils.format_great_table) - -table.pipe( - gte.gt_plt_dumbbell, - col1='p25', - col2='p75', - label = "IQR", - num_decimals=1, - col1_color=_color_palette.get_color("valentino"), - col2_color=_color_palette.get_color("lizard_green"), - width=100, height=50, # check these each time - font_size=8 -).pipe( - gte.gt_hulk_col_numeric, - columns=["bus_catch_likelihood", "pct_tu_complete_minutes"], - palette=[_color_palette.get_color("light_goldenrod"), - _color_palette.get_color("pastel_peppermint")], - domain=[0, 1], - alpha=0.1 -).pipe( - gte.gt_color_box, - columns=["p50"], - palette=PREDICTION_ERROR_COLORS, - domain=[-5, 5] -).cols_width( - cases={ - c: "10%" for c in [ - "bus_catch_likelihood", "pct_tu_complete_minutes", - "prediction_padding_minutes", "avg_prediction_spread_minutes", - "n_predictions", "p50" - ] - } -).cols_move_to_end(columns=["n_predictions"]) -#.pipe(gte.gt_color_box, columns=["iqr"], palette="YlOrRd"), -# maybe IQR doesn't make sense to color, it'll just be ranked by day_type -``` - -## Prediction Error Percentiles - -### Distribution of Prediction Errors - -The 50th percentile is the typical or median rider experience, and it can show that, on average, this transit agency is roughly on-time. -* If the 10th percentile is fairly close to the 50th percentile, it means that the transit agency is consistent and reliable in its predictions. -* Extreme values for the 10th percentile would indicate that predictions fluctuate, or, are somewhat unreliable. -* Steeper lines indicate fairly reliable predictions for the rider. - -```{python} -decile_cols = [ - "month_first_day", "day_type", - "schedule_name", "tu_name", - 'pos_prediction_error_sec_array', 'pos_prediction_error_sec_percentile_array', - 'neg_prediction_error_sec_array', 'prediction_error_sec_percentile_array' -] - -operator_deciles_df = report_utils.import_operator_df( - filters = [[ - ("month_first_day", "==", one_month), - ("tu_name", "==", name), - ]], - columns = decile_cols -).pipe(prep_operator_data.operator_deciles_for_chart) -``` - -```{python} -percentile_chart = chart_utils.fig5and6_prediction_error_plots(operator_deciles_df) -percentile_chart -``` - -### Accuracy Loss -Ratio of the 10th to 50th percentiles - -* Newmark's paper on a small sample of transit agencies suggests that the positive prediction errors typically have ratios of 4. -* Late predictions (negative prediction errors) have ratios around 3. -* Steeper lines = less accuracy loss = better - * y-axis is percentile (moving from 10th to 50th percentile is moving from upwards on y-axis) - * x-axis is error (smaller change along x-axis is less accuracy loss). - * less accuracy loss = less change along x-axis, since change along y-axis is constant (10 to 50) = steeper (unintuitive to the normal interpretation of slope!) - -```{python} -operator_cols = ["day_type"] -percentile_chart_cols = [ - "pos_p10", "pos_p50", "pos_error_ratio", - "neg_p10", "neg_p50", "neg_error_ratio" -] - -mini_df = df[df.month_first_day == one_month][ - operator_cols + percentile_chart_cols] - -# convert the 10th, 50th percentile columns to minutes -seconds_cols = ["pos_p10", "pos_p50", "neg_p10", "neg_p50"] -mini_df[seconds_cols] = mini_df[seconds_cols].divide(60).round(2) -``` - -```{python} -mini_p10_p50_table = ( - GT(mini_df) - .cols_label( - pos_p10 = "10th percentile ", - neg_p10 = "10th percentile", - pos_p50 = "50th percentile", - neg_p50 = "50th percentile", - pos_error_ratio = "Accuracy Loss", - neg_error_ratio = "Accuracy Loss", - ).fmt_number( - columns=["pos_error_ratio", "neg_error_ratio"], decimals=1 - ).tab_spanner( - label="Early Predictions (Positive Prediction Error)", - columns=["pos_p10", "pos_p50", "pos_error_ratio"] - ).tab_spanner( - label="Late Predictions (Negative Prediction Error)", - columns = ["neg_p10", "neg_p50", "neg_error_ratio"] - ) - .tab_header( - title = "Accuracy Loss = Ratio of 10th to 50th percentile error", - subtitle = "units are in minutes (lower = less accuracy loss)" - ) -).pipe( - gte.gt_color_box, - columns=["pos_p10", "pos_p50", "neg_p10", "neg_p50"], - palette=PREDICTION_ERROR_COLORS, - domain=[-5, 5] -) - -chart_utils.format_great_table(mini_p10_p50_table).pipe( - chart_utils.format_great_table, - day_type_grouping=False -) -``` - -## Route Map by Priority Criteria - -The following layers are available and selectable (if no routes match the criteria, the layer is excluded): - -1. **Average prediction error** (minutes) for all routes -2. Routes with **<90% update completeness** - Providing complete real-time information for all routes is the crucial foundation. -3. **Highly Variable Routes (IQR > 3)** that could benefit from transit-supportive policies (signal priority, bus lanes). - The variability in prediction accuracy here could be due to the local traffic conditions. -4. Routes with **Bus Catch Likelihood (early + on-time accuracy < 75%)**, or late predictions 25% of the time. - -```{python} -route_gdf = report_utils.import_route_df( - filters = [[ - ("month_first_day", "==", one_month), - ("schedule_name", "==", schedule_name), - ("day_type", "==", "Weekday") - ]], - columns = [ - "schedule_name", "tu_name", - "route_dir_name", - "avg_prediction_error_minutes", "prediction_error_label", - "pct_tu_complete_minutes", - "iqr", "bus_catch_likelihood", - "geometry" - ] -).drop_duplicates().reset_index(drop=True) -``` - -```{python} -# Set conditions for filtering to pick out priority criteria -condition_completeness = route_gdf.pct_tu_complete_minutes < 0.9 -condition_variability = route_gdf.iqr >= 3 -condition_likelihood = route_gdf.bus_catch_likelihood < 0.75 -``` - -```{python} -m = route_gdf.explore( - "avg_prediction_error_minutes", - tiles = "CartoDB Positron", - name = "All Routes", - cmap = cm.StepColormap( - colors=PREDICTION_ERROR_COLORS, index=PREDICTION_ERROR_INDEX, - vmin=-5, vmax=5, - tick_labels=PREDICTION_ERROR_INDEX, - caption=PREDICTION_ERROR_LEGEND_CAPTION - ), - marker_kwds={"fill": True}, - style_kwds={"opacity": 0.5, "fillOpacity": 0.3} -) -``` - -```{python} -if len(route_gdf[condition_completeness]) > 0: - m = route_gdf[condition_completeness].explore( - "route_dir_name", - m=m, - tiles = "CartoDB Positron", - name = "< 90% update completeness", # color by route-dir name, same as stop report - categorical = True, - legend = False, - ) - -if len(route_gdf[condition_variability]) > 0: - m = route_gdf[condition_variability].explore( - "iqr", - m=m, - tiles = "CartoDB Positron", - name = "High Variability (IQR 3+ minutes) Routes", - categorical = False, - legend = True, - cmap="YlOrRd", - ) - -if len(route_gdf[condition_likelihood]) > 0: - m = route_gdf[condition_likelihood].explore( - "bus_catch_likelihood", - m=m, - tiles = "CartoDB Positron", - name = "<75% Bus Catch Likelihood", - categorical = False, - legend = True, - cmap="cividis" - ) - -folium.LayerControl().add_to(m) -m -``` - -## Route Summary - -Prediction accuracy varies by routes. The routes shown at the top have high variability (high IQRs). - -* **High variability = high IQRs**: local traffic conditions mat confound the prediction algorithm. For these routes, a focus on improving service reliability through additional infrastructure (signal priority, bus lanes), or other transit planning and policies could be explored. - -* **Negative 25th percentiles**: riders miss the bus (late predictions). These routes may benefit from service reliability improvements for riders. - - *Interpretation*: A value of -5 means that one quarter of riders miss the bus by 5 minutes. - -* **scaled IQR**: IQR adjusted so predictions closer to the bus arrival are weighted more. Predictions 5 minutes out are more important than predictions 25 minutes out. - -```{python} -route_iqr_df = report_utils.import_route_df( - filters = [[ - ("tu_name", "==", name), - ("month_first_day", "==", one_month), - ("day_type", "==", "Weekday") - ]], - columns= [ - "route_dir_name", - "avg_prediction_error_minutes", - "n_predictions", - "p25", "p75", - "iqr", - "scaled_p25", "scaled_p75", - # does putting IQR help in interpretation? - ] -).sort_values("iqr", ascending=False) -``` - -```{python} -route_iqr_table = ( - GT(route_iqr_df) - .cols_label( - route_dir_name = "Route-Direction", - n_predictions = "# Predictions", - iqr = "Variability", - avg_prediction_error_minutes = "Prediction Error (minutes)", - ).fmt_integer(["n_predictions"]) - .tab_header( - title = "Route Summary Metrics", - subtitle = md( - """ - High IQR = variability -> focus on service reliability through transit planning and policies. - Variability could be due to local traffic conditions. - Negative 25th percentiles = riders miss bus.""" - ) - ).pipe(chart_utils.format_great_table, day_type_grouping=False) -) - -route_iqr_table.pipe( - gte.gt_plt_dumbbell, - col1='p25', - col2='p75', - label = "IQR (minutes)", - num_decimals=1, - width=200, height=50, - col1_color=_color_palette.get_color("valentino"), - col2_color=_color_palette.get_color("lizard_green"), - font_size=8 -).pipe( - gte.gt_plt_dumbbell, - col1="scaled_p25", - col2='scaled_p75', - label='scaled IQR', - num_decimals=3, - width=200, height=50, - col1_color=_color_palette.get_color("valentino"), - col2_color=_color_palette.get_color("lizard_green"), - font_size=8 -).pipe( - gte.gt_color_box, - columns=["avg_prediction_error_minutes"], - palette=PREDICTION_ERROR_COLORS, - domain=[-5, 5] -).pipe( - gte.gt_color_box, - columns=["iqr"], - palette="YlOrRd", -).cols_width( - cases={ - "avg_prediction_error_minutes": "10%", - "n_predictions": "10%", - "iqr": "10%" - } -).cols_align("center").cols_align( - "left", columns = "route_dir_name" -).cols_move_to_end(columns=["n_predictions"]) -``` From dde5372d9e3394ac382887401e00a1f637bae027 Mon Sep 17 00:00:00 2001 From: tiffanychu90 Date: Fri, 1 May 2026 18:07:54 +0000 Subject: [PATCH 3/5] add stacked bar for prediction category counts --- rt_predictions/chart_utils_for_operators.py | 2 +- rt_predictions/operator_report.ipynb | 39 +- rt_predictions/operator_report.qmd | 586 ++++++++++++++++++++ rt_predictions/prep_operator_data.py | 22 +- 4 files changed, 645 insertions(+), 4 deletions(-) create mode 100644 rt_predictions/operator_report.qmd diff --git a/rt_predictions/chart_utils_for_operators.py b/rt_predictions/chart_utils_for_operators.py index 3a0e3cacb..e4fd424b7 100644 --- a/rt_predictions/chart_utils_for_operators.py +++ b/rt_predictions/chart_utils_for_operators.py @@ -2,10 +2,10 @@ Chart and map functions for operator report. """ +import _color_palette import altair as alt import pandas as pd from great_tables import GT -from gtfs_curator_utils import _color_palette def basic_percentiles_line_chart( diff --git a/rt_predictions/operator_report.ipynb b/rt_predictions/operator_report.ipynb index 54c3b469f..aa2bfed69 100644 --- a/rt_predictions/operator_report.ipynb +++ b/rt_predictions/operator_report.ipynb @@ -96,8 +96,8 @@ " \"n_stops\", \"num_stop_times\", \"daily_arrivals\", 'n_days_schedule_and_rt'\n", "]\n", "\n", - "vp_cols = ['vp_messages_per_minute', 'pct_vp_trips', 'pct_vp_routes'] #'daily_vp_trips'\n", - "tu_cols =['tu_messages_per_minute', 'pct_tu_trips', 'pct_tu_routes'] #'daily_tu_trips',\n", + "vp_cols = ['vp_messages_per_minute', 'pct_vp_trips', 'pct_vp_routes'] \n", + "tu_cols =['tu_messages_per_minute', 'pct_tu_trips', 'pct_tu_routes']\n", "\n", "tu_prediction_cols = [\n", " \"bus_catch_likelihood\", \"pct_tu_complete_minutes\", # both are percents\n", @@ -369,6 +369,41 @@ "# maybe IQR doesn't make sense to color, it'll just be ranked by day_type" ] }, + { + "cell_type": "code", + "execution_count": null, + "id": "b3950f3e-2e99-4da9-bc2d-78f61ab2806b", + "metadata": {}, + "outputs": [], + "source": [ + "pct_category_df = prep_operator_data.reshape_prediction_category_counts_to_long(df)\n", + "\n", + "category_counts_stacked_bar = alt.Chart(pct_category_df).mark_bar().encode(\n", + " x=alt.X(\n", + " 'day_type',\n", + " title = \"\", sort=[\"Weekday\", \"Saturday\", \"Sunday\"]\n", + " ),\n", + " y=alt.Y('pct', title=\"Percent\"),\n", + " color=alt.Color(\n", + " 'prediction_category:N',\n", + " title=\"Prediction Category\", \n", + " sort=[\"early\", \"ontime\", \"late\"],\n", + " scale=alt.Scale(range=[\n", + " _color_palette.get_color(\"light_cadmium_yellow\"),\n", + " _color_palette.get_color(\"electric_orange\"),\n", + " _color_palette.get_color(\"aquatic\")\n", + " ])\n", + " ),\n", + " column=alt.Column(\"tu_name\", title = \"\"),\n", + " tooltip=[\"tu_name\", \"day_type\", \"prediction_category\", \"pct\"]\n", + ").interactive().properties(\n", + " title = \"Predictions by Category\",\n", + " width=150, height = 200\n", + ")\n", + "\n", + "category_counts_stacked_bar" + ] + }, { "cell_type": "markdown", "id": "3033b010-f091-4a32-8df1-38dc7d2fd3af", diff --git a/rt_predictions/operator_report.qmd b/rt_predictions/operator_report.qmd new file mode 100644 index 000000000..1229bb48f --- /dev/null +++ b/rt_predictions/operator_report.qmd @@ -0,0 +1,586 @@ +--- +title: '{one_month_formatted} Summary' +jupyter: python3 +--- + +```{python} +%%capture + +import warnings +warnings.filterwarnings("ignore") + +import altair as alt +import branca.colormap as cm +import folium +import pandas as pd + +import calitp_data_analysis.magics +import gt_extras as gte + +from great_tables import GT, md + +import chart_utils_for_operators as chart_utils +import prep_operator_data +import report_utils +import _color_palette +from rt_msa_utils import operator_report_month + +alt.data_transformers.enable("vegafusion") + +one_month = pd.to_datetime(operator_report_month) +``` + +```{python} +#| editable: true +#| slideshow: {slide_type: ''} +#| tags: [parameters] +# parameters cell +#name = "Redding Trip Updates" +``` + +```{python} +%%capture_parameters + +date_format = "%b %Y" # gtfs_digest/_new_operator_report_utils.py +one_month_formatted = one_month.strftime(date_format) + +name, one_month_formatted +``` + + +Generally, we want better transit user experience. Specifically, the performance metrics we can derive from GTFS RT Trip Updates distills into the following objectives: + +* Increase prediction reliability and accuracy +* Increase the availability and completeness of GTFS RT +* Decrease the inconsistency and fluctuations of predictions + + +```{python} +operator_cols = ["day_type"] + +schedule_cols = [ + "daily_trips", "daily_service_hours", "n_routes", "n_shapes", + "n_stops", "num_stop_times", "daily_arrivals", 'n_days_schedule_and_rt' +] + +vp_cols = ['vp_messages_per_minute', 'pct_vp_trips', 'pct_vp_routes'] +tu_cols =['tu_messages_per_minute', 'pct_tu_trips', 'pct_tu_routes'] + +tu_prediction_cols = [ + "bus_catch_likelihood", "pct_tu_complete_minutes", # both are percents + "p25", "p75", "iqr", "p50", + "n_predictions", + "prediction_padding_minutes", "avg_prediction_spread_minutes" +] +``` + +```{python} +def check_counts_across_quartet(df: pd.DataFrame): + url_cols = [f"{s}_base64_url" for s in ["schedule", "vp", "tu"]] + + counts_df = df.groupby(["schedule_name", "vp_name", "tu_name"]).agg({ + **{c: "nunique" for c in url_cols} + }).reset_index() + + # Need to do this for everyone + counts_df["max_count"] = counts_df[url_cols].max(axis=1) + + if counts_df.max_count.iloc[0] > 1: + print("There were multiple entries for each day_type:") + display(counts_df) + + urls_with_most_predictions = ( + df[url_cols + ["n_predictions"]] + .sort_values("n_predictions", ascending=False) + .reset_index(drop=True) + ).head(1)[url_cols] + + df2 = pd.merge( + df, + urls_with_most_predictions, + on = url_cols, + how = "inner" + ) + + return df2 +``` + +```{python} +df = report_utils.import_operator_df( + filters = [[ + ("month_first_day", "==", one_month), + ("tu_name", "==", name), + ]], +).pipe( + check_counts_across_quartet +).pipe( + prep_operator_data.merge_in_operator_percentiles +) + +schedule_name = df.schedule_name.iloc[0] +``` + +```{python} +# Set variables for color bars used across maps, route dropdown, and great tables +PREDICTION_ERROR_COLORS =list(_color_palette.PREDICTION_ERROR_COLOR_PALETTE.values()) +PREDICTION_ERROR_INDEX = [-5, -3, -1, 1, 3, 5] +PREDICTION_ERROR_LEGEND_CAPTION = "minutes (negative = late; positive = early)" + +POS_BAR_COLOR = _color_palette.get_color("blueberry") +NEG_BAR_COLOR = _color_palette.get_color("vivid_cerise") +``` + +## Schedule + RT Summary Stats + +```{python} +schedule_table = ( + GT(df[operator_cols + schedule_cols]) + .cols_label( + daily_trips = "Daily Trips", + daily_service_hours = "Daily Service Hours", + n_routes = "# Routes", + n_shapes = "# Shapes", + n_stops = "# Stops", + num_stop_times = "Total Scheduled Arrivals", + daily_arrivals = "Daily Scheduled Arrivals", + n_days_schedule_and_rt = "# days with both RT", + ).fmt_integer( + columns = [ + "daily_trips", "n_routes", "n_shapes", "n_stops", + "num_stop_times", "daily_arrivals", "n_days_schedule_and_rt"] + ).fmt_number( + columns = ["daily_service_hours"], decimals=1 + ).tab_spanner( + label="Schedule", + columns = schedule_cols + ).tab_header( + title = "Schedule + RT Summary Metrics", + subtitle = f"{one_month_formatted}" + ) +) + +chart_utils.format_great_table(schedule_table) +``` + +## General RT Metrics + +Vehicle positions and trip updates are distinct RT data sources, and each can be paired with GTFS schedule data. + +The metrics from the schedule-RT pairing include and % of schedule trips with vehicle positions and % of scheduled trips with trip updates. These are calculated the same way across both RT data sources. + +**Update Availability Goal 1:** 2+ vehicle positions or trip updates messages per minute. + +Vehicle positions or trip updates per minute is a measure of completeness *within* the RT data we are capturing, regardless of coverage gaps *across* dates. + +**Update Availability Goal 2:** 75%+ of trips have RT and 100% routes are covered by RT. + +Out of scheduled trips, how many trips have RT, regardless of completeness? If the trip appeared in RT trip updates with at least 1 message, the trip counts as having RT trip updates (similarly for vehicle positions). + +Out of scheduled routes, how many routes have at least 1 trip with RT? If at least 1 trip for that route had RT trip updates, that route counts as having RT trip updates (similarly for vehicle positions). + +```{python} +rt_table = ( + GT(df[operator_cols + vp_cols + tu_cols]) + .cols_label( + tu_messages_per_minute = "Trip Updates per Minute", + pct_tu_trips = "% Trips", + pct_tu_routes = "% Routes", + vp_messages_per_minute = "Vehicle Positions per Minute", + pct_vp_trips = "% Trips", + pct_vp_routes = "% Routes", + ).fmt_number( + columns = ["tu_messages_per_minute", "vp_messages_per_minute"], + decimals=1 + ).fmt_percent( + columns=["pct_tu_trips", "pct_tu_routes", "pct_vp_trips", "pct_vp_routes"], + decimals=1 + ).tab_spanner( + label="Trip Updates", + columns=tu_cols + ).tab_spanner( + label="Vehicle Positions", + columns=vp_cols + ) +) + +chart_utils.format_great_table(rt_table, day_type_grouping = True).pipe( + gte.gt_color_box, + columns=["tu_messages_per_minute", "vp_messages_per_minute"], + palette="Blues", + domain=[1, 3] +).pipe( + gte.gt_hulk_col_numeric, + columns=["pct_tu_trips", "pct_tu_routes", "pct_vp_trips", "pct_vp_routes"], + palette=["#FFEC8B", "#E5F5E0"], #[light goldenrod1, white alyssum (from greens)] + domain=[0, 1], + alpha=0.1 +) +``` + +## Prediction Accuracy Metrics + +These metrics are derived entirely from RT trip updates (no comparison with schedule data). + +**Update Availability Goal 3:** 90%+ of minutes has predicted arrival information. + +**Bus Catch Likelihood Goal:** 75%+ of predictions result in catching the bus. + +On-time predictions use [this definition](https://analysis.dds.dot.ca.gov/rt_operator_metrics/#reliable-prediction-accuracy), where predictions 5 minutes before must fall within narrower bounds to be considered on-time, compared to predictions 30 minutes out. + +**Prediction Error Goal:** Closer to zero or smaller positive values (early predictions). Late predictions = negative values = riders miss bus + +```{python} +table = ( + GT(df[operator_cols + tu_prediction_cols]) + .cols_label( + pct_tu_complete_minutes = "% Minutes with 2+ Predictions", + bus_catch_likelihood = "Bus Catch Likelihood (Early + On-time)", + p50 = "Prediction Error", + avg_prediction_spread_minutes = "Prediction Spread / Wobble", + prediction_padding_minutes = "Prediction Padding", + n_predictions = "# Predictions", + iqr = "Variability" + ).fmt_percent(columns=["bus_catch_likelihood", "pct_tu_complete_minutes"], decimals=1) + .fmt_number(columns=["p50", "avg_prediction_spread_minutes", "prediction_padding_minutes"], decimals=1) + .fmt_integer(columns=["n_predictions"]) + .tab_header(title = f"Trip Update Prediction Accuracy Metrics", + subtitle = "units are in minutes") +).pipe(chart_utils.format_great_table) + +table.pipe( + gte.gt_plt_dumbbell, + col1='p25', + col2='p75', + label = "IQR", + num_decimals=1, + col1_color=_color_palette.get_color("valentino"), + col2_color=_color_palette.get_color("lizard_green"), + width=100, height=50, # check these each time + font_size=8 +).pipe( + gte.gt_hulk_col_numeric, + columns=["bus_catch_likelihood", "pct_tu_complete_minutes"], + palette=[_color_palette.get_color("light_goldenrod"), + _color_palette.get_color("pastel_peppermint")], + domain=[0, 1], + alpha=0.1 +).pipe( + gte.gt_color_box, + columns=["p50"], + palette=PREDICTION_ERROR_COLORS, + domain=[-5, 5] +).cols_width( + cases={ + c: "10%" for c in [ + "bus_catch_likelihood", "pct_tu_complete_minutes", + "prediction_padding_minutes", "avg_prediction_spread_minutes", + "n_predictions", "p50" + ] + } +).cols_move_to_end(columns=["n_predictions"]) +#.pipe(gte.gt_color_box, columns=["iqr"], palette="YlOrRd"), +# maybe IQR doesn't make sense to color, it'll just be ranked by day_type +``` + +```{python} +pct_category_df = prep_operator_data.reshape_prediction_category_counts_to_long(df) + +category_counts_stacked_bar = alt.Chart(pct_category_df).mark_bar().encode( + x=alt.X( + 'day_type', + title = "", sort=["Weekday", "Saturday", "Sunday"] + ), + y=alt.Y('pct', title="Percent"), + color=alt.Color( + 'prediction_category:N', + title="Prediction Category", + sort=["early", "ontime", "late"], + scale=alt.Scale(range=[ + _color_palette.get_color("light_cadmium_yellow"), + _color_palette.get_color("electric_orange"), + _color_palette.get_color("aquatic") + ]) + ), + column=alt.Column("tu_name", title = ""), + tooltip=["tu_name", "day_type", "prediction_category", "pct"] +).interactive().properties( + title = "Predictions by Category", + width=150, height = 200 +) + +category_counts_stacked_bar +``` + +## Prediction Error Percentiles + +### Distribution of Prediction Errors + +The 50th percentile is the typical or median rider experience, and it can show that, on average, this transit agency is roughly on-time. +* If the 10th percentile is fairly close to the 50th percentile, it means that the transit agency is consistent and reliable in its predictions. +* Extreme values for the 10th percentile would indicate that predictions fluctuate, or, are somewhat unreliable. +* Steeper lines indicate fairly reliable predictions for the rider. + +```{python} +decile_cols = [ + "month_first_day", "day_type", + "schedule_name", "tu_name", + 'pos_prediction_error_sec_array', 'pos_prediction_error_sec_percentile_array', + 'neg_prediction_error_sec_array', 'prediction_error_sec_percentile_array' +] + +operator_deciles_df = report_utils.import_operator_df( + filters = [[ + ("month_first_day", "==", one_month), + ("tu_name", "==", name), + ]], + columns = decile_cols +).pipe(prep_operator_data.operator_deciles_for_chart) +``` + +```{python} +percentile_chart = chart_utils.fig5and6_prediction_error_plots(operator_deciles_df) +percentile_chart +``` + +### Accuracy Loss +Ratio of the 10th to 50th percentiles + +* Newmark's paper on a small sample of transit agencies suggests that the positive prediction errors typically have ratios of 4. +* Late predictions (negative prediction errors) have ratios around 3. +* Steeper lines = less accuracy loss = better + * y-axis is percentile (moving from 10th to 50th percentile is moving from upwards on y-axis) + * x-axis is error (smaller change along x-axis is less accuracy loss). + * less accuracy loss = less change along x-axis, since change along y-axis is constant (10 to 50) = steeper (unintuitive to the normal interpretation of slope!) + +```{python} +operator_cols = ["day_type"] +percentile_chart_cols = [ + "pos_p10", "pos_p50", "pos_error_ratio", + "neg_p10", "neg_p50", "neg_error_ratio" +] + +mini_df = df[df.month_first_day == one_month][ + operator_cols + percentile_chart_cols] + +# convert the 10th, 50th percentile columns to minutes +seconds_cols = ["pos_p10", "pos_p50", "neg_p10", "neg_p50"] +mini_df[seconds_cols] = mini_df[seconds_cols].divide(60).round(2) +``` + +```{python} +mini_p10_p50_table = ( + GT(mini_df) + .cols_label( + pos_p10 = "10th percentile ", + neg_p10 = "10th percentile", + pos_p50 = "50th percentile", + neg_p50 = "50th percentile", + pos_error_ratio = "Accuracy Loss", + neg_error_ratio = "Accuracy Loss", + ).fmt_number( + columns=["pos_error_ratio", "neg_error_ratio"], decimals=1 + ).tab_spanner( + label="Early Predictions (Positive Prediction Error)", + columns=["pos_p10", "pos_p50", "pos_error_ratio"] + ).tab_spanner( + label="Late Predictions (Negative Prediction Error)", + columns = ["neg_p10", "neg_p50", "neg_error_ratio"] + ) + .tab_header( + title = "Accuracy Loss = Ratio of 10th to 50th percentile error", + subtitle = "units are in minutes (lower = less accuracy loss)" + ) +).pipe( + gte.gt_color_box, + columns=["pos_p10", "pos_p50", "neg_p10", "neg_p50"], + palette=PREDICTION_ERROR_COLORS, + domain=[-5, 5] +) + +chart_utils.format_great_table(mini_p10_p50_table).pipe( + chart_utils.format_great_table, + day_type_grouping=False +) +``` + +## Route Map by Priority Criteria + +The following layers are available and selectable (if no routes match the criteria, the layer is excluded): + +1. **Average prediction error** (minutes) for all routes +2. Routes with **<90% update completeness** + Providing complete real-time information for all routes is the crucial foundation. +3. **Highly Variable Routes (IQR > 3)** that could benefit from transit-supportive policies (signal priority, bus lanes). + The variability in prediction accuracy here could be due to the local traffic conditions. +4. Routes with **Bus Catch Likelihood (early + on-time accuracy < 75%)**, or late predictions 25% of the time. + +```{python} +route_gdf = report_utils.import_route_df( + filters = [[ + ("month_first_day", "==", one_month), + ("schedule_name", "==", schedule_name), + ("day_type", "==", "Weekday") + ]], + columns = [ + "schedule_name", "tu_name", + "route_dir_name", + "avg_prediction_error_minutes", "prediction_error_label", + "pct_tu_complete_minutes", + "iqr", "bus_catch_likelihood", + "geometry" + ] +).drop_duplicates().reset_index(drop=True) +``` + +```{python} +# Set conditions for filtering to pick out priority criteria +condition_completeness = route_gdf.pct_tu_complete_minutes < 0.9 +condition_variability = route_gdf.iqr >= 3 +condition_likelihood = route_gdf.bus_catch_likelihood < 0.75 +``` + +```{python} +m = route_gdf.explore( + "avg_prediction_error_minutes", + tiles = "CartoDB Positron", + name = "All Routes", + cmap = cm.StepColormap( + colors=PREDICTION_ERROR_COLORS, index=PREDICTION_ERROR_INDEX, + vmin=-5, vmax=5, + tick_labels=PREDICTION_ERROR_INDEX, + caption=PREDICTION_ERROR_LEGEND_CAPTION + ), + marker_kwds={"fill": True}, + style_kwds={"opacity": 0.5, "fillOpacity": 0.3} +) +``` + +```{python} +if len(route_gdf[condition_completeness]) > 0: + m = route_gdf[condition_completeness].explore( + "route_dir_name", + m=m, + tiles = "CartoDB Positron", + name = "< 90% update completeness", # color by route-dir name, same as stop report + categorical = True, + legend = False, + ) + +if len(route_gdf[condition_variability]) > 0: + m = route_gdf[condition_variability].explore( + "iqr", + m=m, + tiles = "CartoDB Positron", + name = "High Variability (IQR 3+ minutes) Routes", + categorical = False, + legend = True, + cmap="YlOrRd", + ) + +if len(route_gdf[condition_likelihood]) > 0: + m = route_gdf[condition_likelihood].explore( + "bus_catch_likelihood", + m=m, + tiles = "CartoDB Positron", + name = "<75% Bus Catch Likelihood", + categorical = False, + legend = True, + cmap="cividis" + ) + +folium.LayerControl().add_to(m) +m +``` + +## Route Summary + +Prediction accuracy varies by routes. The routes shown at the top have high variability. + +Variability is measured by the interquartile range (IQR), which is the difference between the 75th percentile and 25th percentile prediction errors. + +* **High variability = high IQRs**: local traffic conditions may confound the prediction algorithm. For these routes, a focus on improving service reliability through additional infrastructure (signal priority, bus lanes), or other transit planning and policies could be explored. + +* **Negative 25th percentiles**: riders miss the bus (late predictions). These routes may benefit from service reliability improvements for riders. + + *Interpretation*: A value of -5 means that one quarter of riders miss the bus by 5 minutes. + +* **scaled IQR**: IQR adjusted so predictions closer to the bus arrival are weighted more. Predictions 5 minutes out are more important than predictions 25 minutes out. + +```{python} +route_iqr_df = report_utils.import_route_df( + filters = [[ + ("tu_name", "==", name), + ("month_first_day", "==", one_month), + ("day_type", "==", "Weekday") + ]], + columns= [ + "route_dir_name", + "avg_prediction_error_minutes", + "n_predictions", + "p25", "p75", + "iqr", + "scaled_p25", "scaled_p75", + # does putting IQR help in interpretation? + ] +).sort_values("iqr", ascending=False) +``` + +```{python} +route_iqr_table = ( + GT(route_iqr_df) + .cols_label( + route_dir_name = "Route-Direction", + n_predictions = "# Predictions", + iqr = "Variability", + avg_prediction_error_minutes = "Prediction Error (minutes)", + ).fmt_integer(["n_predictions"]) + .tab_header( + title = "Route Summary Metrics", + subtitle = md( + """ + High IQR = variability -> focus on service reliability through transit planning and policies. + Variability could be due to local traffic conditions. + Negative 25th percentiles = riders miss bus.""" + ) + ).pipe(chart_utils.format_great_table, day_type_grouping=False) +) + +route_iqr_table.pipe( + gte.gt_plt_dumbbell, + col1='p25', + col2='p75', + label = "IQR (minutes)", + num_decimals=1, + width=200, height=50, + col1_color=_color_palette.get_color("valentino"), + col2_color=_color_palette.get_color("lizard_green"), + font_size=8 +).pipe( + gte.gt_plt_dumbbell, + col1="scaled_p25", + col2='scaled_p75', + label='scaled IQR', + num_decimals=3, + width=200, height=50, + col1_color=_color_palette.get_color("valentino"), + col2_color=_color_palette.get_color("lizard_green"), + font_size=8 +).pipe( + gte.gt_color_box, + columns=["avg_prediction_error_minutes"], + palette=PREDICTION_ERROR_COLORS, + domain=[-5, 5] +).pipe( + gte.gt_color_box, + columns=["iqr"], + palette="YlOrRd", +).cols_width( + cases={ + "avg_prediction_error_minutes": "10%", + "n_predictions": "10%", + "iqr": "10%" + } +).cols_align("center").cols_align( + "left", columns = "route_dir_name" +).cols_move_to_end(columns=["n_predictions"]) +``` diff --git a/rt_predictions/prep_operator_data.py b/rt_predictions/prep_operator_data.py index b3ce1afdf..abb2ffd16 100644 --- a/rt_predictions/prep_operator_data.py +++ b/rt_predictions/prep_operator_data.py @@ -163,9 +163,29 @@ def merge_in_operator_percentiles(df: pd.DataFrame) -> pd.DataFrame: day_type_sorted=df1.day_type.map(report_utils.DAYTYPE_ORDER_DICT), ) .rename(columns={"prediction_padding": "prediction_padding_minutes"}) - .drop(columns=["pct_predictions_early", "pct_predictions_ontime"] + array_cols) + .drop(columns=array_cols) .sort_values(["month_first_day", "schedule_name", "tu_name", "day_type_sorted"]) .reset_index(drop=True) ) return df1 + + +def reshape_prediction_category_counts_to_long(df: pd.DataFrame) -> pd.DataFrame: + """ + Get distribution of early / on-time/ late, + turn columns into long df to plot with altair stacked bar chart. + """ + df2 = df.melt( + id_vars=["tu_name", "day_type"], + value_vars=["pct_predictions_early", "pct_predictions_ontime", "pct_predictions_late"], + var_name="prediction_category", + value_name="pct", + ) + + df2 = df2.assign( + prediction_category=df2.prediction_category.str.replace("pct_predictions_", ""), + pct=df2.pct * 100, # scale this to match the axis without adjusting it in chart + ) + + return df2 From d0576e7d501147d01f6c87a39d9ae022ec0b5732 Mon Sep 17 00:00:00 2001 From: tiffanychu90 Date: Fri, 8 May 2026 15:47:46 +0000 Subject: [PATCH 4/5] switch order of goals for bus catch & pct completeness --- rt_predictions/operator_report.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rt_predictions/operator_report.ipynb b/rt_predictions/operator_report.ipynb index aa2bfed69..9dac4acc7 100644 --- a/rt_predictions/operator_report.ipynb +++ b/rt_predictions/operator_report.ipynb @@ -346,7 +346,7 @@ " font_size=8\n", ").pipe(\n", " gte.gt_hulk_col_numeric, \n", - " columns=[\"bus_catch_likelihood\", \"pct_tu_complete_minutes\"],\n", + " columns=[\"pct_tu_complete_minutes\", \"bus_catch_likelihood\"],\n", " palette=[_color_palette.get_color(\"light_goldenrod\"), \n", " _color_palette.get_color(\"pastel_peppermint\")],\n", " domain=[0, 1],\n", From c123396de578f3f9cebd3d1ddc334b4423a8eae9 Mon Sep 17 00:00:00 2001 From: tiffanychu90 Date: Fri, 8 May 2026 16:02:48 +0000 Subject: [PATCH 5/5] switch order using cols_move_to_start --- rt_predictions/operator_report.ipynb | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/rt_predictions/operator_report.ipynb b/rt_predictions/operator_report.ipynb index 9dac4acc7..a2337baf2 100644 --- a/rt_predictions/operator_report.ipynb +++ b/rt_predictions/operator_report.ipynb @@ -330,8 +330,10 @@ " ).fmt_percent(columns=[\"bus_catch_likelihood\", \"pct_tu_complete_minutes\"], decimals=1)\n", " .fmt_number(columns=[\"p50\", \"avg_prediction_spread_minutes\", \"prediction_padding_minutes\"], decimals=1)\n", " .fmt_integer(columns=[\"n_predictions\"])\n", - " .tab_header(title = f\"Trip Update Prediction Accuracy Metrics\", \n", - " subtitle = \"units are in minutes\")\n", + " .tab_header(\n", + " title = f\"Trip Update Prediction Accuracy Metrics\", \n", + " subtitle = \"units are in minutes\"\n", + " )\n", ").pipe(chart_utils.format_great_table)\n", "\n", "table.pipe(\n", @@ -364,7 +366,11 @@ " \"n_predictions\", \"p50\"\n", " ]\n", " }\n", - ").cols_move_to_end(columns=[\"n_predictions\"])\n", + ").cols_move_to_end(\n", + " columns=[\"n_predictions\"]\n", + ").cols_move_to_start(\n", + " columns=[\"pct_tu_complete_minutes\"]\n", + ")\n", "#.pipe(gte.gt_color_box, columns=[\"iqr\"], palette=\"YlOrRd\"), \n", "# maybe IQR doesn't make sense to color, it'll just be ranked by day_type" ] @@ -773,9 +779,9 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "Pyproject Local (use-venv)", "language": "python", - "name": "python3" + "name": "pyproject_local_kernel_use_venv" }, "language_info": { "codemirror_mode": {