Skip to content

Graph provenance#841

Open
calvinp0 wants to merge 10 commits intomainfrom
graph_provenance
Open

Graph provenance#841
calvinp0 wants to merge 10 commits intomainfrom
graph_provenance

Conversation

@calvinp0
Copy link
Copy Markdown
Member

This pull request introduces a provenance tracking and visualization system to the ARC workflow, enabling detailed recording and rendering of the sequence of computational events (such as job launches, completions, troubleshooting, and decision points) in each run. The provenance data is saved in YAML format and, if Graphviz is available, also rendered as a graph (DOT and SVG). The scheduler now records all relevant events and generates these artifacts at the end of a run. Comprehensive tests are included to validate the new functionality.

Key changes include:

Provenance tracking and event recording:

  • Added a provenance dictionary to the Scheduler class to track run metadata and a list of events, with initialization and persistence logic. Events such as species initialization, job start, job finish, troubleshooting, and TS guess selection are now recorded via the new record_provenance_event method. [1] [2] [3] [4]
  • On restart, previous provenance logs are loaded, and new events are appended, ensuring continuity across interrupted runs. [1] [2]
  • The scheduler finalizes provenance at the end of a run, generating all artifacts. [1] [2]

Provenance artifact generation and visualization:

  • Implemented save_provenance_artifacts in arc/plotter.py to save the provenance event log as YAML and, if possible, render the event graph using Graphviz (DOT and SVG). The graph visualizes the relationships between species, jobs, troubleshooting decisions, and TS guess selections. Helper functions ensure graph labels are readable and node IDs are safe.
  • Added logic to handle missing Graphviz gracefully, falling back to YAML-only output.

Testing and validation:

  • Added comprehensive tests for label wrapping and for the full provenance artifact generation pipeline, verifying that all key node types and relationships are rendered in the output graph.

API and typing improvements:

  • Updated function signatures and docstrings to support provenance tracking, including new parameters for parent job and reason in run_job. [1] [2]
  • Minor typing and import cleanups in arc/scheduler.py.

Utility and robustness:

  • Ensured output directories are created as needed and that provenance logs are robust to parsing errors or missing files. [1] [2]

These changes lay the foundation for reproducible, auditable ARC runs and provide a clear visual summary of complex computational workflows.


Provenance tracking and event recording

  • Added a provenance dictionary and event recording methods to the Scheduler class, capturing all key events during a run and persisting them to YAML. [1] [2] [3] [4]
  • Implemented logic to load previous provenance logs on restart and ensure continuity of event tracking. [1] [2]
  • Scheduler now finalizes provenance and generates artifacts at the end of a run. [1] [2]

Provenance artifact generation and visualization

  • Added save_provenance_artifacts in arc/plotter.py to render provenance graphs (DOT/SVG) and YAML logs, with readable labels and safe node IDs. Handles missing Graphviz gracefully.

Testing

  • Added robust tests for label wrapping and for provenance artifact generation, ensuring all key events and relationships are rendered and validated.

API and typing improvements

  • Updated run_job and related methods to accept provenance-related parameters and improved docstrings and typing. [1] [2] [3]

Utility and robustness

  • Ensured output directories are created as needed and made provenance log handling robust to errors and missing files. [1] [2]

- Improve provenance logging by avoiding duplicate initialization events and handling potentially corrupted provenance files.
- Ensure internal consistency on restart by verifying that species marked as converged have all required output paths, resetting their status otherwise.
- Fix job key generation for reactions (lists of labels) and improve tracking for running conformer jobs.
- Defer TS switching during conformer optimization batches to avoid unnecessary job deletions.
Ensure that successful and unsuccessful transition state generation methods are listed uniquely and formatted using join to avoid trailing commas in the species report.
- Update graph logic to correctly link jobs to parent jobs, troubleshooting diamonds, or TS selection decisions instead of always defaulting to the last node.
- Preserve intentional newlines in wrapped labels to improve node readability.
- Ensure the provenance YAML file is saved with an updated timestamp even when the graphviz package is unavailable.
- Add support for visualizing TS guess selection failure events as decision nodes.
- Use stable indices for TS guesses to ensure correct mapping between jobs and guess objects during conformer optimization.
- Add unit tests for provenance deduplication, restart output sanitization, and multi-species label handling in the Scheduler.
- Correct "unsuccessfully" to "unsuccessful" in the transition state report string.
- Update unit tests to reflect the deduplication of generation methods and the removal of trailing commas in the report output.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds provenance tracking to ARC runs, persisting an event log to YAML and optionally rendering a Graphviz (DOT/SVG) visualization at the end of scheduling.

Changes:

  • Introduces scheduler-side provenance event recording (job start/finish, troubleshooting, TS guess selection) with persistence and restart behavior.
  • Adds plotter support to save provenance artifacts (YAML + Graphviz DOT/SVG) with label wrapping and safe node IDs.
  • Updates/extends unit tests to validate provenance logging/rendering and improves TS report formatting (deduped method lists).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
environment.yml Adds conda package for the Python Graphviz bindings used for rendering provenance graphs.
arc/species/species.py Deduplicates TS report method lists and fixes wording for unsuccessful methods.
arc/species/species_test.py Updates expected TS report string to match new formatting.
arc/scheduler.py Implements provenance state/events, restart sanitization for missing paths, and records key scheduling events.
arc/scheduler_test.py Adds tests for provenance restart dedup, restart sanitization, delete-all-jobs reset behavior, and multi-label provenance.
arc/plotter.py Adds provenance artifact generation (YAML + optional DOT/SVG) and helper functions for Graphviz output.
arc/plotter_test.py Adds tests for graph label wrapping and provenance artifact generation/graph structure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

arc/scheduler.py Outdated
Comment on lines +1268 to +1272
level_of_theory=self.ts_guess_level,
job_type='conf_opt',
conformer=i,
conformer=tsg.index,
)
tsg.conformer_index = i # Store the conformer index in the TSGuess object to match them later.
tsg.conformer_index = tsg.index # Use a stable identifier for mapping back to TSGuess.
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tsg.conformer_index is assigned after run_job(), but run_job() immediately persists the restart file. If ARC is interrupted after spawning these conf_opt jobs but before another save_restart_dict() call, the restart.yml can contain running conf_opt jobs while the corresponding TSGuess objects still have conformer_index=None, and parse_conformer() will then be unable to map conformer results back to a TSGuess. Set tsg.conformer_index (and any fallback tsg.index) before calling run_job() (or explicitly save the restart dict after assignment) so restarts are always consistent.

Copilot uses AI. Check for mistakes.
arc/scheduler.py Outdated
logger.warning('Could not parse existing provenance.yml; starting a fresh provenance log.')
provenance = None
if isinstance(provenance, dict):
self.provenance['events'] = provenance.get('events', list())
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When loading an existing provenance.yml, provenance.get('events', ...) is assigned directly to self.provenance['events'] without validating type/shape. If the file is partially corrupted (e.g., events is not a list of dicts), the set comprehension on the next line can raise and break Scheduler initialization. Consider validating events (must be a list of dicts) and falling back to an empty list if it’s not.

Suggested change
self.provenance['events'] = provenance.get('events', list())
raw_events = provenance.get('events', list())
if isinstance(raw_events, list) and all(isinstance(e, dict) for e in raw_events):
self.provenance['events'] = raw_events
else:
logger.warning('Existing provenance.yml has an invalid "events" structure; '
'starting with an empty event log.')
self.provenance['events'] = list()
# Ensure we always have a list for provenance events
self.provenance.setdefault('events', list())
if not isinstance(self.provenance['events'], list):
logger.warning('Existing provenance events are not a list; resetting to empty list.')
self.provenance['events'] = list()

Copilot uses AI. Check for mistakes.
Comment on lines +556 to +557
self.provenance['events'].append(event)
self.save_provenance()
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

record_provenance_event() persists provenance.yml on every event. In real runs this could be thousands of events (job starts/finishes, troubleshooting, etc.) and may noticeably slow scheduling due to synchronous disk I/O. Consider buffering events in memory and flushing periodically (e.g., every N events / every M seconds) and/or only persisting on key milestones + finalize, while still ensuring durability on restart.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

arc/scheduler.py Outdated
Comment on lines +551 to +553
event = {'event_id': len(self.provenance['events']) + 1,
'event_type': event_type,
'timestamp': datetime.datetime.now().isoformat(timespec='seconds'),
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

record_provenance_event() derives event_id from len(events) + 1. If an existing provenance log is loaded with non-contiguous or non-1-indexed event_ids (e.g., manual edits or future schema changes), this can generate duplicate IDs. Safer approach: compute the next ID from max(existing_event_id) + 1 (defaulting to 0 when absent).

Copilot uses AI. Check for mistakes.
arc/scheduler.py Outdated
Comment on lines +536 to +537
already_initialized = {e['label'] for e in self.provenance['events']
if e.get('event_type') == 'species_initialized' and 'label' in e}
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The already_initialized = {e['label'] ...} comprehension assumes label is always hashable. If provenance.yml contains a parsed-but-invalid event where label is a list/dict, restart will raise TypeError here despite the earlier “robust to parsing errors” intent. Consider filtering to isinstance(label, str) (or coercing to str) before adding to the set.

Suggested change
already_initialized = {e['label'] for e in self.provenance['events']
if e.get('event_type') == 'species_initialized' and 'label' in e}
already_initialized = set()
for event in self.provenance['events']:
if event.get('event_type') == 'species_initialized':
label = event.get('label')
if isinstance(label, str):
already_initialized.add(label)
elif label is not None:
logger.debug(f"Ignoring provenance event with non-string label in provenance.yml: {label!r}")

Copilot uses AI. Check for mistakes.
Comment on lines 307 to +312
self.species_dict, self.rxn_dict = dict(), dict()
for species in self.species_list:
self.species_dict[species.label] = species
for rxn in self.rxn_list:
self.rxn_dict[rxn.index] = rxn
self._initialize_provenance()
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_initialize_provenance() is called before TS species are created/added from rxn_list, so those TS labels never get a species_initialized event and the provenance graph/log will be incomplete for reaction runs. Consider moving _initialize_provenance() to after the reaction/TS-species construction block, or explicitly recording species_initialized when a TS species is created and appended to species_list.

Copilot uses AI. Check for mistakes.
@alongd
Copy link
Copy Markdown
Member

alongd commented Mar 28, 2026

Thanks for this awesome addition! For a while we wanted to add something to visualize ARC's progress. Is this meant to be live or static at the end of the run? Eventually, we want a live HTML portal to track ARC/T3 progress, will be great to have that in mind when developing the feature in the present PR so we can build on top of that

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.81%. Comparing base (1a72830) to head (987037f).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #841      +/-   ##
==========================================
+ Coverage   58.58%   58.81%   +0.22%     
==========================================
  Files          97       97              
  Lines       29203    29409     +206     
  Branches     7752     7800      +48     
==========================================
+ Hits        17110    17297     +187     
- Misses       9889     9890       +1     
- Partials     2204     2222      +18     
Flag Coverage Δ
functionaltests 58.81% <ø> (+0.22%) ⬆️
unittests 58.81% <ø> (+0.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants