AOP-Wiki CLI

Tools for analyzing AOP-Wiki content derived from XML data along with scripts to process AOP content from literature sources.

Overview

This repository provides Python functions and CLI commands for analyzing content from the AOP-Wiki XML data export. The CLI functions extract Adverse Outcome Pathways (AOPs), Key Events (KEs), and Key Event Relationships (KERs) from the XML, calculate completion metrics, and support various analytical workflows.

Key Features

XML Data Collection: Automated download and parsing of AOP-Wiki XML exports
Entity Extraction: Extract AOPs, Key Events, and KERs with full metadata
Completion Scoring: Automated calculation of data completeness metrics
Event Ranking: Scoring system for prioritizing Key Events based on multiple criteria
Evidence Harmonization: Tools for standardizing tabulated evidence across KERs
Text Search: Search AOP-Wiki entities for specific terms and patterns
Reference Analysis: Search and analyze citations in AOP-Wiki content
CLI Interface: Command-line tools for common analysis workflows
Tests: Current test suite is limited to specific needs

Installation

# Optional: Drop dependencies dir
rm -rf .venv

# Install dependencies with uv
uv sync

# Run CLI with help flag to view available commands
uv run python cli.py --help

CLI Commands

# Collect all events and calculate integration rankings
uv run python cli.py collect-event-integration-rankings

# Collect KER analytics
uv run python cli.py collect-ker-analytics

# Search KERs for concordance evidence
uv run python cli.py search-kers-for-concordance-text

# Harmonize KER evidence tables
uv run python cli.py harmonize-ker-evidence

# Search entities using a config file
uv run python cli.py search-with-config <config_name>

# Collect and harmonize seizure AOP data (interactive review)
uv run python cli.py collect-harmonized-seizure-aops
uv run python -m cli collect-harmonized-seizure-aops --date 02-20-2026

# Manually review match results from a JSON file
uv run python cli.py manually-review-matches <input_file.json> [--threshold 0.9]

# Concrete example - with future oriented date
uv run python cli.py manually-review-matches outputs/seizure_aops/03-14-2026/mapping_ke_description_to_harmonized_ke_03-14-2026.json --threshold 0.9

uv run python cli.py manually-review-matches outputs/seizure_aops/{date}/mapping_ke_description_to_harmonized_ke_{date}.json --threshold 0.9

Seizure AOP Workflow

Use this workflow to generate seizure-specific outputs, then move the selected files to the target project input folder.

Two-Stage Human Review Process

The seizure workflow includes interactive human review for quality control:

Stage 1: KE Descriptions → Harmonized KEs - Review fuzzy matches between Key Event descriptions from the source workbook and harmonized KE titles
Stage 2: Target Families → Events - Review fuzzy matches between target family labels and AOP-Wiki event titles

During each stage, you'll be prompted to accept (y), reject (n), or quit (q) each match below the confidence threshold. Rejected matches allow you to suggest a better match.

Seizure AOP Workflow-Specific Caching Behavior

The workflow checks for previously curated input files in outputs_for_vc/:

KE description mappings: outputs_for_vc/reviewed_ke_description_to_harmonized_ke_mapping.json
Event-target family mappings: outputs_for_vc/curated_event-target_family_mappings.json

If these files exist, they are loaded and the interactive review is skipped. These files must be manually placed there after review (e.g., by copying from dated output folders).

To regenerate matches (bypass curated inputs), use --skip-curated.

CLI Options

# Basic usage (uses cached curations if available)
uv run python cli.py collect-harmonized-seizure-aops

# Specify a cache date for AOP-Wiki data
uv run python cli.py collect-harmonized-seizure-aops --date MM-DD-YYYY

# Force refresh of AOP-Wiki XML data
uv run python cli.py collect-harmonized-seizure-aops --force-refresh

# Skip curated inputs and regenerate via fuzzy matching + review
uv run python cli.py collect-harmonized-seizure-aops --skip-curated

Export to Target Project

# Preview file moves (recommended)
./export_ready_for_emod_upload.sh --date MM-DD-YYYY --output /path/to/target/inputs/seizure_aops --dry-run

# Execute file moves
./export_ready_for_emod_upload.sh --date MM-DD-YYYY --output /path/to/target/inputs/seizure_aops

Output Files

Outputs are written to outputs/seizure_aops/{date}/:

File	Description
`harmonized_events_{date}.csv`	Harmonized key events ready for analysis
`harmonized_events_with_wiki_content_{date}.json`	Events enriched with AOP-Wiki metadata
`assays_{date}.csv`	Assay data mapped to events
`seizure_aop_events_{date}.xlsx`	Combined workbook with all seizure AOP data
`mapping_ke_description_to_harmonized_ke_{date}.json`	KE description to harmonized KE mappings
`post_analysis_event_to_assays_{date}.json`	Event-to-assay mappings via target families
`biological_target_families_{date}.json`	Target family definitions
`aop_to_harmonized_events_validation_{date}.json`	Validation results comparing with AOP-Wiki

Development

Project Organization

Entry point: cli.py provides the main CLI interface
Source code: All production code is in src/ organized by functional domain
Configuration: Analysis configurations are in configs/
Tests:
- Unit and integration tests are in tests/ at project root
- One test has been created as a shell script at project root
Scripts: Catch all space for "scripts"

Testing

# Run tests on content search functions
uv run python -m tests.test_search_text_by_field

# Run a test that all CLI functions are running with standard params
bash test_cli_integration.sh

# Alt version - just view results, using grep
bash test_cli_integration.sh 2>&1 | grep -E "(Testing:|PASSED|FAILED|Test Summary)"

Project Structure ()

aop_wiki_cli/
├── cli.py                              # Main CLI entry point
├── pyproject.toml                      # Project dependencies
├── test_cli_integration.sh             # CLI integration tests
├── export_ready_for_emod_upload.sh     # Seizure output export script
│
├── src/                          # Source code modules
│   ├── analysis/                 # Post-extraction analytics
│   ├── collection/               # Needs refactoring 
│   ├── parsers/                  # Parser for the XML and other sources
│   ├── search/                   # Text and reference searching
│   ├── harmonization/            # KER Evidence table standardization
│   ├── data_export/              # File generation (CSV, Excel, JSON)
│   ├── visualization/            # Graphical outputs
│   ├── utilities/                # Shared helper functions
│   └── standalone/               # Needs refactoring 
│
├── configs/                      # Analysis configuration files
├── tests/                        # Unit and integration tests
├── docs/                         # Project documentation
│
├── inputs/                       # Input data
│   ├── seizure_aops/             # Seizure AOP workbook inputs
│   ├── annotated_manually/       # Manual annotations
│   └── from_emod_prototypes/     # From earlier EMOD prototypes
│
├── outputs/                      # Generated outputs (git-ignored)
│   ├── seizure_aops/             # Seizure workflow outputs
│   ├── event_rankings/           # Event ranking results
│   ├── ker_evidence/             # KER evidence data
│   ├── ker_analytics/            # KER analytics
│   └── cache/                    # Cached XML/JSON data
│
├── outputs_for_vc/               # Curated outputs for version control
│
├── xml_inputs/                   # Downloaded AOP-Wiki XML files
├── logs/                         # Log files
├── archived/                     # Deprecated scripts
└── scratch/                      # Experimental code

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
docs		docs
inputs/seizure_aops		inputs/seizure_aops
outputs_for_vc		outputs_for_vc
src		src
tests		tests
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
__main__.py		__main__.py
cli.py		cli.py
export_ready_for_emod_upload.sh		export_ready_for_emod_upload.sh
pyproject.toml		pyproject.toml
test_cli_integration.sh		test_cli_integration.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AOP-Wiki CLI

Overview

Key Features

Installation

CLI Commands

Seizure AOP Workflow

Two-Stage Human Review Process

Seizure AOP Workflow-Specific Caching Behavior

CLI Options

Export to Target Project

Output Files

Development

Project Organization

Testing

Project Structure ()

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AOP-Wiki CLI

Overview

Key Features

Installation

CLI Commands

Seizure AOP Workflow

Two-Stage Human Review Process

Seizure AOP Workflow-Specific Caching Behavior

CLI Options

Export to Target Project

Output Files

Development

Project Organization

Testing

Project Structure ()

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages