Enterprise knowledge graph system that ingests organizational documents, extracts entities/concepts/relationships, and supports secure retrieval through full-text, semantic, and graph-based search. An agentic retrieval-augmented generation (RAG) workflow built with LangGraph sits on top of these primitives as the core query interface.
Omni-Graph/
├── exec.py
├── requirements.txt
├── README.md
├── database-schema.jpeg
├── sql/
│ ├── schema.sql
│ ├── sample_data.sql
│ ├── procedures_triggers.sql
│ └── queries.sql
└── omnigraph/
├── __init__.py
├── ingestion_pipeline.py
├── entity_relation_extractor.py
├── graph_builder.py
├── semantic_query_engine.py
├── access_control_audit.py
├── console_app.py
└── agentic_rag.py
- Python 3.14+
- PostgreSQL 14+
pip
- Create and initialize the database.
createdb omnigraph
psql -d omnigraph -f sql/schema.sql
psql -d omnigraph -f sql/sample_data.sql
psql -d omnigraph -f sql/procedures_triggers.sql- Install Python dependencies.
python -m pip install -r requirements.txt- Run the console app.
python exec.pyAt startup, the console prompts for DB connection values and a username.
- DB defaults used by the app:
localhost:5432, databaseomnigraph, userpostgres - DB password behavior:
- If entered at prompt, that value is used
- If left blank, the app falls back to
OMNIGRAPH_DB_PASSWORD(orpostgres)
- Supported env vars for
DatabaseConnection:OMNIGRAPH_DB_USEROMNIGRAPH_DB_PASSWORD
Sample usernames (from sql/sample_data.sql):
albert.chengchen.weijohnson.markmartinez.sofiaokafor.emekatanaka.yukiwilliams.alexkumar.rahulfischer.annabrown.david
Note: console authentication currently validates active username only (no password verification in app logic).
All tables live in the PostgreSQL schema omnigraph. The full DDL (constraints, indexes, CHECK enums) is in sql/schema.sql.
The diagram file must sit next to this README at the repository root as database-schema.jpeg so GitHub, GitLab, and local Markdown previews can resolve the path.
| Table | Purpose |
|---|---|
roles |
Role definitions and permission arrays (TEXT[]). |
users |
User accounts and profile fields. |
user_roles |
Many-to-many assignment of roles to users. |
access_policies |
Per-role rules on resource_type × sensitivity_level (can_read / can_write / can_delete). |
taxonomy |
Hierarchical taxonomy nodes (parent_id self-reference, level, domain). |
documents |
Core documents: content, content_hash, source_type, sensitivity_level, taxonomy_id, uploaded_by, is_archived, FTS via GIN on title+content. |
document_versions |
Version history per document (version_number, content snapshot, changed_by). |
entities |
Graph nodes: name, entity_type, optional description/canonical metadata, confidence. |
concepts |
Topic nodes: name, domain, optional taxonomy_id, relevance_score. |
tags |
Tag dictionary for document classification. |
relations |
Directed edges between entities (relation_type, strength, optional source_document_id). |
document_entities |
Links documents to entities (relevance, mention_count, first_occurrence). |
document_tags |
Links documents to tags (tagged_by, tagged_at). |
concept_hierarchy |
Parent/child edges between concepts (relationship_type). |
entity_concepts |
Links entities to concepts (relevance_score). |
document_concepts |
Links documents to concepts (relevance_score, extracted_by: system/manual/ai). |
embeddings |
Vector storage per source_type (document/entity/concept) + source_id and model_name (FLOAT[], unique per triple). |
query_logs |
Search/query telemetry (query_type, results_count, execution_ms). |
audit_logs |
Security audit events (action, resource_type, optional resource_id, details). |
- Users ↔ roles:
user_roles. - Documents: belong to
taxonomy, uploaded byusers; versions indocument_versions; linked toentities,tags, andconceptsvia junction tables. - Graph:
entitiesconnected byrelations;conceptsstructured byconcept_hierarchy;entity_conceptsanddocument_conceptsattach concepts to entities and documents. - Semantic search:
embeddingsrows reference logical sources by(source_type, source_id). - Governance:
access_policiesdrives RBAC checks;query_logsandaudit_logssupport observability.
- Text normalization
- SHA-256 deduplication via
content_hash - Document insert + metadata handling
- Version creation for duplicate content or updates
- Batch ingestion helpers
- Pattern/keyword-based NER
- Concept extraction with domain tagging and relevance scores
- Regex relationship extraction (e.g.,
works_for,depends_on,uses) - Persistence into entity/concept/relation mapping tables
- Entity node and relation creation/removal
- Taxonomy tree operations
- Concept hierarchy operations
- Neighborhood exploration and graph statistics
- Full-text search (PostgreSQL
tsvector/tsquery) - Vector similarity search over stored embeddings (
FLOAT[]) - Graph traversal search through entity links/relations
- Hybrid ranking across retrieval modes
- Expert lookup and related-concept discovery
- Role-based access control (RBAC)
- Sensitivity-aware permission checks
- Query logging (
query_logs) - Audit logging (
audit_logs) - Reports for sensitive access and query analytics
- ReAct agent (LangGraph
create_react_agent) with OmniGraph tools as the primary way to query the knowledge graph. - Tools:
hybrid_search,find_experts,get_entity_documents,find_related_concepts,get_document_content(all respect RBAC). - Console Search & Discover leads with Ask (Agent); set
GROQ_API_KEY(and installlangchain-groq) to use.
Main menus in omnigraph/console_app.py:
Search & Discover
- Ask (Agent) — natural-language question over the graph (agentic RAG)
- Full-text search
- Hybrid/semantic search
- Find experts
- Explore related concepts
- Entity-based document lookup
- Entity neighborhood view
Manage Documents
- Add document
- Update document metadata
- Tag document
- View document detail
- List recent documents
- Run extraction on a document
Administration & Audit
- Graph stats and structure views
- Audit trail and sensitive access report
- Query analytics
- Role assignment/revocation
- Custom read-only SQL (SELECT/CTE with safety checks)
sql/schema.sql: full schema and indexessql/sample_data.sql: demo roles/users/documents/entities/concepts/etc.sql/procedures_triggers.sql: 6 stored procedures + 5 triggerssql/queries.sql: advanced SQL examples (joins, recursive CTEs, window functions, full-text)
- Run order matters:
schema.sql->sample_data.sql->procedures_triggers.sql. - The README in older revisions referenced
run.py; current entrypoint isexec.py. - Some admin console features require permissions like
view_graph, but sample role permission arrays do not includeview_graphby default. If needed, update role permissions inomnigraph.roles.permissions.
Example fix:
UPDATE omnigraph.roles
SET permissions = array_append(permissions, 'view_graph')
WHERE role_name = 'admin'
AND NOT ('view_graph' = ANY(permissions));from omnigraph import DatabaseConnection, DocumentIngester, SemanticQueryEngine
# Connect
db = DatabaseConnection(host="localhost", port=5432, dbname="omnigraph", user="postgres", password="postgres")
db.connect()
# Ingest
ingester = DocumentIngester(db)
document_id = ingester.ingest_document(
title="My Document",
source_type="technical_doc",
content="Kubernetes uses containers and works with Docker.",
uploaded_by=1,
sensitivity_level="internal",
)
# Search
engine = SemanticQueryEngine(db, user_id=1)
results = engine.search("Kubernetes Docker", strategy="hybrid", limit=5)
print(document_id, len(results))
db.disconnect()MIT License. See LICENSE.
