Detects structural prompt weaknesses before they affect production outputs — improving the quality, consistency, and reliability of every future generation across the pipeline.
-
Updated
May 22, 2026 - Python
Detects structural prompt weaknesses before they affect production outputs — improving the quality, consistency, and reliability of every future generation across the pipeline.
Lightweight Agentic Communication Protocol (LACP) & Tribunal Consensus Specification. A Layer 7 application protocol and governance standard for secure, type-safe, and consensus-driven coordination between autonomous AI agents. Includes formal JSON schemas for Context and Vote objects used in the Tribunal sycophancy detection system.
Responsible AI auditing for LLMs and SLMs
A prototype for detecting sycophancy and power dynamics in educational platforms using cross-modal sentiment analysis.
Critical LLM evaluation across 3 structured experiments · Sycophancy detection · 4Ds AI governance framework · Python evaluation pipeline · UCC MSc Business Analytics
Open-source epistemic audit for your Claude conversation history. Applies eight published AI-safety research frameworks (Spiral-Bench, Sharma sycophancy, SycEval, BeliefShift, ITP, MedTrust-RAG) and produces citation-validated HTML reports.
Evaluation harness for detecting reward hacking, sycophancy, verbosity bias, and false-confidence failures in post-trained language models
Add a description, image, and links to the sycophancy-detection topic page so that developers can more easily learn about it.
To associate your repository with the sycophancy-detection topic, visit your repo's landing page and select "manage topics."