petri

We have hosted the application petri in order to run this application in our online workstations with Wine or directly.

Run petri online

Quick description about petri:

Petri is an open-source alignment auditing agent that lets researchers rapidly test concrete safety hypotheses against target models using realistic, multi-turn scenarios. Instead of building bespoke evals, Petri automatically generates audit environments from seed �special instructions,� orchestrates an auditor model to probe a target model, and simulates tool use and rollbacks to surface risky behaviors. Each interaction transcript is then scored by a judge model using a consistent rubric so results are comparable across runs and models. The system supports major model APIs and comes with starter seeds and judge dimensions, enabling minutes-to-insight workflows for questions like reward hacking, self-preservation, or eval awareness. Petri is designed for parallel exploration: it spins many audits in flight, aggregates findings, and highlights transcripts that deserve human review.

Features:

Scenario generator that turns seed instructions into realistic audit setups
Multi-turn auditor orchestration with simulated tool use and rollbacks
Judge model that scores transcripts via a consistent safety rubric
Parallel execution to explore many hypotheses and surface the riskiest traces first
Built-in starters for seeds and judge dimensions plus guidance for customization
API support for popular model providers with reproducible runs and reports

Programming Language: Python.
Categories:

AI Agents

Page navigation:

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.