Expand description
LLM-as-judge — the shared async evaluation primitive behind the LLM-backed
G:: guards and E:: criteria.
Mirrors ADK Python’s final_response_match_v2 / safety-evaluator approach: a
judge model is prompted to render a structured verdict, and the verdict label
is parsed back out of the model’s reply (robust to surrounding prose).
Structs§
Functions§
- parse_
verdict - Parse a verdict from the judge model’s reply. Tolerant of extra prose around
the JSON: it scans for the
violationfield’s boolean and thereasonstring, falling back to common labels (invalid,unsafe). - render_
contents - Render conversation history into a plain-text block for a judge prompt, keeping role labels so the judge can reason about grounding.