Hydra

Back to the journey

LIVE · 2026 – ongoing

A multi-headed QA agent we built to meet the AI-agent era head-on — Jira in, validated tickets out.

OriginSelf-initiated

TypeQA Validation Agent

StatusLive on staging · Actively growing new heads

How it works

What Hydra is

A multi-headed QA agent, triggered by a ticket.

Hydra is a QA agent I proposed and the team I lead built on Claude and Codex — our move in the AI-agent era reshaping our craft. Drop it a Jira ticket key: it routes the ticket through the right validation head (backend, frontend, or hybrid), runs the checks against staging, captures evidence, and publishes the bundle back to reviewers. It's live, it's evolving, and it's freeing the team to do the work only humans can.

The problem

Manual regression doesn't scale. Every ticket on staging needed human eyes for API contracts, UI flows, or both — hours of repetitive click-through that crowded out test strategy, edge-case hunting, and the work that actually moves quality forward. Meanwhile, AI agents were getting good enough to matter.

Hydra's answer

I pitched and designed the approach; the QA team I lead built it — no one assigned us. A Jira-driven agent that takes over the repetitive half so the team can focus on the human half. Trigger it with a ticket key; it figures out what to test, runs the checks, captures the evidence, and hands back a clean report. Built on Claude Code's skills model so the behavior is readable, testable, and improves every sprint.

Anatomy

Three heads, one body.

The agent routes every ticket to the right verification head. Each head has its own playbook, its own tooling, and its own idea of what 'done' looks like — but they all share the same spine.

head 01

Backend Head

/test-be

API, auth, DB, contracts.

Endpoint verification, validation rules, authentication flows, database assertions, and contract testing. Runs against staging with the ticket's acceptance criteria as a spec.

head 02

Frontend Head

/test-fe

Components, interaction, permissions, responsive.

UI component verification, interaction flows, permission boundaries, and responsive behavior. Drives the app the way a user would — but with perfect memory of every ticket that came before.

head 03

Hybrid Head

/test (routes both)

Full-stack tickets end-to-end.

For tickets that cross the stack, the orchestrator routes to both heads and stitches the evidence together. One command, two verification paths, one report.

How it works

From a ticket key to a reviewed result, in six steps.

Every run starts with a Jira ticket and ends with a structured report. In between, Hydra handles the orchestration — so humans only show up for the parts that need judgment.

01
Jira ticket → entry point
Trigger with a ticket key. Hydra fetches the ticket, reads the acceptance criteria, and decides what kind of validation it needs.
02
Route to the right head
Backend, frontend, or hybrid — each with its own verification playbook and tooling.
03
Validate against staging
Exercises the ticket: hits APIs with the right payloads, clicks through UI flows, checks DB state, verifies permissions.
04
Capture evidence
Discovery notes, Postman collections, screenshots, and structured test reports — everything a reviewer needs to trust the result.
05
Publish to GCS
The evidence bundle ships to Google Cloud Storage under a per-ticket folder. Reviewers get a single link with everything attached.
06
Comment back on Jira
A structured comment lands on the ticket: verdict, links to evidence, and any follow-ups the team needs to handle.

Design principles

Why it works the way it does.

The architecture didn't appear by accident. These are the six principles I held the build to — the decisions that made Hydra dependable instead of clever.

principle 01
Meet the team where they already are
Hydra's entry point is a Jira ticket — the artifact every engineer, PM, and QA already interacts with. Zero new UX for the team to adopt.
principle 02
Route, then validate
A thin orchestrator decides which head to call; each head specializes in its own verification discipline. Changing how we test backend doesn't touch frontend logic.
principle 03
Evidence-first
Every run ships a structured artifact bundle — not just pass/fail. Reviewers trust what they can audit.
principle 04
Human-in-the-loop
Hydra proposes verdicts and surfaces findings. Humans approve, override, or escalate. The agent never ships to production on its own.
principle 05
Capabilities as skills, not scripts
Every capability is a composable skill with a contract — so the playbook improves every sprint without a rewrite.
principle 06
Portable by design
What-to-test is decoupled from how-to-test. The architecture travels to whatever product the team works on next.

Engineering Playbook · Shared Skill Library

BrainstormingWriting & executing plansSystematic debuggingTest-driven developmentVerification before completionRequesting code reviewUsing git worktreesParallel agent dispatch

Stack

What Hydra is made of.

Model

ClaudeCodex

Runtime

Claude CodeCursor

Protocol

MCP

Knowledge

Context7

Language

Python

Tooling

Integration

Jira API

Testing

Postman

Infra

Google Cloud Storage

Why this one matters

Why Hydra is the thing I'm most excited about right now.

reason 01
Took the initiative — nobody assigned Hydra. I saw the AI-agent moment and rallied the team to build our answer.
reason 02
Live on staging — not a demo, not a slide deck. Real traffic is running through it.
reason 03
Growing every sprint — new heads, new skills, new coverage.
reason 04
Frees the team for higher-value work — the thinking humans are uniquely strong at.

Back to the journey Talk to me about Hydra

Backend Head

Frontend Head

Hybrid Head

Jira ticket → entry point

Route to the right head

Validate against staging

Capture evidence

Publish to GCS

Comment back on Jira

Meet the team where they already are

Route, then validate

Evidence-first

Human-in-the-loop

Capabilities as skills, not scripts

Portable by design