
Cannon is the bulk prompt testing tool. It runs a whole collection of prompts through a profile at once and shows how many were flagged, so you can validate a profile's guardrails against a defined set of inputs and see how the profile responds across all of them. Where the Playground tests one prompt at a time, Cannon tests a collection in a single run.
Cannon Pages
Cannon is organized into three pages.
- Runs lists the test runs and is where you start new ones. A run is one instance of processing a collection through a profile.
- Results is the table of events the runs produced. It can be filtered down to a specific run or collection, so you can review the individual events from one test or across many.
- Collections holds the libraries of prompts that runs draw from. A collection is the set of prompts you send through a profile.
How a Test Flows
A Cannon test moves from a collection to a run to results.
You start with a collection, the set of prompts you want to tests. From the Runs page, you start a new run by selecting a collection and a profile. The run processes every prompt in the collection through that profile.
When you start a run, a success message confirms it has kicked off and a progress indicator begins to animate. The run's row does not appear in the runs list right away; it shows up after a short delay. A run takes at lease five minutes to process, and very large collections can take considerably longer. No time estimate is shown while a run is in progress. The progress indicator stops animating when the run is complete.
When it finishes, the run settles into the list as a completed row with a count of how many of its events were flagged. From there you open the Results page to review the individual events, where you can see exactly which prompts were flagged and why.
