Guardrails
Guardrails provide input and output validation to ensure your agents handle content safely and comply with your requirements.
Overview
The guardrail framework allows you to: - Validate inputs before sending to the LLM - Validate outputs before returning to users - Chain multiple guardrails for comprehensive protection - Use tripwires to halt execution on critical failures
Built-in Guardrails
The SDK includes 9+ production-ready guardrails:
| Guardrail | Purpose | Package |
|---|---|---|
| PII Detection | Detect emails, phones, SSNs, credit cards | security |
| URL Filtering | Block/allow specific URLs | security |
| Secrets Detection | Prevent credential leakage | security |
| Profanity Detection | Filter toxic content | moderation |
| Moderation API | OpenAI's content moderation | moderation |
| Prompt Injection | Detect LLM security attacks | moderation |
| Content Length | Limit characters, words, or lines | content |
| Custom Regex | Pattern-based validation | content |
| Rate Limiting | Prevent abuse | ratelimit |
Basic Usage
import "github.com/MitulShah1/openai-agents-go/guardrail/security"
func main() {
// Create guardrails
piiGuardrail := security.NewPII(security.WithTripwire(true))
// Run agent with guardrails
result, err := runner.Run(
ctx,
agent,
messages,
nil,
agents.WithGuardrails([]agents.Guardrail{piiGuardrail}),
)
}
Tripwires
Tripwires halt execution immediately on failure:
// Tripwire ON - execution stops if violated
critical := security.NewPII(security.WithTripwire(true))
// Tripwire OFF - logs violation but continues
warning := security.NewPII(security.WithTripwire(false))
Best Practices
- Use tripwires for critical guardrails (PII, secrets, injections)
- Log violations for non-tripwire guardrails (profanity, length)
- Layer guardrails from most to least critical
- Test guardrails with known violation cases
- Monitor guardrail metrics in production