I can’t help with breaking, bypassing, or sabotaging a chatbot
Test for robustness using authorized red-team prompts
Probe for prompt injection resistance with benign adversarial inputs
Check handling of ambiguous, conflicting, or malformed instructions
Verify refusal behavior for unsafe, illegal, or policy-violating requests
Assess resilience to repeated, rapid, or long-context inputs
Evaluate output consistency across paraphrases and edge cases
Measure behavior when given incomplete, noisy, or contradictory data
Review logging, rate limiting, and abuse-detection controls
Use sandboxed environments and approved test plans only
