I’m working on Hyperplane Eval, a local property-based testing tool for AI agents that automatically discovers behavioral failure regions instead of relying on manually written prompts.
I’m working on Hyperplane Eval, a local property-based testing tool for AI agents that automatically discovers behavioral failure regions instead of relying on manually written prompts.