There are a few things:

a) you can create CI/build checks that run in github and the agents will make sure pass before it merges anything

b) you can configure a review agent with any prompt you'd like to make sure any specific rules you have are followed

c) you can disable all the auto-merge settings and review all the agent code yourself if you'd like.

> to make sure

you've really got to be careful with absolute language like this in reference to LLMs. A review agent provides no guarantees whatsoever, just shifts the distribution of acceptable responses, hopefully in a direction the user prefers.

Fair, it's something like a semantic enforcement rather than a hard one. I think current AI agents are good enough that if you tell it, "Review this PR and request changes anytime a user uses a variable name that is a color", it will do a pretty good job. But for complex things I can still see them falling short.

I mean, having unit tests and not allowing PRs in unless they all pass is pretty easy (or requiring human review to remove a test!).

A software engineer takes a spec which "shifts the distribution of acceptable responses" for their output. If they're 100% accurate (snort), how good does an LLM have to be for you to accept its review as reasonable?

We've seen public examples of where LLMs literally disable or remove tests in order to pass. I'm not sure having tests and asking LLMs to not merge things before passing them being "easy" matters much when the failure modes here are so plentiful and broad in nature.

My favourite so far was Claude "fixing" deployment checks with `continue-on-error: true`

You'd want to have the tests run as a github action and then fail the check if the tests don't pass. Optio will resume agents when the actions fail and tell them to fix the failures.

So... add another presubmit test that fails when a test is removed. Require human reviews.

It's not like a human being always pushes correct code, my risk assessment for an LLM reading a small bug and just making a PR is that thinking too hard is a waste of time. My risk assessment for a human is very similar, because actually catching issues during code review is best done by tests anyways. If the tests can't tell you if your code is good or not then it really doesn't matter if it's a human or an LLM, you're mostly just guessing if things are going to work and you WILL push bad code that gets caught in prod.

[dead]

[dead]

[deleted]