We're dealing with CI logs, produced by a variety of frameworks, languages, etc... And the tough ones to look into are e2e tests, with outputs from infrastructure. I wish a re.match() would be enough, but we often don't even know what to match in the first place.

We started to add deterministic matching on the patterns that the agent sees the most so we don't have to go through the whole thing (for example a flake on PostHog can occurs 100+ times during a day, you don't need to reinvestigate every time). But for new errors, it's tricky.