Instead of "surgically adjusting" logits within an existing model, couldn't you just build the slop detector into the loss function during the initial training stage?
Instead of "surgically adjusting" logits within an existing model, couldn't you just build the slop detector into the loss function during the initial training stage?