Hacker News

alexhans 9 hours ago [ - ]

I've naturally done this a lot and suggested that other people prompt this way. I can see how a "ready made" solution with this behaviour could be interesting.

The compliance parts are good to make clear considerating one segment of the user target audience.

May I ask what techniques do you use to test regressions or correct behaviour of your multi turn conversation in your product? What are the biggest lessons and learnings in that space?

qurio_dev 7 hours ago [ - ]

Great question. Testing multi-turn Socratic logic is much harder than testing standard RAG. We currently use a 'Shadow Evaluator'—a separate LLM instance that reviews session logs to flag cases where the tutor 'collapsed' and gave a direct answer.

The biggest learning so far: 'Instruction Drift' is real. You can't just give one long prompt. You have to break the reasoning into smaller 'Cognitive Process Capsules' (CPCs) to keep the model from losing the Socratic thread during long sessions.