I'm a little bit more on the fence with the work sample interviews having designed them and also interviewed through them. I've also done my fair share of "traditional" tech interviews, all at startups, never FAANG.

As an interviewer, I much prefer the signals generated through a work-sample interview. I'm much more confident in the hiring recommendation than I get from a 1 hour zoom session. However, if I look at teams that were built through the work-sample and zoom interviews, I'm not sure the outcomes were that noticeably better.

As an interviewee, I think I understand the frustration being on the other side has. With an in-person interview, I often have a good sense that I bombed the interview or something to improve on or replay in my head, less surprising outcomes. On the work samples it's harder to know whether you're making mistakes, or are being out-competed by someone putting in 4 times the effort to polish the solution beyond what their regular work product would be. Although I had one really good outcome where the work-sample interview really flagged the internal dysfunction of a company.

And then with both interview processes, I still think there is a really big unknown on what the false no-hire rate is, how much effort is getting wasted rejecting candidates that would actually fit the team.

So having to choose a process as an interviewer, I'm with you and would always choose a work-sample interview. On whether it should be considered the "gold standard", I'm much more hesitant, I think there are some limitations that are still hard to control for.

I do wish Starfighter/Stockfighter model had gained more traction, would've been interesting to see a recruiting company specialize in this and then seeding the interview results to multiple companies model work out.

Going through the interview process now for the first time in half a decade, and while I already would have said five years ago that I preferred work samples, that opinion is only growing stronger as I go through the process again.

Leetcode style interviews feel so stupid and divorced from the reality of the job, especially the “one weird trick” kind where you’re expected to discern the best possible solution to a problem on the spot and in a pressurized situation.

The reality of the job is usually that when you are under time pressure, a suboptimal solution that does the job is fine (to be fixed later), while if you’re working on something you know is important (hot loop code, core data structures) you have time to think about it and get it right. A leetcode interview doesn’t select for either of those things: it selects for people who have time to grind leetcode problems.

On the other hand, a work sample is realistic: a timeboxed task that you can approach in a familiar environment, without an interviewer breathing down your neck, expecting you to think out loud, railroading you to their preferred solution, etc.

As an interviewer, I always pushed for either work samples, which I quite liked, or coding interviews where we very explicitly said that we just want a working solution, which we could then talk through and look for potential improvements. We also explicitly viewed the coding interviews as being low signal, and tried to make the bar for passing low, so we could get candidates to higher signal conversations.

I do think the work sample route is a little more difficult in the LLM era, in that you are more likely to get a decent performance from a candidate who doesn’t actually know the domain, but a subsequent discussion asking them to explain their approach seems like it would be enough to ferret that out.

I shouldn't keep leaving this comment because it's not that useful but: AI-era work samples are a fun problem (you definitely can't just use your pre-agent default work samples!), and we've come up with a couple useful solutions.

If you're in an environment that is open to agents to begin with, one simple thing to do is just tell people to use agents, and ask for the prompts. Prompts are a high-signal artifact, and you can construct rubrics to evaluate them objectively.

I'll get around to writing a piece about this within the next couple months.

Ultimately work samples aren't a technology technique; they're a longstanding concept in management science. Long before the first coding take home was ever given, firms were using work samples to qualify salespeople, support professionals, factory workers, whatever. AI agents are a part of what the work of software development involves, and there's obviously no fundamental reason why you can't work sample them like anything else.

As a recent interviewee, I much more prefer work samples. Less stressful, more in my control and less bound to whether I got lucky and clicked with the problem in a live interview. It's also just much more akin to what work is like, and therefore requires far less studying. The fact that live interviewing is a completely different skill to actual work is a really bad smell.

> I'm not sure the outcomes were that noticeably better.

It's not just you. At the end of the day interviewing has been demonstrated to be close to a crapshoot in the best of circumstances, and very few interview schemes are the best of circumstances. Work samples are part of the optimal strategy [1] but even then the signal is quite low.

[1] https://psycnet.apa.org/record/1998-10661-006

Work sample interviews don't have to be take home. We ran our technical interviews as close to work samples as possible and in person.