>Sooner or later investors will see the "$2M spend" and demand "$4M net profit", and that's not going to materialize.
I think this is probably going to happen at the same time that the providers start really jacking up token prices to extract all the value they can.
I'm a manager and the VPs are starting to ask - how many story points are we getting with AI now. Now we do story points = number of days to implement. (I know this is not real agile but just assume you are in the same position)
I can't answer that question but plenty of other managers are fully ready to just give bogus numbers.
For my team, use of AI has indeed lowered the story point cost. The coding part of the story takes less work so we have started to lower the story point cost for stories that would previously cost more. Think of a 5SP to 3SP reduction.
We have increased the number of features being delivered but our number of story points delivered has remained static.
When management starts tracking improvements in story point productivity then the agile teams inflate their story point estimates. Sometimes this involves splitting user stories in ways that don't really make sense from a customer perspective just for the sake of having a place to tack on more points.
And I'm not opposed to using story points, they have some utility within an agile team or program. They just aren't a valid way of quantifying productivity changes.
A few years ago I was on a massive project to rebuild and redesign a major public facing portal. Our dev team was nails, cranking out features and components on a very tight time table. Several other teams were inflating their story point estimates so when the higher ups would get their weekly reports, our team was always in dead last for completing story points.
Our manager brought us into an all hands meeting and kind of read us the riot act because now we were on "Bob the executive" radar because it looked like we really weren't delivering much week by week. Had anybody actually looked at the amount of work we were doing and what we were shipping, it wouldn't be close.
Exactly as you predicted, we started over inflating our stories, creating Epics when they weren't needed, breaking out a single feature into a dozen or more stories. Over the next few weeks, we were all getting pats on the back for "really picking up the pace". When in reality, we were just doing the same thing we always did.
It just reinforced the idea that Agile had turned into a system that was easy to manipulate to create the illusion you were doing more than you really were. I imagine we're going to see a lot more of this as C-Suite folks start clamoring for ROI on the millions they're spending on tokens.
> story points = number of days to implement
Some variant of this has been the case in every agile team I've ever worked on.
No one's ever been able to explain story points to me without saying something like "Story points are kind of like how long it will take to implement, except it's not that".
So what is it then? All the explanations and examples are in units of time, but with a disclaimer saying that the true nature of story points is not time-based, except for the fact that they can only be explained in terms of time.
IANAScrumlord, but ideally story points are like a foreign currency: It's both normal and healthy for exchange rates to constantly fluctuate, and every country (team) has it's own units for capturing guesswork and confidence and quality/speed mix.
The managerial goal is to take near-past moving average rates (from completed tickets) and use them to forecast near-future expectations. 1.0 of Team Alpha's points might mean 4 hours this week... but anybody who shows up six months later expecting exactly the same rate is foolish, doubly-so if they expect it to be the same across teams, or after a big change in staff or tooling or project.
______
Other musings: Whenever a manager says "my current estimate of the rate is X pts/hr, use that when sizing", I feel it's a mistake. I kills off the intuition you really want to capture. Team members ought to be comparing expected tasks to past tasks.
Also, the goal of "accurate scheduling predictions" exists in conflict with "measure employee output". Trying to use your point-system for one generally harms the other.
I once met someone who refused to engage with leadership using his team's story points as a direct measure of productivity. To make it harder to extract the data and compare against other teams, they moved to using names of animals to represent types of task associated with differing amounts of uncertainty.
I've also seen a supplier who was asked to provide some kind of tracking, where literally nothing existed. Their delivery team produced reports with story points per person, per task, per sprint. Every sprint, every person hit their target month after month after month. They were asked to stop.
I always see SP a combination of time and risk. I think a lot of people do not include risk in the estimate.
So a story might be estimated at 3SP to implement but there's a high risk that it would blow out (e.g. idea was not fully proven in a PoC, work is in an area that is historically underestimated, reliance on a different team, etc.), so we set it to 5SP to include that risk. Maybe 50% of the time it does get finished in what a normal 3SP would finish in, but at least we've covered the 50% of time it blows out.
IMO it's best to welcome the addition of a risk premium, because if you don't, it'll creep in anyway, just in a patchy and inconsistent manner.
Over time it becomes "priced in" to the moving average, which is good, assuming that employee instincts are generally valid.
Of course, if someone makes the mistake of trying to peg points to time, they're indirectly creating a kind of inflation: Yesterday's "just in case" premium should not become today's "everything goes well" baseline.
Its a way to say how long it will take without saying how long it will take.
I've always asked for it when joining a team to calibrate my story point estimates. At some teams 1 point is about a half hour task and at other teams it's a full day.
If you earnestly believe story points are a good measure of productivity then Im afraid you have a lot to learn.
> use of AI has indeed lowered the story point cost
It should not have. At least not significantly. Points should represent complexity, risk and overall effort (review burden, testing burden, dependencies, etc.), and so AI should increase velocity before it decreases story point estimates.
Over time, if a team's baseline delivery model genuinely changes, then reference stories can be recalibrated, but casually saying "AI lowered the point cost" is usually a smell that points are being treated as time estimates.
This is the same reason points should not = days even without AI. Velocity is what tells you if a team is getting better through training, tooling, hiring/firing, or process improvements. Re-pointing the same class of work downward hides the gain.
Almost certainly. Software firms are pretty bad at self-evaluation and they're profitable enough that Capitalism won't force them to do it either.
Right now the subscriptions are still in the range of reasonable business expenses, but pretty soon they'll have to jump and $200/month/seat subscriptions turning into $2000/month/seat subscriptions is going to get even very badly ran companies to re-evaluate.
It's worse than that. Developers themselves are drunk. They'll be cut off from tools right when they no longer understand the underlying code they're responsible for.
We're already here even. I know of a company that was doubling their Codex spend and hitting the cap week over week and finally they had enough and stopped increasing. Then they maxed out on credits and had a week of no Codex. A large percentage of the engineers loudly refused to work for the rest of the week. They were managing the Codex managing the codebase and were totally incapable of dealing with its output without it.
[flagged]
Amen. We are still running highly unoptimized workflows in AWS and nobody reviews why we spend so much $ on that now while it was peanuts when we did it all ourself.