YAML is okay for writing structured prose for humans. It’s terrible for anything consumed by programs because even that single microservice has a high likelihood of some problem caused by YAML’s magic typing, silent data loss due to indentation, etc. unless you pair it with a separate validation tool chain, making the argument for simplicity increasingly dubious.
Sure, so at that point how much are we really saving versus using a better alternative? Using YAML correctly is harder because you need not only to do the validation everything needs to do but also doing other things specific to YAML to avoid problems created by YAML rather than the problem domain. For example, if typing less is my goal isn’t it easier to, say, always quote country_name rather than have to run a separate validator which catches the Norway problem?
Why not pick a config language that works with our current config formats, looks like out current config files, and addresses many of the dumb problems that arise only in current config choices?
I’m with you that it’s terrible, but it very much does have schemas! The vast vast majority of YAML-based big APIs (k8s, helm, compose, and so on) all absolutely do check documents against schemas (not just ad-hoc validation rules) internally.
The real issue is two things: the smaller one is that there’s no single or self-describing schema system (like XML supports); the larger thing is that most YAML schema validations prioritize supporting extremely permissive and complex input documents over being predictable and appropriately restrictive. And that’s a harder problem to fix, because it has more to do with priorities and community conventions.
If people wanted strict schemaful YAML to be the norm, they would have consolidated on one of the many tools that does that by now. The issue is, people don’t want that; they want extremely flexible and open-ended APIs. YAML as currently practiced is conducive to that goal, but it’s the goal that leads to issues, not the choice of (bad, I agree) data language.
YAML is okay for writing structured prose for humans. It’s terrible for anything consumed by programs because even that single microservice has a high likelihood of some problem caused by YAML’s magic typing, silent data loss due to indentation, etc. unless you pair it with a separate validation tool chain, making the argument for simplicity increasingly dubious.
Validation is required, yes.
Sure, so at that point how much are we really saving versus using a better alternative? Using YAML correctly is harder because you need not only to do the validation everything needs to do but also doing other things specific to YAML to avoid problems created by YAML rather than the problem domain. For example, if typing less is my goal isn’t it easier to, say, always quote country_name rather than have to run a separate validator which catches the Norway problem?
Why not pick a config language that works with our current config formats, looks like out current config files, and addresses many of the dumb problems that arise only in current config choices?
It doesn't have schemas nor does it scale. It has no valid place because invisibly scoped languages are a terrible idea.
It's certainly insufficient, look at what happened to Helm
I’m with you that it’s terrible, but it very much does have schemas! The vast vast majority of YAML-based big APIs (k8s, helm, compose, and so on) all absolutely do check documents against schemas (not just ad-hoc validation rules) internally.
The real issue is two things: the smaller one is that there’s no single or self-describing schema system (like XML supports); the larger thing is that most YAML schema validations prioritize supporting extremely permissive and complex input documents over being predictable and appropriately restrictive. And that’s a harder problem to fix, because it has more to do with priorities and community conventions.
If people wanted strict schemaful YAML to be the norm, they would have consolidated on one of the many tools that does that by now. The issue is, people don’t want that; they want extremely flexible and open-ended APIs. YAML as currently practiced is conducive to that goal, but it’s the goal that leads to issues, not the choice of (bad, I agree) data language.
No yaml schema will save you when your HelmRelease will arbitrarily merge together your yaml files on top of kustomize on top of whatever else.
In practice schemas are mostly useless in my experience because people bend yaml as if they really really want a programming language instead.