dub: JSON, SDL, YAML, TOML, what color should we paint it ?
Witold
witold.baryluk+dlang at gmail.com
Wed Mar 8 00:11:30 UTC 2023
On Tuesday, 28 February 2023 at 14:29:28 UTC, Mathias LANG wrote:
> Obviously such a change would not happen overnight, and would
> need broad support from the community. Opinions ?
Pure JSON. And only it.
Reason? It is rather simple, and super compatible. Easy to parse
from anywhere, JavaScript, Python, Ruby, PHP, C++, C, syntax
highliters on website, web frameworks, editors, formatters,
command line tools (jq), etc. Do not assume dub files will only
be consumed by dub.
YAML is horrible in my opinion. I use it a lot, and from a
distance it looks nicer than JSON (comments, less verbosity, less
quoting, etc), but it is not good in the long run. 1) YAML Parser
are super complex. 2) backreferences are complicated. 3) multiple
ways of doing same thing (strings, arrays, dicts), so you cannot
easily read and write it back programatically, without likely
messing diffs, 4) too many damn ways to write strings, 5) no
quoting on values, causes issues when that value accidentally is
integer/float-like or boolean-like (including word `no`). I hate
this, and I just quote absolutely everything because of this,
which defeats a big part of yaml. 6) multi-document feature is an
anti-feature. 7) slow. 8) slow. 9) There is a lot of
implementations, and in reality they all differ a bit in minor
details. 10) slow. 11) Some extra features are just broken (like
date parsing).
I use YAML a lot, in Ansible, Prometheus, Kubernetes, Github,
Docker, and few other frequently used project. I never use it for
personally build projects, because I do not like its complexity.
Only thing that would be nice in JSON to have: comments, trailing
commas. JSON5 you say. I say no. Why? Compatibility. One can live
without commas. Comments can be emulated using object keys, i.e.
starting with underscore, and ignore them during processing.
Multi-document can be easily done by just having JSON after JSON
in one file (most parsers will just parse one at the time, and
allow you to parse next object). Not that dub needs this feature
anyway.
I do not like JSON either, but I do not like YAML, JSON5 and TOML
even more. SDL is too XML-like (with attributes), but do not map
nicely to processing in most programming languages (i.e. it is
not just a dict / aa), and often require akward XML-like /
DOM-like parsing, which is also more complex than it needs to be.
I would say YAML is okish, if the files are not too big, and you
edit them literally every day. But the ones in dub, you edit only
few very few times. So its human friendliness isn't really a good
selling point to me.
But YAML spec is so big, has (or had) so many bugs, and issues,
that I consider it horrible language.
Every few weeks I have some YAML issues, be it in Python, Go,
Ansible. We even had few production outages caused by YAML
idiotic parsing rules.
Simple, fast, and universal, is better than complex, slow and
niche.
> But JSON is a terrible format to write configurations in, given
> how verbose it is, and it lacking support for comments.
I do not agree with this statement.
You want comments. Just add them as underscore-prefixed keys. Or
use `//`, which is rather easy to strip before passing to other
tools.
If you want to comment some part of the config temporarily, then
just remove it. Most people use version control. It will be in
their history.
I would not consider JSON really a configuration language. It is
more of a storage and data transfer format. Configuration
languages are different things, there few decent ones out there,
like jsonnet, Hashicorp's HCL, Dhall, and few more. The
interoperability issue of them is not an issue, as: 1) there are
actually few implementations, 2) they are not used directly by
any system, rather they are passed through processing, and simple
(flat and dump) format is used as output (usually JSON) to be
consumed by programs. I like proper configuration languages like
jsonnet, because otherwise you end up in some horrible templating
like jinja in Ansible, or craziness of Helm Charts, K8s
Kustomize, which are all just horrible hacks with poor usability.
But, for dub using proper configuration language, would be an
overkill in my opinion.
Personally I would use text-encoded protocol buffers. Schema
based validation and typing out of the box. I use protocol
buffers and text-encoded ones for configs in most of my personnel
projects (Go, D, C++, Python), but it does come with some other
tradeoffs (proto buffer definition files, extra compilation step,
which is easy to automate, but a barrier for some).
But adding anything new is just not a good idea long term. You
will need to support all the formats for years.
I think way more important than format of config is better
documentation and tooling. Nodejs and TypeScript / JavaScript
people use huge JSON files as config to build system, and it all
just works fine.
Cheers.
More information about the Digitalmars-d
mailing list