Hacker News new | past | comments | ask | show | jobs | submit login
Dhall: A Non-Repetitive Alternative to YAML (dhall-lang.org)
193 points by ff_ on July 4, 2019 | hide | past | favorite | 177 comments



Dhall is an awesome tool to have in your DevOps tool belt - we're heavy dhall users at meshcloud [0] and couldn't be happier about it. We picked it after evaluating a long list of contenders (yaml madness with anchors, jsonnet, ksonnet, j2/jinja, a hacked ejs compiler [1] and some more I forgot). It's so good we're looking into how we can give back/donate to the project.

Dhall elegantly solves a major challenge: configuration management at scale. We build a multi-cloud management platform, which serves DevOps teams, IT Governance, Controlling and IT Management in large enterprises. That means we're an integration solution for a lot of things, so we need to be highly configurable. Because we also manage private clouds (a la OpenStack, Cloud Foundry, OpenShift etc.), we often run on-premises and operate our software as a managed service. Using dhall allows us to _compile and type check_ all our configuration for all our customers before rolling things out. We use dhall to compile everything from terraform/ansible, kubernetes templates, spring config, to concourse ci pipelines and customer-specific reference data to load into our product. Since adopting dhall earlier this year, we measurably reduced our deployment defect rate and re-gained the ability to safely refactor configuration.

It takes a little time to get used to, but we appreciate that it's highly opinionated around formatting and "how to do things" - somewhat in the same way as golang is. It has certainly helped that we had a member with haskell experience on the team, as dhall is built in haskell and the syntax feels familiar.

Plug: if you're looking for a job working with dhall, reach out :-)

- 0: https://meshcloud.io - 1: https://github.com/Meshcloud/ejs-compiler


We're also heavy Dhall users in production. Functional, strongly typed configuration is such a powerful concept that I struggle to understand how the language isn't more popular yet.

Common example: let's say I want to set up a PostgreSQL database for a service running in Kubernetes in AWS. How best to get it done?

Well, it turns out there's a number of different options: you can set up a DB through RDS, and a service in Kubernetes which directs to it through an externalName, which is probably what you want in production; you can set up Postgres as a StatefulSet, which is probably what you want in an ephemeral testing environment; or maybe you have a customer with a full-time DBA who will create the database for you and give you a connection string.

With Dhall, you set up a union type with each of these scenarios as options, and then you have a Dhall function for your Terraform and Kubernetes configurations. In your Terraform configuration, you have an RDS module where the count is set to 1 for the RDS/production scenario and 0 otherwise. In your Kubernetes configurations, you set up a service with an external name appropriately when you need to, set up a StatefulSet when that's relevant, etc.

Because they all use the same type in their function's parameter, they're guaranteed to stay consistent. You're guaranteed to never have an RDS instance setup alongside a Postgres StatefulSet. If you need to make changes (add options, change options, etc.) then you will get type errors in each and every place which forces you to address them, including in places you forgot about.

We started to adopt Dhall more than half a year ago now and we've barely scratched the surface of what the language makes possible. Purity in infrastructure and operations is a powerful drug.


Heads up - Your naked subdomain redirect to www doesn't seem to be working. If I go to www directly, I don't get the timeout.


thanks! turns out it wasn't working for https, should be fixed now.


working for me now as well


I'm trying to use it for Kubernetes since it can both work like helm (paramerizing functions) and kustomize (using the merge // operator). Moreover it has (safe) imports which make defining constants quite easy.

There are already kubernetes bindings available https://github.com/dhall-lang/dhall-kubernetes .

The syntax in the examples looks a bit more verbose and less readable than yaml but I think building sensible abstractions on top of it will alleviate the pain (abstractions here are innocuous since you can 'normalize' the code and they disappear)

I'm not too happy with the default formatting though. I think if the formatter indented nested values similar to yaml that would look better to the human eye.


> Moreover it has (safe) imports which make defining constants quite easy.

I read about dhall’s imports, and I don’t think I like it. If I add a text configuration mechanism to software, I do not want it accessing the network by default, full stop. To me, a “safe” configuration language means that parsing terminates, does not have side effects, does not touch the network, and that parsing the same file twice gives the same output unless I explicitly change an input. Pulling a prelude off of github does misses several of these requirements.

(Having your config file fail to parse if your network is down is bad, bad news if that config is needed to bring your network up. It’s also bad news if a parsing failure due to a transient network issue leaves your system in a state where it won’t quickly recover if the network comes back.)


You can still do that, though. In Dhall you may import things remotely as you develop and then tell Dhall to pre-fetch the result, you can commit that and it will not access anything.

You may also just download any imports yourself and source them locally.

Additionally Dhall supports import fallbacks, for example you may try first a remote import, and if it fails it will look for another place, which I’ve could be remote or local. This is a good strategy for developing locally and then committing imports for production use.

You can also, of course, host the files in your local network.


I get your point, but you don't need to run imports over the network (local imports are fine).

Also, if you were to import over the network, by running `dhall freeze` a semantic hash of the content is computed so you are 100% sure that what you are importing is not going to change. Moreover, files that have a hash value will be cached by dhall.

If you don't want to bother with copying over Prelude and you don't trust the cache, you can also normalize the code before pushing it to the network. This will flatten all your imports and reduce your file to normal form.

You might be interested in what they say about imports here: https://github.com/dhall-lang/dhall-lang/blob/master/standar...


Yeah, the formatting the `dhall` CLI tool uses isn't my favourite. Though luckily I don't usually have to look at it much :)



This isn’t an alternative to yaml. It’s a yaml generator. To me it’s not competing with yaml it’s competing with python or Haskell, and i’d argue that putting yet another language in your stack just for generating config files is added unneeded complexity. And sure while both python and Haskell are Turing complete, how often do we actually run into issues when generations flat config files? I mean I’ve never had that issue, and I’ve never caught myself thinking “if only there was a nice way to limit myself to a non Turing complete subset of python/Haskell”...


> i’d argue that putting yet another language in your stack just for generating config files is added unneeded complexity.

Well... I'd argue that when using Python I don't feel the need for a config file language in the first place... it's human friendly enough and I don't have to learn another syntax, use another parser, etc. I had to work on a Symfony project recently and I wish it wasn't sprinkled with all those yaml files.

> I’ve never caught myself thinking “if only there was a nice way to limit myself to a non Turing complete subset of python/Haskell”

Ditto... seems like bloat to me. There may be some use cases I don't know about but these config file languages tend to repel me.


That's kind of like saying C isn't an alternative to assembly, it's an assembly generator.


On a practical level, C is an alternative to assembly because the appropriate black-box tools and combinations of tools allow the user to easily turn source code in either Language or a combination of both into an executable.

Generating assembly from C is an implementation detail, and many C compilers don't do that.

On the other hand, Dhall really is a YAML generator: the available tools allow only one-way conversion (in particular, there is no interpreter/library to ingest Dhall from the configured application itself).


> in particular, there is no interpreter/library to ingest Dhall from the configured application itself

Not true. Perhaps this needs to be highlighted more prominently on the homepage. There are directly language bindings for Ruby, Haskell, and the JVM with more on the way.


Dhall is also a JSON generator. And you could write your own Dhall implementation that lets you consume it directly from an application if you want to, it's just nobody's considered that worth doing.


Which is odd because tons of applications in the wild already have this amazing capability of directly consuming python and Haskell without first converting to yaml. But of cause you still have the ability to both produce and consume yaml if that’s needed for some reason.


What applications directly consume Haskell code? The only examples I can think of off the top of my head are apps written in Haskell that actually get recompiled any time you change the "configuration" Haskell code.

As for Python, that's because the Python interpreter can be embedded in an app.

That said, Dhall is exposed as a Haskell library, so if you're writing a Haskell app you could consume Dhall directly and skip the YAML. https://hackage.haskell.org/package/dhall-1.24.0/docs/Dhall-... shows examples of this.


There is a tool to generate YAML from Dhall, but there are also language bindings for Ruby, JVM, and Haskell with more on the way. I don't think generating YAML will be a main use case for long.


If you have functions that can call functions, you'd better not have recursion if you want to not be Turing-complete.

Non-Turing-completeness is certainly very important in many cases (e.g., in DTrace and eBPF), but I'm not sure that it's so important for configuration. Assuming for a moment that I don't need non-Turing-completeness for configuration, my choice of DSL would be jq[0]! Using jq for configuration means that I can use JSON, TOML-style, and other ways of expressing complex data, including combinations of them, all with "interpolation" (not quite) and complex computation being available.

  [0] https://stedolan.github.io/jq/


One valuable point of Dhall is that it is programmable (yet not TC) in such a way to that you can e.g. describe a whole system entirely in Dhall and then (in Dhall!) derive whatever further configurations (plural!) you need from that. This is much more feasible than in e.g. YAML because Dhall is strongly typed.

So you could describe e.g. a cluster of machines entirely in Dhall and derive Ansible YAML scripts (with all their boilerplate), derive DNS config files, etc. etc. all from a single strongly typed description.


It goes even further: because Dhall supports functions one can write a function that will migrate old configs to new format. Migration function can also be type checked (it must accept old config format and emit new one). Of course all of this without TC.


I mean, jq is a powerful programming language. Did you look at the link I posted?


Sure, I'm familiar with it. It's nothing compared to what Dhall can do.

... unless of course it's grown to a similar point. As of about a year ago, for some complex transformations I had to stitch together multiple invocations of jq using pipes, etc. etc. That may have been my inexperience with using it, though.

Regardless, the point about typing stands. Dhall uses structural typing/subtyping which turns out to be hugely useful for config transformation.


I never have to "stitch together multiple invocations of jq using pipes". There was a time when I did have to, but that was years ago, and it now has the features needed to avoid that. Not sure what the point about typing was -- you can do the same derivation of data from a master entity with or without strong typing (though my preference is always for strong typing, which jq lacks).


> you can do the same derivation of data from a master entity with or without strong typing (though my preference is always for strong typing, which jq lacks).

Yes, of course you can. The point is that the strong typing helps prevent inadvertent and hard-to-spot mistakes. In jq even simple typos of a field name can lead to "empty result set" silent failures instead of "what do you mean!" loud failures. That's a big thing.


Strong typing generally doesn't speak to "how many values this expression should produce", only their types. Not saying it couldn't, but that even in Haskell you don't quite get that.


I was talking about a mismatched field name. What case are you talking about?

EDIT: Actually, responding to

> Strong typing generally doesn't speak to "how many values this expression should produce"

Have you heard of affine/linear types?


Well, you did format it so that it's not clickable...


Recursion is fine, as long as an argument gets smaller at every iteration, since that guarantees termination.


Oh, like Ackermann function[0]? Because I wouldn't want my server try to evaluate that.

[0] https://en.wikipedia.org/wiki/Ackermann_function


Exactly.

And heck, even with loops that are guaranteed to terminate, you can loop long enough to have that be a problem.

You really want to have no loops at all, like DTrace and eBPF, or else you want to give up on this idea of not being Turing-complete. You have to decide how critical it is to be able to put a reasonable bound on the run-time of a function.


Since there are no side effects in Dhall you can always just impose an arbitrary timeout if you really want to.

IME, it's really rare to stumble upon pathological complexity like the Ackermann function. You basically have to go out of your way to do 'absurd' things.

(The point is of course taken, but personally I don't really care much about the TC-ness or otherwise of Dhall, though it is kind a nice-to-have to ensure that you don't accidentally introduce infinite loops.)


Dhall keeps popping up on HN. Here what I don't like about it:

- Why use '=' instead of ':' for attributes? If you used ':', then '=' could be variable assignment and eliminate the need for 'let'.

- Why is there a need for commas?

- Why quote via ticks?! Gee!

- What's with the '{-' and '-}' for comments?! It's like its author decided to differ at any price!

In general, good ideas, but it's too weird and unnecessarily deviates from common syntax.


Dhall heavily borrows both ideas and syntax from the ML family of languages. E.g. Haskell, OCaml, Elm, Purescript

Colons are used for type signatures.

Commas are presumably required because you can have multi line and nested records. (don't quote me on this, not a parser expert)

The comment syntax is from Haskell.

Not saying this syntax is familiar to everyone, but it is familiar to some. The lineage of the syntax might help you understand where the language is coming from


Weird that it's billed as an alternative to YAML, but effectively has zero roots or influence from YAML. Looks more like an alternative to... whatever configuration language is popular in ML language projects?


There aren't too many standard onrs. Most people seem to either write a custom (embedded) DSL for their tool or go with something more widely known.

If people are curious about examples of DSLs in Haskell projects, they can look at cabal files, persistent's entity syntax, and servant's type level DSL for API definitions. These go on a scale from "fully separate" to "embedded in the language". (Persistent is in between, it uses something called Template Haskell)

The stack Haskell build/project management tool uses YAML files.

- - -

I think Dhall's power is not that it is an alternative syntax to YAML. It's more about the typesystem than anything else. If you're sold on types, then Dhall is definitely worth a look


So? They live in the same solution domain for the same problem. Moreover, Dhall can generate (typechecked) YAML and JSON.


It's not even in the same solution space. I can't replace my YAML with Dhall and consume it directly. I have to now depend on the converter to go from Dhall -> YAML/JSON. All I did was add another layer of complexity into my config.

Maybe you'll benefit from the added abstraction; I see the value in having "typed" configs that are semi-scriptable but not turing complete. But it's in no way a "replacement".


> All I did was add another layer of complexity

If that was in fact all you had done, it would not be worth it. But it might remove a ton of boilerplate from your YAML.


or, one can eliminate boilerplate in code so that configuration stays simple and the logic in the application

I mean what's the role of configuration? is so that installer can tune or match environments. if configuration becomes as hard as your program to understand then installation won't ever scale


>> I can't replace my YAML with Dhall and consume it directly.

Sorry I don't get it. Why not? Because your favorite languages do not support it?


How many languages have an Dhall implementation? One?


As far as I know Clojure, Haskell, Ruby, Rust in various stages of completion..


Both Clojure and Eta bindings should allow really anything on the JVM to consume directly.


Six or seven, or thereabouts.


I disagree that it has zero roots. It's not whitespace sensitive, so it does throw out a ton of YAML's distinctive syntax.

What you're left with is YAML's core expressiveness, which is that within a text file you can express the basic structures of maps, sequences, some atoms and user-typed [1] nodes.

YAML also has some limited facilities for avoiding repetition, such as aliases and anchors[2]. And a YAML parser can use tags to simulate functions[3].

Dhall is preserving all those capabilities, and then adds real functions and type declarations to further expand the ability to avoid repetition and to avoid mistakes. It's also using a proper type system rather than basically returning a parse tree back.

And whereas these advanced features in YAML are rarely useful because the parser has to go out of its way to support them, it seems like Dhall may be able to make them work properly. So I'd say seeing another project attempt something and then trying to do it right is definitely being influenced by it.

[1]: https://yaml.org/spec/1.2/spec.html#id2761292 [2]: https://yaml.org/spec/1.2/spec.html#id2786196 [3]: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...


Its more like an alternative to scripts generating configuration from templates. they claim that this is better because you cant for example write an infinite loop ruining the script.


Is it a real problem though? If your config generator is complex enough to hide an infinite loop, you probably should be relieved every time it fails obviously and doesn't generate wrong config instead.


Yeah, but how is this relevant? YAML, for example, makes JSON a subset, which lets it consume existing JSON without any need for conversion. It's easy to learn, and readable. Leading commas are eyesores. Plus, not everybody is a fan of ML! These are not mass market languages.


Not everybody’s a fan of C++, yet still...

YAML has its detractors and many of the criticisms against it are valid. It’s ambiguous and confusing and deeply nested significant white space, with odd indentation rules and optional syntax, is incredibly hard to follow as a human.


Honestly, having wrestled with YAML, I find it's easy to write YAML and a bear to process it, in that it seems to be a "read-only" format for machines.

Given some YAML file, making any kind of automated change generally nukes the formatting entirely, let alone the comments. I'm sure there are some libraries that do it right, but there seem to be far more that are just awful.

One nice thing about Dhall is that there's a reference implementation written in Haskell where they have very good pretty printing libraries so it's likely that programs would be able to do automated updates to configuration.


Yep, we use the Haskell implementation as a library in Spago [0] and automatically migrating the configuration from old versions is a breeze as we can just manipulate the AST by pattern matching on it, e.g. see [1]

The above is quite standard in JSON/YAML too though, but an awesome thing you can do in Dhall is that - since you have functions - you can write migrations for Dhall data in Dhall itself, e.g. see [2]

[0]: https://github.com/spacchetti/spago [1]: https://github.com/spacchetti/spago/blob/a77b869edcfddd592f4... [2]: http://www.haskellforall.com/2017/11/semantic-integrity-chec...


Seems like you're criticizing it just because it's different from what you're used to.


> Seems like you're criticizing it just because it's different from what you're used to.

This is a valid criticism. There are already widely-used data formats; if you do something different it should be justified.


I don't really see why the author needs to justify it. New languages with new syntax come out all the time, some become popular some do not. Sometimes a new language comes out with a different syntax and becomes popular, creating a new norm.

But in any case Dhall does actually follow a norm rather closely, just a different norm than the grandparent is used to. Namely, ML and other functional languages/research. Which makes sense since Dhall is a product of that community.


Yeah, it deviates from the norm and the expectations of most people, most of who are not even developers - lots of BAs today configure apps via such files.


Sorry, "BAs"?


Business Analysts


None of the examples show quoting via ticks, that seems to be a pretty obscure feature.


Actually, at least one does and for the wrong reason (an attribute named True, for example, needs to be quoted):

  { -- Unlike YAML, Dhall does not accept YES|NO|ON|OFF
    validDhallBools = [ True, False ]
      , someNumbers = [ 1
    ,
  -- Dhall is not indentation-sensitive
  2, 3 ]
    -- Field names that conflict with reserved identifiers must be quoted
  , `True` = True
  , version = "9.3"  {- Strings must be quoted

                        All Dhall literals have unambiguous types -}
  }


Thanks, I'm surprised I missed that!


Couple of contenders:

- Jsonnet (https://jsonnet.org/) - simpler syntax and less concepts to learn, just an extension of JSON. But no type checking. An open source offspring of Google's internal config language (GCL/BCL)

- Cue (https://github.com/cuelang/cue) - a more ambitious attempt to fix GCL/BCL by replacing inheritance as the fundamental compositional primitive with constraint unification.

Great thread comparing them against each other by the authors of both: https://github.com/cuelang/cue/issues/33

Cue seems kind of similar to Dhall on first sight, but I haven't used either enough for an informed opinion yet.


We make heavy use of Jsonnet at work (https://databricks.com/blog/2017/06/26/declarative-infrastru...). It's worked great. Having a hermetic, pure templating system whose only output is a set of JSON/YAML files means that you can refactor fearlessly: as long as the materialized JSON/YAML doesn't change, you are 100% sure your refactor is safe. Bazel's StarLark dialect of Python (https://github.com/bazelbuild/starlark) has similar benefits.

The language is simple and remarkably well specified, enough that we implemented our own intellij plugin (https://plugins.jetbrains.com/plugin/10852-jsonnet) and even our own faster compiler (https://github.com/databricks/sjsonnet) without much effort at all.

There are odd corners in the language, but not something that most people will end up bumping into in typical usage. The templates certainly get messy in large configurations, but no more messy than any other code, and the hermeticity/purity greatly helps in managing the messiness. It's certainly less messy/odd than the copy-paste configs or be-spoke JSON/YAML templating systems that inevitably appear in messy deployment environments!

The last thing of note is the lack of static types: this definitely affects usability to some extent, and especially hinders IDE support from being as useful as it is in e.g. Java. But having a useful/ergonomic type system that fits this specific problem space is probably still an unsolved research question.


We tried to introduce Jsonnet at our org. It failed miserably because ops kept mistaking the name for JSON which they hated. (International multilingual team).

It was a real shame because ops then implemented some features of Jsonnet via scripts to to parse and merge YAML. What was 0 LOC in Jsonnet is now about 300 LOC plus custom CI checkers, all because of a marketing problem.


If I go to the mentioned Jsonnet homepage it says "A simple extension of JSON". The graphic explains its relation to JSON. The example looks awfully similar to JSON.

What I don't understand is the following: config files are read by text editors, and in the end, by human beings. Because of the latter they should have certain traits. We must agree on the importance of these traits before we can settle on a standard.

For me, important features are that they must be readable, and easily editable. They must be readable with a certain text editor (vi) for backwards compatibility. So that means it shouldn't require syntax highlighting or schema. Well, these 2 simple requirements of mine rule out anything remotely resembling JSON.

It just appears to me that JSON is for JavaScript developers, YAML for Python developers, and Dhall for ML (the whole family I suppose, not just Haskell) developers.

Well then if we're going that route then perhaps all we need is some kind of glue between text config and binary config (which reminds me of Systemd...). Ie. that it accepts multiple config file formats.


I'm a little confused by one part of what you wrote. You say that your config files must be readable by vi, and that in turn adds a no-highlighting-required constraint and a no-schema-required constraint, and that in turn adds a no-JSON-nor-anything-like-it constraint.

I deal with JSON all the time in vim, effortlessly. I'd be willing to deal with it in Notepad if necessary, and certainly in non-vim vi. Pipe it through a prettifier (lately I use `jq . file.json`) if necessary.

I don't need syntax highlighting, and I don't need a formalized schema (although I certainly appreciate an informal one, interpreted by the 1.0 Human Meatbrain I carry around). Also, if by "vi" you meant "vim", this is EVEN MORE confusing, because vim syntax-highlights JSON.


With vi I mean vi, not vim. If I meant vim, I'd have written vim. I use vim if its available (with my own configuration which includes syntax highlighting), but it isn't always available.

With my Human Meatbrain syntax highlighter I have far more issues with JSON than with say YAML or any other markup language.

Consider, for example, how easy the syntax is of a Wireguard configuration file. It is basically akin to a shell script or ini configuration file. And these have a proven track record. Why is that way of configuration broken in the first place? You could do things such as variables in shell scripts as well.

I also believe that the whole Systemd drama is basically because of moving away from such a proven track record. And it might very well be true that shell scripts are slow. That is why I argue for backwards compatibility and converting to/from formats. Which is something Dhall is able to (it can convert to/from YAML and JSON).


i'll throw in one of my projects as a contender: ytt - YAML templating tool - https://get-ytt.io (check out live playground!).

it works with yaml structures (hence avoids text templating problems) and uses familiar python-like language, starlark, making quite easy to get started. it makes use of yaml comments to assign metadata/templating directives to yaml nodes, so it looks something like this:

  #@ load("@ytt:data", "data")

  #@ def labels():
  app: echo
  org: test
  #@ end

  kind: Pod
  apiVersion: v1
  metadata:
    name: echo-app
    labels: #@ labels()
  spec:
    containers:
    #@ for/end echo in data.values.echos:
    - name: #@ echo.name
      image: hashicorp/http-echo
      args:
      - #@ "-text=" + echo.text
it doesn't include type checking, however, it does have a system to "overlay" structures on top of each other via overlay feature -- https://github.com/k14s/ytt/blob/master/docs/lang-ref-ytt-ov.... merge/replace/remove operations expect to find one node by default so map key typos or wrong structural nesting problems are caught easily in common cases.


My favorite is still the nginx config-like libucl: https://github.com/vstakhov/libucl


Cue is a bit too cute trying to combine the subtyping and inhabitance relations into one.


What problems do you see?


Immediate response?

I hate commas at the start of lines and I would prefer not to have curly braces in a human editable/readable format.

Neither reason is terribly rational but my first impressions weren't great.


It looks that Dhall has been inspired of the Elm language [0] and it's formatter.

[0]: https://guide.elm-lang.org/


It's a common practice in the Haskell community. Knowing where the creator of dhall comes from I would say that that's the source of inspiration


Author here: the syntax is inspired by all three of Haskell/PureScript/Elm


I get that this makes diffs a tiny bit nicer when adding new lines, if you don't use trailing commas, but christ it's ugly!


> commas at the start of lines

are not a requirement.


OTOH I would guess it doesn't allow a trailing comma (same problem as JSON...) so you end up with weird ugly formatting conventions


Final trailing comma is not in the standard yet, but will be soon


No, you can use trailing commas if you want.


So far, only comments complaining about syntax. You can do better, HN!


Yeah, because that's what most configuration languages differ in.


And yet, in focusing on the syntax, you missed the biggest difference between Dhall and other configuration languages: safe, termination-guaranteed non-Turing-complete computation.


How is it better than Jsonnet or Starlark?


static type system


True. But isn't this also accomplished when pairing JSON with JSON Schema?


The schema applies to the generated values, but what’s the equivalent of a schema for a programming language? Types.

Jsonnet does not have a way to validate its functions without running them first.


I clearly get that, but many IDEs use the schema to help you produce valid JSON.


Why are you implying that kccqzy thinks Dhall is better than Jsonnet or Starlark?


A new project should be better than the old ones. Otherwise, what's the point?


And yet, HN is commenting about a language people use to describe entire networks of computers on a few non-redundant centralized files... And all the comments are about syntax.



I'm afraid years of n-gate have shown that HN cannot do better.


n-gategate strikes again


For a pragmatic, really readable configuration file format, TOML never disappointed me ( https://github.com/toml-lang/toml#user-content-local-date ).

- This is human readable contrary to the JSON family and its {} abuses.

- It is not space / ident base contrary to YAML that becomes very quickly a mess to write and a mess to parse.


As a fan of TOML, I want to be clear on the downsides.

TOML is good for data layed out with TOML. Representing arbitrary nested arrays and tables gets messy.

Also, the constraint on homogenous shallow types has impacted me in some cases. Originally, I was all on board. Arrays should be homogenous. The problem is logically homogenous vs syntactically homogenous.

Cargo uses tables to declare dependencies. The values are logically homogenous, they are declarations. Synatictically, some values are strings while the rest are sub-tables. The string is just shorthand for a table though.

This feature can't be implemented in arrays like it can with tables.


TOML is almost perfect. The only things I don't like are the need for commas and the double brackets.


Perfection is a bugaboo. Give me 95%, minor inconveniences, and declare vict'ry, say I.


Still much prefer HJSON (http://hjson.org/) for stuff that people might need to touch.

If it's truly for end-users (read: non-admin/dev types), you probably shouldn't have them touching configuration files _at all_.


Don't fully remember why I prefer json5 to hjson but at a quick glance, bare values is one. Bare values are ripe for someone entering in a string and accidentally getting a bool or number instead.


How is this at all related to Dhall? It looks like a completely different thing with a completely different purpose.


They are both text based configuration file formats made to be easier for humans to interact with, so I'm not sure what you're confused about?


That’s about the least interesting thing about Dhall. (It’s weird that they tout it prominently on the homepage.) There are so many flavors of syntax sugar for JSON, Dhall is a completely different beast.


You might want to ask the HN admins to rename the title for this post, then.


It looks fine to me. Reducing repetition is precisely the type of thing Dhall’s features should be useful for.


Also Relaxed JSON. http://www.relaxedjson.org/


My comment isn't specifically about Dhall, but about the note on Turing completeness. I often read comments about how YAML/JSON is not turning complete. These comments normally frame the lack of Turing completeness as being a short coming of the format(s). I find this interesting because one the reasons that the industry moved away from XML was to have cleaner separation between data and logic. I generally tend to think that it is cleaner to separate logic and data, instead of creating a tight coupling. I don't read many people making comments from this perspective though. I am not trying to say we can't do better then YAML/JSON, I am just trying to offer some food for thought. I tend to view JSON/YAML as a data exchange format, and not a programming language, so I am not bothered by the lack of Turing completeness.


Not being Turing-complete is a feature.

A Turing-complete language allows to write programs that never terminate. This is not what a config file should be capable of.


Previously in my career I've abused Jinja templating to build "scripts" out of Ansible and SaltStack YAML. It solved business problems effectively but I'm sure when I left that role I passed on a big plate of spaghetti to my successor with minimal automated tests.

If it depends on conditional logic or iteration, it probably belongs in a proper programming language with a linter, type checkers, debugger and unit test framework.


This is one of the reasons why, coming from a background of writing Chef and writing exhaustive testing around my configuration management, I've never been able to deal with Ansible without grinding my teeth. Or, for that matter, Terraform; HCL is godawful and trying to write it in JSON is a sucker's bet, too. With the move towards more containerized systems I don't write Chef much anymore (thankfully), but Terraform has become ever more of a boil on my rear end. Fortunately there's also Pulumi now and writing this stuff in TypeScript is a lot faster and feels really good.


The thing is, although Ansible is not perfect, it is python. When you are writing playbooks you should be defining the state of the world, if you are writing custom logic it is probably best to write a custom module, which are commonly unit tested. I haven't tried Pulumi yet, thanks for pointing it out.


I'm not quite ready yet (esp. documentation is lacking), but I'm working on something to get the best of both worlds: https://freckles.io

Basically, you write your state-defining code in Ansible, using either modules, tasklists, roles, or a combination thereof. 'freckles' lets you wrap those up in re-usable, distinct, atomic units which you can combine for more complex tasks. Then you can use those directly via the command-line, or you can auto-generate (wrapper) Python (will do other languages later) classes from them (e.g. https://freckles.io/doc/interfaces/python#code ), for when there is more 'logic' to be implemented.

I reckon it's a bit like Pulumi, but it is less opinionated. Actually, it's probably more like it lets you create your own little domain-specific Pulumi, if that makes any sense. It also works really well as a wrapper for Packer and Terraform.

As I said, documentation is not quite there yet, but most of the important stuff works. Starting to look for people to try it out, if anybody on here is interested.


First off: congratulations on shipping something! Shipping is hard.

But this misses the mark. With respect, and I'm trying to not not dunking on your project, to me Ansible is the worst of every world and is emblematic of configuration management retreating from the realm of "infrastructure as code" to "infrastructure as magic notepad files because incurious sysadmins won't write code.". I don't really care that Ansible is being wrapped so long as, inevitably, I'm going to have to go deal with that when it breaks. Eventually Pulumi/Terraform will break too, and there is no programming language I have more eyerolls for than Golang, but at least it's a programming language, you know? (And I really don't relish the thought of ping-ponging from the Python wrapper to YAML hell to a Python module, tbh.)

The biggest thing that a project like Freckles fails to capture, and where Pulumi shines, is that it's all code and you just treat it as code. They're compiler wonks, or at least the guy I know there is one, and they leverage that--when dealing with stuff like Lambdas/GCP Cloud Functions, you just write them inline and they're hoisted into deployable packages without comment or incident. There's no zipping of files, there's no messing around--you wrote a function to do a thing and it gets run. Done-and-done.

It's got a lot of other nice features, but that it is transparently code, and that they're investing most of their effort (it seems) into hiding the unfortunate fact that eventually some Terraform providers get run, makes Pulumi a really hard one to top.


Fair enough, I can see some of your points, but (obviously) I don't think they are entirely relevant, and my experience and view of Ansible is a different one.

Also, Ansible is only one backend for freckles, it's just the first one I implemented because of all the roles and modules already available for it. There will be a Terraform backend, and a Packer one, and a Kubernetes one too, probably. And I'm actually much more excited about the shell backend I have in mind, but that is much more work to create a repository of tasks for. You'll be able to just select one of those backends, or combine all of them.

The goal is for 'end-users' never having to deal with underlying backends and Ansible code and the like, if they don't want to. Those are just there to provide the building blocks end users would use to compose their provisioning/orchestration -- which they can do entirely in code. I hope to have a stable 'stdlib' of such tasks at some point, comparable to what is available in Pulumi, which people can just use without any other concern.

I don't know how Pulumi works under the hood, are you saying they basically wrap Terraform modules? If that's the case, that's not that different to how freckles works. With the exception that freckles can also be used for non-cloud things.


Ansible is emphatically not Python. Ansible is YAML with Python extensions.

The model you're describing better fits Chef Zero. (Which is excellent, and is the tool I would go to were I still in a shop that needed instance-level CM.)


Ansible is written in Python, and all the core modules are written in Python. Using YAML you are passing arguments to the Python modules. This is also why Jinja is used.

https://docs.ansible.com/ansible/latest/dev_guide/index.html


Did it sound like I was unaware of this?

The user interface is YAML. It is an awful user interface and it adds nothing except complexity to the effort of making systems work. Expect people to write well-factored code and to treat the entirety of their CM as code and they will do so.


Of all the bugs you've ever dealt with in programs in general, what fraction of them of them were infinite loops?

For me I'd guess maybe... 0.1%? It's definitely under 1%.

Given that, it makes no sense to me that I'd want to make myself jump through hoops to express some basic coding patterns [1], just to rule out that single class of bugs. It seems like a solution in search of a problem.

[1] https://github.com/dhall-lang/dhall-lang/wiki/How-to-transla...


I think the Non-Turing completeness is misunderstood as "no infinite loops!". Of course, that's the obvious effect of not having it, but it has more important effects:

1. Any expression can be normalised at compile time. This means that you can remove all abstractions in your code and audit the result before running it. This is great of debugging and understanding a codebase you may not be familiar with.

2. You can be sure that the language cannot do a jailbreak and potentially read files or make network calls it was not supposed to. Not because the Dhall developers are security geniuses but because it is literally impossible without Turing completeness.

An example of a NT complete language that you would hate if it could read a file in your system and send it to attackers is Regular Expressions, for example.

3. It helps the language stay focused. This limitation is the biggest motivator to keep the language being focused on a single domain, and not try to be everything. This also helps execution performance of the language.


I feel like you've made in "inverse error" here repeatedly.

If a language L1 is Turing complete then you have some potential liabilities: there's no general normalization plan, you cannot in general ensure security guarantees, etc. But another language, L2, might be not-Turing-complete and still have all the same problems. It may especially be the case that L2 has no known normalization algorithm, or no known general security audit.

For example, many interesting properties are undecidable for the general case of context-free languages (a class that I'm sure you know is much more restricted than the class for Turing-complete languages). For example universality is undecidable in this class, as is language equivalence. As for being sure that the language cannot do a jailbreak, you can't be sure until you write a proof. It's nice that Godel isn't telling you that no such proof can possibly be written, but that's still a long way from having a proof.

I agree with you that this sort of thinking seems to be why people say that they do not want certain languages to be Turing-complete, but I'm not at all convinced that those people have correctly named the property that they want.


> one [of] the reasons that the industry moved away from XML was to have cleaner separation between data and logic

How is XML coupling data and logic? The only kind of "processing" it does by itself I can think of is composing documents from pieces and "processing instructions" as a generic extension mechanism. That is, features to support its original use case of authoring and capturing structured text. Now SGML has more processing features (tag inference, stylesheets/link processes, notations), but is still far away from Turing-completeness.


I am not sure if this completely answers your question, but I am talking about XML in a SOAP based architecture.


I 100% agree. It should be considered an important feature that my configuration files (and transfer data) don't suffer from the halting problem.


> I find this interesting because one the reasons that the industry moved away from XML was to have cleaner separation between data and logic.

That's not really true or even sensible, since XML doesn't combine data and logic. Sure, there were XML-based logic languages (most notably XSLT) as well as XML-based data languages, but while all were applications of XML they were separate languages.

XML lost ground to JSON, etc., as the fashion pendulum swung away from heavyweight tooling and detailed specs for most things (though it's swinging back again), and to some closer-to-memory-layout binary formats as efficiency became a concern in some of the places where rigid specs remained important.


Hello, just to clarify I am specifically talking about how XML was used in the 90's for web development, i.e. SOAP.


Dhall is fantastic and I try to encourage everyone in tech I meet try it.


OK, why?


Because it is a good mix of features, syntax, execution speed and correctness.

Of course. Didn't you read the article?


I did, and it's bad form to suggest otherwise.

I was asking about your personal opinion, since you said it was great but didn't give any info as to why.


> I did, and it's bad form to suggest otherwise.

If you say so. But... I said I think the features are good and you ask "why?"

I could relist the features, but that's a waste of time. As I said, I think the combination is a good combination. In a world where people try to encode loops in YAML or JSON, even small improvements are better and Dhall is a large improvement.

> I was asking about your personal opinion, since you said it was great but didn't give any info as to why.

I don't really get what you're asking. I said I think the combination of features is good. I don't need to describe the features. The reason I like them is probably related to neurochemistry?

Are you asking for a deep dive on how Dhall compares to alternatives?


Then I apologise; because you posted I assumed you were happy to or wanted to engage in discourse, maybe highlight your favourite features, possibly debate the value it bring, that kind of thing, but I see my mistake now.


Hmm...

So the authors claim that their language is guaranteed to terminate for all well-typed programs. That is actually a nice spot for configuration languages. Yet, I wonder how

a) they guarantee it, as I have seen no obvious link to the language's semantics

b) useful this is in practice.

Nevertheless, very nice approach, indeed.


There is no support for recursion and the usual workarounds don't apply, so the language is not Turing complete: https://github.com/dhall-lang/dhall-lang/wiki/Safety-guarant...


Build systems tend to get very complex Turing complete scripts, such as gradle for Java or Make for C. Having something almost as powerful but reducible to a normal form is very helpful for CASE tools.


Banning Turing completeness doesn't give you the property you want, though. Knowing that reducing to a normal form eventually terminates if you wait a million years may be something mathematicians care about, but isn't of practical use.

What matters is that you can analyze the code quickly. To find that out, one way is to try it and kill the process if it takes too long.

Or perhaps better would be to come up with a portable definition of what "takes too long" means that you can put in a presubmit check. Something like "running out of gas" in Ethereum.



I'm sure people will be happy to crucify me for throwing this out there but I don't see a big risk in just using JavaScript in most cases if you want something like that. You could use template literals to replicate the example.


How far does "non turing completeness" really get you in this context? It looks easy to write a program in this language that will take longer than the age of the universe to evaluate and whose result can't be represented explicitly without collapsing the galaxy into a black hole. How much comfort can you take in the fact that you know it doesn't diverge?


How would you write such a program?


    let replicate = http://prelude.dhall-lang.org/List/replicate
    in replicate 999999999999 Natural 1
(add additional nines if necessary)


I love this!

How small is a static binary to run this in my containers?

How are some ways to integrate the typed config in a language?


The static binaries for the various interpreters and conversion utilities (i.e. `dhall`/`dhall-to-yaml`/`yaml-to-dhall`) are all roughly 10 MB each

The following languages natively bind to Dhall:

* Haskell * Clojure * Ruby

... and the following language bindings are in progress:

* Rust * Go * Python * PureScript

In the absence of a native language binding, you can convert Dhall to YAML or JSON and read that in.


   let input =
      { relative = "daughter"
      , movies   = [ "Boss Baby", "Frozen", "Moana" ]
      }
We don't frequent the same kind of "non-technical users" I guess.


I thought the typo in the challenge was that the keys were in the root of the user's home directory, instead of the `.ssh` directory. So, I added `.ssh/` between the key and user home directory.


This looks very useful.


looking further, it seems that aside from repetitiveness, safety is the main focus:

https://github.com/dhall-lang/dhall-lang/wiki/Safety-guarant...

which in Rust, we're solving this via SANE and SCL:

https://gitlab.com/bloom42/sane-rs

https://github.com/keats/scl

I'm not sure how much need there is for an additional programming layer, especially within config (the part of a program with the simplest syntactic requirements).

for my projects where "ahead-of-time validation" is needed, we're currently using SCL's parser for safety guarantees:

https://github.com/foundpatterns/contentdb

https://github.com/foundpatterns/lighttouch/blob/d7ada4576a6...

https://github.com/foundpatterns/torchbear/blob/4dd2b9ea76ba...


From a cursory look to SANE and SCL it looks like Dhall still offers some more:

- functions

- a powerful typesystem

- remote (HTTP) imports with sha256 checksums


Programmable configuration is always and without exception a monumentally stupid idea.

Programmatic generation of static configuration files can be very useful.

Sufficiently complex examples of the latter might as well be the former as far as maintenance is concerned.

If you need to write a program to configure your program, you're probably doing it wrong.


Configs allow to add flexibility past compile time, often dynamically at runtime.


Yes, that's the problem. I'd like to be able to look at a config file, on disk, loaded at startup, which defines the initial state of the server without having to think through how it was evaluated.

Generating the config during deployment, eh... often necessary. Best done with transforms and templates because they're simple.

Executable config, run during startup or, worse, on each request? NO.

[edit] I think that's the main disconnect here: 'past compile time'. The whole point of testing, strong type systems, etc is to lock down the set of states the system can be in. If your configuration is so 'dynamic' you are essentially abandoning all those benefits and saying 'yeah, do what you like to our live servers'.

In short, configuration which is that powerful is indistinguishable from running untested code in production.


Dhall give you exactly that, since you can store the normalised version of any configuration. Or inspect it at will by running it.


Yes, I'm not criticising Dhall so much as the behaviours it permits. IMO it should be difficult to 'program' the generation of a config file, because the application should be designed such that that degree of flexibility is not required at the level of configuration.

We've done things with generated config before. Looked necessary. We took a few steps back and realised that it was never necessary, only permitted, so it got fudged in deployment instead of being fixed in application design.


If you operated services that should run at scale, or 24/7, or both, you must have noticed that the ability to fix an urgent issue (which may be totally not a fault of the deployed code) by changing just configs is valuable.


Of the failure modes I've had to deal with, even when we did use a lot of interpreted scripts which could be live-patched, very few of the important issues were fixable that way. Because the stuff which broke was rare and deep.

One could argue that everything would be config-related if all the code were interpreted and considered to be configuration. So absolutely everything could be fixed in Production.

I'd argue that absolutely everything could be broken in Production too, in that case. The point is that you test the invariant stuff, and have a small surface area of variables which define the permutations of Live behaviours.

Configuration is not code. Code gets tested and doesn't vary between Canary, Nightly, Stable Test, Customer Test and Customer Live for a given build.

Because some of us build stuff for customers who won't justify the 'risk' of updating every day, and have upgrade cycles measured in months.


To me, worrying about config files seems like the ultimate exercise in bikeshedding.

You either need a simple list of items (eg. dependencies) or key/value pairs. Use a text file or yml or json or whatever.

Or you need templating, the use of functions, etc, like dhall provides. But then, why not use the language you're already using for the rest of your project, or a bash script to export some variables?

Might sound like I'm throwing sourness around, but I just don't see the niche for this, except inventing a new thing for the joy of it?


>But then, why not use the language you're already using for the rest of your project, or a bash script to export some variables?

Well, for same reason, say, why people would use a javascript framework to build a webapp over vanilla js. Both could do the job, and for simple cases there's little reason to go with a framework resp. specialized config language.

But as your app/config gets larger and more complex, using a framework resp. config language would tend to get the job done more efficiently by providing you structure and toolbox with solutions to common pain points.

Config generators themselves tend to be a rather heavyweight all-or-nothing solution which leads people to compromise on some adhoc middle-ground solutions like YAML with jinja templates with unclear evaluation semantics. A good config language designed from the ground up can be so much better than this unholy yaml/jinja mess!

Finally, one of the key selling points of specifically Dhall is type checking. Implementing that in config generators in a generic untyped scripting language would be a nontrivial amount of boilerplate, and boilerplate elimination is what config languages are all about.


> You either need a simple list of items (eg. dependencies) or key/value pairs. Use a text file or yml or json or whatever.

You clearly haven't wrestled with kubernetes configuration files.

Reams and reams of YAML, heavily indented to represent umpteen nested objects.

Templating, bash scripts, using your favourite language to roll your own config generator - these are all well trodden approaches that fail to scale.

Official shitshows like Helm have come along with nothing more innovative to offer than templated YAML. The next version uses lua to generate config, but I remain skeptical given previous design choices.

We can question whether kubernetes has pushed the dial too far towards necessitating mountains of config, but for now it is most definitely a problem for us users.


My favorite is when people write scripts that concatenate YAML files in such a way that you have to be really careful about how this one file always needs to be indented by 12 spaces otherwise everything breaks in horrible ways.


One reason is that "configuration language" is an extremely wide topic. Some configs use yaml to encode bash scripts for example.

Overall a reason enough is that human-friendly languages have different priority than parser-friendly ones.

Honestly it is the reason I like TOML, with the exception of the date data type it cleanly maps to json (which everyone agrees it is a good enough serialization format) and it is specifically focused on human friendliness (except uniform lists) and readability

As an underappreciated feature, the ability to have scoped keyvalues allow to define nested table with flat statements.


> Overall a reason enough is that human-friendly languages have different priority than parser-friendly ones.

Okay thanks, that's a good one. Readability might be a big deal.


I kind of agree, to a point (e.g. Erlang config is, usually, a complete shitshow).

I wonder if JSON had allowed for comments if we'd see such a proliferation of config system? At least IME, that seems to be the biggest pain point with JSON.


My thoughts, exactly. Especially with dynamic, scripting languages with no need for compilation and simple enough syntax (ie. php, python, ruby, etc). Why would I want to add another layer of complexity to an application with an unknown language while I already have a capable one ? Why add more dependencies on your project ? Why add more cognitive load on your brain ?


What's with the commas at the start of lines?


This is a common convention in some languages, most often functional languages in my experience. I associate it most with OCaml.

So it's not so surprising to see it here, seeing as Dhall is written in Haskell.


Better Syntax for Lists, Records (and Unions) #66

https://github.com/dhall-lang/dhall-lang/issues/66


Makes it easier when commenting out. Use this trick for SQL and JS too (to prevent trailing comma issue)


I don't get why you'd build a language in 2019 which disallows a trailing comma in lists.


Yep. The solution to comma issues isn't to put every single one in a place nobody usually puts them - it's to be more forgiving about allowing the occasional trailing one.


The solution is to get rid of them. They are not necessary.


Author here: commas are necessary to separate list elements in any language where function application uses whitespace (i.e. Haskell-style function application). If you're not familiar with the syntax, an expression like `f x y z` is a function `f` applied to three arguments (`x`, `y`, and `z`), analogous to `f(x, y, z)` in a more traditional language.

Nix made this mistake of using whitespace to separate list elements AND using whitespace to separate a function from its argument and it is a very common pain point for new users, because they will write something like this:

    [ theFunction theFunction'sArgument ]
... thinking it will be parsed as:

    [ (theFunction theFunction'sArgument) ]
... but it actually gets parsed as a list with two elements, leading to a bizarre type error.


There's absolutely no need for commas in a configuration language, in my opinion.


Dhall does not disallow trailing commas, nor require leading ones.


Alternatively it could allow leading commas.


On the other hand - why would you build a language where commas aren’t treated as whitespace?


You could absolutely do this as well. I'm just taking issue with a language that constrains you in such a way that you have to put the commas at the beginning of the list elements, so it plays nicely with {comments, source control blame}.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: