Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The obvious final step (akrzemi1.wordpress.com)
51 points by codewiz on May 27, 2023 | hide | past | favorite | 46 comments


After many years in Ruby, I feel strongly that the idea of a block is necessary for optimal expressiveness of a high-level language. Destructors are too implicit and fine for resources, but not as great when understanding code. And Python’s context managers trying to help with exception-safety is a nice touch.

The Lisp crowd was right, as usual.


> The Lisp crowd was right, as usual.

40 years ago.

Fortunately we’re getting there, feature by feature. Maybe 2030 will be the decade yaml goes away.


What would be the way to do yaml?



A general concept here is of a "context", which both Ruby blocks and Python context managers try to provide. Something happens when you enter the context, your code runs, and then something happens when you exit the context, maybe varying depending on how you exited.

Such "contexts" are somewhat ad hoc in Lisp, whereas Ruby and Python have built-in language support and special syntax for them. The traditional Lisp idea is that you don't need special language support when you have macros and higher-order functions, but the special language support also means that all context managers follow the same protocols, whereas ad hoc contexts introduced by CALL-WITH-FOO/WITH-FOO are more like a common loose convention than a standardized system.

Functional programmers have also understood this idea of structured contexts for a very long time, giving us Functor, Applicative, and Monad.


One other way to deal with 'context' in Lisp is for example by the ADVICE mechanism or the method combinations (:around, :before, :after, ...) of CLOS.

https://en.wikipedia.org/wiki/Advice_(programming)

> all context managers follow the same protocols

Lisp is an old language with lots of dialects and systems. Thus people had different needs for context management and different facilities available to implement it.


> Lisp is an old language with lots of dialects and systems.

Absolutely, my first thought was `dynamic-wind`.

https://www.gnu.org/software/guile/manual/html_node/Dynamic-...


I agree. The most important feature of a high level programming language is first class closures with lexical scoping. Almost every design pattern, including the IOC in the original article, can be implemented with them (along with a few other core data structures and primitive operators). By contrast, there are many patterns that are very difficult to implement by hand in c because of the lack of first class closure support (c++ of course is a bit better since it has lambdas, but then you have to opt in to everything else in c++).


Yup, came to say this same thing! I read this whole article going “this is easily solved in Ruby/Lisp”


Anecdotal, of course, but: often the most elegant solutions I see in my job are from programmers coming from Ruby. It is interesting to see how different programming languages can shape our mindset.

A big part is those good solutions are those little lambda-receiving functions that do all the heavy lifting without imposing a lot of boilerplate. Not too many of them, of course.

IMO this kind of thing is where the “getting things done” and “good architecture” meet.


Lisp? You don't need to go all the way to Lisp.

Declaring data imperatively is quite a contradiction. You need a lot of effort and misguided design to create it. You don't need first class code blocks to avoid it.

(But, of course, first class code blocks help in many other ways.)


What do blocks give you that good old-fashioned first-class functions don't?


For example, here's how you'd read a file in Python and operate on its contents.

  with open('file') as myfile:
      process(myfile.read())
When control leaves that context, `file.close()` is always called. You don't have to remember to close it. Even if `process()` raises an exception, the `with open` context will close it before passing the exception upward. You never have to remember to clean up the file yourself.

It's not that you can't do all that without context managers, obviously. People have done it for a long time. It's just that the `with open...` pattern is effectively bulletproof and removes just one more potential footgun.


In JS, we could do:

    open('file', (myfile) => process(myfile.read()))
It would be on the implementer of `open` to call `file.close()`, but presumably that's true of python as well?


I think so, depending on how I'm reading your definition of "the implementer". If you meant "the code that implements the open() context manager", then yes. It looks something like:

  def open_context_manager(filename):
      try:
          f = open(filename)
          yield f
      finally:
          f.close()


This is literally one of the oldest problems in programming: returning the stack to its previous state before returning from a subroutine, or jumping back to the start of a loop.

Specifically, the XML element example involves ‘pushing’ the context of a new element into the document - a context which MUST be ‘popped’ before continuing.

This problem is literally why GOTO was considered harmful, why old school programmers insist that subroutines should have one return statement, why functional programmers are right to despise mutation, and why iterators are preferable to loops.

So as a programmer, you shouldn’t be surprised that your language of choice has clever ways to solve this problem, or that you can think of a dozen ways of avoiding this. It’s like the ur-problem of writing code.


Knowing that Ruby libraries used block (lambda) pattern for the past 25-ish years, it was very painful to read through the first 90% of the article explaining all the issues with abusing destructors.

Maybe because first-class support for lambdas was added too late in C++ and Java, or maybe because it was always PITA to type.

Ruby syntax is the lightest there could be (and it supports both {braces} and do/end keywords)

transaction do

   foo()

   bar()
end


> Ruby syntax is the lightest there could be

Kotlin can have exactly the same code with lambdas, using either a receiver type, or a context receiver. And it's type safe.

    transaction {
      foo()
      bar()
    }
Type-safe DSLs become a pleasure to build and use.


How does ruby prevent you from doing non-transactional things between 'do' and 'end'?


The same way you would be prevented from doing non-transactional things between calls to transaction.start() and transaction.end()


When I write this:

    someTransaction = do
      incrementCounter
      writeFile "foo.txt" "bar"
      decrementCounter
I get this:

  src/Lib.hs:11:5: error:
      • Couldn't match type ‘IO’ with ‘STM’
        Expected: STM ()
          Actual: IO ()
      • In a stmt of a 'do' block: writeFile "foo.txt" "bar"
        In the expression:
          do incrementCounter
            writeFile "foo.txt" "bar"
            decrementCounter
        In an equation for ‘someTransaction’:
            someTransaction
              = do incrementCounter
                  writeFile "foo.txt" "bar"
                  decrementCounter
     |
  11 |     writeFile "foo.txt" "bar"
     |     ^^^^^^^^^^^^^^^^^^^^^^^^^


Yup, came to say this same thing! I read this whole article going “this is easily solved in Ruby”


> The problem with this is that it is easy for the programmer to forget to write the obvious final step. It wouldn’t be a problem in a different, imaginary language or IDE where the machine would recognize a library with an obvious final step, and signal a compiler/IDE error when the programmer forgets it, or if it put the obvious final step into the code for the programmer. But this is fantasy.

It’s something you’d get with a linear type system (use values exactly once), but they’re not present in any particularly popular languages. I suspect Haskell might be able to achieve roughly this, but I’m not certain.

The closest I know definitely about is Rust, which has an affine type system (use values at most once). You can mark types and functions with the #[must_use] attribute, which says that you must do something with values of the type or returned from the function at least once, or it’ll produce a compiler warning. But what’s needed in these cases is that you must consume the value. I have often wished for this in Rust, but I doubt it would be accepted into the language (and so haven’t tried proposing it formally) because of its comparatively niche usefulness due to the interactions with destructors and panic unwinding. (Basically, you couldn’t actually rely on it for resource management: if you wanted it for something like “end database transaction, but actually do something with an error code rather than silently dropping it like a destructor”, you’d still actually need to implement a regular no-return-value destructor, so is requiring that the user write `tx.commit()?;` actually useful, given that a panic before that line would lead to effectively `drop(tx);` happening instead?) For most purposes, #[must_use] is good enough. But it still disappoints me, because I think most places that #[must_use] gets used, they’d actually be better suited by a #[must_consume]. Hmm… maybe I should contemplate proposing it after all.

But yeah, the lambda approach is often a good solution.


I was thinking about the Typestate Pattern because it would also take care of the order of the intermediate steps. Is it possible to mark the last type with #[must_use] and get a compile time error if I do not go through all steps?

EDIT: I found this:

"Basically, the typestate pattern in Rust today provides a way to ensure that state machines are used correctly, but does not fully defend against them being unused, or discarded before completion." [1]

So I guess #[must_use] is not enough and that's what you were saying in your comment already.

EDIT 2: I found another good write-up of this topic:

"The Pain Of Real Linear Types in Rust" [2]

[1] https://users.rust-lang.org/t/write-up-on-using-typestates-i...

[2] https://faultlore.com/blah/linear-rust


You don’t need anything along the lines of typestate, though it can fit in fine as well: having all your functions consuming self and returning a new Self (except the consuming “destructor”) is sufficient. It’s just an ergonomics pain since you need to assign it to something every time:

  #[must_use]
  struct Transaction;

  impl Transaction {
      fn new() -> Self { … }
      fn action(self, …) -> Self { … }
      fn commit(self) -> Result<(), …> { … }
  }

  let mut tx = Transaction::new();
  tx = tx.action(…);  // ← `tx =` is the difference.
  // (Any panic at this point will roll back the transaction via Drop,
  // but any errors in the process will be swallowed or abort or who knows what, there’s no standard.)
  tx.commit()?;  // unused-must-use warning on the last assignment of tx (`tx = tx.action(…);`) if this line is absent
A hypothetical #[must_consume] would let it be like this:

  #[must_consume]
  struct Transaction;

  impl Transaction {
      fn new() -> Self { … }
      fn action(&mut self, …) { … }
      fn commit(self) -> Result<(), …> { …; drop(self); Ok(()) }
  }

  let mut tx = Transaction::new();
  tx.action(…);
  // (Same panic behaviour.)
  tx.commit()?;  // unused-must-consume warning on the last assignment of tx (`let mut tx = Transaction::new();`) if this line is absent
With putting the state into the type, you’d be limited to annotating your entire generic type with #[must_use], not individual states. (#[must_use] won’t propagate through generic parameters, nor should it.)


That makes it clearer. Thanks!


Personally I would call this “execute within context”. The context could be a database transaction, or saving/restoring graphics state, or swapping out some global state to control instrumentation, or whatever.

The destructor approach has one advantage for semantic clarity: it’s guaranteed that your execute-within-context block really executes immediately and just once. Passing a lambda to a function means the function could call the lambda now, or later, or ten times, or maybe never. Still, I think it’s a more readable approach for contexts that are not holding resources.


> Personally I would call this “execute within context”.

Oh, a monad.

Yeah, a monad solves it. But it's quite overkill.


It would only be overkill if you had to invent it yourself. But since there's an existing solution, use that.

The article went from destructors to the with-pattern. Once you start thinking about exceptions, you might reach for try-with-resources. With Async exceptions you might look at bracket. If you want composability, go for ResourceT.


Basically everything on the article is overkill. You just need to compose the XML contents declaratively... what is the most natural way to compose data.

The original API goes out of its way to be dangerous and hard to use.


With an expressive enough type system you could also leverage the Typestate Pattern. This would have the further advantage of not only taking care of the final step but also the intermediate ones.

For example in the first example from the article calling xml.attribute doesn't make sense without a prior xml.begin_element.

By using the Typestate Pattern the compiler would ensure that you can only call xml.attribute after a xml.begin_element. Of course all of this only works in cases where the order of steps is fixed at compile time.

EDIT: Please read chrismorgan's comment why, at least in Rust, this does unfortunately not really solve the "Obvious Final Step" problem.

https://news.ycombinator.com/item?id=36095908


The database transaction example has another good reason to use a closure or callback: many databases can fail a transaction with an error saying it should be retried (due to a deadlock being detected). If the body is a closure or other callback, the wrapper can call it again.


Sounds like a cause for impossible to find bugs if the wrapper decides to reuse the callback in rare cases likely only existing in production. Might be better to let the end user handle this scenario in case they have side effects in their callback, like std::move.


This type of use case is why I wish more mainstream languages would have stricter concepts of purity and side effects.

Conceptually, a function implementing the body of a database transaction should be idempotent. If a language had a function type that did not mutate global or closed-over variables and only affected the world by calling methods on its parameter, it would be difficult to mess up.

Haskell can do this.

Rust’s Fn(&mut txn) -> Ret is kind of close, but it doesn’t express that the function may not have mutable captures, and it does nothing about interior mutability or calling impure functions.


A pseudocode python variant could look like

   with xml.element("port") as e:
       e.attribute("name", name)
       e.attribute("location", location)
And that's it


Pretty much how https://github.com/Knio/dominate does it, but no context variable binding (otherwise gets messy with lots of nesting)


> Any use of a lambda, as well as any inverison[sic] of control, makes code more difficult to read or debug

Anyone else immediately think about lambdas and surprised by this zinger at the end? I'll grant that in some languages lambdas are harder to debug, but I don't find the example more difficult to read or reason about _at all_.


The "Inversion of Control" section shows the solution. Using destructors is wrong on so many levels.


As far as I understand, Kotlin doesn't actually have first class blocks like Ruby, despite the appearance that it does. Instead, blocks are syntactic sugar for anonymous functions, so

  y = foo() {
    bar(it)
  }
is just pretty syntax for

  y = foo({ it -> bar(it) })


What's the difference?


Functionally, not that much. But in Ruby it requires a special language construct, whereas in Kotlin it's just pretty syntax for something else that's already in the language.



Even if there’s only one thing that could possibly go there, you shouldn’t omit the final


> you shouldn't omit the end.

Sometimes there's a different way to express the same thing without losing any meaning, but which might be more terse or robust.


In C, seems like a pretty easy problem for a macro, without complicating stack traces or control flow.


I make pretty heavy use of Java's AutoCloseable interface for this purpose. The lambda approach is an interesting one that could be applied in most languages though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: