More

raju · 2025-06-22T20:25:50 1750623950

pickaxe isn't a Git command, rather, it's a set of flags that `git log` offers (specifically `-S` and `-G`).

Shameless plug: I recently did a webinar on how the pickaxe options are better than `git-blame`) that you can find here: https://nofluffjuststuff.com/webinar/142/level_up_your_git_g... (Note: It requires you to provide an email address).

chrishill89 · 2025-06-23T15:28:55 1750692535

"Pickaxe" most often means those options. But it's also a (apparently legacy) command alias for git-blame.

steve_adams_86 · 2025-06-23T22:06:53 1750716413

Yes, this makes more sense now. Unfortunately I was already a heavy user of log, so then only new part is the term used for the flags. Somehow I've never seen it before

raju · 2025-02-07T11:26:36 1738927596

Similar HN thread: https://news.ycombinator.com/item?id=42967173

raju · 2025-02-06T11:56:26 1738842986

Let me start by saying this is wonderful work. Thank you for creating such a comprehensive resource. I haven't read through it all, but one thing did catch my eye.

Section 5.1 (https://beej.us/guide/bggit/html/split/branches-and-fast-for...)

> The default branch is called main.

> The default branch used to be called master, and still is called that in some older repos.

This is not true. Git still defaults to `master`, but allows you to change the default (for future `git init` invocations via `git config --global init.defaultBranch <name>`)

See https://github.com/git/git/blob/bc204b742735ae06f65bb20291c9...

Again, thank you. If I find anything else, I will be sure to post here.

*Update*: I also feel that referring to "older repos" sends the wrong message. *GitHub* decided to make this change, causing others to follow, and finally Git itself allows for the aforementioned configuration, but it has little to do with _newer_ or _older_, but rather preference.

beej71 · 2025-02-06T22:09:11 1738879751

Hrm. I didn't realize. I unset `init.defaultbranch` and it uses `master` and prints a block of hints.

> hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 'development'. The just-created branch can be renamed via this command:

That's going to make it more "interesting" to write the fix, that's for sure.

Thanks!

raju · 2025-02-06T11:47:46 1738842466

Posted this comment https://news.ycombinator.com/item?id=42961468

Happy to get any feedback.

raju · 2025-02-06T11:47:12 1738842432

Not the OP, but I've been teaching along similar lines. I've done it a couple of times at conferences—here's one of the versions: https://www.youtube.com/watch?v=8Q-frNO-yps

I've also blogged about it

- https://looselytyped.com/blog/2014/08/31/gits-guts-part-i/

- https://looselytyped.com/blog/2014/10/31/gits-guts-part-ii/

raju · 2025-01-01T18:56:56 1735757816

If you really get into them here’s a few series that are along similar lines

- Prey series by John Sandford

- Bosch series by Michael Connelly (He’s the “Lincoln Lawyer’s half-brother). Also a series on Amazon Prime

- Orphan X series by Mark Greany

I’ve followed all of these (as well as Jack Reacher and Gray man) series for years and have read all of them. Great brain candy.

lawgimenez · 2025-01-08T14:14:10 1736345650

Wow thanks! Unfortunately I stopped at Reacher’s third book, there’s something about Jack’s sleeping with all the women so far that is making me uncomfortable.

raju · 2024-09-23T19:05:47 1727118347

Let me start by saying (as someone who has written a few technical books of his own)—Congratulations!

I am sure you (assuming this is your first book) are learning that this is a labor of love, and I wish you the very best in this endeavor. You should be proud!

I was exposed to "data oriented programming" thanks to Clojure—wherein maps/sets are the constructs used to pass data (as plain data) around, with simple functions that work with the data, as opposed to the traditional OO (hello ORM) that mangles data to fit some weird hierarchy.

Java's recent innovations certainly make this a lot easier, and I am glad someone is looking at propagating a much needed message.

I will take a look at the book, but I wish you the very best.

mrbonner · 2024-09-23T19:28:06 1727119686

I am also very interested in how this work in practice. With OOP at least you know the shape of your data structure as opposed to the hash map as a mere container type.

geophile · 2024-09-23T19:53:01 1727121181

I am an OOP programmer going back to the late 80s (including the cfront days of C++), and a serious user of Python since 2007.

In Python, I sometimes try data-oriented programming, using lists and dicts to structure data. And I find that it does not work well. Once I get two or more levels of nesting, I find it far too easy to get confused about which level I'm on, which is not helped by Python's lack of strong typing. In these situations, I often introduce objects that wrap the map or dict, and have methods that make sense for that level. In other words, the objects can be viewed as providing clear documentation for the whole nested structure, and how it can be navigated.

goostavos · 2024-09-23T20:06:51 1727122011

>Once I get two or more levels of nesting, I find it far too easy to get confused about which level I'm on

Author here, I agree with you. I have the working memory of a small pigeon.

The flavor of data orientation we cover in the book leverages strongly typed representations of data (as opposed to using hash maps everywhere). So you'll always know what's shape it's in (and the compiler enforces it!). We spend a lot of time exploring the role that the type system can play in our programming and how we represent data.

joshlemer · 2024-09-23T20:29:52 1727123392

Given the strongly typed flavour of data oriented programming, I wonder if you have any thoughts on the "proliferation of types" problem. How to avoid, especially in a nominally typed language like Java, an explosion of aggregate types for every context where there may be a slight change in what fields are present, what their types are, and which ones are optional. Basically, Rich Hickey's Maybe Not talk.

    record Make(makeId, name)
    record Model(modelId, name)
    
    record Car(make, model, year)
    record Car(makeId, modelId, year)
    record Car(make, model)
    record Car(makeId, modelId)
    record Car(make, year)
    record Car(makeId, year)
    record Car(make, model, year, colour)
    record Car(makeId, modelId, year, colour)
    record Car(year, colour)
    
    ....

PaulHoule · 2024-09-24T00:46:22 1727138782

Hickey is great at trash-talking other languages. In the case of Car you might build a set of builders where you write

   Car.builder().make(“Buick”).model(“LeSabre”).build()

Or in a sane world code generate a bunch of constructors.

In the field of ontology (say OWL and RDF) there is a very different viewpoint about ‘Classes’ in the objects gain classes as they gain attributes. :Taylor_Swift is a :Person because she has a :birthDate, :birthPlace and such but was not initially a :Musician until she :playsInstrument, :recordedTrack, :performedConcert and such. Most languages have object systems like Java or C++ where a Person can’t start out as not a Musician but become one later like the way they can in real life.

Notably in a system like the the terrible asymmetry of where does an attribute really belong is resolved, as in real life you don’t have to say it is primary that Taylor Swift recorded the Album Fearless or that Fearless was recorded by Taylor Swift.

It’s a really fascinating question in my mind how you create a ‘meta object facility’ that puts a more powerful object system on your fingers in a language like Java or Python, for instance you can have something like

   taylorSwift.as(Musician.class)

which returns something that implements the Musician.class interface if

   taylorSwift.isA(Musician.class)

where

   TaylorSwift instanceof MetaObject.class

romwell · 2024-09-24T12:36:59 1727181419

Well, that's what C++ templates were made for.

White your code to work on Musicians, pass Taylor Swift in.

If she's not a musician, your code won't compile.

PaulHoule · 2024-09-24T14:12:21 1727187141

What I am talking about is more dynamic, although meta-objects could be made more static too.

Particularly, I am not a Musician now but if I learned to play an instrument or performed at a concert I could become a Musician. This could be implemented as

   paulHoule.isA(Musician.class)                                  # false
   paulHoule.as(Musician.class).playsInstruments()                # an empty Set<Instrument>
   paulHoule.as(Musician.class).playsInstruments().add(trumpet)
   paulHoule.isA(Musician.class)                                  # now true

I really did build a very meta object facility that represented objects from this system

https://en.wikipedia.org/wiki/Meta-Object_Facility

in an RDF graph and provided an API in Python that made those objects look mostly Pythonic. Inheritance in MOF is like Java so I didn't need to use any tricks to make dynamic classes (possible in RDF) available.

mhuffman · 2024-09-24T20:11:26 1727208686

This is interesting. It seems like a logic language (like Prolog) would work more naturally.

cutler · 2024-09-24T13:15:04 1727183704

builder() .... build() Rich Hickey got something right. This is about as far from the idea behind DOP as it gets.

Tainnor · 2024-09-24T21:48:26 1727214506

That's on Java, though. Many other languages such as Kotlin, Swift, etc. have better ways of dealing with this, e.g. in Kotlin

  Car(make = "Buick", model = "LeSabre")

kentosi · 2024-09-24T00:50:46 1727139046

I haven't yet had the luxury to experiment with the latest version of Java, but this is one of the reasons why I wish Java introduced named parameters the say way kotlin and scala do.

Eg:

  data class Make(makeId: String, name: String)
  data class Model(modelId: String, name: String)

  data class Car(make: Make, model: Model, year: String, ...)

Now you can go ahead and order the params whichever way you wish so long as you're explicitly naming them:

  val v1 = Car(make = myMake1, model = myModel1, year = "2023", ...)
  val v1 = Car(model = myModel1, make = myMake1, year = "2023", ...)

Ironlink · 2024-09-24T09:29:01 1727170141

Once withers land, I think you could approximate this by letting your record class have a zero argument constructor which sets every field to some blank value, and then fill the fields using `with`.

  var x = new Car() with { make = "Volvo"; year = "2023"; };

If you want the Car constructor to enforce constraints, you could use this pattern in a separate Builder record:

  record Car(String make, String year) {
    Car {
      Objects.requireNonNull(make);
      Objects.requireNonNull(year);
    }

    record Builder(String make, String year) {
      Builder() {
        this(null, null);
      }
      Car build() {
        return new Car(make, year);
      }
    }
  }

  var x = new Car.Builder() with { make = "Volvo"; year = "2023"; }.build();

Obviously syntax TBD.

Tainnor · 2024-09-24T21:50:17 1727214617

So much syntax to enable something that other languages have had for 10+ years. That's why I can't take the "Java is as good as Kotlin now" arguments seriously.

javanonymous · 2024-09-24T11:06:50 1727176010

I think named parameters would be a great addition

For now, I use Lombok's @Builder annotation. It makes it much easier to create and copy a record, where non-assigned attributes are set to default.

Example:

   var bmw = Car.builder().make("BMW").build()

It also has a practical toBuilder() syntax that creates a copy of the original record, with some attributes changed

   var other = bmw.toBuilder().year(2024).build()

goostavos · 2024-09-23T20:55:36 1727124936

I have a long convoluted answer to this.

I love that talk (and most of Rich's stuff). I consider myself a Clojure fanboy that got converted to the dark side of strong static typing.

I think, to some degree, he actually answers that question as part of his talk (in between beating up nominal types). Optionality often pops up in place of understanding (or representing) that data has a context. If you model your program so that it has "15 maybe sheep," then... you'll have 15 "maybe sheep" you've got to deal with.

The possible combinations of all data types that could be made is very different from the subset that actually express themselves in our programs. Meaning, the actual "explosion" is fairly constrained in practice because (most) businesses can't function under combinatorial pressures. There's some stuff that matters, and some stuff that doesn't. We only have to apply typing rigor to the stuff that matters.

Where I do find type explosions tedious and annoying is not in expressing every possible combination, but in trying to express the slow accretion of information. (I think he talks about this in one of his talks, too). Invoice, then InvoiceWithCustomer, then InvoiceWithCustomerAndId, etc... the world that microservices have doomed us to representing.

I don't know a good way to model that without intersection types or something like Rows in purescript. In Java, it's a pain point for sure.

1propionyl · 2024-09-23T21:23:41 1727126621

My sense is that what's needed is a generalization of the kinds of features offered by TypeScript for mapping types to new types (e.g. Partial<T>) "arithmetically".

For example I often really directly want to express is "T but minus/plus this field" with the transformations that attach or detach fields automated.

In an ideal world I would like to define what a "base" domain object is shaped like, and then express the differences from it I care about (optionalizing, adding, removing, etc).

For example, I might have a Widget that must always have an ID but when I am creating a new Widget I could just write "Widget - {.id}" rather than have to define an entire WidgetCreateDTO or some such.

piva00 · 2024-09-23T21:47:35 1727128055

> For example, I might have a Widget that must always have an ID but when I am creating a new Widget I could just write "Widget - {.id}" rather than have to define an entire WidgetCreateDTO or some such.

In this case you're preferring terseness vs a true representation of the meaning of the type. Assuming that a Widget needs an ID, having another type to express a Widget creation data makes sense, it's more verbose but it does represent the actual functioning better, you pass data that will be used to create a valid Widget in its own type (your WidgetCreationDTO), getting a Widget as a result of the action.

1propionyl · 2024-09-24T00:13:41 1727136821

> Assuming that a Widget needs an ID, having another type to express a Widget creation data makes sense, it's more verbose but it does represent the actual functioning better

I agree with this logically. The problem is that the proliferation of such types for various use cases is extremely detrimental to the development process (many more places need to be updated) and it's all too easy for a change to be improperly propagated.

What you're saying is correct and appropriate I think for mature codebases with "settled" domains and projects with mature testing and QA processes that are well into maintenance over exploration/iteration. But on the way there, the overhead induced by a single domain object whose exact definition is unstable potentially proliferating a dozen types is developmentally/procedurally toxic.

To put a finer point on it: be fully explicit when rate of change is expected to be slow, but when rate of change is expected to be high favor making changes easy.

piva00 · 2024-09-24T07:52:37 1727164357

> What you're saying is correct and appropriate I think for mature codebases with "settled" domains and projects with mature testing and QA processes that are well into maintenance over exploration/iteration. But on the way there, the overhead induced by a single domain object whose exact definition is unstable potentially proliferating a dozen types is developmentally/procedurally toxic.

> To put a finer point on it: be fully explicit when rate of change is expected to be slow, but when rate of change is expected to be high favor making changes easy.

I agree with the gist of it, at the same time I've worked in many projects which did not care about defining a difference between those types of data in their beginning, and since they naturally change fast they accrued a large amount of technical debt quickly. Even more when those projects were in dynamically typed languages like Python or Ruby, relying just on test cases to do rather big refactorings to extrincate those logical parts are quite cumbersome, leading to avoidance to refactor into proper data structures afterwards.

Through experience I believe you need to strike a balance, if the project is in fluid motion you do need to care more about easiness of change until it settles but separating the actions (representation of a full fledged entity vs representation of a request/action to create the entity, etc.) is not a huge overhead given the benefits down the line (1-3 years) when the project matures. Balancing this is tricky though, and the main reason why any greenfield project requires experienced people to decide when flexibility should trump better representations or not.

1propionyl · 2024-09-24T15:57:18 1727193438

> Through experience I believe you need to strike a balance, if the project is in fluid motion you do need to care more about easiness of change until it settles but separating the actions (representation of a full fledged entity vs representation of a request/action to create the entity, etc.) is not a huge overhead given the benefits down the line (1-3 years) when the project matures. Balancing this is tricky though, and the main reason why any greenfield project requires experienced people to decide when flexibility should trump better representations or not.

I am in complete agreement, and this is why experienced architects and project managers are so key. Effective software architecture has a time dimension.

Someone needs to have the long term picture of how the architecture of the system with develop, enforce a plan so that the project doesn't get locked into or cut by early-stage decisions long term, but also doesn't suffer the costs of late-stage decisions early on, and manage the how/when of the transition process.

I think we could have better tools for this. Some of them in libraries, but others to be effective may need to be in the language itself.

glenjamin · 2024-09-23T22:44:42 1727131482

Do you mean in TypeScript or in another language?

In TS the `Omit<T, K>` type can be used to remove stuff, and intersection can be used to add stuff

jakjak123 · 2024-09-23T21:22:51 1727126571

Hopefully your domain is sane enough that you can read nearly all the data you are going to use up front, then pass it on to your pure functions. Speaking from a Java perspective.

chipdart · 2024-09-24T05:56:44 1727157404

> Given the strongly typed flavour of data oriented programming, I wonder if you have any thoughts on the "proliferation of types" problem.

Not a problem.

You're just making your life needlessly hard and blaming Java for the problems you're creating for yourself.

This represents, coincidentally, the bulk of the problems pinned on Java.

Everywhere else the problem you described is a variant of an anti-pattern and code smell widely known as telescoping constructor pattern.

The problems caused by telescoping constructors have a bunch of known cures:

- builder pattern (Lombok supports this, by the way),

- the parameter object pattern (builder pattern's poor cousin)

- semantically-appropriate factory methods.

The whole reason behind domain models taking the center stage when developing a software project is that you build your whole project around a small set of types with the necessary and sufficient expressiveness to represent your problem domain.

Also, "explosion of aggregate types" can only become a problem if for some reason you don't introduce support for type conversion when introducing specialized types.

kaba0 · 2024-09-24T07:32:49 1727163169

I have thoroughly enjoyed that Hickey talk, but I think he has a very system-oriented view/take - which is very important and shows his experience - but it is also common to have control over the whole universe for our program.

In the interconnected system view, data schemas can change without notice, and the program should be backwards and forwards compatible to a reasonable degree to avoid being brittle.

This is not a problem when we control the whole universe.

I find that Haskell-esque type systems (strongly typed with frequent use of algebraic data types to represent every possible state in _that_ universe) work better for the latter, but are not the best fit for the former, and they often have to add some escape hatches at the boundaries.

Java itself is in a weird cross of this two - it has a reasonably strong type system nowadays, but it’s also a very dynamic runtime where one can easily create their own class at runtime and load it, reflect on it, etc.

So all in all — are you making that Car as part of your universe where you control everything, and it won’t change in unexpected ways? Make a record, potentially with nullable/Optional/Maybe types for the fields, if that makes sense.

If it represents some outside data that you don’t control, then you might only care about a subset of the fields: create a record type for that subset and use a converter from e.g. json to that record type, and the converter will save you from new fields. If everything is important then your best bet is basically what Clojure/JSONObject/etc do, just have a String-keyed map.

(Note: structural types can help here, and I believe OCaml has row polymorphism?)

cutler · 2024-09-24T13:28:39 1727184519

There's always clojure.spec.

geophile · 2024-09-23T21:38:58 1727127538

This discussion sounds like there is confusion about the Car abstraction.

Make and model vs. makeId and modelId: Pick one. Are Make and Model referenced by Cars or not? There seems a slight risk of the Banana/Monkey/Jungle problem here, so maybe stick with ids, and then rely on functions that lookup makes and models given ids. I think it's workable either way.

As for all the optional stuff (color, year, ...): What exactly is the problem? If Cars don't always have all of these properties then it would be foolish of Car users to just do myCar.colour, for example. Test for presence of an optional property, or use something like Optional<T> (which amounts to a language supported testing for presence). Doesn't any solution work out pretty much the same? When I have had this problem, I have not done a proliferation of types (even in an inheritance hierarchy) -- that seems overly complicated and brittle.

nicoty · 2024-09-23T20:51:50 1727124710

I'm not familiar with Java. Does it have no notion of structural types at all? If it does, maybe you could wrap those fields in `Car` with `Maybe`/`Option` (I’m not sure what the equivalent is in Java) so you get something like `Car(Maybe Make, Maybe Model, Maybe Year, Maybe Colour)`?

vips7L · 2024-09-24T03:56:25 1727150185

Records are structural types. Null restricted types are in draft: https://openjdk.org/jeps/8303099

htuyfjsdkfw · 2024-09-25T03:12:19 1727233939

Records in Java are nominal. In fact, it is syntax sugar for a class.

spullara · 2024-09-23T21:31:53 1727127113

yes and it is called Optional (rather than Maybe)

cbsmith · 2024-09-24T01:03:28 1727139808

That one is pretty simple. You have a Car object with four fields. The types of the fields are, respectively Optional<Make>, Optional<Model>, Optional<Year>, and Optional<Colour>.

Hickey makes it sound worse than it is.

chii · 2024-09-24T05:26:18 1727155578

so now when you have a function that takes in a Car object, you have no idea what fields those objects might have, because it's all optional! Which means the checks for the validity of each field end up spreading out to every function.

chipdart · 2024-09-24T06:12:32 1727158352

> so now when you have a function that takes in a Car object, you have no idea what fields those objects might have, because it's all optional!

Your types are already optional if you're adding constructors for each permutation of all input parameters.

Tainnor · 2024-09-24T21:59:46 1727215186

Which is no worse than the situation in a dynamically typed language where every field in every object could be optional.

Dynamic typing advocates sometimes miss that statically typed languages don't force you to encode every invariant in the type system, just those that seem important enough.

Or, if you really want to go overboard, you could use a dependently typed language and write functions that only accept cars with a specific combination of fields not being empty. But that's typically not worth the complexity.

cbsmith · 2024-09-24T06:39:24 1727159964

Frankly, your contract was that you have no idea what fields those objects might have. I'm just fulfilling it. You won't have checks for validity of each field, as Optional is valid, but you will have to have code that handles Optional<> types (so things like foo.getModel().orElse()...), which is the requirement you described. That doesn't mean you'll be constantly checking the validity of each field.

js2 · 2024-09-23T21:21:02 1727126462

> Python's lack of strong typing

I see people conflate strong/weak and static/dynamic quite often. Python is strong[1]/dynamic, with optional static typing through annotations and a type checker (mypy, pyright, etc).

Perhaps the easiest way to add static types to data is with pydantic. Here's an example of using pydantic to type-check data provided via an external yaml configuration file:

https://news.ycombinator.com/item?id=41508243

[1] strong/weak are not strictly defined, as compared to dynamic/static, but Python is absolutely on the strong end of the scale. You'll get a runtime TypeError if you try to add a number to a string, for example, compared to say JavaScript which will happily provide a typically meaningless "wat?"-style result.

jstimpfle · 2024-09-23T23:36:58 1727134618

In some significant ways, it's not strong at all. It's stronger than Javascript but it's difficult not to be. Python is a duck typing language for the most part.

js2 · 2024-09-24T00:14:59 1727136899

Duck typing is an aspect of it being dynamically typed, not whether it is strong/weak. But strong/weak is not formally defined, so if duck typing disqualifies it for you, so be it.

https://langdev.stackexchange.com/questions/3741/how-are-str...

cbsmith · 2024-09-24T01:14:58 1727140498

I always think of Python as having "fairly strong" typing, because you can override the type of objects by just assigning to __class__.

htuyfjsdkfw · 2024-09-25T03:26:01 1727234761

Duck typing doesn't exist. What you refer to as duck typing is the inherit nature of dynamic typing.

fire_lake · 2024-09-24T12:18:35 1727180315

The Python ecosystem is not built around types.

For example, you will find functions where the runtime value of parameters will change the return type (e.g. you get a list of things instead of one thing). So unless we want to throw out huge amounts of Python libraries (and the libraries are absolutely the best thing Python has going for it) then we have to accept that it’s not a very good statically type language experience.

The JS community on the other hand had adopted TypeScript very widely. JS libraries are often designed with typing in mind, so despite being weakly typed, the static type experience is actually very good.

senand · 2024-09-24T15:54:11 1727193251

I don't disagree. However, often, when I use a library, I use it within a small function that I control, which I can then type again. Of course, if libraries change e.g. the type they return over time (which they shouldn't also according to Rich), you often only notice if you have a test (which you should have anyway).

Moreover, for many libraries there are types- libraries that add types to their interface, and more and more libraries have types to begin with.

Anyway just wanted to share that for me at least it's in practice not so bad as you make it sound if you follow some good processes.

js2 · 2024-09-24T19:24:15 1727205855

YMMV. I have over two decades of experience with Python and about a decade with JS though it's all backend work. I use both in my day job, but write in Python more frequently. I've found the transition to Python static typing much more seamless and easier to adopt than TS.

Amusingly, I can't call any time where I'd had to deal with differently typed return values in Python, but just recently had to fix some legacy JS code that was doing that (a function that was returning null, scalar, or array depending upon how many values it got in response to a SQL query).

$0.02.

zahlman · 2024-09-24T18:23:37 1727202217

>For example, you will find functions where the runtime value of parameters will change the return type (e.g. you get a list of things instead of one thing).

I have long argued that such interfaces are doing it wrong. That's what "Special cases aren't special enough to break the rules." in the Zen is supposed to warn about, to my understanding.

thelastparadise · 2024-09-23T21:43:00 1727127780

You're being pydantic =)

lelanthran · 2024-09-24T16:18:06 1727194686

> You'll get a runtime TypeError if you try to add a number to a string, for example,

A popular but nonsensical myth:

    $ cat t.py
    def foo(myval):
       return myval * 12
    
    print (foo("a"))
    $ python3 t.py
    aaaaaaaaaaaa

In real-world usage, Python's "typing" is about as helpful as Javascript's "typing". Plain old C has stronger typing guarantees than Python/PHP/etc.

zahlman · 2024-09-24T18:35:41 1727202941

Defining an operation between two different types is not at all the same thing as enabling implicit conversions. Notice for example that "1" * 2 gives "11", and not "2" nor 2. Interpreting multiplication of a string by an integer as "repeat the string that many times" doesn't require any kind of conversion (the integer is simply a counter for a repeated concatenation process). Interpreting addition as "append the base-10 representation of the integer" certainly does. (Consider: why base 10?)

You have a point that strong vs weak typing is not a binary and that different languages can enable a varying amount of implicit conversions in whatever context (not to mention reinterpretation of the underlying memory). But from ~20 years of experience, Python's type system is nothing like JavaScript's - and it's definitely helpful to those who understand it and don't fight against it.

In my experience it's typically people from languages like Haskell that can't see the difference.

Tainnor · 2024-09-24T22:02:10 1727215330

that's just operator overloading and it exists in many statically typed languages too

lelanthran · 2024-09-25T05:48:21 1727243301

> that's just operator overloading and it exists in many statically typed languages too

My point is that Python's "typing" guarantees allow a caller to call a function with the wrong type, and get back a wrong answer and/or silently lose data.

Strong typing is pointless if the language is unable to actually prevent common footguns, like passing in the incorrect type.

I'm moving more and more to the opinion that arguing about the spectrum of strong <-> weak typing is stupid, because type utility is on the spectrum of static <-> dynamic, with dynamic being full of footguns.

cbsmith · 2024-09-24T00:59:49 1727139589

Living this dream in Python right now (inherited a code base that used nasty nesting of lists & dicts). You don't strictly need to do OOP to solve the problem, but it really does help to have a data model. Using dataclasses to map out the data structures makes the code so much more readible, and the support for type hints in Python is good enough that you can even debug problems with the type system.

clepto · 2024-09-24T15:41:10 1727192470

I see a lot of people mentioning Pydantic here, but you should take a look into TypedDict. It provides a type structure ontop of a plain dictionary, and sounds like exactly what you’d want, and is a built-in that you don’t need a dependency for.

Mypy for example can also see the types of the dictionary are supposed to be when you use it just like a normal dictionary.

mejutoco · 2024-09-23T20:43:28 1727124208

I recommend you use pydantic for type annotations. Alternatively, dataclasses. Then you pair it with typeguards @typechecked annotation and the types will be checked at runtime for each method/function. You can use mypy to check it at "compile time".

Having clear data types without oop is possible, even in python.

sodapopcan · 2024-09-23T20:33:53 1727123633

Python's not really built for that AFAIK, though. In languages built for it, you can type your dicts/hashes/maps/whatever and its easier to see what they are/know where the functions that operate on them live. I'm most familiar with Elixir which has structs which are simply specialized map (analogous to dict in Python) where their "type" is the name of the module they belong to. There can only be one struct per module, In this sense is easy to know exactly where its functions live and is almost like a class with the very key difference that modules are not stateful.

cbsmith · 2024-09-24T01:16:11 1727140571

> In languages built for it, you can type your dicts/hashes/maps/whatever and its easier to see what they are/know where the functions that operate on them live.

I think I must be misunderstanding what you mean by that, because I can very much do that in Python.

sodapopcan · 2024-09-24T12:58:17 1727182697

I think I misunderstand OP's problem then.

cbsmith · 2024-09-24T13:34:44 1727184884

Their problem stems from the scenario where you don't type them. You just leave them as generic lists & dicts.

sodapopcan · 2024-09-24T14:49:33 1727189373

That's what I thought. I obviously don't know Python well enough and didn't know you can name dicts (like, beyond setting them to a variable). I guess you can export from a module so they are prefixed! Didn't think of that one earlier.

maleldil · 2024-09-24T16:03:02 1727193782

I'm not sure what you mean by naming dicts, but Python has TypedDict, where you can define the names and types of specific keys. They only exist for type checking and behave exactly as a normal dict at runtime.

In modern typed Python, you can instead use dataclasses, NamedTuples (both in the standard library), attrs or Pydantic (both third-party) to represent structs/records, the latter also providing validation. Still, TypedDicts are helpful when interfacing with older code that uses dicts for heterogeneous data.

My main gripe with them is that different TypedDicts are not compatible with each other. For example, it would be very helpful if a dict with x:str and y:str fields were considered superclasses of dicts with x:str, y:str and z:str like they are in TypeScript, but they aren't. They are considered different types, limiting their usability in some contexts.

When using homogenous dicts, you can still use dict[str, T], and T can be Any if you don't want to type the whole thing. You can use any hashable type instead of str for keys. I often do that when reading JSON from dynamically typed dict[str, Any] to dataclasses.

cbsmith · 2024-09-24T17:52:44 1727200364

    class X(TypedDict):
        x: str

    class Y(TypedDict):
        y: str

    class XYZ(X, Y):
        z: str

That should get you the class/superclass relationship that you want, no?

maleldil · 2024-09-24T20:39:12 1727210352

That needs to be explicit for any interacting types. You must define separate classes and explicitly define their hierarchy. This is fine if you control all the types, but it breaks down quickly. The best example is having two TypedDicts with the same members; in Python, you cannot use one instead of the other.

    from typing import TypedDict

    class A(TypedDict):
        a: int
        b: str
        

    class B(TypedDict):
        a: int
        b: str
        
    def f(a: A) -> None: pass

    b = B(a=1, b='b')
    f(B) # mypy error: Argument 1 to "f" has incompatible type "type[B]"; expected "A"  [arg-type]

On the other hand, this is legal in Typescript:

    interface A {
      a: number;
      b: string;
    }

    interface B {
      a: number;
      b: string;
    }  

    function f(a: A) {}

    const b: B = {a: 1, b: 'b'};
    f(b);

This is most useful when A has a subset of B's attributes, like this (which also doesn't work in Python):

    interface A {
      a: number;
    }

    interface B {
      a: number;
      b: string;
    }  

    function f(a: A) {}

    const b: B = {a: 1, b: 'b'};
    f(b);

cbsmith · 2024-09-24T22:36:17 1727217377

That seems a lot like duck typing to me.

maleldil · 2024-09-25T08:00:54 1727251254

Yes, it is. Typed Python supports duck typing to some extent; see typing.Protocol and stuff like Sequence, Iterable, Mapping, etc.

cbsmith · 2024-09-25T16:53:04 1727283184

I'd argue non-typed Python supports duck typing pretty well too, so you don't necessarily need Typed Python to support it.

sodapopcan · 2024-09-24T16:36:27 1727195787

Awesome, thanks for the clarification.

cbsmith · 2024-09-24T15:47:43 1727192863

Python classes are basically dictionaries that have a distinct type bound to them. Alternatively you can subclass from dictionary to give yourself a distinct type but still be a dictionary. Slotted classes are basically named tuples (and of course, Python has actual named tuples and dataclasses), so there's a lot of ways to "tag" a collection with a specific type in mind.

sodapopcan · 2024-09-24T16:35:53 1727195753

A typed dict is more like what I mean. Obviously I know about classes as I'm no stranger to OO.

chipdart · 2024-09-24T06:07:06 1727158026

> (...) by Python's lack of strong typing. In these situations, I often (...)

Python does support strong typing, albeit optional, with type annotations and tools like mypy.

If a problem is caused by the lack of strong typing, why not use strong typing?

baq · 2024-09-24T10:44:52 1727174692

static.

python was always strongly typed: you could not do 2 + '3', ever. nowadays mypy/pyright will tell you that before runtime, hence static.

chipdart · 2024-09-24T13:18:13 1727183893

I stand corrected.

FpUser · 2024-09-23T20:14:47 1727122487

>"In these situations, I often introduce objects that wrap the map or dict, and have methods that make sense for that level."

I've been doing the same thing since the end of the 80s as well starting with Turbo/Borland Pascal, C++, and later any other language that supports OOP.

ninetyninenine · 2024-09-24T08:30:29 1727166629

Python now has type hints which can be used with an external type checker in the IDE. You'd probably type it with a class.

ederamen · 2024-09-24T00:45:21 1727138721

Use Data Classes

kccqzy · 2024-09-23T20:28:09 1727123289

Clojure has spec. That allows you to know a specification of what the data structure contains.

akavi · 2024-09-23T19:51:55 1727121115

You can get strongly typed "shaped" data without objects[0], even in Java: Records[1].

~Unfortunately, I believe they're mutable (and cannot be made immutable).~ Edit: I was wrong, they're immutable.

[0]: I'm using "object" to mean "data bound to methods", since the concept of aggregate data in general long pre-date OOP (eg, C's structs)

[1]: https://docs.oracle.com/en/java/javase/17/language/records.h...

taftster · 2024-09-23T20:00:46 1727121646

Java Records are immutable (by the most common definition). They don't have any means to update the record (via setters, etc.) after construction. That doesn't mean, for example, you can't store a reference to a mutable type (for example, a List or Map) in your record.

The frustration I have with Records is there is no good way to prevent direct construction of them. That is, the constructor is public, which prevents an easy way of enforcing an invariant during construction.

For example, let's say that you have a record with a Date type. There's no good way to prevent a user from creating the record with an invalid date, one that is out of a needed date range. Or maybe enforcing a field cannot be null or some combination of fields must meet requirements as a group.

The benefit I get from the classic Builder pattern is defeated with Records. I can't enforce checking of my fields before the construction of the record object itself. Presumably I would need to verify the object after construction, which is unfortunate.

vips7L · 2024-09-23T20:06:40 1727122000

You can enforce some invariants during construction:

    record Point(int x, int y) {
        Point {
            if (x < 0) throw new IllegalArgumentException()
        }
    }

or if you want to assert something is not null:

    record Person(String name) {
        Person {
            requireNonNull(name);
        }
    }

nogridbag · 2024-09-24T19:41:42 1727206902

I think records will be much more useful if https://openjdk.org/jeps/468 gets out of preview.

elric · 2024-09-24T11:53:33 1727178813

> There's no good way to prevent a user from creating the record with an invalid date

That is factually incorrect.

You can do all of that validation in a record constructor, much like in a normal Java class constructor. There's a difference in syntax: you don't need to repeat the constructor arguments in parantheses, and don't have to perform the assignments yourself. These are tailored specifically for easy validation.

tpmoney · 2024-09-23T23:15:20 1727133320

As mentioned by the other commenters, you should be able to run any validations or transformations on the data that you want in the canonical constructor, including re-assigning values (for example we've done defaults with `foo != null ? foo : new DefaultFoo()`). The only thing I think you can't do with a record is make the canonical constructor private and force users of your type to call factory methods that can return null instead of throwing an exception. You can provide those factory methods, but anyone can still call the constructor, so you have to do your hard checks in the constructor. On the other hand, no matter how many alternate constructors or factory methods you make, you're also guaranteed that every one of them eventually has to call the canonical constructor, so you don't need to spread your validation around either.

akavi · 2024-09-23T20:04:02 1727121842

Can you make the Record class private to a module, and only export a static function that constructs them?

(I know very little about Java)

kaba0 · 2024-09-23T20:11:51 1727122311

To a degree, yes, that’s possible. But leaking a private type over module boundaries is bad form, so a better (though possibly over engineered solution) would be to have a separate public interface, implemented by the private record type, and the static function would have that interface as return type.

enugu · 2024-09-24T01:23:45 1727141025

Why is it bad form to expose a record type only via custom functions and not its field accessors? Isn't this just like exposing a more usual object with its public functions and private functions remain inaccessible?

snmx999 · 2024-09-23T21:25:26 1727126726

You can create dedicated, already verified objects to pass on to your record. E.g. AllowedDate (extends Date).

bedatadriven · 2024-09-23T19:59:10 1727121550

A record's fields are final, so records are immutable (though they can include immutable pointers to mutable objects)

davedx · 2024-09-24T10:47:13 1727174833

With TypeScript you have types to tell you the shape of your data.

goostavos · 2024-09-23T19:32:11 1727119931

Thanks for the kind words :)

>learning that this is a labor of love

I underestimated both the amount of labor and the amount of love that would be involved. There were more than a few "throw everything out and start over" events along the way to this milestone.

Clojure definitely had a huge impact on how I think about software. Similarly, Haskell and Idris have rearranged my brain. However, I still let Java be Java. The humble object is really tough to beat for managing many kinds of runtime concerns. The book advocates for strongly typed data and leveraging the type system as a tool for thinking.

>Java's recent innovations certainly make this a lot easier

Yeah, it's an exciting time! Java has evolved so much. Algebraic types, pattern matching, `with` expressions -- all kinds of goodies for dealing with data.

jwr · 2024-09-24T13:50:34 1727185834

> Clojure definitely had a huge impact on how I think about software

I could be called a "Clojure programmer", because I make a living from an app written Clojure and ClojureScript. While I always appreciated the incredible JVM, I always looked at Java the language with disgust and contempt, interfacing with it only as was necessary, but recent work on Java makes it much more attractive. I was impressed by the functional interfaces, modern design with mostly static methods, JSR-310 (date and time) is absolutely great — overall, Java has improved a lot over the years.

It has come to the point where I gasp might consider writing some Java code :-)

raju · on March 31, 2024

The minimum price of the book is free. You could grab a copy and determine if you like what you see and then pay the full price.

fock · on March 31, 2024

and that is the appropriate price for a collection of reddit(?)-messages grouped by topic.

DataDive · on March 31, 2024

What do you mean?

That there are tasks out there that some ought to do for free if someone else thinks these should be done for free?

The "pay what you think it is worth" model is not a scalable and viable approach. It most likely only works when everything costs money and "pay what you think" is a novelty that gains sympathy and attention.

But as soon as the novelty wears off it is not a sustainable approach.

raju · on Jan 29, 2024

Previous discussion on HN with 108 comments—https://news.ycombinator.com/item?id=25591492

actionfromafar · on Jan 29, 2024

Whoa, the top comment is really +5 insightful.

corford · on Jan 29, 2024

Yeah, it's a great comment. Since then things like Temporal have sprung up to try and address this state machine complexity of "accidental decoupling"

Terr_ · on Jan 29, 2024

> Accidental decoupling is where you have a complex state machine encapsulating a business procedure with multiple steps, and it's coordinated as messages between and actions in multiple services.

That might need emphasis on "in multiple services."

Within the same service, a granular set of messages (events) can still be useful for auditing or creating good read-model "projections" of what happened.

dagss · on Jan 30, 2024

I thought event sourcing as a data storage strategy and queues as an execution model were orthogonal concepts?

Terr_ · on Jan 30, 2024

It's true that the messages (state machine transitions) don't have to be a durable source of truth, but there are similar arguments to be made for granularity.

bvrmn · on Jan 29, 2024

I could add 2c. If you ever need to store some meta alongside the message in a DB. For example status, some execution history, etc. Then it's better to avoid MQ at all. Of course if you can scale DB access from workers and you can couple producers/consumers via the same DB. But it's the case for almost all applications TBH.

briHass · on Jan 30, 2024

I mostly agree, but from a devil's advocate position, the downside is the likelihood that you end up reimplementing queue basics like retries, delay/scheduling, and of course, the essential transactional state flips without locking or perf issues.

In my experience, the downside to the queue is losing all the historical statistics/state that you get for free with a database table. You have to instrument all that stuff manually, since most simple queues are designed to be transient once messages are confirmed.

I usually end up with a hybrid: store a copy of the state in the DB (along with all the job data), and essentially use the queue to hand off an ID or something pointing to the DB. You can then run queries against the best-effort state recorded in the DB, but the queue handles all the at-most and schedule/retry logic I don't want to handcraft.

throwawaaarrgh · on Jan 30, 2024

Fwiw, HTTP is a stateless synchronous at-least-once many-to-one request-reply message queue with metadata.

bvrmn · on Jan 30, 2024

HTTP is a transport. It doesn't have any other properties besides you could make a request and hopefully get a response. Semantic is defined by actual client/server implementations with corresponding backend.

throwawaaarrgh · on Jan 30, 2024

MQTT is a transport too, but its design facilitates a brokered pub-sub message queue. The transport implementation is effectively the queue, as far as the applications are concerned.

bvrmn · on Jan 30, 2024

MQTT as a transport is a part of elaborate spec describing client/server state machine. There is nothing similar for HTTP.

throwawaaarrgh · on Jan 30, 2024

The application doesn't care if it's "just a transport" or "part of an elaborate spec". Functionally there is no difference between an HTTP transaction and a message queue transaction, for a specific kind of message queue. You implement them literally the same way in an application; open socket, connect, send, receive, optional loop, close.

The point is that a message queue is just a concept. It can exist in many forms for many different use cases. You evaluate the type of queue and implement something that provides for its requirements. HTTP does that inherently for certain message queue types, and for others it requires more code.

Another example is the perennial meme of "PostgreSQL queue". A database isn't a message queue, it's a database! Yet people throw some crap into the DB and call it a queue. Because the application doesn't care how you implement the logical concept. HTTP, RDBMS, MQTT broker, whatever.

kapilvt · on Jan 29, 2024

thanks for noting, 2020 should be in the title, I was about to comment that event bridge (which linked article notes is new) is a rebrand on cloud watch events which have been around since January 2016.

raju · on Sept 21, 2023

Previous submission (with 67 comments): https://news.ycombinator.com/item?id=25834444