Doing anything like this in __init__ is crazy. Even `Path("config.json").read_text()` in a constructor isn't a good idea.
Friends don't let friends build complicated constructors that can fail; this is a huge violation of the Principle of Least Astonishment. If you require external resources like a zeromq socket, use connect/open/close/etc methods (and a contextmanager, probably). If you need configuration, create a separate function that parses the configuration, then returns the object.
I appreciate the author's circumstances may have not allowed them to make these changes, but it'd drive me nuts leaving this as-is.
Not just Python. Most languages with constructors behave badly if setup fails: C++ (especially), Java, JavaScript. Complicated constructors are a nuisance and a danger.
Rust is the language I'm familiar with that does this exceptionally well (although I'm sure there are others). It's strictly because there are no constructors. Constructors are not special language constructs, and any method can function in that way. So you pay attention to the function signature just like everywhere else: return Result<Self, T> explicitly, heed async/await, etc. A constructor is no different than a static helper method in typical other languages.
new Thing() with fragility is vastly inferior to new_thing() or Thing::static_method_constructor() without the submarine defects.
Enforced tree-style inheritance is also weird after experiencing a traits-based OO system where types don't have to fit onto a tree. You're free to pick behaviors at will. Multi-inheritance was a hack that wanted desperately to deliver what traits do, but it just made things more complicated and tightly coupled. I think that's what people hate most about "OOP", not the fact that data and methods are coupled. It's the weird shoehorning onto this inexplicable hierarchy requirement.
I hope more languages in the future abandon constructors and strict tree-style and/or multi-inheritance. It's something existing languages could bolt on as well. Loose coupling with the same behavior as ordinary functions is so much easier to reason about. These things are so dated now and are best left in the 60's and 70's from whence they came.
I don't have enough experience with traits, but they also sound like a recipe for creating a mess. I find anything more than like 1 level of inheritance starts to create trouble. But perhaps that's the magic of traits? Instead of creating deep stacks, you mix-and-match all your traits on your leaf class?
> I find anything more than like 1 level of inheritance starts to create trouble.
That's the beauty of traits (or "type classes"). They're behaviors and they don't require thinking in terms of inheritance. Think interfaces instead.
If you want your structure or object to print debug information when logged to the console, you custom implement or auto-derive a "Debug" trait.
If you want your structure or object to implement copies, you write your own or auto-derive a "Clone" trait. You can control whether they're shallow or deep if you want.
If you want your structure or object to be convertible to JSON or TOML or some other format, you implement or auto-derive "Serialize". Or to get the opposite behavior of hydrating from strings, the "Deserialize" trait.
If you're building a custom GUI application, and you want your widget to be a button, you implement or auto-derive something perhaps called "Button". You don't have to shoehorn this into some kind of "GObject > Widget > Button" kind of craziness.
You can take just what you need and keep everything flat.
Yup, and IO being async usually creates impedance mismatch in constructors.
Had to refactor quite a few anti-pattern constructors like this into `async Task<Thing> Create(...)` back in the day. No idea what was the thought process of the original authors, if there was any...
I'd really need to think long and hard about it, but my initial feeling is that we'd attach them to data classes or a similar new construct. I don't think you'd want to reason about the blast radius with ordinary classes. Granted, that's more language complexity, creates two unequal systems, and makes much more to reason about. There's a lot to juggle.
As much fun as putting a PEP together might be, I don't think I have the bandwidth to do so. I would really like to see traits in Python, though.
Why? An object encapsulates some state. If it doesn't do anything unless ypu call some other method on it first, it should just happen in the constructor. Otherwise you've got one object that's actually two types: the object half-initialised and the object fully initialised, and it's very easy to confuse the two. Especially in python there's basically no situation where you're going to need that half-state for some language restriction.
It's a lot easier to reason about code if I don't need to worry that something as simple as
my_variable = MyType()
might be calling out to a database with invalid credentials, establishing a network connection which may fail to connect, or reading a file from the filesystem which might not exist.
You are correct that you don't want an object that can be half-initialized. In that case, any external resources necessary should be allocated before the constructor is called. This is particularly true if the external resources must be closed after use. You can see my other comment[0] in this thread for a full example, but the short answer is use context managers; this is the Pythonic way to do RAII.
Works in exactly the same way? Errors get reported the same, it has the same semantics, you have the same need to keep track of things, it's just two lines instead of one and you're more likely to have a partially initialised object floating around. I really don't see the advantage at all.
Context managers are handy if you need to guarantee cleanup in simple object lifetimes, and in that case __enter__ is probably the better place to do setup because it's closer to where you get guarantees about the cleanup happening, but context managers are not actually RAII, just a partial substitute.
> might be calling out to a database with invalid credentials, establishing a network connection which may fail to connect, or reading a file from the filesystem which might not exist.
In python its not unusual that,
import some_third_partylib
will do exactly that. I've seen libraries that will load a half gigabyte machine learning model into memory on import and one that sends some event to sentry for telemetry.
> Otherwise you've got one object that's actually two types: the object half-initialised and the object fully initialised, and it's very easy to confuse the two.
You said it yourself -- if you feel like you have two objects, then literally use two objects if you need to split it up that way, FooBarBuilder and FooBar. That way FooBar is always fully built, and if FooBarBuilder needs to do funky black magic, it can do so in `build()` instead of `__init__`.
I mean, you can do that (and it kight be useful if you want to seperate where the construction arguments are from where when the initialisation happens), but I don't understand why not just do it in init instead of build. It's not all that special, it's literally just a method that's called when you make an instance of the class, if you need to do it before you use the class, just make it happen when you make the class.
no thanks, that's how we end up with FactoryFactory. If the work needs to be done upon startup, then it needs to be done. if it is done in response to a later event, then it been done later.
For me the point is that __init__ is special - it's the language's the way to construct an object. If we want to side-effecting code we can with a non-special static constructor like
class Foo
@classmethod
def connect():
return Foo()._connect()
The benefit is that we can choose when we're doing 1) object construction, 2) side-effecting 3) both. The downside is client might try use the __init__() so object creation might need to be documented more than it would otherwise
Init isn't really special (especially compared to e.g. __new__). It has no special privileges, it's just a method that is called when you construct an instance of the class.
This is the exact opposite? They explicitly encourage doing resource-opening in the __enter__ method call, and then returning the opened resource encapsulated inside an object.
Nothing about the contract encourages doing anything fallible in __init__
It is a tragedy that python almost got some form or RAII but then figured out an object has 2 stages of usage.
I also strongly disagree constructors cannot fail. An object that is not usable should fail fast and stop code flow the earliest possible. Fail early is a good thing.
There is no contradiction between “constructors cannot fail” and “fail early”, nobody is arguing the constructor should do fallible things and then hide the failure.
What you should do is the fallible operation outside the constructor, before you call __init__, then ask for the opened file, socket, lock, what-have-you as an argument to the constructor.
Fallible initialisation operations belong in factory functions.
The real problem is that constructors and factory functions are distinct in the first place. They aren't, in Rust, and it's much easier to reason about and requires far less verbiage to write.
> Even `Path("config.json").read_text()` in a constructor isn't a good idea.
If that call is necessary to ensure that the instance has a good/known internal state, I absolutely think it belongs in __init__ (whether called directly or indirectly via a method).
You're right that consistent internal state is important, but you can accomplish this with
class MyClass:
def __init__(self, config: str):
self._config = config
And if your reaction is "that just means something else needs to call Path("config.json").read_text()", you're absolutely right! It's separation of concerns; let some other method deal with the possibility that `config.json` isn't accessible. In the real world, you'd presumably want even more specific checks that specific config values were present in the JSON file, and your constructor would look more like
def __init__(self, host: str, port: int):
and you'd validate that those values are present in the same place that loads the JSON.
This simple change makes code so much more readable and testable.
Friends don't let friends build complicated constructors that can fail; this is a huge violation of the Principle of Least Astonishment. If you require external resources like a zeromq socket, use connect/open/close/etc methods (and a contextmanager, probably). If you need configuration, create a separate function that parses the configuration, then returns the object.
I appreciate the author's circumstances may have not allowed them to make these changes, but it'd drive me nuts leaving this as-is.