I have a few internal services on which I like to crank transport security to 11. No port 80, only TLS 1.3, only modern ciphers. You'd be surprised how much confusion not opening port 80 caused across technical people. And I've learned a bunch of things about supported TLS versions and supported ciphers of windows server versions from this crusade.
And that's with experienced admins and developers. Doing this with our average B2B customer? Hah, oh dear.
I've written, tested and debugged low-level java concurrency code involving atomics, the memory safety model and other nasty things. All the way down to considerations if data races are a problem or just redundant work and similar things. Also implementing coroutines in some complang-stuff in uniersity.
This level is rocket science. If you can't tell why it is right, you fail. Such a failure, which was just a singular missing synchronized block, is the _worst_ 3-6 month debugging horror I've ever faced. Singular data corruptions once a week on a system pushing millions and trillions of player interactions in that time frame.
We first designed with many smart people just being adverse and trying to break it. Then one guy implemented, and 5-6 really talented java devs reviewed entirely destructively, and then all of us started to work with hardware to write testing setups to break the thing. If there was doubt, it was wrong.
We then put that queue, which sequentialized for a singular partition (aka user account) but parallelized across as many partitions as possible live and it just worked. It just worked.
We did similar work on a caching trie later on with the same group of people. But during these two projects I very much realized: This kind of work just isn't feasible with the majority of developers. Out of hundreds of devs, I know 4-5 who can think this way.
Thus, most code should be structured by lower-level frameworks in a way such that it is not concurrent on data. Once you're concurrent on singular pieces of data, the complexity explodes so much. Just don't be concurrent, unless it's trivial concurrency.
That’s why the actor model is so good. You have concurrent programs but each Actor has full ownership of its data and can access it as if it were single threaded.
In my opinion, it’s the only way to get it right and should only be replaced with low level atomics if performance proves to be much better and that impacts the business strongly, which I have never seen in practice.
> A good example: Here in North America I'll jaywalk without a thought if there's no traffic. In Germany, you'll get grandmothers calling you a child-killer for setting a bad example if you did the same.
This varies wildly in Germany. In Hamburg, at 7 - 9 in the morning near schools or kindergartens with kids around, many people are following good traffic behavior. At 9 on a university campus, or at 9 at night no one really cares.
> If this certificate is so critical, they should also have something that alerts if you’re still serving a certificate with less than 2 weeks validity - by that time you should have already obtained and rotated in a new certificate. This gives plenty of time for someone to manually inspect and fix.
This is also why you want a mix of alerts from the service users point of view, as well as internal troubleshooting alerts. The users point-of-view alerts usually give more value and can be surprisingly simple at times.
"Remaining validity of the certificates offered by the service" is a classical check from the users point of view. It may not tell you why this is going wrong, but it tells you something is going wrong. This captures a multitude of different possible errors - certs not reloading, the wrong certs being loaded, certs not being issued, DNS going to the wrong instance, new, shorter cert lifecycles, outages at the CA, and so on.
And then you can add further checks into the machinery to speed up the process of finding out why: Checks if the cert creation jobs run properly, checks if the certs on disk / in secret store are loaded or not, ...
Good alerting solutions might also allow relationships between these alerts to simplify troubleshooting as well: Don't alert for the cert expiry, if there is a failed cert renew cron job, alert for that instead.
I listen to a lot of music on the side, but Chris Boltendahl of Grave Digger said something that stuck with me. Btw, Grave Digger are not making Heavy Metal inspired by Heavy Metal, they were there making Heavy Metal in the 80s :)
Paraphrasing: With all of the streaming, and easy access to music, music has turned into a fast food. Eaten on the side, but rarely really fully appreciated this day.
And for new albums of bands I follow (or if I want to have a good time), I do exactly that: If the weather permits, get a hammock, a good drink, the good headphones (yes, I have several levels of quality of headphones), and just look at the sun, the trees and the magpies while listening to the music. Improving my own guitar skills has only deepened this appreciation.
> Sitting at a live concert (I am thinking classical) is up there too, because you've given yourself permission to not think of/work on anything else in that time
At least in Metal and to me, concerts are a different beast than the record. The record is usually the best and most perfect take of a song, often with additional effects, better mix. If you want to hear to the best version of a song, it's usually from the record.
Concerts are a party. It's always amusing how different concert cultures are there -- I know of some people who complain that they "can't hear the singer over someone next to them shouting". That's kind of the point of a live celebration of the band at the music in my world.
This is something that hit me during my master thesis the first time: I was entirely free to choose my work time, work mode. Since the prof was very hands-off too, only the result after 6 months mattered. That was quite the weird time. But it taught me some pieces about work-life balance and choosing what is 'enough work', as well as giving your brain time off, or time on something else to work through more complex topics and to recharge.
This is also why I honestly enjoy being a salaried employee. My employer buys 40 hours a week from me. Right, some weeks it's 50 and the next week only 30. Some weeks need a machine just executing, some weeks need more careful thought.
I could optimize it for more monetary output, but at the moment it is a predictable, usually not-painful thing with decent monetary output for personally more interesting subjects. I've found appreciation of this.
I totally failed on the balancing part, was working on the thesis like a runaway diesel engine. I could sleep and step away from the word processor window but it kind of consumed my thoughts at all times. Towards the end stress started to spill over to the physical domain and it took about 4 months after the submission and thesis defense (no snakes fortunately) for symptoms to subside.
Personally, I only trust an image manipulation tool to put down solid colored blocks, or something that does not involve the source pixels when deciding on the redacted pixel. Formats like PDF are just so complicated to trust.
One thing this is missing: Standardization and probably the ECS' idea of "related" fields.
A common problem in a log aggregation is the question if you query for user.id, user_id, userID, buyer.user.id, buyer.id, buyer_user_id, buyer_id, ... Every log aggregation ends up being plagued by this. You need standard field names there, or it becomes a horrible mess.
And for a centralized aggregation, I like ECS' idea of "related". If you have a buyer and a seller, both with user IDs, you'd have a `related.user.id` with both id's in there. This makes it very simple to say "hey, give me everything related to request X" or "give me everything involving user Y in this time frame" (as long as this is kept up to date, naturally)
I actually wrote my bachelors on this topic, but instead of going the ECS route (which still has redundant fields in different components) I went in the RDF direction. That system has shifted towards more of a middleware/database hybrid over time (https://github.com/triblespace/triblespace-rs). I always wonder if we'd actually need logging if we had more data-oriented stacks where the logs fall out as a natural byproduct of communication and storage.
I always wondered why we didnt have some kind of fuzzy english words search regexes/tool, that is robust to keyboard typing mistakes, spelling mistake, synonyms, plural, conjugation etc.
I have such fond memories of Graphite's simplicity. Simply hit the server with a metric and value and BOOM it's on the chart, no dependencies, no fuss.
Yeah, "haven't most of the users already moved to Prometheus?" was my first reaction.
Turns out that the name's been re-used by some sort of slop code review system. Smells like a feature rather than a product, so I guess they were lucky to be acquired while the market's still frothy.
I'm assuming you meant statically/dynamically type checked languages.
Generic functions to not ignore types. An `inThere::a list -> a -> bool` very much enforces that the list passed in, as well as the element have the same type. With a sufficiently powerful type system, this allows for statically checked code that's not much less flexible than dynamically checked code.
Observing current developments in Python, but also Rust gives me the impression that dynamically typed languages were more a reaction to the very weak type systems languages like C or Java provided back in the day. A lot of Python code has very concrete or rather simple generic types for example - Protocols, Unions, First-class functions and Type parameters handle a lot. The tools to express these types better existed in e.g. Caml or Haskell, but weren't mainstream yet.
And that's with experienced admins and developers. Doing this with our average B2B customer? Hah, oh dear.
reply