Hacker Newsnew | past | comments | ask | show | jobs | submit | more twa927's commentslogin

I think DSLs are about defining a 100% textual format. And spreadsheets are GUI apps. There's a DSL for defining a cell formula but in total it's a small part of the experience of using a spreadsheet app.


> "I think DSLs are about defining a 100% textual format."

That's an unjustified qualification. I could throw together a spreadsheet format that is all text. The spreadsheet GUI then becomes a advanced text editor that, when editing that particular format, exposes advanced content-aware controls not at all unlike advanced text editors like emacs can for s-expressions.

We can bridge the gap in other ways too, for instance the '2d' racket language that lets you do control flow using a two-dimensional ascii art grid: https://docs.racket-lang.org/2d/index.html It's not hard to see how this concept could be iterated on to become something quite like a spreadsheet, and with editor support the editor/language combination would begin to look a lot like a spreadsheet too.


> I could throw together a spreadsheet format that is all text. […] The spreadsheet GUI then becomes a advanced text editor that, when editing that particular format, exposes advanced content-aware controls not at all unlike advanced text editors like emacs can for s-expressions.

Emacs already has that, included in org-mode: https://orgmode.org/org.html#Tables

(Emacs also has an even more spreadsheet-like mode; ses-mode, but that saves its data in less-than-purely-textual format. Or, at least, less textual than org-mode.)


> That's an unjustified qualification. I could throw together a spreadsheet format that is all text. The spreadsheet GUI then becomes a advanced text editor that, when editing that particular format, exposes advanced content-aware controls not at all unlike advanced text editors like emacs can for s-expressions.

If you could invent a text format that could be efficiently edited with a basic text editor then I would agree it's a DSL. But I feel like you would lose a lot by dropping a dedicated GUI, e.g.:

- horizontal scrolling of columns, adding, hiding columns

- "smart copying" a formula by scrolling down

- selection of rows/columns/cells

I don't think anybody would use such DSL using a basic editor.

Overall it's discussion about the definitions of terms, but I don't understand why people want to capture anything having some "editable format" as a "DSL" when addtional terms like "visual programming" allow more differentiation?


What about Unreal Blueprints, Dynamo BIM? Visual GUI doesn't make them less DSLish.



Don't confuse a deficiency of a wikipedia article for the definition of that thing.


So where should I look up the definition of a DSL?


Let's take the Wikipedia definition, not the examples list:

> A domain-specific language (DSL) is a computer language specialized to a particular application domain.

where computer language links to a list of things that includes "programming languages", which visual programming languages are a part of.

Other sections of the article also mention some graphical examples, e.g. UML. (which isn't a programming language, but a modeling language, also used sometimes as an input to software)


> I think DSLs are about defining a 100% textual format.

Spreadsheets are an argument for thinking outside of that particular box.


Spreadsheets are about thinking inside of a lot of little general boxes, instead of just one big particular box.


Can someone provide actual high-level use cases for using Kafka? Prefereably use-cases not handled by RabbitMQ.

I've seen a few talks about Kafka but they focused on the internals. My guess is that Kafka is for large systems for which managing a multi-node RabbitMQ cluster is too much trouble.


I’ve long had the inverse view - I’m not sure what good use cases there are for Rabbitmq that couldn’t be handled better by a Kafka cluster.

One company I worked with used Kafka as their central source of truth across the organisation. All events generated by users were thrown into a massive Kafka cluster. Each team in the organisation cared about a different view into that data (financials, marketing, fraud, what we display to that user on the website, etc). Each team would ingest the same kafka queue and do different things with it - often consuming certain events into their own Postgres instance, or other things like that.

I used Kafka when I made my reddit r/place clone a few years ago because it gives great read and write amplification. With Postgres as a central source of truth, you can only handle thousands of writes per second. And reads will slow down the instance. With Kafka you can handle about 2M/sec. And reads can really easily be serviced from other machines - you can just have a bunch of downstream Kafka instances consuming from the root, and serving your readers in turn.

It may be that you can also solve all these problems with a well configured rabbitmq cluster. But coming from a database world I find it more comfortable to reason about architecture, performance and correctness with Kafka.


Size? If you’re getting less than a few hundred events a minute is it worth setting up Kafka?


This is the main reason I don’t use much Kafka in my own projects. I hope at some point someone makes a redis equivalent of Kafka for small projects.

Is Rabbit much easier to set up for small projects? I haven’t used it much.


You might be interest in Redis Streams[1], it's basically Kafka in Redis.

[1] https://redis.io/topics/streams-intro


If you're in AWS, you can use Kinesis which is similar to Kafka. It also ties into a lot of their other offering such as:

* s3 - use kinesis firehose to take the contents of your kinesis stream and time partition it into files for either ingestion into redshift, elastic search, etc... or later batch analysis for ML or just to treat as cold searchable storage with something like Athena

* dynamoDB - spit out the data into kinesis from dynamoDB as it changes to create a change stream used elsewhere in your platform. (dynamo-streams)

* real time analysis - perform real time sql analysis (kinesis analytics) on what's in your stream over a given window of time or data, and react as events/situations occur.

Looking at all the services that amazon has built around kinesis might help you understand some of the differences between kafka and something like RMQ.


Sounds like your org used Kafka for event sourcing. This is almost always a bad idea, event sourcing and aggregate reconstruction is a nightmare IMO.

Kafka used as a pure FIFO cache for regular CRUD endpoints works fine


Event sourcing was one of LinkedIn use cases when they created it, Kafka is fine for all logging needs.


Yes; they did. It worked pretty well actually.

Why do you think it’s a bad idea? Most of the arguments against event sourcing that I’ve read seem to be “yes but the tooling isn’t very good”. That might be true, but maybe we solve that problem with more investment into event sourcing; not less of it.


TLDR the tooling is so bad it's basically impossible to run at scale. I worked for a company that tried. Maybe on a small scale it's fine, but replays and storage of past events takes insane amounts of space at high event rates. To the point that storage costs and replay times became a real problem. (Many terabytes and days)

I also don't think it's a great idea in general. The event stream directly replicates a DB commit log, and the aggregates your tables. It's building your own database.

We had to throw a year's worth of work away at the end so I'm fairly biased against trying it until the ecosystem is better.


Kafka is a high-throughput, horizontally-scalable blob data store for data streams. The data store part of that is my favorite part.

You can use it as a simple message broker, but since it keeps the message history as a timeseries, you can also do things like run batch analysis jobs on the day's message or replay the last X hours of messages because your DB died and your backup is old.

It is a good way to decouple data producers and data consumer, particularly in an enterprise context - producers push to Kafka and anyone can consume that data, whether they are an operations team that wants a realtime data stream, a BI team that needs periodic data dumps, or a team that wants a long-term audit trail (the duration of the history is going to depend on your scale, but for many users a long history is realistic).

Kafka also has a nice ecosystem including streaming analytics (KSQL), clients that make reading from Kafka easily horizontally scalable (have many machines acting as a single client, automatically rebalancing if one of those machines dies), exactly once processing and probably more since I last worked with it.

I'm not familiar enough with RabbitMQ to say how it compares to Kafka, but I haven't found a use case yet where Kafka isn't a good choice (except for the 'I need to set up a message broker quick and painlessly' use case because it is not a particularly fun technology to manage yourself)



I skimmed over it but it's again mostly about internal design. The high level use case I see is "publish-subscribe" which is handled by RabbitMQ and a dozen of other solutions.

One use case I see is that the events published into Kafka are persisted so e.g. some component can see a history of some events (so this is something not handled by RabbitMQ). Is it right?


Event streams between decoupled systems is kind of the sweet spot for Kafka. It's extremely easy to scale horizontally, and handles distributed work and network partitions in an easy to reason about way. I've also seen Rabbit be the bottleneck before where I've never really seen Kafka be the bottleneck in an architecture - it's very analogous to a firehose. For organizations shuttling messages and events between teams, it makes a very convenient lingua franca.


I often do the same thing... skim through articles and papers to get the gist. Trust me, this is one to ready all the way through.


That's correct - being able to "rewind" a history of a topic (queue) is a powerful concept. But, Kafka is a bit harder to operate than RabbitMQ in my experience. (Somewhat related, one of our subsystems originally was built around Kafka, but later was migrated to RabbitMQ and Postgres)


A financial exchange - order messages are routed to Kafka and partitioned by the instruments symbol, match engines associated with a given set of symbols consume from their assigned partition. When a match-engine goes down it can reconstruct the order book by replaying from a given offset.


It's basically a high speed transaction log, persistent, distributed, easily scalable, that happens to store messages to do messaging brokering very well.


For some reason many "enterprise" software products are clunky and buggy. I guess this is because the decision to use the software is made by managers who don't ever use it.


I think it is more to do with the fact that "enterprise" software vendors have done the boring legwork of ticking all of the necessary multi-jurisdiction legal tick-boxes, implement that one insane export format, provide reports required just-so by a given regulator and so-on. The value they provide to the business is outside of the software in that sense.


It's good that the managers care for the higher-level "business" goals, but shouldn't they care for their subordinates more? In the end it also contributes to a business metric - employee happiness. When I was switching a corporate job to a startup job the fact that I switched from a corporate Outlook and a proprietary issue software to GMail and Github made me happier.


I suppose my point is ultimately that they're not goals, they're requirements, and they are often imposed from the outside by legal or other official bodies. In that context caring doesn't really come into it. In terms of the vendors of shitty enterprise software (and attempting to answer your original question), if you've spent a bunch of time, money and effort making the Venn diagram of certifications and industry accreditations and implementing niche proprietary licensed file formats converge into your product you may not have either the willing, cash, or capability to produce quality software with flavour-of-the-month UX too.


> It's good that the managers care for the higher-level "business" goals, but shouldn't they care for their subordinates more?

Ideally, yes; rationally, given the incentives of the concrete eco omic system they exist in, no. Managers work for higher level managers, or at the highest level for capital owners of the business, not for their employees except to the extent that the employees are also capital owners of the business.

Managers aren't union reps for their subordinates, they are agents of capital. That's, literally, their job.


This sounds like early 20th century capitalism... I don't think it's very applicable to the current programming industry, where companies have to fight to retain good employees. Also, the top-down structure is no longer preserved within the Agile's "self-organizing teams". So in this landscape I think it's very reasonable to take seriously what programmers think about the given internal software.


> This sounds like early 20th century capitalism... I don't think it's very applicable to the current programming industry, where companies have to fight to retain good employees.

It absolutely does; just because capitalist are competing for labor doesn't mean that management switches from being agents of capital to agents of labor. Valuable, contested human resources are still resources, not owners.

> Also, the top-down structure is no longer preserved within the Agile's "self-organizing teams".

One of the most frequently reported problems with Agile in practice is that the idea of empowered, self-organizing teams is, even in organizations that give lip service to Agile development given limited, effect by management. In any case, that concept applies mainly to how teams deliver on business goals, not on setting business goals, so even ideally it would not prioritize staff opinions over business goals.


> just because capitalist are competing for labor doesn't mean that management switches from being agents of capital to agents of labor. Valuable, contested human resources are still resources, not owners.

Yes, I generally agree, but it goes as follows:

1. The business owners care only about achieving their business goals.

2. To achieve the business goals the owners need to have good employees.

3. To hire and retain good employees some business decisions should take into cosiderations the needs of the employees.

So even from the purely economical point of view there should be some balance between 1. and 3.

And overall your purely economical point of view seems too rigid to me. One example that comes to my mind is the cultural shift happening at Microsoft. The company made many decisions with little business sense like open sourcing many projects and providing free developer tools. But the effect is that developers like the company more. This has measurable effects like Azure success or hiring better employees. I think this is an example of a bottom-up-built success which doesn't fit your top-down viewpoint.


There’s some truth in this, but my overwhelming sense is that previous poster is correct too. Enterprise software vendors almost never sell to their actual end users. While this is true, no one should be surprised that the UX on most enterprise software products is truly awful.

Also, <mild sarcasm warning> if you ship high quality working enterprise software on day one that actually delivers everything your customer needs, how will you bill them for a large “services” team engagement? It’s not unusual for this service money to make up a substantial portion of a mid to large enterprise software company’s revenues. All too often they are building features or fixing bugs that should have been in the original product as sold.

I’ve often sadly joked at my own work that if we shipped working software we’d probably make less money...


> For some reason many "enterprise" software products are clunky and buggy.

That's because “selling to enterprises” is a distinct competency from “making a working product”.

> I guess this is because the decision to use the software is made by managers who don't ever use it.

That's, often, an important part of the problem, but another, perhaps more signficant, part of the problem is that enterprise level constraints on software purchasing decisions are often made by managers (or management workgroups with no single responsible party, or even by directly by state or federal legislation and/or regulation from outside the purchasing agency [0]) who are (or, in the case of groups, consist of people who mainly are) remote in time and org-chart distance from both the decision to purchase software and the actual use of the software. This is perhaps most notoriously true in the public sector, where often the most critical competency for selling to an agency is the ability to navigate the acquisition policies applicable to that agency, which in the most complex of cases involve policies driven by both state and federal legislation and regulation by multiple state and federal control agencies as well as the internal policies of the agency actually doing the purchasing, but any large organization tends to have bureaucratic purchasing rules which, while usually well-intentioned and sometimes legitimately necessary to avoid even worse adverse effects, inadvertently reward vendors with competence in navigating the particular bureaucratic maze over those that lack such competence in a way that can at times outweigh competence in delivery.


The company I work for provides software to a very specific enterprise market. One of the most insightful moves the owner/CEO made was starting a business that uses the software right alongside the software company itself (in the building next door).

Members of the software company staff do week long rotations working at that business just so they can get a real life feel for actual needs of the market, which has been really beneficial.


Also execs (both buyers and sellers) choose to underinvest over time.


It works well for tight loops processing much data, or heavy object-orientation (multiple levels of class hierarchies). It probably won't work well for regular Django webapps or scripts. Also, real-world Python numerical/AI code uses numpy/ML libs so there's not much to optimize in Python...


CPython is mostly reference-counted.


With a synchronous garbage collector for cycles. Which is like the worst of both worlds, since you get the constant overhead of refcounting, plus unpredictable interruptions of unspecified duration that can happen every time a new object that might contain references to other objects is created.

To be fair, the GC can be disabled. But it's only safe to do so when you know there are no cycles, and even when such guarantee can be had for your own code, I've never seen a library guarantee that to API clients.


And reference counting is a form of GC


I was trying to speed up a log processing service running on PyPy by rewriting it in Java. I was surprised that the result was about twice slower (I know Java quite well and I didn't see obvious optimizations; most of the time was spent in GC). So it can be quite fast even in more absolute terms (VM languages), at least for some types of code.


If more than 1% of your time is in GC you are doing something very wrong.


The fact that a singular implementation was better than java says less about the languages and more about the particular software.


I know that it’s asking a lot, but any chance that you can post a minimum reproducible sample? From what I know it is quite smelly...


I don't have access to this codebase now but I'll try to write some benchmark.


If you manage to do it would be awesome, otherwise thanks a lot anyway for the effort :) I’m just curious to understand why it happens because it’s exactly the opposite of what I would expect. The only explanation that comes to my mind is excessive gc as someone else already mentioned, but it would be interesting to see the original code.


I started doing it but yes, it's too much effort to get two full benchmarks, I'm sorry :). But I think it went down to the inefficiency of String.split(): https://stackoverflow.com/questions/37007189/string-split-te... and generally the Java's String built-in methods not being GC-friendly: https://stackoverflow.com/questions/20336459/garbage-friendl.... I'm guessing that when such parts can be coded in a non-VM environment (CPython/PyPy runtime) they can be made much faster, and Java to these days has the standard library coded in pure Java?


Yes, Java has nothing like C# Span to avoid these kind of problems, but I thought that also python would be affected in a similar way... Anyway, thanks for sharing.


CPython and PyPy are both VMs, similarly to the JVM.


I meant the inside of a VM (its implementation), which is coded in C/RPython. Java's VM is coded in C++, but I don't think C++ is used for any regular library functions, while C is used heavily for Python's stdlib.


Not all JVMs are implemented in C++.

It is a specification with multiple implementations, some of them are even bootstrapped in Java.



I had a binary parser written in Python that took around 30 seconds on typical input on CPython. PyPy took that down to about 10 seconds. Rewriting it in C# took it down to 200 ms.


If this was using a loop processing a single byte in an iteration I would expect a greater speedup on PyPy. I've seen 100x speedup in such cases.


Not single byte, but individual fields (float32/int32/string etc). Yes, I expected a much more significant speed-up as well. It's probably because a lot of that code was driven by reflection-type techniques.

Curiously, IronPython did better than anything (but still slow). Haven't tried Jython.

Compiling the whole thing with Cython was less effective than PyPy.


Yes, reflection voids many JIT paths in PyPy, AFAIK. Maybe it was worth rewriting the Python code to get rid of the reflection?


> The only reason for a KV cache is if you're running a slow language like Ruby, PHP, Python, where the memory overhead for objects is large and the speed of a remote cache is roughly the same as one in memory. If you find yourself needing a cache for good enough performance you shouldn't be using these languages in the first place. In memory caches don't work well when you need 100+ server instances to handle the load anyways

You're saying reading a serialized Python object from Redis and deserializing it is comparable to accessing a ready in-memory object? No, it's an order of magnitude slower. I think the reason in-memory caches are not used much in Python/Ruby is a concurrency model, ie. in Python/Ruby you usually have to run multiple processes and in Java you run multiple threads, so the in-memory cache can be just a regular data structure normally accessed.

Also, I think you are overestimating the Python's memory overhead compared to Java - Python's int is 24 bytes and Java's Integer is 16 bytes (and using PyPy is another story - it can use several times less memory compared to Java).


Maybe it's time to start hosting olympics in the same place? And make all countries participate in costs? There are multiple arguments for this: https://www.google.com/search?q=hosting+olympics+in+the+same...


At the very least the host should be chosen based on it's current ability to host the games, not rely on tens of billions of dollars worth of construction projects to get there. LA is probably one of the few places with most of the venues already in place.

The other option would be to host different sports in different countries, so the burden get's spread out more. It's kind of ridiculous that countries have to build artificial rivers now and all kinds of ridiculous crap. Why not host the kayaking somewhere there is a natural river they can use.


That's one of the advantages of doing it in Los Angeles. There's already stadiums here, several of which have been used for previous Olympic games


Maybe it’s time to forbid them from receiving any taxpayer funds and require them to take donations or sell tickets/merchandise/media rights only.


> "Previously, those were kind of off the books, under the table. You know, 'Don't ask don't tell deployments.' "But now suddenly, it's the AI team and they've got to be supported."

Are AI devs (Data Scientists?) really regarded that much higher by the management compared to the regular devs?

Do you think this is special or is it just the current fad that is large enough to reach the management?


In my experience, Developers who deviate from the norms are able to support themselves. Data Scientists are generally more dependent on support and infra from the rest of the org.


Yes I think so. At our place all data scientist/big data types are given high end MBPs and >2 good quality 4K monitors. All other typical IT devs are given a low-end 15W HP laptop and 2 HD monitors again from HP. Low quality Windows machines have been mandated right from top IT leadership. So I think data scientists are important enough if they can override or get exception to standard issue hardware/software.


That's interesting, I'm wondering if this happens also to e.g. "Cloud Specialists" or happened to "Object Orientation Specialists" 20 years ago...


Maybe. But I think it is mainly due to expensive ass consultants selling management the idea of deriving deep insights out of terabytes of application logs. And since it is about big data so big hardware seems natural to management. Regarding OO Specialists I remember in J2EE heydays stodgy Weblogic/Websphere tools would take 2GB RAM when 256MB was more of standard. So management would approve this configuration for important enterprise projects done in J2EE.


Regular devs tend to have to target users. You can love X as much as you like, but if your users are on Y you will find it easier to be on Y as well. Devs that have a business case to not be on Y have been using X for years. The only thing different about AI devs is they are enough to get a critical mass in some areas as opposed to the weird guy.


Can someone say if the font rendering is improved compared to previous releases? This might sound like nitpicking but it's the thing that kept me from switching from Ubuntu, which generally has superb font rendering.


It's not nitpicking at all. Readability of text is hugely important, especially if your job is to read text on computer monitors all day.

Also, I saw this in the news a while ago: https://www.phoronix.com/scan.php?page=news_item&px=Fedora-C...

... so it looks like it's enabled, but not configured? If Fedora's freetype contains all the patches, perhaps the Reddit thread above is obsolete now?


> Also, I saw this in the news a while ago: https://www.phoronix.com/scan.php?page=news_item&px=Fedora-C....

Hmm, that's interesting, so this is the same ClearType algorithm as the one currently used in Windows 10? I actually hope Linux distros won't switch to Windows-like antialiasing, I definitely prefer Ubuntu's antialiasing (Windows' fonts are too thin an pixelated for me).


Have you tried using the built-in ClearType Tuner to tweak the rendering to your liking?


Yes, I run it several times, but each time the effect wasn't noticeably better. And I would prefer to have an option for an explicit configuration of the rendering properties (like "hinting: slight/medium/none") instead of going through a series of sample screens, because this is getting really tiresome after a few passes...

I recently switched to a 4K/24" monitor and I actually cannot say that the Windows fonts are pixelated, but they are still too thin. And the larger fonts don't look good - as if the hinting was completely disabled for them.


It's the same, and it won't change due to licensing. You can change it on your own, however.

https://www.reddit.com/r/Fedora/comments/8bh7av/quick_font_r...


Didn't the MS patent PR puff piece solve shit like this ?


What was the last version you used?

It's enormously improved over, say, Fedora 23. If you had issues with a newer version like, say, 28, I don't know. I don't personally have an issue with it.


We would have known if they didn't hate screenshots.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: