I definitely get your point, but over-generalized comments like these are also dangerous.
Just as there are many MBAs who were or are veteran software developers, the HN community is large enough that there are many members who are professional investors.
You are correct, some know what they are talking about. But there are also many people who think that being a smart person in one field makes them a smart person in every field. It is disrespectful, it implies that they think their field is easier than yours and it must not be that hard to figure out. It is also very easy to spot.
I did not mean to imply that everyone is one-dimensional, I personally have professional experience in the finance and software industries and have respect for the people in them. But when some finance expert suddenly becomes an opinionated epidemiologist I call bullshit (a random example that has happened far too often the past few years).
Sorry if it's a bit of an enterprisy-response, but please reach out! We are supporting some non-cloud commercial customers on a case-by-case basis. The reason for this is that we find the support and maintenance burden to be much higher with a non-cloud delivery model, which isn't always a great experience for either party. We also have a managed product (where the data plane resides in your environment) that may work, depending on your infra and security requirements.
Re: custom code -- our codebase is fully source-available and open to contributions, but the source+sink code going through some refactoring to make it more beginner-friendly. Depending on your consistency requirements, we also support Debezium and our own CDC format (https://materialize.com/docs/connect/materialize-cdc/) for folks who want to bring in their own data sources. (For quick prototypes, we also support csv/json/plaintext source types, as well as SQL INSERTs!)
Exactly! Since most ELT tools (including Airbyte) support json + csv output formats, those work perfectly well with Materialize out-of-the-box. I'm playing around with Slack+Stripe Airbyte sources to try and come up with some fun dashboards to show off in Materialize as we speak.
You are correct! Updates are a expressed as a retraction and an insert that happen within the same timestamp.
An example may not be necessary but it might also help clarify. Assuming you're using the psql client to run "TAIL WITH (PROGRESS)", the logical grouping for a single update will be a set of rows like the following:
...
1608081358001 f -1 ['Lockal', '4590']
1608081358001 f 1 ['Epidosis', '4595']
1608081358001 f -1 ['Matlin', '5220']
1608081358001 f 1 ['Matlin', '5221']
1608081359001 t \N ['\\N', '\\N']
...
All of these occur at the same timestamp, meaning that they should be applied atomically to maintain consistency of your dataset. In this case, my query is a top-10 query and Epidosis has now entered the top10 while Lockal has dropped out of the top10. Matlin remains in the top10 but their total has gone from 5220 to 5221. The final example record is produced when you run with PROGRESS enabled and serves as an indicator that 1608081359001 is now closed and no further updates will ever happen at timestamp 1608081359001.
I find that this stream of rows is very easy to convert to a data structure "{timestamp, inserts[], deletes[]}" and this, in turn, maps naturally onto reactive APIs, such as React or D3. My blog post, linked above, delves into this in more detail. Hope this explanation helps!
Nothing we can share publicly at the moment yet, but if you reach out and chat, we're more than happy to give you some numbers that I think will address what you're looking for!
Do you happen to have any examples of real-time queries or apps you would be interested in?
Re: the second point — you’re right, Materialize has historically leveraged existing upstream systems (like Kafka) for things like persistence. But we also hear you loud and clear that not everyone wants to stand up Kafka :)
Yeah, I think there's a tremendous amount of use cases for Materialize for companies that already have data store infra and want real time analytics or such use cases.
However, I also think differential dataflow solves a big problem for smaller companies building out their MVP or in early-stages. Firebase is popular because it's easy to set up, and it's realtime functionality on the client side mean you don't need to write a client-side data management layer, you can just use firebase's realtime functionality.
The issue is that firebase is completely untyped, isn't relational, and has limited queries. So you end up writing gnarly non-transactional code that makes many round-trip requests to query basic stuff.
I think there may be an opportunity product that combines the performance of and client-side tools of firestore, the ease of use of airtable and the real-time query and materialized view functionality of materialize into a database platform for businesses that want to scale their product.
Big ask obviously, but I know that a product like that would help me launch products much faster, I'd pay a lot for it.