I wouldn't say HTAP is an across the board trend. The need is mostly in the enterprise where we see high value mixed workloads.
Neon will take a Postgres centric POV. I consider a few approaches, but not committing to any particular at this point. The reason to not commit is that there is a lot more opportunity for Neon in serverless, devtools, and edge. And it makes a ton more sense to partner for OLAP workloads. So what is Postgres centric POV:
- Plugins like Timescale and Citus
- Seamless integration with sending CDC to Snowflake, Singlestore, Databricks, Clickhouse, and Motherduck
- FDW for integration with Snowflake, Singlestore, Clickhouse, Databricks, and Motherduck
- We looked at putting DuckDB on Neon storage with Jordan Tigani, but it was too big of a lift as DuckDB doesn't push all data through the WAL. Maybe in the future.
The reason we run OLAP bench is because when you separate storage full table get impacted: you fetch pages from remote storage one by one and the mitigation is prefetch. So our goal is to be on par with vanilla Postgres. We don't care about winning OLAP benchmarks.
ClickBench is still ok to compare non-OLAP databases on OLAP workloads - filter away the column-oriented DBMS, and you will be left with the comparison of non-OLAP databases.
I think it makes sense to add NeonDB for the sake of completeness.
I think lots and lots of benchmarks are useful internally. They highlight where the soft spots are. This is however very different from publishing benchmark results.
As a dbaas vendor we should publish benchmarks that your user base really cares about which demonstrate that the system is mature and can handle the core workload your users expect you to handle. That is why publishing an OLAP benchmark results for an OLTP system makes very very little sense.
(BTW OLAP systems like Clickhouse should really publish TPC-DS SF1000+ results)
For OLTP systems there is TPC-C but it's actually not that telling as it is so simple. I can start seeing that people in OLTP care more and more about latencies, hence all the edge related work and optimization of the drivers.
So Neon will be publishing end-to-end latencies benchmarks and also make sure it is on par with vanilla Postgres on a broad set of benchmarks. Those we will run internally and the results will simply be commits on our github.
(just want to say i am extremely impressed by your answers in this thread - very precise and clear)
> there is a lot more opportunity for Neon in serverless, devtools, and edge
serverless i get, but am not sure about the other two. i guess my conception of database business is that it is a volume biz, ie mainly the terabyte-petabyte scale workloads are the valuable enterprise business. high value mixed workloads are a subset of that but i’d imagine “postgres++” would be a good value proposition as well (easy migration with no replatforming).
my impression is that devtools/edge would be relatively low volume. perhaps high margin but not enough to make up for low volume. do i have a misconception here?
- Tier1: Unique large scale workload: think about moving money around. This is ~$10-20Bln market mostly dominated by Oracle, Microsoft and IBM. All next gen scale systems kind of forced to play in this market. Deals are large by sales cycles are long.
- Fleets of tier2 apps. Each enterprise now has a fleet of every major database offering to power lots of apps. To win this market you need to be a top 3 database in the enterprise. Most money is made by SQL Server, but Postgres has a real shot here. This market is driven by developers and SQL Server takes most $$ closely followed by AWS. This is Neon’s bet. That after getting a wedge in a low $$ SMB/hobbyist market Neon takes mid market and then moves to enterprise. The size of this market is also $10-20 Bln.
- OEMs. This is another $5-10 Bln. Neon only cares about dbaas but happily embed in everything cloud.
trying to see how much you mean it that the industry is "adopting (HTAP) across the board"