More

andrea_s · on June 28, 2020

> Back to topic. In Russia it is believed that after taking Berlin soldiers should have fought Allies. That's liberation war for you.

And they were not the only ones: https://en.wikipedia.org/wiki/Operation_Unthinkable

sergeykish · on June 28, 2020

Sure there were war plans then and from both sides. But those in command understood what could and could not be done. Soviet soldiers knew about allies - both as force and support (food, trucks etc). It was liberation war, it is not clear would they turn weapons on allies or stupid rulers.

I mean today, when hardly any solder left alive.

andrea_s · on April 10, 2020

Yandex ClickHouse also should be on the list!

andrea_s · on April 9, 2020

No pricing and no self-hosting option (either/or would be fine of course). A pretty neat tool for hobbies, maybe...

dylburger · on April 9, 2020

Hi there, Pipedream co-founder and engineer here. Paid plans are coming soon. We launched with a free tier during our beta to let developers experiment and solicit feedback improve the product for the longer-term.

Would love if you had a chance to give it a spin. We're always eager to hear what can be improved.

andrea_s · on June 19, 2019

MongoDB is not well suited for OLAP-style workloads - have you considered Yandex ClickHouse?

andrea_s · on Feb 9, 2018

404 not found at the moment - does anyone has a snapshot?

edit: nevermind, it is available again - weird

andrea_s · on July 9, 2017

To be fair, Spark is way too slow to be used as a back-end database system (there's a big bap between "lightning speed computation", as they put them, and a common workload for a database serving data to a user-facing application).

Now of course Spark makes up for it with its great flexibility and scalability, but I do not really see the two technologies as competing ones.

This even without getting into the other parts of the data model (insert, update, delete) that do not exist in Spark (or "kind of" exist), by design.

andrea_s · on June 2, 2017

Kind of goes in the direction of what Thoughtspot (https://www.thoughtspot.com/) is doing (https://www.youtube.com/watch?v=D-y_EjFsDuk)

fudged71 · on June 2, 2017

As far as I care, Google Sheets has beaten several BI companies towards "Natural Language Querying" by being free and accessible to everyone.

amelius · on June 2, 2017

Looks like a good idea. But, where does it get its data from?

Does it perform NLP on company documents?

andrea_s · on June 2, 2017

Honestly I don't know, I do not work for Thoughtspot.

My assumption is that it works on metadata coming from a relational schema with a rule parser on top.

Honestly I don't see how it would work well enough if it was based on NLP of unstructured data.

andrea_s · on Nov 8, 2016

I have tried it ~1 month ago and was quite a bit underwhelmed - direct connectivity with Google products is nice, but the data manipulation capabilities themselves lag behind Tableau a big deal (especially when it comes to clicking around in charts and tables to slice your dataset, which is admittedly something in which Tableau excels).

I do believe it is a valuable addition to the GA enterprise tier (especially for customers savvy enough to use BigQuery too), but at the moment I don't quite see it as a serious competitor to other off-the-shelf BI tools - would be very happy to be proved wrong in a few months, though.

andrea_s · on Oct 21, 2016

Perhaps that's closer to ClearScript (https://clearscript.codeplex.com/) in the .NET environment - although ClearScript has a broader area of application and supports multiple engines.

andrea_s · on Oct 18, 2016

It's a bit odd there's no mention of the PG columnar store in this article (https://www.citusdata.com/blog/2014/04/03/columnar-store-for...) - especially since it's from the same company.

It would be interesting to see how much the performances improve once you use cstore_fdw (especially since 1M records is quite small when talking about OLAP workloads).

disclaimer: I've never used cstore_fdw, but I have evaluated a number of columnar databases in the past.

ozgune · on Oct 18, 2016

(Ozgun from Citus Data)

We find that the primary motivation for using cstore is reducing disk I/O / storage footprint. cstore_fdw keeps a columnar layout on disk in compressed form and reads only relevant columns. For example, it's commonly used for data archival purposes.

That said, cstore_fdw doesn't yet make optimizations related to query planning and execution. We made experiments in that direction (https://news.ycombinator.com/item?id=8423825), but making those changes production ready is no small effort.

Since all benchmarks in this blog post are for in-memory data, I don't know how much they would benefit from cstore. If I have the time, I'll give it a try and update this comment with the results.

buremba · on Oct 18, 2016

I think cstore_fdw is not popular enough among Citus users. Only a few of their customers use it since it's not trivial to use cstore_fdw in real-time workloads. Given than its use-case is mainly analytics, it seems a bit odd though.