>well I can do the same thing with some scripts or spin up a web server.
That's a massive waste of your time and effort. The maintenance costs become massive as well should you choose to create your own system. At the VERY least you should leverage existing free open-source tools such as Metabase or Superset or Dash, or free tools such as Google Data Studio or Mode Analytics, if you're not going to spend cash to get a tool like Periscope Data/Looker/Tableau. I mean this gently, but you likely underestimate the complexity of a reliable reporting/analytics infrastructure. Think about it this way - these tools are either collaborated on by a large open-source talent pool, or are created by teams of dedicated software engineers just as talented as you.
I've worked with quite a few companies in an analytics consulting type role, and your "I can do the same thing with scripts" statement is one I've heard countless times. The long-term maintenance costs and technical debt (and "rigidity cost") of rolling-your-own analytics far outweighs the cost of a true analytics platform.
If you decide to roll-your-own anyway, look at tools like DBT and Airflow to reduce long-term maintenance costs.
Yeah I work at a company in the analytics space and see that all the time. It peaks the curiosity of people who are software developers (yet their core competency at their job is something else). They think its a fun work side-project and go after it. Write some python scripts to do ETL and process the data...make a backend with pg, a web server, then do charts in d3.js.
A year later they have a bunch of nice demos to their bosses but nothing that they can actually use in production because it crashes, there's no UI for interactive queries, no reports for people in the business groups, no user management etc. Then they drop it because they're busy with their actual job. So the cost of that engineers time to do something that didn't work was about $20-30k over a year. While the product they could actually use in production was around the same price.
Out of genuine interest and perhaps even a need for such a solution, is there anything that does what Metabase/Superset/Dash/GDS/ModeAnalytics does but for realtime datastreams? For instance, parsing and recording and visualizing the events coming from a websockets or some event queue/bus?
Yes perspective is built for streaming data - https://perspective.finos.org. Open source streaming pivoting engine that operates in the browser (using wasm in a webworker, with integration with Apache arrow for ingest of binary data strrams off websockets)
I believe you'd want a tool such as Grafana (which I believe is free/open source), which I have seen eng teams implement for realtime streaming. There is also Kibana which I am less familiar with.
Those all work for that already, because they're just frontends to whatever backend system you're using. You a database like BigQuery with real-time streaming inserts and rerun your queries for results.
If you want analytics directly on the stream, then there are plugins available to support reading the query results of something like Kinesis Analytics or Confluent's KSQL.
You might want to check Striim - it wraps kafka + streaming transformers with a scripting language, and a visual pipeline design tool. It then offers some really nice dashboards with a real-time “feel”.
Not affiliated in any way but I took it for a spin and was happy to see how much could get done with it in just a few hours and without prior knowledge.
How does Tableau make your reporting more reliable? My understanding is that Tableau can hook into different data sources -- like your warehouse and SalesForce, for example. Then you can write some kind of SQL to generate charts.
The auto-chart generation is nice. But what about Tableau makes it more likely to be accurate? Aren't you just as likely to make an error on the SQL in Tableau than if you didn't use Tableau?
I never said that about tableau in particular, which is why I listed half a dozen analytics platform solutions. Using any of those makes for a more powerful, flexible, sharable, usable tool than one engineer's self-rolled internal webpage with "analytics". Nothing special about Tableau in particular, in fact, I categorically prefer SQL-based visualization tools over Tableau.
The only exception to "using any of these is better than creating your own" is large companies like Google and Facebook, where they have entire teams of engineers who are dedicated to creating an in-house SQL+Visualization tools. It is absolute hubris for one engineer to think they can make a robust analytics platform!
Tableau is a GUI first. It's not designed for SQL based access and barely supports it other than as a type of custom datasource.
Since people aren't typing code, it can be more accurate to use, and it provides visual results beyond just a table that can be useful in detecting anomalies in your data.
That's a massive waste of your time and effort. The maintenance costs become massive as well should you choose to create your own system. At the VERY least you should leverage existing free open-source tools such as Metabase or Superset or Dash, or free tools such as Google Data Studio or Mode Analytics, if you're not going to spend cash to get a tool like Periscope Data/Looker/Tableau. I mean this gently, but you likely underestimate the complexity of a reliable reporting/analytics infrastructure. Think about it this way - these tools are either collaborated on by a large open-source talent pool, or are created by teams of dedicated software engineers just as talented as you.
I've worked with quite a few companies in an analytics consulting type role, and your "I can do the same thing with scripts" statement is one I've heard countless times. The long-term maintenance costs and technical debt (and "rigidity cost") of rolling-your-own analytics far outweighs the cost of a true analytics platform.
If you decide to roll-your-own anyway, look at tools like DBT and Airflow to reduce long-term maintenance costs.