Hacker News new | past | comments | ask | show | jobs | submit | more ClaireGz's comments login

We heard of Julius a lot, but did not know about Cipher42, there are a few others folks around. We feel there is a pain and also data teams are a bit abandoned at the moment when it comes to work using AI so makes sense. Curious to get a feedback about your journey building cipher, did you stop working on it?


Oh yes! Should be done fairly easily, we have DuckDB coming in the next release, we can also add SQLite. You use SQLite to develop locally I guess?


I can't really tell what those databases are that are coming soon, a "hover" over the icons would be nice. SQLServer coming anytime soon, my coworkers are working on some data integrity work right now and it might be a nice tool for them.


It's Databricks, Iceberg and Redshift, which on the first survey that we did were the most asked. But as per this post and a broader audience it appears SQLite at least win! We'll add also SQLServer in the list


Yes I second this. I use sqlite for local use and also for prototyping data designs, so sqlite support is very useful indeed - not a deal breaker but definitely a tick item.


Will also give nao a shot as soon as this is shipped. A LOT of non corp data work happens in SQLite and duckdb.


Yes just local, but love to use Nao to quickly analyze datasets


we support Postgres (and DuckDB is coming very soon) so yes probably as Hydra is a mix of both, but I have to try it


Sweeeet. Let's give it a go!


Thank you!


We say "data vibing" so it feels unique to the data community! But in all seriousness, this is already an issue, people are already asking ChatGPT (or Cursor/whatever else) to generate SQL for them, but the next steps do not exist, if you "vibe code" for data you want to have the easiest feedback loop you can get to check if the output is good, and that's what we are working on: identifying the downstream impacts in the IDE and proposing fixes, a table diff view, new UI/UX to test your outputs.

The goal for us is to be the best way to do data with AI.


Ok, but how do you know it’s good?

With data I think that is very hard, I wrote a SQL query (without AI) which ran and showed what look like correct numbers only to years later realise it was incorrect.

When doing more complex calculations, I am not clear how to check if the output is correct.


Usually what we've seen is data people having notebooks/worksheets on the side with a bunch of manual SQL queries that they run to validate the data consistency. The process is highly manual, time consuming. Most of the time teams knows what kind of checks they want to run on the data to validate it, our goal here is to provide them the best toolbox to do it, in the IDE.

Tho, i'd say this is like when writing tests in software, you can't catch everything the first time (even when going 100% code coverage), especially in data when most of the time it breaks because of upstream producers.

It will still require live observability tools monitoring data live in the near future.


Thank you!


Great! Let us know if it's how you imagined it when you try it


Thanks for the kind comments, he's surely a great guy :)


When it comes to SQL writing we are more relevant, when it comes to speed this is hard to benchmark exactly against Cursor and Windsurf but we are a bit slower (around ~600ms on average) obviously and we know what we have to improve to speed it up.

Next in the list is the next edit suggestion dedicated to data work, especially with dbt (or SQL transformations) where when you change a query you have to change the downstream queries directly.


Thank you! Actually this is exactly what we target, we've seen that data teams have often a longer feedback loop than software engineers. That's a goal for us to shorten it and to bring data the closest to your dev flow.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: