Hacker News new | past | comments | ask | show | jobs | submit login
Try RethinkDB in your browser (rethinkdb.com)
122 points by coffeemug on Feb 22, 2013 | hide | past | favorite | 51 comments



Alright, I tried it. Frankly, I'm confused as to why this is getting so many up-votes. If someone could elaborate what they enjoyed so much, I'd be interested to hear.

I didn't enjoy this experience one bit. It didn't feel interactive to me. Each step is basically "Run this query that is already typed out for you" without any explanation, whatsoever, about what each part of what I was writing was doing. This was followed by an invitation to click a link to documentation to try something else on the line of code you don't understand.

It begins with:

>r.db('test').tableCreate('tv_shows').run()

My first questions: what is r? What does db() do? Is there a test database somewhere? Why do I have to call run() when there's a button saying "Run this query"? Do you sometimes not run() things?

The tutorial doesn't remember your position if you go back to it so it's a good thing, I guess, that it opens a new tab for you when you access the documentation to list the tables.

Let's move to the second step:

>r.table('tv_shows').insert([{ name: 'Star Trek TNG', episodes: 178 }, { name: 'Battlestar Galactica', episodes: 75 }]).run()

Even more questions: What happened to r.db('test') that I needed to use? Is r now that database? Why didn't I just do r.tableCreate('tv_shows') before?

I did this with basically every step of the process but stopped just short of finishing because I wasn't learning much and wasn't sure where it was going. I found myself simply wondering what the heck RethinkDB was.

This tutorial could really use some more explanation, interaction, flow and direction. Tell me what I'm about to do. Tell me why what I'm about to do is worth my time and cool. Start with r, go from there. At least I won't be lost from the get-go.

Let me write stuff out as you're telling me what I'm doing. That's interacting.


This feedback is wonderful, thank you. It's often hard to do a good job when you're neck deep in your own concepts because you lose perspective. (I'm responsible for the tutorial, btw, and it's my screwup). I'll make everything work better.

To answer your questions here:

> what is r?

It's a module where all the RethinkDB operations are defined. The tutorial is running the same JavaScript driver code as you'd run if you connected to C++ RethinkDB server from node.js, which is why you have to start with `r`.

> What does db() do?

It refers to a database named 'test'. This is where you're creating the table

> Is there a test database somewhere?

Yes. It's set up by default at the beginning.

> Why do I have to call run() when there's a button saying "Run this query" do you sometimes not run() things?

Because you can type multiple queries in the text box. Every time you end a query with `.run()`, that's when it goes to the server and gets executed there. We've been talking about not having people type `.run` for a single query, but it's a bit difficult to solve.

> What happened to r.db('test') that I needed to use? Is r now that database?

If you omit r.db() at the beginning of the query, it picks a default one (which is 'test'). It's the same as you'd do in MySQL.

> Why didn't I just do r.tableCreate('tv_shows') before?

We thought allowing tableCreate in a default database allows people to make mistakes too easily, so we disallowed it. It confuses people, so we'll add it back. Sorry for confusion.


Just a note for everyone on HN who is starting a business or working on any project: THIS IS HOW YOU SHOULD RESPOND TO CRITICISM! Classy, doesn't expect the user to research beyond the page, etc.

I have to confess: my initial reaction was "isn't this kind of self-explanatory" and "dude, if you want to know what rethinkdb is, look at their damned website" or even the dreaded, and admittedly elitist "if you have to ask, its not targeted at you." However, reading your response, I am reminded that I can be a smug asshole, and I'm going to remember how you responded for the day I release something to the public.

Your response, on top of the fact that I've already examined, installed, and liked your product, adds even more to my confidence that you guys are creating a great longer term product.


I once read a quote by Bertrand Russell that really helped me respond to people better. He was talking about studying philosophy, but it applies to understanding points of view in general. Here it is:

In studying a philosopher, the right attitude is neither reverence nor contempt, but first a kind of hypothetical sympathy, until it is possible to know what it feels like to believe in his theories, and only then a revival of the critical attitude, which should resemble, as far as possible, the state of mind of a person abandoning opinions which he hitherto held. Contempt interferes with the first part of this process, and reverence with the second.

This helped me a lot, hope it helps you too!


I don't have a similarly stylish and philosophical story as coffeemug's but the thing that changed the way I perceive criticism is something I've read a while back: before coming up with any answers or remarks, ask 3 questions about the original thing. Agreed these 3 questions shouldn't be: 1) WTF? 2) who is he to criticize? 3) should I bother?)


Thanks for the response! I completely understand how easy it is to lose perspective when you're the one behind the concepts.

I'd recommend checking out TryRuby from Code School to get a better feel for interactivity in a tutorial. It takes you all the way from "That thing over there is an interactive prompt you are going to type things into."

Writing the code out yourself is extremely helpful for learning. Folks learn better by writing down what they are trying to understand while absorbing the information; the same is true for programming.


Would you mind if I ping you to try things again after I make some changes? Your feedback is really wonderful, it would be immensely helpful to hear what you think after we fix things up. (I know it takes time, so if you're busy, no worries!)


I for one have no reason why randomdrake is requesting such extensive pedantry: I thought your API does a wonderful job of speaking for itself, and adding more ceremony to talk about it would only have gotten in my way.

Edit: I totally see the value of the "hack it out yourself" flow he discusses though, and agree that making people do it themselves would be a more likely to be useful learning experience.


Sure! I hopped in your IRC channel.


I've tried RethinkDB (back when it wouldn't balance shards correctly!), and I liked it a lot. Especially the automatic sharding is magical, I will definitely try it out again when it's stable.

That said, it is also my opinion that the ORM syntax needs a bit of love, for all the reasons stated above. Especially .run() feels a bit weird, can you make queries lazy (like Django's QuerySets, which only run when evaluated)?

P.S. If the rest of the RethinkDB team are as good at DOTA 2 as Marc is, I want to play with you too!


> Especially .run() feels a bit weird, can you make queries lazy (like Django's QuerySets, which only run when evaluated)?

In a way queries are lazy. You can write a query and without calling `run` on it nothing happens. The whole chained ops are executed at once when `run` is called.

> That said, it is also my opinion that the ORM syntax needs a bit of love, for all the reasons stated above.

This is not an "ORM" per se. It's a query API or data manipulation language. The next version will bring a few improvements to it.


> In a way queries are lazy. You can write a query and without calling `run` on it nothing happens.

That's not lazy, though, that's just doing nothing :P Lazy would be not running anything until you evaluated the last link in the chain, so, in Django:

    model = User.objects
    model = model.filter(name="Alex")
    model = model.filter(hero="Dazzle")
    model = model.filter(abandons<2)
    list(model)
and the query would only be executed when wanting to turn the QuerySet into a list.

I think your examples would also be a bit clearer if you forewent the r.db("test") step and just did:

    db = r.db("test")
    db.query("foo").bar().run()
which is more explicit.


Probably this is only "semantics", but I find these two quite similar :). I like the idea of having the query being executed only when needed, but I'm not sure that's always possible (or optimal with large result sets).

thanks,

alex @ rethinkdb


Laziness typically implies something isn't evaluated until you need the data (e.g. you coerce it to a native language data structure, or you need to print it, or write it to a file or network). So saying list(query) lazily evaluates the query, while saying query.run() is a more strict (explicit) evaluation model.

I actually think strict is better here, but that's a different discussion altogether :)


Sure, they are almost identical, but one doesn't need .run() :)

I don't see how it wouldn't be optimal, since lazy evaluation is a superset of eager evaluation (you can invoke it whenever you want).


Both solutions are lazy. While ReQL requires .run(), the Django ORM requires you to do 'list()' (or probably other form of iterator over it). I'd say the only difference is that one is explicit and the other one is implicit.


.run() is also bad. What connection is it being run on? The most recently opened connection? In a global variable somewhere? For anything larger than a short script, you should use query.run(conn), or conn.run(query).


Yes, especially in web applications. Found that out the hard way.


Awesome feedback!

Let me try to address some of these questions here, before figuring out how to improve the tutorial.

`r`: is the top level ReQL namespace that gives you access to the functions defined

r.db('name'): is accessing the 'test' database. RethinkDB supports multiple databases. You can specify which one to use on a per connection basis or for each query.

The 'test' database is a default database (in principle it's similar to MySQL's test database). Being the default also means that you don't need to use `.db('test')` in each query.

`run`: ReQL allows chaining multiple functions together. Basically creating a query is a bit like using a builder pattern. These chained ops are all sent to the server and executed on the database. There is a single roundtrip. Basically `run` tells the query builder: "now it's your time to do something for me".

As side comments

1. initially we added much more details about each query, but we ended up with quite a bit of text compared to short queries. We thought it would be more intuitive the reduce the description part and provide links to the API. It looks like that wasn't the best solution.

2. we chose to have the query pre-typed instead of having it part of the description only as we assumed mostly everyone would just copy paste it. Assumption proved incorrect!

Thanks a lot!

alex @ rethinkdb


Thanks Alex! Appreciate the response. I'm always learning but I was a special education teacher for a while too so I've got some perspective on getting people to understand things.

In regards to your side comments:

1) If you feel like there's too much text compared to short queries, that's fine. I'm not there to write gigantic amounts of code. I'm there to learn and write a few little snippets to understand what I'm doing.

2) As I mentioned in my response to coffee, it's much better to actually type out the stuff as you go. It sinks in and immediately starts to develop a feel for writing in the language. Especially considering the non-traditional querying that you've come up with, I think it's very important to take baby steps and treat it as a new language. To continue along the language analogy: teach us each word so we know what the sentence actually says.


I appreciate your feedback. Really really helpful to us (you should see what it triggered on #irc ;-)


Indeed I remember doing MongoDB and Redis online tutorials and typing everything out myself and they felt great. This tutorial was too annoying to finish. Took a big break from coding, thought maybe it's because I'm rusty. Apparently not!

So take a careful look at Mongo and Redis tutorials. They made it feel easy, maybe why they got so popular.

Could also be that RethinkDB has fundamentally more complex syntax so not as suited to quickstart tutorials. When I did Mongo and Redis I didn't know SQL and didn't have to. With innerJoins everywhere you need SQL as prerequisite.


I couldn't agree more - it's less a tutorial and more a slideshow of 10 queries and the results.

Not only that, but I found the documentation extremely lacking (almost nonexistent) when I was attempting to do the additional queries. When I was confused, there was nowhere to see the 'right answer' for the query and an explanation of why, I just had to guess at the syntax until I got it right.


Same. A hint that appears after a wrong attempt at the "before you move on" steps and a full explanation that appears after two would be very helpful, especially with steps 7+. Without those it feels like a puzzle instead of a tutorial.


Thanks. I'm going to implement exactly this.


I don't understand how this is giving any advantages over the standard object interface.

How is:

> r.table('tv_shows').insert([{ name: 'Star Trek TNG', episodes: 178 }, { name: 'Battlestar Galactica', episodes: 75 }]).run()

Easier than:

> db['tv_shows'].push([{ name: 'Star Trek TNG', episodes: 178 }, { name: 'Battlestar Galactica', episodes: 75 }]);

Seems to me like a totally unnecessary abstraction which only adds complication and potential points of failure and bugs.

EDIT:

Ok never mind this does actually create a real database. From the demo it seemed like it was just creating a JSON object.


In the former case you're creating a query (similar to SQL insert) that puts data into the database. You can then take advantage of indexing, sharding, durability guarantees, caching, a specialized query language, and all the other benefits of a database system.

In the latter case you're creating a data structure in the host language (which of course is immensely useful, but completely different).


My mistake, from the demo I thought this was just creating a JSON object, and there was no actual backend database.


Right. The demo implements the backend database in Javascript, so you can run real queries against it. (Still, the whole thing is confusing and I'll make it much, much better)

slava @ rethink -- I was responsible for the tutorial


The main difference is what happens if you chain together multiple operations. Are these executed one by one (each resulting in a roundtrip to the database) or executed all at once (single roundtrip to the db).

As a side note, each programming language has its own idioms. We tried to bring the ReQL query to each language in a way that felt as close as possible to the host PL (as opposed to say SQL which you need to use it as it is; and that led to the hundreds or thousands of wrapper libraries/ORMs. etc).

alex @ rethinkdb


One thing I can't understand is if "RethinkDB is built to store JSON documents", why then there are tables?

And also, probably I should better explore website, but is it transactional? In other words, is it possible to save multiple documents and RethinkDB server will reject them in case of conflict, just like all-or-nothing semantic in couchdb before ver. 0.9?


if "RethinkDB is built to store JSON documents", why then there are tables?

JSON docs are stored in tables -- a table is just a collection of JSON documents. We of course also support arrays, but the reason why we start with tables as a primitive is that it allows doing significant optimizations that otherwise would be very hard/impossible. If the entire DB was one huge JSON document, optimizations would be much more difficult to do. Grouping docs into tables essentially gives the system a hint as to the usage intent.

is it transactional?

Only for changes on a single document. There are no transactional guarantees on queries that touch more than one document.


> There are no transactional guarantees on queries that touch more than one document.

Any plans to change it in future?


May be. It's a bit far off (there is more low-hanging fruit), but I'd never say never.


It doesn't look like this site was built with Weblocks :-)


Still no PHP support?


Currently RethinkDB has official drivers for three languages, Python, JavaScript, and Ruby. I hope you understand why it would be difficult for us to support many more than this. Eventually we're counting on support for 3rd party drivers from the community. In fact, there are already a few early community drivers from some intrepid contributors including for Haskell and Go.

We've had lots of requests for more drivers and lots of offers to build them, but we've asked our volunteers to hold off while we revamp the driver interface to make it significantly easier to build drivers for RethinkDB. Having written the first JS driver myself and the new version I can attest to the much greater ease of doing so with the new API.

The next release (1.4) due very soon will include these changes and a driver development kit to support 3rd party efforts. After that I'm sure you'll see a PHP driver emerge very quickly.


should one use this instead of MySQL, and why? I couldn't find this question in the FAQ.


They have a page comparing themselves to MongoDB http://www.rethinkdb.com/docs/comparisons/mongodb/



"scale to multiple machines with very little effort"


Any info on how it handles resharding?


You set the number of shards you want and click "rebalance" (CLI tools are available too, of course). The right data is replicated to the right nodes without any additional effort. It's demoed in the screencast video if you'd like to see (http://www.rethinkdb.com/screencast/).


I was actually a little worried when I read that everything can be handled from the Admin GUI interface. Knowing that there are CLI tools available makes me a little more comfortable.

For some reason reading through steps like "click here" makes me less comfortable than "type this".


RethinkDB allows users to reshard with a single click in the WebUI. All you need to provide is the number of shards you want and it will partition the data evenly in to that many shards and lay them out on machines. You can see this in action in the webcast video (http://www.rethinkdb.com/screencast/) or you can install and see for yourself.

Joe @ RethinkDB


Thanks. So in theory something like this would not have happened with Rethink? https://groups.google.com/forum/?fromgroups=#!topic/mongodb-...


Yes, we put an enormous amount of effort into building a rock-solid architecture. However, any software has bugs, especially early on. We're working hard to make RethinkDB production ready, but it will take a few more months until it gets sufficiently battle tested that we can recommend running it in prod. environments.


AFAIK RethinkDB is (yet another?) non-relational database.


Check out their main webpage. They are better than most of the other "NoSQL" dbs out there in explaining the strengths/weaknesses of their product compared to a typical relational DB.


I have same question but for mongoDB. What's the difference between mongo and rethink? Edit. Okay I found the charts.


I'm a ReQL guru!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: