Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Optimizer nuances aside, I agree with this and think it's helpful for anyone writing SQL. I've trained a lot of non-developers (and new-developers) to use SQL, and one of my favorite tips is, "Start with FROM."

Even sooner than "start with FROM" is "What does each row of the result mean?" Is it a sale? A month of sales? A customer? If you can answer what you want, then it's usually easy to start there. Especially if your answer corresponds to an existing table, put that in the FROM, and then add whatever joins you need. You can even think about the joins "imperatively".

Thinking imperatively about JOINs is also a helpful way of understanding a mistake I call "too many outer joins", which I wrote about here: https://illuminatedcomputing.com/posts/2015/02/too_many_oute... If you don't know about this mistake, it's easy to get the wrong answer from aggregate functions like AVG or COUNT. (EDIT: I should probably call this "too many joins" instead, since it's not really specific to outer joins. But in practice I find outer joins are more often used in these sort of situations.)




For updates, and deletes on the other hand I try to start with the where clause. Learnt the lesson after running a update query once before i typed the where to cause something like 40k worth of damage


Good advice. I also break up the statement so that an incomplete highlight and execute will fail, e.g.

    DELETE FROM   
      Customers WHERE    
      Name='Gude' AND   
      Area='Tama'   
      ;  
Much harder to leave off half the clause doing this kind of thing.


I've forced the habit upon myself to first compose a SELECT statement and verify that it will give me exactly the dataset that I want to run my operation on. Then I wrap that statement into a CTE and run the DELETE / UPDATE on the CTE.

It doesn't save me time, but a lot of nerves.


I'm used to writing the SELECT query and adding the commented-out DELETE just below the SELECT line.

  SELECT *
  -- DELETE
  FROM Sales
  WHERE Customer = 1
Next, selecting the whole query and executing it. If results are satisfactory - selecting only the DELETE part.


This is the one I use too. Tiny difference - I write

   select * -- delete
   from sales
   where customer = 1
so it's even harder to accidentally highlight the 'delete' part.

Although I've switched to DBeaver for a year now, and it automatically pops up a warning when it detects a DELETE query without a WHERE, which is very nice.


DBeaver has something even better, albeit sometimes annoying: you can define a connection as "Production" (highlighted red) and "Test" (green). On Production systems, DBeaver will ask you for confirmation for every write statement.


Start all your sessions with "BEGIN TRANSACTION".

Then if you make an error you can always ROLLBACK. If it looks OK -- COMMIT.


And then you use MySQL Workbench and forget to turn off autocommmit....


Similar tip: don't start a command from `rm -rf ...`, type `rm /foo/bar/` first only when finished append ` -rf` at the end.

Similarly `git push ... -f` etc.


This saves you only if you delete directories. Better start with `ls`. The one with git force push is spot on though. This is one of the reasons to prefer tools that let you write options at the end of the argument list instead of after the command name.


Start every update and delete with BEGIN; and never fret again.


Yeah i've made a whoops like this before. I usually start a n update with a select statement to make sure my assumptions about the intent of the change is on point.


For destructive statements I almost always do a select with the WHERE statement that I will use in the delete first so I can see I am not doing anything stupid.


Auto-rollback transactions is helpful.

Everything is non-destructive without an explicit COMMIT.


For MySQL, run the command with

--safe-updates

to avoid queries running without WHERE clause.


I've called it the "GROUPing pitfall" since those extra JOINs only really come back to bite you when you try to aggregate the results: https://wikido.isoftdata.com/index.php?title=The_GROUPing_pi...


Nice writeup! The nasty part is that you don't get an error, you just get wrong answers. You might not even notice for a while. Such bad news!


Thanks to both of you for these articles. I'm very fond of outer joins as in most cases (in my day to day work) they're exactly what I need. While I don't use them often with aggregate queries, I didn't realize that multiple OUTER JOINS will break aggregates. I'll need to review my query logs to make sure I'm not doing this anywhere.


Was Emily Moore's child ever born?!?!


Unless I have the whole query in my head by the time I've typed "$db->query(", I will indeed start with the FROM by just typing the placeholder SELECT * and then revisiting the asterisk when I'm done with (most of) the query. Somehow that makes it easier, especially with complex queries.


I do this too, so the autocomplete in SeekWell automatically does this, check it out: https://www.loom.com/share/9c7979a163eb4513b3320e6e90c66079

If you type just a table name into an empty cell it will autocomplete with a full SELECT statement. e.g. if you typed "pubuser" and selected "public.users" it would autocomplete with:

  SELECT pu.*
  FROM public.users AS pu
  WHERE 1=1
Disclaimer: I built this.


I do this too! I thought I was the only one


> Especially if your answer corresponds to an existing table, put that in the FROM, and then add whatever joins you need.

This is absolutely right. In fact, my ideal query language would require the programmer to be explicit if they want to increase the cardinality of the result set. Meaning, if I start a query with "from Employee ..." the query language should not let me accidentally add a join that will cause the result set to contain the same Employee more than once. (It should be possible, but only if I make it clear that it is intentional and not an accident.)


> If you can answer what you want, then it's usually easy to start there.

Yep. As developers/designers we need to understand this as much as we wished our clients did.


> "Start with FROM."

Interesting because LINQ starts the same way, or at least that's how I remember it.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: