Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This sounds like you're opening yourself up to situations where you have rows in data_journal that have status='Loading' but which have started_loading_at=NULL, in violation of your data model.

Your premise is slightly flawed in that your second example isn't actually difficult to write or understand as you claim. However, it could be argued that if the logic for selecting "loading" rows is repeated in multiple places then a layer of abstraction over it would be useful. This could be achieved in SQL by e.g.

    CREATE VIEW loading_entries AS
    SELECT * FROM data_journal 
    WHERE started_loading_at IS NOT NULL AND finished_loading_at IS NULL
or, if you have several such statuses you need to define,

    CREATE VIEW data_journal_view AS
    SELECT
    data_journal.*,
    CASE
      WHEN started_loading_at IS NOT NULL AND 
    finished_loading_at IS NULL THEN 'Loading'
      WHEN ... THEN ...
      ELSE 'Some other status'
    END AS status
    FROM data_journal


that have status='Loading' but which have started_loading_at=NULL, in violation of your data model

Why not CHECK() it then? Or make ‘status’ GENERATED STORED or do a similar thing on triggers. The way that you suggested is also good, but it creates two names, one for update, another for select, which may confuse orms or developers.


The iron law of data is that there should be one source of truth.

It’s always better to enforce constraints statically (in this case by the schema) than dynamically at runtime. Because avoiding inconsistencies is the human’s job, the computer sure as hell won’t know what to do about it.


Not sure I understand a context for this, since checks and triggers are as static as views in a schema sense, and all of them are dynamic in a sense of computation (though get updated at different times).

If you mean that columns/fields should not share parts of the same “fact” even if one of them is computed or both are constrained accordingly, then I disagree.


Another benefit of having a separate status field is that you can index it easily. You could technically index the timestamps, but it's a lot of busywork for the RDBMS, as the values are all distinct.


This is orthogonal, because expressions can be indexed too. Of course if you see the same expression three+ times in your code, it’s begging for a name.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: