What ORMs Have Taught Me: Just Learn SQL (2014)

Philip-J-Fry · on Oct 21, 2020

I'm a Go developer and I notice a lot of other Go developers instantly suggest things like GORM to noobs writing applications. Whereas I always suggest against it.

I'm a big advocate of understand your data model at the database level. Need to join on too many tables is too easy to do with an ORM.

My go-to strategy for SQL is simple. Abstract your SQL as far away from your application as possible. Not just by using interfaces in your application, but by using stored procedures in your database.

In my mind SQL is just another service your application depends on. And you want that interface to be as simple as possible for your application. If you need to tweak SQL performance it should not need any input from your application. I could completely migrate to a new table relationship structure without the application realising if that was what was needed. You could even go as far as to unit test just your SQL code if you wanted, something you can't do too easily when it's in your application.

Yes, if you need to return new data then you need to update your stored proc and code. But that's so worth it in my opinion for that extra layer of abstraction.

My opinion is slightly skewed from a decently sized business perspective, but I do still follow this pattern in personal projects. When migrating applications to different tech stacks (like Java to Go, or C# to Go) this abstraction has meant the world to us.

register · on Oct 21, 2020

I am against using stored procedures. In the projects I saw where SQL code was encapsulated by stored procedures the business logic was inconsistently split between the application and the database making maintenance much more complicated than it should have been. I am in favour of stored procedures to avoid several roundtrips to the database but only when performance really matters. The case on tweaking SQL performance does not convince me. Where I work we execute performance and load tests before every release of the application and we are expected to meet performance targets that are documented in the release notes.

friend-monoid · on Oct 21, 2020

I’m interested in this because I have the opposite experience. Not saying any choice is wrong, but in my experience, moving that logic into the database has made all our codebases easier to reason about. The API’s don’t have to botch error handling due to “open transaction need to be ROLLBACKed”, I don’t have to worry about bad devs doing stuff non-atomically and putting the db in a bad state, and the code is easier to read because it’s just a single query that does “what it’s supposed to do”. The API usually ends up just being a small adapter for DB functions. It also makes sense to me because the “database API” (e.g. the database schema with some methods in it - the tables themselves are usually not in the same schema) describes the valid operations of the data, and can make sure the data is consistent. It also makes adding new APIs easier.

register · on Oct 21, 2020

>I don’t have to worry about bad devs doing stuff non-atomically and putting the db in a bad state

Premise: I assume we are speaking of relational databases. The fact that a developer can "corrupt" the database with non-atomic stuff is an hint to me that the database probably doesn't have the right referential integrity constraints in place and probably is not normalized either. The relational model is built out of the box to keep a consistent state of the data and this is one of its main value propositions.

Another danger that I didn't mention is that junior developers will be tempted to prefer "Turing like code" inside stored procedures (for loops,cursors, conditionals) instead of relational constructs (joins, subqueries,grouping,table variables). I have seen that several times and this is really a killer for database performances.

Silhouette · on Oct 21, 2020

The fact that a developer can "corrupt" the database with non-atomic stuff is an hint to me that the database probably doesn't have the right referential integrity constraints in place and probably is not normalized either. The relational model is built out of the box to keep a consistent state of the data and this is one of its main value propositions.

This is true when interpreted the right way, but I don't think real world problems are always so tidy.

Let's consider Standard Toy Example #17: The Bank Account Transfer. In a simplistic model, we might have a table of bank accounts in our database, and a second table of deposits. Constraints will guarantee that we can't do things like creating a deposit for an account that doesn't exist, but they won't ensure that if we create a deposit of X in one account, we must also create a corresponding deposit of -X for some other account so we aren't inventing or destroying money.

Of course, in a more realistic implementation, you'd never represent transfers between accounts as two separate stages like that, and if there is any single place to look up deposits into specific accounts it's probably some sort of view or cache rather than the authoritative data. But rather like going from single-entry to double-entry bookkeeping, to avoid relying on something higher up our system to ensure consistency, we've had to restructure our whole data model and the schema implementing it. In a sense, this is still just normalising the data, but that's a bit like saying implementing a word processor is, in a sense, just refactoring and extending a text editor.

couchand · on Oct 21, 2020

A related principle is to make illegal states unrepresentable. If you can possibly insert a debit and fail to insert the corresponding credit, thereby making money magically appear, the schema is wrong. This isn't a matter of just insufficiently normalized, it simply doesn't represent what you claim it represents.

The database shouldn't be viewed as just some kind of scratch pad where you write stuff down so the application server doesn't have to remember it. If the database is the source of truth, then it is the model, and the application is simply a view.

libraryofbabel · on Oct 21, 2020

This is a good principle to aim for but it is impossible to fulfill in many cases. There are always going to be invariants that the application may wish to enforce which cannot be enforced in the database (or, if there was some ideal schema to enforce them, the migration from the existing schema would be enormously costly). This is the whole reason we have ACID transactions in relational databases - so the application can enforce its invariants!

(Exercise for the readers: construct a set of invariants that cannot be enforced within a database schema.)

Silhouette · on Oct 21, 2020

This is a good principle to aim for but it is impossible to fulfill in many cases. There are always going to be invariants that the application may wish to enforce which cannot be enforced in the database

As HN doesn't show moderations, I will just say that this is exactly the point I would have made if I'd been replying first. Relational databases are good at enforcing the kinds of relations provided by basic set manipulations. If you try really hard, you can encode some more complicated constraints, but as the complexity increases it becomes unwieldy and eventually impractical to enforce everything at schema level.

Edit: Changed "database level" to "schema level" to clarify that I'm not talking about stored procedures here.

hiram112 · on Oct 21, 2020

I'm a little confused. A 'toy' bank app would have individual accounts with some amount of money in each. The problem arises when you subtract the debit from one account and error out before crediting the amount in other account, or vice versa. Now you're in an inconsistent state.

How would you typically design a relational schema that could avoid this scenario?

ufmace · on Oct 21, 2020

The usual for this would be to have a table of transactions instead of balances. The balances would be computed dynamically by summing the full transaction history for a particular account.

Of course this gets a bit unwieldy with large transaction histories, and has trouble with things like enforcing that you can't enter a transaction that draws a balance below zero.

earthboundkid · on Oct 22, 2020

In the real world, you can totally go below zero though. It’s a race condition that they turned into a revenue opportunity.

ufmace · on Oct 22, 2020

Well yeah, but that just makes the business rules even more complex. That makes it more like - reject the transaction if it would move the account balance below zero, unless the account has a flag to allow overdrafts (which is over on the accounts table), in which case we allow the transaction and also enter another transaction for the overdraft fee.

Probably need some more rules like max overdraft amount, max number of transactions etc. I've heard some places may also have the "let's be jerks" rule to reorder transactions to hit overdraft as soon as possible to maximize fees.

Silhouette · on Oct 21, 2020

That sort of problem is actually relatively easy to solve in itself. In essence, you don't record the transactions on each account separately, you record the transfers between them (just as double-entry bookkeeping does). Since any transfer inherently balances, you can't be inconsistent in crediting money somewhere without debiting it somewhere else or vice versa. If you want to know what happened to an individual account, you can still construct that information by looking at the relevant sides of all the transfers into or out of that account, but your single source of truth is always the transfer records.

Of course, it's still not that simple in any realistic system, because maybe you have other important constraints such as ensuring that accounts don't go into negative balances. In that case, before you can add any new transfer, you need to calculate the existing balance on the account you're taking money from, which means replaying the entire transaction log selecting for that account if you just have the naive schema described above, and then you have to create the new transfer record if and only if the constraint is satisfied. All of this has to be done atomically to make sure you can't have multiple transfers being recorded concurrently that would individually be valid but collectively leave an account overdrawn.

That takes you into the technicalities of how to process DB transactions with acceptable performance. Hopefully your database will help you if it's ACID compliant, but it still can't work miracles. For the simple two-sided transfers in this example, you can probably do a lot just using a database's built-in transaction support, but for more complicated constraints applied to more complicated schemas, at some point you have to take the practicalities of your database engine into account when designing that schema and deciding what constraints to enforce at this level.

blmsjw · on Oct 21, 2020

I think as long you keep use stored procedures as atomic operations/transactions it both simplifies code and improves performance. In your example, a transfer money procedure should handle both the balance increment and decrement of all accounts involved in a transaction before returning. Things start to get hard to maintain if you have stored procedures that only complete a portion of the business logic and rely on code in the business logic to “understand” what still needs to be done

teraku · on Oct 21, 2020

I think what he has experiences is something I have also seen a lot of times (and am still seeing in my current company): Some developers put their application into SQL. That means, that the whole business logic of the app is SQL, and the app is just executing SQLs in order and does some front-ending.

But I don't suspect that is what the original author meant.

UK-Al05 · on Oct 21, 2020

“open transaction need to be ROLLBACKed” isn't business logic. Business logic should rely on operations being transactional anyway.

headmelted · on Oct 21, 2020

There are security matters too though, which no-one ever seems to get right.

Stored Procedures correctly managed are a more secure option than an ORM executing arbitrary SQL code, but the reason why is misunderstood consistently (I've had experienced DBAs get this wrong and insist to me that stored procedures prevent SQL injection attacks which is completely false).

To pick an arbitrary example out of thin air:

If my application requires 10 different calls to the database with variables, then I can create 10 stored procedures. I can set the permissions for the account that the calling application is using to only have EXEC privileges and only on those 10 procedures. This means that if the credentials were to leak the damage is limited to tasks the application could conceivably have performed anyway, albeit without any application-enforced validations on the variables passed to the calls.

If I use an ORM, I have to give the SQL account the client application is using more-or-less carte-blanche access to the database as it could conceivable read from or write to anywhere in the database through arbitrary SQL. Sure, maybe I scope the ORM to a schema and restrict on that, but it's not nearly as granular and fundamentally misses the point that I've opened a massive attack surface unnecessarily by creating a SQL account with generally free reign on the database that exists in the application layer.

On a recent project, mostly to see how practical it was, I built an application that used stored procedures with all of the validation being done directly inside the procedures in SQL. This had quite a few benefits:

I only had to maintain validation in one place. I could rely on the formatting constraints I already needed in the database anyway. The validation rules were beside the data and were much faster to edit and maintain over time. The validation messages had to be pulled as keys making globalisation of the application much simpler later within the client application.

Different strokes for different folks, but there are literally decades of reasoning behind why RDBMS systems are the way they are that is completely bath-watered by ORMs.

pbz · on Oct 21, 2020

"If I use an ORM, I have to give the SQL account the client application is using more-or-less carte-blanche access to the database as it could conceivable read from or write to anywhere in the database through arbitrary SQL"

You can give table level permissions (and even specific columns if you want) to the ORM db account. From a security point of view there should be no difference.

tonyarkles · on Oct 21, 2020

I agree with both of you, although have never set up a DB in either style :)

One small point: the proposed stored procedure approach seems, qualitatively to me, to be less error-prone. The consumer in the sproc approach either does or doesn’t have access to specific sprocs, while managing fine-grained per-column permissions seems easy to screw up in either a too-liberal or too-restrictive way.

mtcoope · on Oct 21, 2020

I have a few questions I'm curious about.

1) How are you unit testing all of this?

2) How are you source controlling all this?

3) I'm assuming you are not versioning anything but if so how are you?

4) How difficult would it be for someone else to come support this?

Number 4 has been the nightmare I've seen with excessive stored procs.

headmelted · on Oct 21, 2020

The project is end-of-life now so most of this is moot but to answer these queries:

1) SQL calls in a separate command-line application.

2) Git. It’s in a .NET Database project.

3) See 2.

4) Not overly to be honest - the organisation uses SQL procs pretty extensively as is, so this isn’t anything particularly new or out of the ordinary other than the validations are held in a different part of the stack.

taffer · on Oct 21, 2020

Not having consistent rules about what code belongs where is a people problem, not a technology problem. In order to write maintainable software, you always need proper code classification, regardless if you use stored procedures or not.

ritchiea · on Oct 21, 2020

Allow me to recommend another current HN front page article in response to this: “Discipline Doesn’t Scale”, https://www.sicpers.info/2020/10/discipline-doesnt-scale/

taffer · on Oct 21, 2020

Again, this has nothing to do with technology. Why do we assume that you only have to follow software engineering principles when writing Java, but not when writing SQL?

ritchiea · on Oct 21, 2020

I’m not sure what you’re arguing. You should have rules for what code goes where and stored procedures are also considered an anti-pattern by most developers because they tend to split business logic between the application code & database.

Let’s say you decide to use stored procedures with rules about what belongs in app code and what can go in a stored procedure, the “discipline doesn’t scale” perspective would say that as your company and app code gets larger your rules get harder to enforce. So devs tend to fall back to the simplest, easiest to follow version of the rule: stored procedures are an anti-pattern.

TeMPOraL · on Oct 21, 2020

So what would be a solution then? I imagine, creating an RDBMS where SQL can be executed only by DBA root accounts, and normal (app) accounts can only execute stored procedures? That would enforce a pattern without requiring discipline.

ritchiea · on Oct 21, 2020

Impossible to generalize, would depend entirely on what specifically you want the stored procedures to do. Though I can’t think of a good reason to abstract your queries at the database level. Even ORMs typically allow execution of raw SQL, so you could still store a straight SQL query in your app code (perhaps in a constant) instead of using a stored procedure.

TeMPOraL · on Oct 21, 2020

I've been thinking about this a bit before seeing this discussion, and I'm starting to feel that abstracting at database level makes sense, because it turns your RDBMS from being just a dumb SQL execution engine into an application-specific API.

If you consider restricting operations on your data close to the storage layer, you can imagine wrapping DB access with an app exposing business-specific API, and route all your actual code through that API. Doesn't seem like an unreasonable design to me - particularly, if the same database is used by multiple applications. But if you do that, it may make sense to just put the API inside the RDBMS - giving you one networked software component less to manage.

(I've done work on a project using lots of stored procedures only once, and my tasks were unrelated - so I have no practical experience here. But through this discussion, I think I'm starting to understand why enterprise projects are anecdotally so in love with stored procedures.)

ritchiea · on Oct 21, 2020

If you abstract at the db level then your db reading code is either unlikely to be in version control or you need a klug to for version control for stored procedures.

TeMPOraL · on Oct 21, 2020

But that's a concept entirely orthogonal to system architecture and solvable with appropriate process and/or tooling.

ritchiea · on Oct 21, 2020

I suppose it is in theory but how many of us work in theory rather than with a set of tools that are commonly available?

Or maybe it’s just better to roll your own version control system to support your really amazing new app architecture that stores lots of logic in database functions, what do I know? ¯\_(ツ)_/¯

crizzlenizzle · on Oct 21, 2020

Same here. Earlier this year I consulted a company for evaluating whether to modernize their current ERP or buy a new one. What I found was a large clusterfun of an application inside a MSSQL database with Microsoft Access frontends and some glue code molded as .net application on Windows 10 in between accessing third party APIs. Barely anything documented, thousands of stored procedures. When I asked how they roll out changes and perform tests I got this answer: “As few deployments as possible. Everytime we deployed new features, there have been always issues for a couple of days.”

juniper_strong · on Oct 22, 2020

That's an application-centric view of development, as indicated by the phrase "the application". If an application is successful, its data will likely end up being used by other new applications, and the data will survive the lifetime of the original application that generated the data. The applications themselves will be rewritten many times as technologies evolve. Think mainframe to client/server to EJB to lightweight J2EE to Spring Boot microservices.

If there is logic or metadata that is common across all those applications, there are a couple choices. You could duplicate that logic across all the applications, and rewrite that logic every time you do a rewrite of the application, or you could keep it one place.

If you keep it one place, one way to do that is to have a service in front of the database that every application that uses the data calls instead of hitting the database directly. That has some disadvantages in that it requires upfront design and planning, the service in front of the database will likely be rewritten in new technologies over time, and it has performance implications. And for the developer, instead of having to deal with logic in an application and a database, now they have to deal with logic in two applications.

Another way to do it is to have common logic for the data implemented in key constraints, check constraints, triggers, stored procedures and other similar tools in the database so that any application that uses the data doesn't need to rewrite that logic, and can't intentionally or unintentionally violate the rules of the common domain. That does have its own disadvantages, and it makes things more complicated for developers who will need to be familiar with an additional set of technologies, but it is a valid use case for stored procedures.

janstice · on Oct 21, 2020

I was once dead set against stored procs (having been previously in favour of stored procs) for the same don’t-split-business-logic reasons. However I now have a more nuanced approach - stored procs shouldn’t have business logic, but can be used to insulate sensible code data models from whatever random bozo (like me last year, or some legacy decision from 15 years ago) decided how the database schema should be.

The other way I’ve seen data-transport stored procs used is to allow multiple code versions to work on a single database - where you need to support version N and N+1 one the same data (I once worked for a saas company that provided a Preview version on live data before going to prod). Before version N+1 code goes live, deploy N+1 database schema, updated procs, new data, etc, and put in place versioned stored procs which let a version N application run on a version N+1 database, setting default values for new fields, etc. It did require some extra thought to make it work smoothly, but very rarely caused problems for customers.

kermatt · on Oct 21, 2020

How do you prevent business logic being inconsistently split (and duplicated) between the applications or libraries?

AmericanChopper · on Oct 21, 2020

Having worked as a DBA in numerous large financial institutions, and also as a SWE in a lot of node/python/MEAN-style startups. I can assure you that there are some very significant trade offs made with stored procs (or any business logic that is compiled to your database). It takes your one codebase and turns it into at least two seperate codebases. If you have a monolith, now you’ve got a distributed system. Any stored proc codebase is guranteed over time to become a maze of complexity and footguns. The more db features you use in this area, the worse it gets. If you’re using stored procs, then somebody is eventually going to create a table trigger. If a table trigger exists, then somebody will eventually write to that table not knowing it’s there.

The complexity it adds to release management alone makes it not worthwhile in a lot of situations.

It’s like doing full blown OOP design. Your codebase has to pass a certain (rather large) size before the complexity trade offs can even begin to make sense, and even then they often don’t.

hackerfromthefu · on Oct 21, 2020

>> Any stored proc codebase is guranteed over time to become a maze of complexity and footguns.

I think this is a very valuable and important point and wanted to highlight it.

>> Any stored proc codebase is guranteed over time to become a maze of complexity and footguns.

There's no absolute law that states that SP using environments will always end up as a mess, but there should be such a law, because they do!

I've worked across dozens to hundreds of enterprise environments, and the pattern is clear.

I suppose it comes from personality and culture with the people working on db coding more often in the 'get it done' end of the spectrum vs programmers where you find a significant number interested in 'get it right'.

An even stronger reason is probably that db coders are more likely to be autonomous instead of in large centrally driven teams. That central hierarchical organisation is a necessary but not sufficient condition for working to tame the entropy of large complex systems, without which they evolve into "a maze of complexity and footguns".

AmericanChopper · on Oct 21, 2020

It's certainly a valid observation that anything that large enterprise does, will end up being more complex than it necessarily needs to be.

But if I look at a lot of the SP anti-patterns I've seen over the years, some common themes start to arise. I'd say the biggest issue is that it is more complex to maintain interfaces between two discrete systems than it is to write monolithic code paths. Seeing SP_ProcedureName accompanied by SP_ProcedureName_New is very common. It's easy to get into that situation, and tedious to get out of it.

Storing business logic in the DB also presents a lot of very tempting, but ultimately costly, shortcuts to developers. Using a non-temporary table when you should have used a temporary table, because you wanted debugging to be easier, but now you've got a concurrency issue, but it'll probably be OK, so you leave it there for 15 years. I have also never seen what I would consider to be a sensible use of a trigger. You also end up with SPs invoking SPs invoking SPs... and figuring out the side effects of calling something becomes a lot more complicated than it would be in a monolith.

I don't think they're always bad, but I do think that reasonable use cases for them are rather uncommon. When you had a lot of services directly consuming one database, SP interfaces used to make a lot more sense. But that's not a very common architecture is not very common any more. Even in large enterprises, you'll typically see new projects being developed as web applications, even if they're only served over the corporate network.

marcus_holmes · on Oct 21, 2020

yeah, I don't think this is a problem with sprocs, per se.

I think this is a problem because the database code is treated differently from the application code, because so few coders understand SQL well enough to be comfortable in it.

So, as you say, the DBA's tend to be autonomous, and held to different standards. And with less ability to say "we have tech debt, we need to refactor our sprocs" because of the knock-on effect on the application code.

It's a management problem rather than a technical certainty, I think.

WJW · on Oct 21, 2020

If it happens consistently across "dozens to hundreds" of enterprise environments as stated by the GP and with apparently zero exceptions in which they remained simple then, while maybe not a technical certainty, it should be treated as a practical certainty.

Pointing out that it's probably caused by management is not really very useful since there are so few means available for line workers to fix their managers.

marcus_holmes · on Oct 21, 2020

This is true, and I agree, but mis-attributing the problem is still not good.

After all, if (for instance) a new IDE plugin appeared that managed sprocs consistently with code and solved this problem, then we'd still face "sprocs are bad" criticism because the problem was mis-attributed.

dkersten · on Oct 21, 2020

> If you have a monolith, now you’ve got a distributed system.

You already did, you just chose to ignore that the DB is actually a separate service. Ignoring this has meant that I've spent a not-insignificant amount of time in my last few jobs cleaning up bad ORM code because the queries performed terribly. I've cleaned such messes as pulling in all rows and then looping/filtering on them client-side, just generally querying for too many things/too many columns, querying for data and then looping over it to do more queries (instead of figuring out how to do it in a join), etc. When people treat the database as if its part of your monolith, because all the query logic is locally part of your codebase in your ORM, there's a temptation to be lazy and treat it that way. And it works fine in development because the database is small with only test data, but can be painfully slow in production.

I'm sure plenty of people do it properly, even with an ORM, but I've had to clean up the mess too many times now, across multiple companies and dev teams.

I'm not advocating for putting all of this logic into stored procedures, I think that's going too far in the other direction, only that you already have a distributes system, if you have a database, where the boundaries do matter.

AmericanChopper · on Oct 21, 2020

Not to nearly the same degree. Start transaction > mutate data > end transaction is not nearly as distributed as invoke SP > let the database do something (?...) > wait until it's done. The first is a distributed system in the same way that an application writing to a filesystem is a distributed system.

I'd also point out that ORM is not the only alternative to stored procedures. In addition to that, there's nothing unique about ORMs that make them susceptible to bad code, or poor SQL understanding. Somebody who doesn't understand data structures and how the CBO works is going to write equally poor stored procedures. Removing the ORM is not the solution to your engineer not know how RDBMSes work. ORMs also don't suit every use case, but again, having a bias against ORMs is not the solution to failing to understand how your use case fits in with your RDBMS.

dkersten · on Oct 21, 2020

> I'd also point out that ORM is not the only alternative to stored procedures.

Absolutely. I personally use HugSQL[1], so no ORM, but I also don't use stored procedures. I guess my main point is that the boundary between application and database matters and not putting logic in the database doesn't make that boundary go away (putting logic in the database also doesn't make it go away). But I take your point, its not to the same degree.

> there's nothing unique about ORMs that make them susceptible to bad code, or poor SQL understanding.

I think that there is: it makes database code look like client-side application code, so there's some temptation to just blur the boundaries and mix them together. This is how you end up with code that queries for data, then does looping and filtering outside of the query, then queries some more. Sure, you can write equally bad code in stored procedures, but the boundaries are more obvious. At the very least, its likely a different programming language from the rest of your application.

> Removing the ORM is not the solution to your engineer not know how RDBMSes work.

Sure, however, in practice, at least in my experience across multiple teams and companies, education hasn't been a reliable solution. People still abuse the ORM and treat it as if the database is magic. Removing the ORM doesn't automatically fix it, of course, but it does force people to consider the database as a separate service whose access crosses a boundary. In an ideal world, everyone already keeps this in mind and ORMs can be used as they were intended, but, at least in my experience, this just hasn't been reality. Have I just been unlucky?

If different people and teams keep making the same mistakes with the tools, maybe its time to re-evaluate the tools and what they encourage people to do. Its not that the tools are bad in and of themselves, but that they encourage bad usage. (EDIT: I just saw another article on the front page and the title seems apt: "Discipline Doesn’t Scale")

[1] https://www.hugsql.org/

hans_castorp · on Oct 21, 2020

> If you have a monolith, now you’ve got a distributed system.

Just call your procedures "micro services" and you are in the clear ;)

AmericanChopper · on Oct 21, 2020

Your typical micro-service architecture will run into a lot of the same types of problems. The cost of maintaining those inter-service dependencies easily gets quite high, so if you're going to commit to that, you really should first understand whether you're going to be deriving any benefit from it.

lmilcin · on Oct 21, 2020

I have been working with Java for the past 20 years.

I have noticed that over the years the knowledge of SQL eroded a lot in population of Java developers. Not only that, but the design of application seems to be lacking. Where problems could easily be solved with a little bit of efficient SQL people mindlessly accept huge performance losses due to ORM as if they just did not see other possibility.

hackerfromthefu · on Oct 21, 2020

It makes sense .. if all you have is a hammer ..

The problem is exacerbated though by the amount of churn on styles of hammers, and newer developers are often too hammered to pickup other tools ..

TeMPOraL · on Oct 21, 2020

I've noticed this about few things. Regular expressions are another - they have this bad reputation as "write-only, magic one liner hacks" that should absolutely be avoided. The proposed alternative is usually one or more dozens of lines of manual string operations in a loop. It feels like it doesn't enter the minds of people in some companies I worked in that regular expressions are a separate language, one that you should learn.

cptwunderlich · on Oct 21, 2020

And layers of abstraction make it worse. Devs don't know SQL because Hibernate, now new devs rely on the "auto-magic" of Spring Data JPA and don't now Hibernate...

forinti · on Oct 21, 2020

I've heard quite a few times the argument that by using Hibernate, you don't need to know SQL. But Hibernate's manual is huge.

thirtythree · on Oct 21, 2020

We do this in work and it's not great in my opinion.

There can be issues with package versions compiled to the db. It makes continuously integration harder. We do automated builds and testing in the application layer but we don't have anything in the db layer. It's harder to create a new environment (for a developer who does not have full db permissions). Business logic ends up in the stored procedure layer AND the application layer.

dagss · on Oct 21, 2020

Including a real DB in continuous integration is a must for any development IMO. (And these days usually very easy to throw up a database using Docker -- each of our full CI runs runs in 2 minutes and does hundreds of "create database" calls and populate real SQL databases from scratch..)

For instance:

1) You want to implement idempotency in your backend by using uniqueness constraints in your DB 2) That's important to test 3) That's not really tested unless you include a real DB

After including a real DB in our CI setup I will never move back.

marcus_holmes · on Oct 21, 2020

I use a Docker instance for localhost development, so much easier.

There are so many problems that can occur in that communication to the database, it's invaluable to test it

nicoburns · on Oct 21, 2020

> for a developer who does not have full db permissions

Are you using a commercial database (SQL Server, Oracle, or similar?). With open source databases this is non issue as developers can run a complete instance locally with full permissions, completely isolated from prod/staging servers.

hiram112 · on Oct 21, 2020

It looks like even the expensive commercial databases have docker images that would be okay for development purposes:

* Oracle Express: https://github.com/oracle/docker-images/tree/master/OracleDa...

* Sql Server: https://hub.docker.com/_/microsoft-mssql-server

* DB2 Community: https://hub.docker.com/r/ibmcom/db2

* Sybase: https://hub.docker.com/r/nguoianphu/docker-sybase

guggle · on Oct 21, 2020

> but by using stored procedures in your database.

This used to be the standard when clients used to connect directly to the database. Now that the world depends on web services, things are a little different and there's less incentive to maintain stored procs as an interface to your database.

It's still a good idea though, with unparalleled performance. I suppose developers don't do it for a variety of reasons: They may lack the SQL knowledge, they may not want to maintain the extra code, they may fantasize about database portability, etc.

hackerfromthefu · on Oct 21, 2020

Well as someone with 25 years SQL experience, I generally will still use LINQ for 90% of db access code in business style applications, for very important reasons beyond those mentioned, like

- They may realise value in having a compiler provide guarantees, as a bonus immediately in a tight feedback look while editing code as well as on compilation

- They have favour expressive languages with robust error handling, integrated IDE and VC etc

- They may want to reduce the number of moving parts

These are all along the themes of a) delivering results faster, and b) improving the maintenance lifecycle of systems - unlike those parts of the industry that don't really plan to maintain their systems and instead just wanna do rewrites in the latest hotness which is usually a hot mess from a long term maintainability perspective.

fhars · on Oct 21, 2020

But then LINQ is in many ways conceptually closer to directly using SQL than to using a classical ORM.

hackerfromthefu · on Oct 23, 2020

Sorry my wording could have been more explicit!

Generally, use of LINQ in the context of DB access is done hand in hand with an ORM, such as Entity Framework, NHibernate etc, and this is how I meant it.

bosswipe · on Oct 21, 2020

I'm not a fan of stored procedures because 1) It's more maintainable to keep all the logic unified in the one language and system with which developers will be more familiar with 2) The programming tools are normally much better for app code 3) You can write unit tests for app code (does anybody write unit tests for sql?) 4) you can usually avoid transferring too much unnecessary data or doing too many unnecessary network requests, remember to avoid premature performance optimization

Twisell · on Oct 21, 2020

On the other hand, one of the highly specialized stored procedure I wrote in PostgreSQL have being successfully and flawlessly re-used/tested on multiple out-of shelf clients app (BI related all developed in different languages) simply because they all support SQL.

I agree this only apply for data-model related functions. But this has proven really low maintenance, and improvements of this "API" are automatically available in all clients.

I'd bet that accessing this stored procedure through an ORM raw SQL query would provide more benefit than rewriting it from scratch in ORM's language.

PS: Also yeah people do write unit test for SQL (never enough but that's another story). https://pgtap.org/

HelloNurse · on Oct 21, 2020

1) One language and one system: SQL on a database that will last much more than the applications 2) SQL has a limited scope, and so do SQL tools. You need advanced tools if you have "app code": it's a necessary evil. 3) At worst, you can write unit tests for app code that calls SQL queries and stored procedures. 4) "Usually" isn't enough, moving logic to the client requires moving data to the client.

Philip-J-Fry · on Oct 21, 2020

The argument against it I usually see is "you can't version control SQL!"

Where I work we have a custom system which manages releasing immutable SQL snapshots between environments and they get merged to master once in prod. The only thing needed is a process, and then version control is easy.

rakoo · on Oct 21, 2020

Editing stored procedures and deploying them on a database is the same as editing code deploying it on a server. There is no difficulty in versioning it

xupybd · on Oct 21, 2020

Yeah you can just use a migration system and store your functions in text files checked into the version control. Nothing but the migrations are allowed in staging and production so everyone is forced to use it.

joeyjojo · on Oct 21, 2020

As a front-end dev, I've been interested in learning about such a process but don't know where to look. What kind of tooling would you use to manage your migrations in a CI environment?

smada · on Oct 21, 2020

ive used liquibase in the past (not affiliated) with great results https://www.liquibase.org

funcDropShadow · on Oct 21, 2020

Flyway or Liquibase.

JamesSwift · on Oct 21, 2020

I'm not sure about other DBs, but SQL Server has (really good) tools for this (DACPACs). As an added bonus you can then use tSQLt to unit test the SQL.

grep_name · on Oct 21, 2020

> you can't version control SQL

Aren't stored procedures part of the database schema? Which can be version controlled?

keithnoizu · on Oct 21, 2020

On high volume applications I would avoid it since it's usually easier to horizontally scale web servers than sql servers, and it makes cache strategies more difficult depending on how and what you're querying.

taffer · on Oct 21, 2020

It is a common misconception that more data processing in SQL puts a higher load on the database. A typical database spends 96% of its CPU time on logging, locking, latching and marshalling [1][2] rather than processing SQL. By sending less data to the middle tier and performing fewer round trips, the use of stored procedures means that the database can actually do more real work.

[1] https://dzone.com/articles/mit-prof-stonebraker-%E2%80%9C

[2] https://drive.google.com/file/d/0B7jyeB8kxFPjU0VySkF3UHhoVnM...

keithnoizu · on Oct 21, 2020

It depends. Regardless there is limited CPU, and so any scenario in which a stored procedure uses more CPU than a simple query will cause you to hit that saturation point sooner.

I generally have layers of caching on top of the sql server so the majority of queries will be integer equivalency or range checks, if not get by primary key queries. So I am not generally operating in a scenario where a stored procedure would reduce record scans, etc.

I generally also don't run on transactions out side of limited scenarios, since throughput is usually more important to me than data consistency.

marcus_holmes · on Oct 21, 2020

sorry, I don't get this at all. How is using stored procs instead of an ORM going to adversely affect your cache strategy?

keithnoizu · on Oct 21, 2020

Imagine for example that you have per record caches. You can run a query to get ids with out joining to the table and then simply fill in any gaps in your cache with a follow up query.

zaarn · on Oct 21, 2020

In most cases, it is sufficient if your database runs on the DB in production and sqlite for local development. If it's too big to be able to run on the dev computer, then it should use one db across the board.

Portability across databases rarely buys you anything but hard to reproduce bugs.

Dave3of5 · on Oct 21, 2020

Just a friendly reminder to anyone reading this about v1 of GORM. It doesn't properly support JOINS:

https://github.com/go-gorm/gorm/issues/1436

Note that there is some work for v2 in allowing users to create JOIN queries but without JOIN support it's effectively unusable in anything other than very small databases.

giantDinosaur · on Oct 21, 2020

wait, what? doesn't support joins? Things like this make me doubt my fundamentals. 'Are joins really necessary?' is the question now running through my mind ;-)

Dave3of5 · on Oct 21, 2020

It does not support joins. The expectation is that you "Preload" your data which is a fancy way of saying load it all into memory. The performance of this is trash so don't use that ORM there are a few better ORMs for GO mentioned in the issue.

Yes joins are really necessary in SQL they are a foundational part of using any relational database.

giantDinosaur · on Oct 21, 2020

Yes, they're literally the second thing one learns about SQL, really. I'm not being too serious about rethinking whether they're necessary, to be clear!

What a strange design decision though. I don't really understand why it is doing this? Why would anyone really think it's a good idea?

Dave3of5 · on Oct 21, 2020

> I'm not being too serious about rethinking whether they're necessary, to be clear!

My apologies if I came off condescending, it's hard to impart tone of voice in text.

> I don't really understand why it is doing this?

I think it's because it makes the SQL Generation harder. I think the lib at the moment is simply a search and replace on a bunch of SQL strings.

henvic · on Oct 22, 2020

Thankfully my team is making the switch from gorm to vanilla SQL. I joined the company a little over one year ago on another team that also uses it, and using gorm turned out to be a really pain. It was the first time I had to use an ORM, and I really disliked the experience & constant source of issues.

bbarnett · on Oct 21, 2020

You're sadly fighting against the "but... but... it's how people DO it" mentality here.

Not to mention, as computing expands, the average capability of people is dropping. Seats need to be filled, and those seats don't all need to be high quality devs.

So ORMs are there to reduce the workload, in terms of what needs to be learned to be productive. Database tuning is becoming a lost art.

(For DEVs not getting this, on a largish DB, I can literally write a single query which can take 10 seconds in MySQL's console, or hours. Yes, hours.

This isn't a flaw, any more than doing a tree search wrong is a flaw in <insert language here> when it is slow.

And most ORMs don't come close to properly handling all edge cases here.)

Beyond all of that, there is another bonus to using stored procedures or functions. Security.

There are many tables I want to give read only access to, yet want to allow writing in very limited scenarios. A stored procedure can have different user permissions (at least under most DBMS), which means you can:

- make a table read only for a web application user - then, allow a stored procedure to perform INSERT/UPDATES, via a different user - however, let the web user execute the stored procedure

The stored procedure thereby limiting the INSERT/UPDATE to a very specific action.

Mostly, I've used this when dealing with tables/databases which have a financial component. Said stored procedures can even perform additional checks on other tables, whatever is required to ensure that the update == OK.

Beyond that, it also allows for tiered security, in case you have someone in house, maybe a new dev, who decides they'd like to modify that financial table a bit.

And beyond that, it also allows logging to be done at the DB level.

While you are more limited here, in that you really do not want to be burdening the DB with more writes, you can definitely log all modifications to said financial table.

(You could use triggers for this last bit too, of course, but still...)

fiddlerwoaroof · on Oct 21, 2020

> Not to mention, as computing expands, the average capability of people is dropping. Seats need to be filled, and those seats don't all need to be high quality devs.

Also, a lot of businesses have analysts/data scientists that know SQL and have access to the DBs: stores procedures make business domain-level functionality more available to them because they don’t have to guess what application code is actually doing.

commandlinefan · on Oct 21, 2020

> ORMs are there to reduce the workload, in terms of what needs to be learned to be productive

The problem is (which you seem to recognize but most seem to miss) is that the opposite happens: the ORM lets you shoot yourself in the foot quickly, but that's about it. In the end, you lose more time using an ORM than you actually gain.

throw_m239339 · on Oct 21, 2020

> I'm a Go developer and I notice a lot of other Go developers instantly suggest things like GORM to noobs writing applications.

And there is nothing wrong with that. It's like recommending against the use of HTTP before people learn TCI/IP, or recommending against PHP/Apache just because people aren't really writing servers.

Learning how to write a web app and learning database management are 2 separate topics. Furthermore, GORM is rather barebone, compared to Entity Framework or Hibernate, due to Go type limitations...

fauigerzigerk · on Oct 21, 2020

The thing is, RDBMS are set oriented. Manipulating sets of tuples is a very high level kind of thinking compared to manipulating individual objects and their relationships in a procedural manner.

Of course sometimes the task is to write load/edit/store logic for individual records and their relationships. There's nothing wrong with using an ORM for that.

But extending this sort of thinking to all data processing is extremely detrimental to both productivity and performance. If you're using an RDBMS you're well advised to apply a sets first approach to data handling.

That's the danger of ORMs. They encourage people to do the wrong thing.

They also add considerable complexity to a project. So after having written my own ORMs many years ago and using some of the popular ones I would stay away. The cases where they are the right tool for the job are just not worth the friction they create.

marcus_holmes · on Oct 21, 2020

Not only teaching people to do the wrong thing, but also teaching that RDMS's are scary, deep-level, things that developers don't need to understand in order to do their job proplerly. Which isn't true.

BigJono · on Oct 21, 2020

There's a big difference between HTTP, Apache, and ORMs. People not understanding why they're different is what sinks companies all the time.

Using any abstraction is a gamble that it doesn't leak before you have the resources to fix it.

HTTP is a terrible comparison because in almost every domain you will either make millions or (far more likely) fail for other reasons before you need to understand how TCP/IP works.

I bet there's loads of companies that have died a slow death, releasing everything a month later than they want to, because the back-end developers aren't comfortable with whatever the stack of the day is and they were too busy learning yesterday's stack of the day instead of the intricacies of the language they're working with, or the standard library.

FlyingSnake · on Oct 21, 2020

> It's like recommending against the use of HTTP before people learn TCI/IP...

This is a false dichotoy. SQL is not TCP/IP because it's nearly on the same level as ORMs. Unless you normalize and understand your entities properly it'll be difficult to write performant systems. ORMs are rightly called the Vietnam of Computer science¹ for this reason.

[^1]: http://blogs.tedneward.com/post/the-vietnam-of-computer-scie...

Philip-J-Fry · on Oct 21, 2020

I agree that ORMs lower the barrier of entry for creating a full stack app.

But it's not just the suggestion of GORM. It's that it is often suggested as the solution and anything else is unviable.

I think writing a web app and knowing how the database works are interlinked. It's incredibly valuable knowledge as a web developer IMO.

EdgarVerona · on Oct 21, 2020

This is the approach I like as well, with the caveat that the stores procedures themselves should be kept as dumb and as late in the process as possible.

I have made many mistakes in my career in terms of systems that were either overly complicated or inefficient or both, and it has almost always ended up boiling down to either trusting the SQL that an ORM produces, or putting too much of the business logic in hand crafted stored procedures. I have done both, regretted both, and nowadays advocate for sticking to simple, predictable, and dumb stored procedures so that you can both have the benefit of knowing/controlling exactly what will execute in the database and having things like validation occur code side whenever possible.

In the real world, that simplification of stored procedures isn't always possible. But if it's your starting assumption, then you can fight for it until you are certain that there isn't a practical alternative to pushing the logic into the stored procedure. And if a compromise of these principles needs to happen, I would push complexity into a stored procedure before trusting the SQL produced by an ORM every time.

But where I fall on that spectrum may just be due to the severity and types of problems I have seen happen with each approach, do YMMV. And I wouldn't go so far as to call myself an expert, so also YMMV on that front. I am just a guy who has made a lot of mistakes. But if it'll help others, this is my takeaway.

marcus_holmes · on Oct 21, 2020

Came here to say exactly this. Also a gopher, who is also vaguely horrified at all the "use GORM, and a framework" advice to newbie gophers.

I also use stored procedures, and treat them as function calls from my application code. So I effectively build a "data storage API" in the database, that I then call (as you say, like any other service) from my code.

I also have views for all the data sets that will be returned to the application, so I can use them easily as a type for function returns, and enforce consistency.

Cthulhu_ · on Oct 21, 2020

I'm trying Gorm right now, and I'm running into some serious issues where it's not as straightforward as an ORM should be (mostly having to do with transparently saving / updating relationships).

I'm reluctant to go to straight SQL because then I have to:

- Write a ton of 'mapping' code between resultsets and data model

- Do a lot of diffing to determine which relations to add, update or remove

- Manage my own database migrations - challenging because I'm stuck with using sqlite

- etc

I want my tools to make things easier for me, but gorm seems to be the "now you have two problems" direction - for now. Maybe I just don't get it yet. Maybe I just don't get databases anymore either.

vptr · on Oct 21, 2020

Depending on which DB you're using, I'd suggest trying out https://sqlc.dev instead of gorm if on PG. Mysql support is a bit flaky last time I checked.

davedx · on Oct 21, 2020

> You could even go as far as to unit test just your SQL code if you wanted, something you can't do too easily when it's in your application.

How do you unit test SQL code? You could integration test against something like sqlite but unit testing stored procs sounds very difficult.

I worked with a big C# .net core ecommerce platform that had a big stored proc used for ingest. It was an absolute pain in the ass to debug and fix issues because testing it was so difficult. Eventually we replaced it with C# code. Access to full software dev tools (debugger, automated tests) is really something you don't want to be without on large projects...

hans_castorp · on Oct 21, 2020

> How do you unit test SQL code?

e.g. https://pgtap.org/documentation.html

There are similar solutions for Oracle's PL/SQL and SQL Server

buro9 · on Oct 21, 2020

There is a middle-ground... helpers.

Things like search interfaces might offer filters over multiple columns... your choices here:

1. Use an ORM

2. Write SQL for each permutation of filter

3. Write a helper that will compose the SQL based on the filters needed

And that 3rd one is the only time I stray from plain SQL towards something that looks like a lightweight abstraction over the SQL.

Similarly... stored procs, I avoid these but there are times when they are necessary. If you need the boundary of a transaction to span multiple SQL calls then it is safer to do so within a stored proc. Yes you can use a transaction and rollback, but from a maintainance perspective I've seen refactors break transactions that were not fully encapsulated in a single Go (insert other language) function .

Yes keep things simple with SQL and do everything in the application... but also break those rules rather than do something daft or with side effects.

ashtonian · on Oct 21, 2020

Or use something lightweight and configurable like https://github.com/a8m/rql to just handle filters.

nullsense · on Oct 21, 2020

Seeing the word "helpers" in a codebase gives me PTSD flashbacks.

hodgesrm · on Oct 22, 2020

Your view is the original design idea behind stored procedures.

Stored procedures as far as I know were first implemented in Sybase SQL Server. We tended to think of them as enabling RPC (remote procedure call) access to the DBMS. Sybase was also the first major relational database to offer client access through the network. Our connectivity was influenced by distributed system concepts like RPCs that were top-of-mind in the 1980s.

sshb · on Oct 21, 2020

I've recently came up with following approach: make it easy to generate CRUD applications without ORM's, and then modify them any way you like without extra dependencies.

Made a tool to achieve just that: https://github.com/sashabaranov/pike/

jayd16 · on Oct 21, 2020

I salute your level of abstraction but I personally draw my line in the sand a bit higher. A library project for the data access layer with a stable app facing code API gives me the best bang for my buck in terms of flexibility and abstraction.

That said, its easy for me to push out new code versions. I realize it doesn't work for everyone.

C1sc0cat · on Oct 21, 2020

I agree and for any serious database used in production SPROC's have so many advantages improved security as well.

_me_go_too_ · on Oct 21, 2020

>> I'm a Go developer and I notice a lot of other Go developers instantly suggest things like GORM to noobs writing applications

I've made the opposite observation. Compared to Python, C#, Java and NodeJS, ORMs are actually not a big thing in Go. Which is good - I agree with you.

sjaak · on Oct 21, 2020

So much this. ORMs? That way lies madness. For example if you use Hibernate in Java you can end up writing stuff like this:

  interface MyRepository {
    Page<SomeObject> findEntitiesByIdAndCreatedAtAfterOrderByCreatedAtDesc(int id, Instant after, Pageable page);
  }

This abstracts away executing a query along the lines of: select * from entities where id = $1 and created_at >= $2 order by created_at desc

I'll take the latter any day.

watt · on Oct 21, 2020

Pageable also adds window functions (limit, offset). If you are going to compare, please be courteous and do it right.

When app starts up, Java compiles these declarations and will abort if the declared function actually does not match the entities. This is a huge safety net, which you would not have when you write SQL. And also when you write SQL you have to take care and unpack the "select *" into your fields etc etc. I suspect this is much more important downside than you might realize.

vitro · on Oct 21, 2020

How do you get results of your query to SomeObject then?

I think it depends on the use case, as always. I use Hibernate for our platform, which is used at most by 10-20 people at once. There ORM is a huge help because I can just easily view and edit my model and all its relations and I accept performance hit in this case and if something is too slow, I rewrite it in sql.

Then we have API that uses data of the platform, and there I dropdown to plain handcrafted sql with a little help of JDBI and so far it has been working great and responses are fast.

So I think it is just about stepping back a bit to think and not to blindly accept either side of the argument. Each use case is different.

twic · on Oct 21, 2020

That's not Hibernate, that's Spring Data or some such, which is indeed cursed. This bit of Spring was absolutely designed by and for people with no critical thinking ability.

Meanwhile, JPA, the standard which Hibernate implements, has a textual query language [1] which is basically SQL but a little more objecty. For example, you can express joins through properties concisely:

  SELECT c.capital.name FROM Country AS c WHERE c.name = :name

And use constructor expressions [2] to map results into Java objects:

  SELECT NEW example.CountryAndCapital(c.name, c.capital.name)
  FROM Country AS c

[1] https://www.objectdb.com/java/jpa/query/jpql/structure

[2] https://www.objectdb.com/java/jpa/query/jpql/select#Result_C...

sjaak · on Oct 21, 2020

Thanks for the corrections, suggestions, and comments everyone!

@twic Thanks for the correction. Didn't know this wasn't a Hibernate but instead a Spring Data feature :-)

@watt You're right, I could've added an offset/limit. Wasn't the main point though (readability was).

@vitro Depends on the approach, but it won't be done /for you/, the mapping is a manual step (grab field X, etc.).

@piokoch, nice suggestion. Of course in the real world we work in teams and some people will prefer this ;-)

piokoch · on Oct 21, 2020

And nothing is going to stop you from doing this, you can pack you query (native or JPA-QL or HQL) into @Query annotation and be done. The good thing is that it will do mapping for you.

The advantage of using findEntitiesByIdAndCreatedAtAfterOrderByCreatedAtDesc is that you will get for free paging support (which is not trivial and has to be done properly for each database engine, simple approach with using limit typically is very inefficient).

Also, most of the queries will be something simple to fetch some entity using some business identifier or primary key, not being forced to write down all those simple queries is a productivity boost.

For more complicated queries you can fallback to writing query manually.

There is also maturing Spring Data JDBC project that kind of has "best of both worlds".

thesimon · on Oct 21, 2020

While the abstraction might be a bit longer, I've found them relatively easy to write using IDE support (e.g. IntelliJ).

Pageable adds of course a bit of functionalities, as does the convenience of passing an Instant instead of a String :)

register · on Oct 21, 2020

That is Java. In .NET using linq with entity framework looks almost like SQL.

nullsense · on Oct 21, 2020

Debatable. Method syntax vs query syntax is available. I mostly see method syntax in my codebase. I also see terribly performing linq queries all over the place.

I think it's neither good nor bad, it's just a tool. Tools are useful for making things, but if wielded improperly they will hurt you.

I'm sure same arguments can be made against raw SQL. It's possible to have good codebases in any paradigm, and bad codebases in any paradigm. It's really all down to the quality of implementation.

As a side note if you want a creative way to backup a database just use try logging an EntityFramework tracked object in Serilog with the destructuring operator. Watch as the logging framework inadvertently causes an entire object graph traversing navigation properties endlessly make infinite SQL queries while your web server slowly runs out of ram. Fun times!

Powerful tools are usually sharp.

iamsb · on Oct 21, 2020

You can provide the query, hibernate/JPA does not stop you from doing that. Autogenerated queries for simple CRUD ops, but queries supplied by developer for more complex logic is ideal.

baq · on Oct 21, 2020

how do you unit test your stored procedures on a CI agent?

serious question - this is nigh impossible in every db i've ever used.

dkersten · on Oct 21, 2020

Run the DB in docker, import the stored procedures as part of the setup migrations and run tests against that from your normal test suite.

marcan_42 · on Oct 21, 2020

Everyone here is arguing for their favorite side. As often is the case, the best solution is often somewhere in the middle, using the best tools for the job.

I authored a complete rewrite of an ancient and rotting PHP+MySQL web ticket reservation app in Python+SQLAlchemy+PostgreSQL. I use an ORM - except where it doesn't make sense because a SQL query expresses what I need to do more concisely and effectively. I don't use stored procedures - except where I do because I need one specific atomic DB operation to be performant and not bottlenecked on the app. I use relational storage - except almost every table in the database has a big JSON column for everything I don't need to ever join, filter on, or index in production codepaths (though I can still do that with PG's native json support, which is great for the rare case I have to move something to a real column). And I use triggers, stored procedures, and notifies to implement live change notifications for a table, that eventually get fed via WebSockets server to users.

This hybrid approach has served me extremely well, resulting in very readable and maintainable code, minimal DB schema migration pain (most upgrades only touch JSON fields and thus require no migration), and much better performance than the old app, especially in that hot path using the SP, while keeping table column bloat down, and avoiding the join spam that results from keeping everything religiously normalized even in cases where that doesn't buy you anything.

Of course, that does mean you need to know all the relevant technologies involved, SQL, ORMs, etc.

YMMV, but consider that if you think a single solution is the right solution in all cases, you're most likely wrong.

msluyter · on Oct 21, 2020

YMMV, but consider that if you think a single solution is the right solution in all cases, you're most likely wrong.

This is sort of the meta-principle that I apply to all of software development: no principle or methodology (DRY, YAGNI, SOLID, TDD, etc...) applies 100% of the time (except for this principle. ;))

mritchie712 · on Oct 21, 2020

JSONB is awesome, a lot of our tables end up being something like this:

    CREATE TABLE some_table (
        id    SERIAL PRIMARY KEY,
        uuid  DEFAULT uuid_generate_v1(),
        user_id INTEGER,
        details  JSONB,
        updated_on timestamp DEFAULT current_timestamp,
        created_on timestamp DEFAULT current_timestamp
    )

nicwolff · on Oct 21, 2020

I suggest adding `details_schema JSONB` and validating `details` against it. Now you can still add fields to `details` without deploying code changes, but every dev in the company won't throw random garbage in there – or expose that field as raw JSON in an admin panel somewhere.

10000truths · on Oct 21, 2020

The whole point of having a JSON type is to do away with needing a fixed schema, no? If you need to constrain the format of the JSON object, why not just represent it with more DB-friendly types in the first place?

nicwolff · on Oct 22, 2020

In an enterprise production environment, it's much easier to update a JSON schema, which is stored as data, than to alter the DB schema. The point of having a JSON field isn't to let everyone with write access to the API stick whatever they want in there; it's to let the administrators of the service change what it can store without engineers deploying code changes and DB migrations.

mritchie712 · on Oct 21, 2020

We're still a pretty small so unlikely to happen, but great tip for a larger team.

_1qd4 · on Oct 21, 2020

The thing I struggle with is knowing whether or not the solution I designed was biased towards SQL (because that's what I learned years ago) and therefore would be better suited for SQL. OR maybe if I learned more about ORMs I could craft a solution that isn't SQL biased. Or maybe SQL "just works" for most of what we need to do with applications right now, and that's where the SQL bias comes in?

sopromo · on Oct 21, 2020

I do not agree and this is coming from someone that loves SQL. I only experienced the problems that this article describes with people that do not know how the ORM works.

You can use an ORM and it works flawless for most cases. There might be some cases where I need to write SQL to generate a custom function or some weird edge case but that is why the ORM gives you the possibility to write your own SQL if you want.

I forgot to mention that having to write raw SQL queries and mantain them, doing migrations and keeping up with the changes is kind of a pain when most of the time the ORM takes care of everything.

emmanuelay · on Oct 21, 2020

ORMs are good for quickly building something. But if you're building something that is expected to last and that will be passed on to other developers... ORMs will likely derail or put severe limitations on your application.

hackerfromthefu · on Oct 21, 2020

Disagree completely. It depends, very much, on context.

There's contexts where good knowledge of the ORM, and of SQL, is part of the bar of working in the environment. This often happens in stacks where there aren't a zillion different choices but rather curated default choices for things like ORMS.

Combined with having a large and sophisticated system (or many similar small ones), then the complexity saving of being able to use a good ORM (and LINQ) provides a significant savings in time, and complexity.

This helps the systems to last, and be passed on to other developers.

lcrz · on Oct 21, 2020

Counterpoint: a lot of ruby devs are very familiar and capable with ActiveRecord. Al lot of python devs are familiar and capable with using the Django ORM. Why are these ORM more likely to limit your application? You provide no proof or explanation of your statement.

darkhorse13 · on Oct 21, 2020

I second this. For example, Django's ORM is actually very advanced these days, allowing you to squeeze out the performance in many, many cases. While I agree that ORMs should not always be used (sometimes you just need raw SQL, especially on very large databases), I think the ORM hate is a bit overstated on HN.

metafunctor · on Oct 21, 2020

The Django ORM is pretty good, and I like it. But, to squeeze out performance, wouldn't you say you have to actually understand what is going on in the database?

BozeWolf · on Oct 21, 2020

Agree! I like django's ORM and I love SQL databases. It is nice that your in-code model changes, will reflect the schema changes in the database. And migrations are a blessing. (Of course it has some foot guns.)

For optimizations: my_queryset.explain(verbose=True, analyze=True) is very nice to understand what is going on. If you cannot get it done with django's database abstractions, there is always:

my_queryset.extra() or even my_model.objects.raw(). Where the last one maps the result to the model, for free.

To others responding that ORMs are bad: IMHO saying that raw sql is better than ORMs is short sighted. Many simple things are nice in ORMs. Queries are often simple. Today, in django, complex queries are also nice. For edge cases raw sql can be better.

Extending queries is also super easy with a good ORM, where with raw sql you would have to maintain two almost the same queries.

Just remember to run and inspect queryset.explain().

darkhorse13 · on Oct 21, 2020

Fully agree with this. These discussions are always this or the other, but you can easily have both in the same project. Django's ORM and raw SQL functions prove that.

sjwright · on Oct 21, 2020

And more importantly, you'd want to design your database to be optimal for complex SQL queries and not whatever generic structure that matches your programming objects.

darkhorse13 · on Oct 21, 2020

I don't see why those things have to be mutually exclusive. You can understand your database and still use an ORM.

metafunctor · on Oct 22, 2020

Yep. My point is that if you use an ORM, but don't understand your database, you're asking for trouble.

ClikeX · on Oct 21, 2020

The only limiting thing I've experienced with ActiveRecord is when the database becomes larger and more complex. The performance takes a bit of a hit. Which isn't ActiveRecords fault, it's the developer's lack of understanding how the database works.

Most of the issues I see is a lack of understanding of things like N+1 queries. I see this mistakes from junior to senior level. ActiveRecord has built in solutions to fix them too. Some developers don't know about them.

yxhuvud · on Oct 21, 2020

There are built-in solutions but also quite severe limitations when you get to advanced usage of activerecord. Especially with regards to composability. Of course, raw SQL is also really bad in that scenario.

tylerhou · on Oct 21, 2020

When I found ActiveRecord insufficient, I found it easy to drop down to Arel for complex queries: https://www.rubydoc.info/gems/arel

bdcravens · on Oct 21, 2020

Working on a Ruby application that uses a database that predates the application (it was originally written in another language), there are simply some things that can't be written in AR, or when they are, the performance just isn't there.

Additionally, we often have to create ad-hoc queries to answer questions from stakeholder. It's easy enough to just pop up DataGrip and write the query, and copy the tabular results into a doc in Google Sheets for sharing.

I think the best approach is ORM-first, not ORM-only.

WJW · on Oct 21, 2020

An alternative viewpoint on that situation might be that the database is in need of some serious refactoring/migrating if the app can't deal with its structure. Trying to bend Rails into non-AR database usage is going to lead to tears sooner rather than later IMO. It's just so very integrated.

Having separate applications with (write) access to the same database is similarly going to end in trouble. Suddenly there are all sorts of problems when the database needs a migration and the applications are expecting different things from the same table. This is not really ORM-related btw.

That said, ActiveRecord is not perfect and sometimes a single handcoded SQL query can deliver a ton of value. Profile carefully.

dragonwriter · on Oct 21, 2020

> Trying to bend Rails into non-AR database usage is going to lead to tears sooner rather than later IMO. It's just so very integrated.

Is that really true? While I've barely used Rails in anger, and not at all recently, I was under the impression that all the major Ruby ORMs and high-level database libraries (e.g., DataMapper, Sequel) are heavily focussed on Rails usage and usually around AR pain points like working with DBs in Rails that don't fit ARs opinionated patterns.

> Having separate applications with (write) access to the same database is similarly going to end in trouble.

It shouldn't, if the DB is designed for that, which usually means each application sees the DB exclusively through its own set (probably in a distinct schema) of views, and keeps it's filthy little hands off base tables. That's a best practice for multiple application access to an RDBMS that is older than the Web.

bdcravens · on Oct 21, 2020

I won't argue that an ideal situation is to stop all feature development and refactor into an ideal data model. That's a technical debt ROI discussion, the kind that we always have, and sometimes the business value just doesn't support the rewrite.

However, even in an ideal data model, I don't always agree that multiple applications accessing the application's database is bad. That's a matter of being thoughtful and communicative, instead of just wantonly running "rails g migration" every time you need to implement a feature (that's true of Monorails even). FWIW, we've never had any real issues with multiple applications accessing our single database.

gregoriol · on Oct 21, 2020

The ORM actually might reduce the risk of derailing over time as it will probably handle any database layer upgrade, will perform checks to prevent common issues, while your own SQL might not (thinking of utf8/utf8mb4 on mysql8, newly reserved words escaping, special characters escaping, ...).

bhaak · on Oct 21, 2020

That's quite a broad statement. Can you explain how you come to that conclusion?

sjwright · on Oct 21, 2020

Our views on this tend to be influenced by the sorts of data applications we've written in the past. I can imagine that people who've done a lot of corporate/enterprise development with large teams or straightforward requirements would see immense value in an ORM. I'm sure that doesn't even begin to describe the breadth of projects where an ORM is a good idea. I've never built applications in this space so I can't speak to it.

With most of the stuff I do, an ORM is a non-starter for my databases. The data would make no sense as objects and I tend to avoid using data objects in programming anyway. It's probably impractical or impossible to build what I build using one and even if you could it would be more difficult to write and performance would certainly be atrocious.

jeswin · on Oct 21, 2020

There are many reasons why ORMs work better than it did earlier. For one, people prefer simpler database schema these days - and what used to be larger monolithic apps are often broken into micro services. Earlier, you'd have one big database - today you'll have many. And ORMs are great with simple queries and joins.

The language plays a big role in whether an ORM is actually useful or not. In the .Net world, ORMs work quite well because querying capabilities are integrated into the language. Queries and schema are verified at compile-time, and I wouldn't trade it for slightly better performance or control. On the other hand, with Python (or say Node), or with Java and Hibernate the wins are smaller.

Of course, there will be some queries which are just better written in plain SQL. If you're willing to accept that, ORMs are a good tool in your toolbox.

isbvhodnvemrwvn · on Oct 21, 2020

You can get some benefits you mention by using query builders (like jOOQ). They can get unwieldy at times (not part of the language), but you can drop directly to SQL if you need to, and they tend to be simpler than ORMs in various use cases.

cies · on Oct 21, 2020

Saw the title, searched the comments for jOOQ.

jOOQ needs to be mentioned in this context. With jOOQ you basically write Java-SQL-DSL, and with some code generation from your schema you have really neat type safety (and code completion).

String SQL queries are not the answer to ORMs, imho, we need a level in between. jOOQ provides a really cool set of features in that space.

arusahni · on Oct 21, 2020

Wasn't familiar with jOOQ - neat! Looks like it covers the same set of use-cases that SQLAlchemy-core handles in Python.

Cthulhu_ · on Oct 21, 2020

Counterpoint: By migrating to microservices, you're moving the problem, not solving it. Sure, your individual services and tables are simpler, but when you need to combine the data, update it, etc you run into big issues that aren't trivial to solve.

dnautics · on Oct 21, 2020

If you're talking about linq I wouldn't call that an orm.

tehlike · on Oct 21, 2020

Linq + ef or linq + nhibernate. You can use linq to query against both.

tehlike · on Oct 21, 2020

Ef, entity framework, exposes an iqueryable which you can use linq to query against. Meaning you can use your orms entity model and query your database In a type-safe compile-time-checked manner.

dnautics · on Oct 21, 2020

Linq is more of a query builder than an orm, or am I wrong?

jeswin · on Oct 21, 2020

Linq is not an ORM, but Entity Framework is. But what makes EF interesting is Linq.

drbawb · on Oct 21, 2020

The last bit of the article reminds me of an experience I had recently working w/ another group of developers. They write in C# w/ Entity Framework, but they just use it to map stored procedures to C# data-types. All queries to their database go through a stored procedure w/o exception.

They push as much of the business logic as possible to the database. Their reasoning being that if the client insists on a separation between the DB & application servers: you should do as much computation as close to the data as possible. Then just send the end result over the wire.

Due to my own ORM-induced brain damage I found it hard to wrap my head around this at first: a data type in the application no longer represented a table, but the result of a query. Once you realize the database is just another API though it clicks really nicely into your architecture.

I think I still prefer things like Linq, jOOq, Arel, Ecto, etc. where you can write the query in your programming language and have it translated to SQL. It's just nice to see your query right next to the code when debugging. The author is absolutely right though you still have to know SQL to use tools like that effectively, so you might as well just learn it early instead of wasting effort learning the quirks of a specific ORM.

deepstack · on Oct 21, 2020

Most programmer has problem with SQL because it is declarative langue, and doesn't fit into the C, Java context very well. However, if you just spend some time getting use to declarative langue, you will find SQL is quite nice. You just ask for stuff you are looking for and the computer will give to you. And SQL servers like Postgres has gotten so optimised in returning search results, it really out performs mapReduce.

drbawb · on Oct 21, 2020

Yup that has been my experience as well - if you read someone's stored procedure it's pretty easy to tell if they're a "relational thinker" or an "imperative thinker." People who embrace an RDBMS will tend to use constructs like CTEs, windows/partitions, set operations, they'll do their conditionals in the projection (i.e w/ CASE statements), they might use a CROSS APPLY to do a transform over a "collection", etc.

Whereas people who are "imperative thinkers," who just treat the database like a giant excel spreadsheet, will use flow control for branches, cursors for iteration, tons of temporary variables, etc. Without fail the iterative thinkers wonder why they need monster CPUs for their DB server and it's always pegging one core. Well of course the query planner can't optimize your code: you told it how to get the data, rather than asking for what data you wanted.

Illniyar · on Oct 21, 2020

This was a fairly common approach about 10-15 years back.

It has its charms but it mostly fell out of favor because of versioning. It's very hard to manage multiple versions and migrations for the database and it had a negative affect on development velocity.

nodamage · on Oct 21, 2020

IMO people discussing this topic really need to clarify what type of applications they are working on because the cost/benefit analysis changes quite significantly based on use case. If you're building a web-based reporting tool where you simply query records out of a database to dump to HTML and your objects are short lived, you might not get as much value out of an ORM compared to a client-side app where objects stick around for the entire lifetime of the application and you have to worry about things like object identity and staleness.

3np · on Oct 21, 2020

I'd also add that if it's a community-based open source project, there can be great benefit in more safely supporting a bigger set of database engines (e.g. sqlite for the simple entry-level, postgres et al for the more serious) at the cost of some performance overhead.

If it's a performance-critical in-house service where you have full control of the deployment and can tailor it to your use-case, it's a completely different story.

midasz · on Oct 21, 2020

We have to support both MySQL and MSSQL so Hibernate has been mostly great for that. For really simple queries it's also nice to be able to write "findAllBySomethings_IdAndPropertyIsNull" etc instead of writing a SQL query. Been there, done that.

Though I have to admit I do check what queries Hibernate produces to make sure it's not doing some funky stuff that's not really needed (as a result of a mistake of my own, mostly)

Edit: It has come to my attention that it is ofcourse Spring Data that offers those named queries. Still see nothing wrong with hibernate since I still only have to write it once and I'll be relatively safe supporting both db systems.

unklefolk · on Oct 21, 2020

I personally have found that how to approach data access has been a real bone of contention on projects and quite damaging. It seems half the team want to use an ORM and have a long list of reasons not to use SQL/Stored Procs (slower to develop, out dated, not testable, business logic in the wrong place). The other half of the team wants to avoid ORMs and would rather use SQL/Stored Procs (performance, ORMs start off okay but soon aren't up to the job, more control and power with direct SQL). In fact, you can see these two standpoints in this very discussion.

I find both have valid points and there isn't really a compromise. Whichever approach is taken you end up with half the team feeling not listened to and disenfranchised.

I have found few things to be more divisive than the ORM vs No ORM debate and I am not sure what the answer is.

BenoitP · on Oct 21, 2020

I have that that kind of divisive discussion. But we did compromise. What we ended up doing was using an ORM for CRUD operations, which were mostly user-facing ('row' operations); and SQL for reports / data monitoring ('column', bulky operations). And doing our most to avoid procstocs (limit them to executing REFRESH MATERIALIZED VIEW and COPY operations to the best of our ability), and trigger processing.

We do manage the SQL code (view definitions, mviews, and the remaining procstocs) right next to the application code with Flyway [1]. We also have SQL-ORM integration tests, spinning a database container with testcontainers [2].

We've had some issues with business rule duplication (updating the ORM and forgetting to do the same in SQL), but so far I'd say it's successful.

This being said, I'm on the SQL boat; and I remain convinced that the CRUD could be done comfortably with an SQL query builder such as jOOQ [3]; and that it would help solving the business rule duplication issue. But hey it's working right now and everybody is happy about it, so why change it?

[1] https://flywaydb.org/

[2] https://www.testcontainers.org/

[3] https://www.jooq.org/

davidgl · on Oct 21, 2020

Agreed. As ever, it depends a lot on the type of work and how big the system is. We use a lot of SQL (not sprocs), but are moving back to LINQ for reasons of composition and maintainability. I love writing SQL, but in our system which is north of 300 tables, on a project over 10 years old now, the lack of compatibility and general lack of static dependencies in SQL is hurting us more and more

sjwright · on Oct 21, 2020

The problem comes when anyone thinks the ORM vs No ORM debate has a single answer. For me it's a simple test:

• Your data makes sense as objects;

• Your data is small enough to fit in memory;

• Most of your operations are CRUD.

If you can answer all three with yes, then use an ORM or some other kind of SQL abstraction layer. Otherwise don't.

jbjohns · on Oct 21, 2020

But if your answer to all those questions is yes, why are you using an RDMS to begin with? Wouldn't some other kind of data store map better to those use cases, and then you wouldn't need a bit abstraction to make it look like something it isn't.

sjwright · on Oct 22, 2020

I completely agree, because ORMs are literally jamming a metaphorical round peg in the square hole. But enterprise gonna enterprise. Shrug.

TobiasA · on Oct 21, 2020

Per your last point, why don't you think ORMs work well with complex domains?

murgindrag · on Oct 21, 2020

ORMs completely fail for many complex domains. If your data represents an arbitrary tree hierarchy, a graph, or otherwise, relational databases have nice data architectures, but those completely misalign with how ORMs handle data.

You don't even need to go that complex. Even moderately complex JOINs start to look better in SQL than in ORMs.

If you have an employees database, an inventory database, and a clients database, an ORM is perfect. If you have a fixed hierarchy, ORM works great too. It maps objects to the database. That's 90% of web apps.

If you're building e.g. an online CAD system with a complex data model for storing hierarchical layers of objects with complex relations, and where you need to perform complex operations on that data for e.g. simulations and optimizations, SQL will actually handle that just fine. You'll just be doing SQL beyond what fits into an ORM. At that point, if you use an ORM, you'll be doing complex contortions. Think back to your data structures class. Then to the grad-level data structures class. Most of those structures, SQL will handle fine, but ORMs won't.

SQL has also turned out to be surprisingly resilient to different programming paradigms. SQL came out just around a half-century ago, before OO was common, and did fine with OO, functional, structured, and a whole range of other paradigms which have moved into and out of vogue over that time. ORMs, as the name implies, are specific to OO.

A few posts up, the poster is correct that there are design patterns around ORMs which work well, and design patterns around SQL which work well. You want to pick one and stick to it. The mess comes in when you mix layers of abstractions. One or two SQL procedures might not kill you, but when you have big chunks of code using ORM and big chunks NOT using ORM, you'll crash-and-burn.

Footnote: "Complex domains" is also about the database layer, and the type of complexity. Right now, I'm working on a very algorithmically complex system, but that complexity isn't in the data layer. Most of the data engineering is about moving GB of data around in realtime. All the database needs to handle are simple things like auth/auth. That's an ideal use-case for an ORM. I've also built systems with ORMs where the database layer was complex, but complex in ways which aligned well with ORMs. So this shouldn't be read as "dumb programmers who can't handle functional use ORMS."

Footnote to posterity: Should someone stumble upon this in a web search: If you grew up in OO and Java, and never ventured beyond, none of the above will make sense to you. You only learned the programming paradigm ORMs were designed for.

sjwright · on Oct 21, 2020

The intent of my third point is that you should expect a decent proportion of the operations to be handled well by the ORM such that you can reap efficiency benefits from the abstraction. If you're spending a significant amount of time jumping out of the ORM's sandpit and into custom queries, the benefit of the ORM may be limited. And to the extent that it dictates/influences your table structures, it might be a hindrance.

UK-Al05 · on Oct 21, 2020

Don't put business logic in stored procs.

They're great for storing complicated queries, but the actual buiness logic? No.

Have some stored procs for updating and querying data, then a lightweight abstraction around them in the application.

taffer · on Oct 21, 2020

What does "business logic" mean to you? According to the Wikipedia definition [1], I find it very difficult to imagine SQL that does not contain business logic.

[1] "business logic or domain logic is the part of the program that encodes the real-world business rules that determine how data can be created, displayed, stored, and changed.", https://en.wikipedia.org/wiki/Business_logic

UK-Al05 · on Oct 21, 2020

Business Logic is calculating tax, working out price, adding something to a bag, populating an order object with infomation according to some rules.

Business logic isn't pulling data from database, how to map an object to a table, or the specific SQL query to grab some data. Those are technical details. SQL should just be pulling, saving and querying from the store without much maniplation. Not changing the data.

It's the rules that a business stakeholder might be interested in creating.

When you start putting the tax calculation rules into the stored proc, thats where the problems begin.

If you create a CTE to pull the information required to perform the calulation then return that info without doing the calculation thats fine.

taffer · on Oct 22, 2020

Let's assume there is a business requirement that an order belongs to a customer. By definition, this is business logic. If you implement this requirement by creating a foreign key constraint, then you have added business logic to the database. If there is a business requirement to calculate the sum of all wages in a department, and you do this via an aggregation in SQL, then you have processed business logic in the database.

If you are not allowed to do this kind of things, then what is left? Do you interact with your database as if it were just a bunch of VSAM files?

UK-Al05 · on Oct 22, 2020

Tbh a lot of NoSQL databases don't use foreign key constraints. If you have a SQL system with messaging and eventual consistency you have to turn them off anyway.

So in a lot of ways, yes people use databases like they're just persistent memory with indexes.

sjwright · on Oct 21, 2020

Stored procedures are a really bad hack—in any practical sense they function as a mediocre API server which comes bundled "for free" with your SQL database server. If you need an API layer at all, it's much better to write this in a real programming language.

Just about anything you can do with stored procedures, you can do within an SQL query block in your favourite language. And when all your queries are fully parameterised, your database server can even use the same execution plan cache as "real" stored procedures.

mattmanser · on Oct 21, 2020

Just to be explicitly clear to newer programmers.

This is an extremely niche view, ORMs save you a huge amount of time, and the days before ORMs were a real pain in the arse.

As a noob you are much more likely to introduce a massive security hole by rolling your own solution. Don't do it!

So this article is sorta true, and you should learn SQL, but for trivial queries ORMs are actually a massive time saver and really useful in maintaining a system, especially in strongly typed languages (Java/C#/etc.).

commandlinefan · on Oct 21, 2020

The article outlines four of the (many) places where ORMs actually cost you time rather than save it.

mattmanser · on Oct 21, 2020

None of which are completely true. There's a good reason ORMs are so ubiquitous.

Look at his description of the things he was making.

You do not use ORMs for anything but the most basic reports.

eNTi · on Oct 21, 2020

I'm currently in the process of porting api code from .net 3.2 soap where the code is a mixture of string based SQL queries and stored procedures to a .net core 3.1 webapi mvc + ef.

Let me tell you... readability and type saftey are a boons. Not a curse.

Things still get convoluted but boy the errors that can sneak into a complex sql statement are painful to debug.

As always the pendulum in this "article/rant" is swinging in the other direction ("everything was better in the past"). Also it's fricking (almost) 7 years old. That's a lifetime in software development.

majewsky · on Oct 21, 2020

> [7 years] is a lifetime in software development.

If we were talking about something like Kubernetes best practices, I'd definitely agree with this point. In this case, though, I don't see what substantially changed about ORMs or SQL in the last 7 years.

ashtonian · on Oct 21, 2020

The two are not mutually exclusive. Dapper provides types safety cleanly and imo is more readable than ef. Ef is heavy, requires trading to understand and is a giant pain if you need to implement performant sql.

mmatczuk · on Oct 21, 2020

Not all ORMs are like hibernate and gorm. I'm author of a lightweight ORM for Go and Scylla / Cassandra [1] so I may be biased. In my view use of a good ORM makes your code more resilient to change, refactorings are simple and stupid mistakes largely eliminated. A benchmark of usefulness of an ORM can be how many lines need to be changed to add a new field to a struct / class. In a good ORM this should be about 1, it should still work be ~1 if you use a hand crafted SQL / CQL.

The worst thing that ORMs try to do is handling relations (eager / lazy loading) and handling sessions / object lifetime. This is mainly because it's impossible to have one-fit-all solution for all usecases even in one application.

[1] https://github.com/scylladb/gocqlx

scaryclam · on Oct 21, 2020

I use a lot of raw sql and I also use ORMs. I don't see ORMs as being the issue the author thinks it is, bit rather that developers suck at designing the data layer. A half decent design will do as you say; it will allow the use of an ORM or hand written sql. ORMs make magic easier, and pulling that magic apart is a real pain, but so is working with triggers and stored procs. The problem exists on both sides. The solution isn't to get rid of ORMs, but to train developers in better system design.

PeterCorless · on Oct 23, 2020

All reminds me of this talk at Aerospike back in 2015: https://twitter.com/i/events/1032400612653129729

PeterCorless · on Oct 23, 2020

My favorite quote of the talk, "There's a lot of stuff that's deployed everywhere that sucks everywhere."

wodenokoto · on Oct 21, 2020

When working directly with SQL instead of an ORM, how do you elegantly handle things like parameterizing table names or having flags that turns on or off filters?

In non-ORM codebasese I am seeing patterns where sql queries are "copy-pasted" together and I am not sure I like it.

Just imagine the following example had hundreds of lines of SQL and several optional filters, some of which could themselves be several lines long:

    def get_data(table1, table1, use_subset=True):
        if use_subset:
            filter_query = "AND column2="subset_value"
        else:
            filter_query = ""

        query = f"""
        SELECT 
          * 
        FROM
          {table1}
        JOIN
          {table2}
        WHERE
           column1="value"
           {filter_query}
        """

strken · on Oct 21, 2020

You could use a query builder for this, rather than a full ORM. Programmatically generating SQL is a subset of what an ORM does, and it's one of the less contentious parts.

zwack · on Oct 21, 2020

In quma this would look like this (when using the default mako templating).

    example.sql

    SELECT 
        * 
    FROM
        %(table1)s
    JOIN
        %(table2)s
    WHERE
        column1 = "value"
        % if use_subset:
            AND column2=%(filter_value)s
        % endif


    def get_data(table1, table1, use_subset=True):
        if use_subset:
            filter_value = "subset_value"
        else:
            filter_query = ""

        data = cur.example(table1=table1,
                        table2=table2,
                        use_subset=use_subset,
                        filter_value=filter_value).all()

https://github.com/ebenefuenf/quma/ https://quma.readthedocs.io/en/latest/templates.html

ashtonian · on Oct 21, 2020

I have come to the same conclusion as the author.

In go I use rql: https://github.com/a8m/rql

Which I ported to c# for use in .NET: https://github.com/Ashtonian/RQL.NET

pkphilip · on Oct 22, 2020

EDIT: Removed the reference to SimpleJdbcTemplate.

A good alternative (if you are using Java) is to use something like Spring's JDBCTemplate. This provides the following benefits over using Hibernate/JPA:

1. You craft the queries yourself and so there is no intermediate ORM language which you need to use.

2. It parameterises the queries so that you don't have to append strings to get to the final query.

https://dzone.com/tutorials/java/spring/spring-simple-jdbc-t...

<code>

@Override public void insertForum(Forum forum) {

  String query = "INSERT INTO FORUMS (FORUM_ID, FORUM_NAME, FORUM_DESC) VALUES (?,?,?)";

  jdbcTemplate.update(query, forum.getForumId(), forum
    .getForumName(), forum.getForumDesc());

 }

@Override public Forum selectForum(int forumId) {

  String query = "SELECT * FROM FORUMS WHERE FORUM_ID=?";

  return jdbcTemplate.queryForObject(query, new ParameterizedRowMapper<Forum>() {

     @Override

     public Forum mapRow(ResultSet resultSet, int rowNum) throws SQLException {

      return new Forum(resultSet.getInt("FORUM_ID"), resultSet.getString("FORUM_NAME"), 

                           resultSet.getString("FORUM_DESC"));

     }

    }, forumId);

 }

</code>

3. It fetches only the columns that you need it to fetch and across multiple tables (if need be) without fetching the "relationships" between tables unnecessarily.

The example given below shows how to use RowMappers and to fetch only the columns you need:

<code> String query = "SELECT FORUM_ID, FORUM_NAME, FORUM_DESC FROM FORUMS WHERE FORUM_ID=?";

return (Forum) jdbcTemplate.queryForObject(query, new Object[] { Integer.valueOf(forumId) },

      new RowMapper() {

         public Object mapRow(ResultSet resultSet, int rowNum) throws SQLException {

                    return new Forum(resultSet.getInt("FORUM_ID"), resultSet.getString("FORUM_NAME"), 
                            resultSet.getString("FORUM_DESC"));

                }

            });

ipiz0618 · on Oct 21, 2020

This is basically how I use ORMs. Only as a query builder. Writing plain SQLs in code can be very messy and hard to debug, and this works similar to an API the author mentioned in his article. IMO ORMs can spare some duplication, but relying too much on that is quite painful.

izietto · on Oct 21, 2020

The point of ORMs is not avoiding SQL, it is avoiding concatenating strings in order to build complex queries. In our app we have very complex query logics depending on the request params, and I can hardly imagine how dirty the code would be without the relational algebra abstraction.

throw_m239339 · on Oct 21, 2020

> The point of ORMs is not avoiding SQL, it is avoiding concatenating strings in order to build complex queries. In our app we have very complex query logics depending on the request params, and I can hardly imagine how dirty the code would be without the relational algebra abstraction.

You're talking about query builders. ORM often come with a query builder but the goal is really about mapping object relations(1:n,1:1,n:n) to SQL queries in an automatic fashion, nothing more. today, most of the time it can easily be done with with JSON queries on most RDMS. But advanced ORM also come with a bunch of useful stuff like unity of work, caching and co...

fabian2k · on Oct 21, 2020

I like ORMs for routine queries, there's a lot of stuff that is just much nicer and less tedious to write that way. It's also generally much less annoying to have some dynamic aspects in your queries with an ORM, e.g. adding different WHERE clauses depending on some parameters.

But even with relatively small and simple things you can run into issues very quickly if you don't know what kind of queries the ORM will create. ORMs are very leaky abstractions, they're useful but unless you understand SQL and understand their quirks you're likely to create some weird and monstrous queries at times.

You should know how your ORM handles related entities if you query them, there are some big footguns there with some strategies. And of course you should know how to use the ORM so that it doesn't do a "SELECT *" everywhere (which can be trickier than I'd like in some cases).

I would also not hesitate to drop to plain SQL for some cases, if your query doesn't fit neatly into the capabilities of your ORM.

twic · on Oct 21, 2020

> ORMs are very leaky abstractions

I think this is where a lot of the anti-ORM crowd trip up. They start with the idea that an ORM completely abstracts away a database, show that isn't true, and then condemn ORMs. But ORMs don't do that, aren't advertised as doing that, and aren't useless because they don't do that.

ragnese · on Oct 21, 2020

Yes and no. I find that using leaky abstractions is often harmful as a general principle.

So it doesn't matter that ORMs admit that they are a leaky abstraction. I still pretty much hate them for it.

It just feels pointless when I know I'm going to have to understand and use the underlying tech anyway. For ORMs specifically, I much rather use a query builder that's basically just a type-checked SQL DSL. Some of those aren't complete either, though, which also enrages me.