> In any sufficiently complex application, something like a query builder that allows different parts of the application to work together to lazily compose queries (vs combining/filtering/sorting the data in memory) will be created at some point.
Welcome to LINQ - introduced in 2007 (?).
LINQ isn’t exactly an ORM. You create strongly typed LINQ statements that are turned into expression trees that are then turned into a query by a query provider.
Being able to run queries over data without having to implement the various boilerplate is excellent.
Unfortunately basic mistakes can result in so many of those N+1 type queries if you're not careful. It can also result in reading entire table(s), possibly inside those nested N+1s.
It gets expensive when it's a remote DB server and/or dealing with a lot of records.
Profiling tools are essential.
It's why I'm not a fan of LINQ (and Entity Framework) for talking to DB severs without very strict controls.
Many times I've been called in to deal with a "slow" server, only to find out the issue is someone chained together a bunch of calls through EF and we have hundreds of thousands of queries that can be replaced by one or two.
> Unfortunately basic mistakes can result in so many of those N+1 type queries if you're not careful. It can also result in reading entire table(s), possibly inside those nested N+1s.
I don't know which "basic mistake" would do this. Maybe when using Lazy loading Proxies?
We use EF Core and it is quite easy to control eager loading or load-on-demand on a per-case basis.
Would you consider SQL an abstraction over DynamoDB? ElasticSearch’s native query language? Apache Presto to query files? Mongo?
All of those can use SQL as a worse query language than their native counterparts.
You can also create very bad SQL if you don’t know the underlying engine. For instance if you try to write SQL for a columnar database like you would for a traditional database, you are in for a world of hurt.
You seem to be talking about databases who's native language is not SQL. In those cases, yes, SQL is definitely an abstraction. There's some layer that's transforming that into (say) Elastic query or Dynamo queries.
I would argue that when talking about writing SQL, most folks are talking about applications who's query planner talks some dialect of SQL. MS SQL Server, MariaDB, MySQL, Oracle, SQLite.
In those cases, it's not really an abstraction any more than writing assembly is an abstraction over (say) Intel's CPU microcode is.
The query planner takes your SQL and turns it into something else, sure, but you can't generally do that yourself. It's the lowest layer of abstraction that's reasonably available.
Iirc, Linq is exactly a query builder and not an orm. IMO what truly makes an orm is when object fields have data binding against the database (updating a field triggers an update in the db)
It's kinda gross because the developer needs to make a decision about what is authoritative source of truth, the programming model makes you feel like you can trust your code (probably the wrong choice), and all the footguns around distributed state kick in (possibly even worse if you have a frontend with two way data binding and data structures that last longer than an http request in your backend)
LINQ is a query builder and the .Net runtime creates expression trees. Entity Framework translates the expression tree to SQL and maps data back to objects.
Linq is really just a high level abstraction for querying any data that can be queried. EF adds some more abstractions mostly related to mapping C# objects to database tables. Then it adds database specific query providers (implementations) that combine the (abstract) linq query and the mapping to produce sql.
You can really use linq to query anything (APIs, for example), including databases without using EF. It's just not very common because building a custom query provider is a lot of work.
LINQ is just an interface, you need some sort of an implementation below it (be it the unmaintained LINQ-to-SQL that AFAIK nobody should be using anymore, or the latest EF rewrite).
It's also the first thing I recommend to anyone claiming that concatenating SQL strings together is a good use of your time in the 21st century. Makes writing things like complex HTML tables with configurable columns and tons of optional filters so much faster and more maintainable.
In my experience, across a wide variety of applications built wide a number of different teams, LINQ has almost always been an anti-pattern. Whenever you inject a leaky abstraction (as Spolsky would say), things start to go awry. The crimes of grotesque inefficiency I've seen because the magical LINQ is there to shield devs from proper data tiers.
Absolutely, there are times where programmers try to do way too much in LINQ and don't realize that the query being generated to back up their chain of LINQ operations is a monstrosity of thousands of lines of SQL that will grind the server to a hault. Or the abstraction leaks badly due to the programmer using something EF doesn't know how to translate to SQL, so it ends up loading a much larger dataset and trying to complete the rest of the filtering in memory.
As developers get more acclimated to the .NET ecosystem they usually pick up a good spidey sense for what query "shapes" are appropriate to express in LINQ and which are asking for trouble.
However, for every one of those awful LINQ queries in a codebase, I think there are 40 to 50 run-of-the-mill queries that make it worthwhile. The biggest win that it delivers over a simpler, "stringier" ORM is the ability to return structured data. I have no problem writing SQL, but processing the results back out of flat rows into objects gets extremely annoying.
Say I want to load some Foos with their Bars. If I use straight SQL with a row mapper, I define a type to represent each FooBar row, load a List<FooBarRow> into memory from my simple left join, then I'll have to do a GroupBy in memory on that List to get the actual structure that I want: a list of Foos where each one has its own list of Bars. Notice that I also have to handle the empty-list case specially.
var fooBarRows = await connection.QueryAsync<FooBarRow>("select f.FooId, f.Name as FooName, b.BarId, b.Name as BarName from Foos f left join Bars b on b.FooId = f.FooId where whatever");
// re-shape result set manually from flat query results
var foos = fooBarRows
.GroupBy(fbr => fbr.FooId)
.Select(g => new Foo
{
FooId = g.Key,
FooName = g.First().Name,
Bars = g.First().BarId == null ? new List<Bar>() : g.Select(b => new Bar { BarId = b.BarId.Value, BarName = b.BarName }).ToList()
});
In EF it's just:
var foos = await db.Foos.Where(whatever).Include(f => f.Bars).ToListAsync();
Saving that clutter -- both the code and the otherwise useless intermediate FooBarRow type -- really adds up because the apps I write have a zillion queries very much like this.
So I'll take the occasional shitty query that has to be tracked down and optimized in exchange.
>I have no problem writing SQL, but processing the results back out of flat rows into objects gets extremely annoying.
I'm pretty sure Dapper lets you do this fairly easily. You've got a couple options - you can either provide a mapping function, or you can have your query/sprocs return multiple result sets. Then you can assemble the result sets into the object structure.
It's not quite as magical as LINQ, but it's also not quite as annoying and fraught with so much marshalling code as your example.
When they say it forces client evaluation, what they mean is that any LINQ operations you run after the ToListAsync() will run on the client side - since it's just a normal list in memory at that point and no longer has any knowledge of the database. So if you want to make sure some code will execute client-side, you should ToList or AsEnumerable your query first, then do your additional client-side operations on that list.
The IQueryable operations composed before to the ToListAsync() will run as SQL on the server. In the case of my example it will look pretty similar in structure to the string SQL I wrote before it: a straightforward left join with a where clause. Performance will be similar too. It will not load the entire universe of Foos and Bars and then filter them down on the client.
One of the habits I have developed from working with EF for the past 10 years is to be explicit and as local as possible about forcing the query to evaluate. It lets me pretty reliably predict what the SQL is going to look like.
You can return IQueryables from methods and pass them around through many layers of your program, adding complexity to them as you go, and lazily getting the results at the last possible moment, five layers up the call stack from where the query first started.
It seems at first like that would be good, because that way the maximum amount of logic will run on the DB server, and letting the server do stuff is better, right? But that's also where you open yourself up to being very surprised about what your SQL looks like by the time it executes, with multiple layers of your program each tacking complexity onto the query. It can get ugly. Also you can have problems if you're hanging onto IQueryables that you still haven't executed after you've Dispose()d your DbContext. For this reason I try not to let IQueryables travel very far through my program before forcing evaluation.
It's sad that in programming world, something is either anti-pattern and you must not use it at all, or something is so awesome that you need to use it everywhere.
LINQ in general is a very bad indicator if it isn't very restricted. I don't think it's a crazy claim whatsoever: When LINQ appears littered through code, it usually mean there has been perilously little data access planning, so instead of planning out and centralizing concise, planned, secure, optimized data access avenues, just pretend that LINQ being available everywhere is a solution. Every LINQ-heavy solution I've had the misfortune of using has always, with zero exceptions, been a performance nightmare. And instead of one big function you can just optimize, it is death by a trillion cuts.
I get that there are big LINQ advocates. It's convenient. I'm sure there are projects where it is use spectacularly. I've never seen such a project.
First time I’ve heard that claim was when I had to start working with Go. I was telling someone how much I miss LINQ/functional programming, and he said exactly the same thing “ugh, LINQ is why .NET is so slow”. I was perplexed. Do you even know what LINQ is? How does it make .NET slow? And to the best of knowledge they were comparing JIT startup time to Go’s AOT time. Smh
This comment make me question if you really have any experience with LINQ. If you had, you would not make a comment where you seem to think that LINQ is used only for data access.
LINQ is (at the most basic level) list comprehensions done better. It is functional programming for the imperative C# programmers. It has the potential to remove most/all loops and make the code more readable in the process.
>This comment make me question if you really have any experience with LINQ
The classic fallback.
>If you had, you would not make a comment where you seem to think that LINQ is used only for data access.
List comprehension is data access. Accessing sets in memory is data access.
>If you had, you would not make a comment where you seem to think that LINQ is used only for data access.
It has the potential, and almost the certainty, of allowing one to thoroughly shoot themselves in the foot. See without the "magic" of LINQ the grotesqueness of many patterns of data access (which, as previously mentioned, includes in memory structures. Pretty bizarre that anyone actually in this field thinks this only applies to databases, or that only DBs are "data") would lead one to rethink.
LINQ is almost always a bad indicator. It is actually a fantastic thing in one off scripts and hack type code, but when it appears in production code, I would say 90%+ of the time it is absolutely gross, but it hides how gross it actually is.
> LINQ is almost always a bad indicator. It is actually a fantastic thing in one off scripts and hack type code, but when it appears in production code, I would say 90%+ of the time it is absolutely gross, but it hides how gross it actually is.
Yeah, you definitely have no idea what LINQ, or an list, even is.
It is just a slightly slower than e.g fors, so unless this is hot path, then it is basically not relevant meanwhile it improves readability of the code
Welcome to LINQ - introduced in 2007 (?).
LINQ isn’t exactly an ORM. You create strongly typed LINQ statements that are turned into expression trees that are then turned into a query by a query provider.