T
he excellent Joseph Albahari, author of LINQpad, wrote a blog post: Why LINQ beats SQL and while I really like LINQpad (how else would I debug LINQ?) I do think Joseph misses the point a little with SQL. He says: “Compared to SQL, LINQ is simpler, tidier, and higher-level” and my first reaction is that SQL and LINQ are bulldozers and Cadillacs here, not good subjects for direct comparison.

I think there is a place for LINQ to entities but I do not believe that all database functions and access can be replaced by LINQ queries. I’d propose that any query that involves more than a couple of joins probably (not always) should be a stored procedure or inline SQL rather than a LINQ query.

Why?

Transact-SQL is a language provided for manipulating data in the SQL Server environment. As Joseph points out, SQL was invented in 1974. That doesn’t make it bad, instead Transact SQL has had a long time to evolve, and there are is a lot of functionality in SQL Server that’s there because it makes the edge cases workable (table variables, common table expressions, on and on).

SQL is executed therefore in the context of the database engine, before any data is returned to the client PC, web server, or other data consumer. SQL is very efficient at retrieving data, and ideally you would want to have SQL transmit over the network only that specific data you need to work with.

LINQ can be used to create complex queries, but for complex queries Transact SQL requires less programmer effort and often runs more efficiently.

For example…

A temptation when trying to code a complex query in LINQ is to break it into multiple chunks, or to use loops. This will usually be slower and less efficient than asking for the correct data in the first query.

This is true in Transact SQL and it’s true in C#, and “cursors are bad” is usually one of the first refrains you encounter when starting to work with Transact SQL. LINQ obfuscates your application’s interaction with the database, so it’s a lot easier to accidentally write code that generates convoluted and inefficient queries, and a lot harder to detect when it’s a query that’s causing the performance issue. (Imagine a DBA looking for slow queries, and then imagine her trying to make sense of the SQL generated by LINQ to Entities).

When you’re working with data, it’s a good idea to be aware of the context that you’re in and SQL Server makes that a little easier by providing options like SET STATISTICS IO and by showing query plans. You can control aspects of your environment using options like SET TRANSACTION ISOLATION LEVEL, common table expressions and global and local temporary variables.

LINQ usually creates efficient queries, but when it doesn’t it can be really tough to to determine why. The generated sql that it creates is very difficult to parse for moderately complex queries, and there’s no way to control how that sql will be executed using, for example OPTION FORCE ORDER or TABLE HINT INDEX.

If you’re working on a complex query, and have to do so in LINQ, it can be a nightmare, even using the wonderful LINQpad. Think about it: Every LINQ command is being translated into SQL before being executed. It’s like programming using waldos.

For example, what is the below LINQ to entities error telling me?

mystery

Answer: That the model classes don’t match the schema of the database, a table in the database (which one?) has a column (again, which one?) that is defined in the C# model as having a length of 8000 but which in the database has a length of 4000. For a query, this error is irrelevant, and a Transact SQL would just retrieve the data here, no questions asked, no exceptions raised.

This does point out a situation where LINQ to Entities is very useful. That is: Inserts, updates and deletes. LINQ to entities protects you against situations where you are inserting data into a table, but the data type of the column is different than when you initially created the application. Instead of a runtime error when the insert takes place, the application will fail right away.

So, that being said, I would say: Sometimes LINQ beats SQL, and sometimes SQL beats LINQ:

Use LINQ for:

  • Simple queries (fewer than a couple of joins, no aggregates beyond COUNT)
  • Adds, inserts and deletes

Use SQL for:

  • Complex queries
  • Queries where performance is important
  • Cases where server-side data manipulation is necessary (temp tables, SQL to XML, etc.)

Debugging a complex LINQ statement that’s embedded inside of some giant and monolithic application can be a real exercise in stress management. In these cases a little pragmatism (and a stored procedure or inline SQL) can add years to your life.