Why Linq2Umbraco can be a trap

Some time ago I took over a Umbraco e-commerce website built with a custom web shop implementation. No fancy stuff, just categories, products, a basket and a checkout. The site owner complained that Umbraco was slow and that the loading times (sometime 5-10 seconds for one page) where unacceptable. Since I have been working with Umbraco for years I know that the system is superfast if used in the right way – so I defended Umbraco.

Why was the website slow?

When I was looking for the source of the “slowness” I learned something about Linq2Umbraco that I have never thought of as a problem.


Linq2Umbraco what?

If you have never heard of Linq2Umbraco it’s a very handy piece of technology included in the Umbraco core. Aaron Powell wrote this and I think it was first included in Umbraco v4.5.0?

It’s basically a way to query the published and cached Umbraco data from your code using the Linq-syntax. A query could look like this:

var products = from p in MyUmbracoContext.Products
                       where p.Price > 100
                       select p;


This approach is a lot cleaner than using for example XPath to perform the same query and a lot more readable!


umbraco-linq2umbraco-doctypes To use Linq2Umbraco you just right click the “Document Types”-folder in the settings-section and Export your document types as C#-classes.


Umbraco will now generate a code file with your document types as POCO-classes and create an UmbracoContext that will be used to query the data.





This is great! What is your problem?

With great power comes great responsibility, Linq2Umbraco is great but it’s not the solution to all our problems. As Aaron writes in his article on “Why no IQueryable in LINQ to Umbraco?” Linq2Umbraco will take all the content of the xml-cache and convert it to an in memory collection and from there parse that collection.

This means that if you have 20-40 nodes in your content section during development you site will be superfast. BUT. When the site is in production and the client adds all there 5000 content nodes we run into a problem.

The project that I was working with used Linq2Umbraco for all the interaction with the Umbraco cache. Even if they just needed one node from the cache they wrote something like:


var products = (from p in MyUmbracoContext.Products
                       where p.Id == 2555
                       select p).First();


Think about that query for a while. What will happen during execution?


Answer: Linq2Umbraco will grab the XML-cache and parse that in to a in memory collection of objects. In this case with over 5000 products only just this action takes around 2-3 seconds. Our code will now run the conditions (where p.Id = 2555) on that in memory collection and return the product with an id of 2555.

Since Linq2Umbraco creates this in memory collection its very expensive to run queries on sites with a lot of document types. I ended up creating another repository implemented with XPath to get just one node, and for more advanced queries (not something I did in this project) I would look at Examine.

So to wrap up. If you know that your site may grow over something like 100 nodes, do not use Linq2Umbraco if you are not 100% sure what your are doing.



PS. If you like the blog post - please share it =D