Sometimes we are tasked with writing some analytics query and the first step is to use the ORM and the job is DONE.
We test it out locally and it’s INSTANT, of course it is... we have 5 rows in the table.
We go to staging and everything is FINE.
Production is also FINE.
Some time passes and the users have filled up the database with their data.
Now we are starting to see some spikes...
And the day comes where our job fails and management is upset because the report no longer arrives.
Now that we know the pain, let's see how we optimized our monthly/weekly report query.
Simply: This is a query that sums / counts it the period of last week or last month
DateTime currentUtcTime = DateTime.UtcNow;
DateTime oneWeekAgo = currentUtcTime.AddDays(-7);
DateTime oneMonthAgo = currentUtcTime.AddDays(-30);
return UnitOfWork.Entities.All()
.Select(e => new EntityStatisticsDto
{
EntityName = e.Name,
Attribute1LastWeek = e.RelatedEntities2.Where(re2 => DbFunctions.TruncateTime(re2.CreatedOn).Value >= DbFunctions.TruncateTime(oneWeekAgo).Value)
.Select(x => x.Attribute1)
.DefaultIfEmpty(0)
.Sum() / CustomExtensions.OneUnitBytes,
Attribute1LastMonth = e.RelatedEntities2.Where(re2 => DbFunctions.TruncateTime(re2.CreatedOn).Value >= DbFunctions.TruncateTime(oneMonthAgo).Value)
.Select(x => x.Attribute1)
.DefaultIfEmpty(0)
.Sum() / CustomExtensions.OneUnitBytes,
OptimizedAttributeLastWeek = e.RelatedEntities2.Where(re2 => DbFunctions.TruncateTime(re2.CreatedOn).Value >= DbFunctions.TruncateTime(oneWeekAgo).Value)
.Select(x => x.EntityType == Models.RelatedEntity2.EntityType.Type1 && x.Type1 != null && x.Type1.OptimizedAttribute != 0 ? x.Type1.OptimizedAttribute : x.Attribute1)
.DefaultIfEmpty(0)
.Sum() / CustomExtensions.OneUnitBytes,
OptimizedAttributeLastMonth = e.RelatedEntities2.Where(re2 => DbFunctions.TruncateTime(re2.CreatedOn).Value >= DbFunctions.TruncateTime(oneMonthAgo).Value)
.Select(x => x.EntityType == Models.RelatedEntity2.EntityType.Type1 && x.Type1 != null && x.Type1.OptimizedAttribute != 0 ? x.Type1.OptimizedAttribute : x.Attribute1)
.DefaultIfEmpty(0)
.Sum() / CustomExtensions.OneUnitBytes
}).AsEnumerable();
The main problem we faced with this query is that entity framework read too many times from the tables, instead of joining
This was found by simply executing the SQL this code gave us and showing the Actual execution plan in SSMS
CREATE PROCEDURE [dbo].[sp_GetEntityStatistics] (@unitSize INTEGER)
AS
DECLARE @oneWeekAgo as DATETIME = GETUTCDATE() - 7
DECLARE @oneMonthAgo as DATETIME = GETUTCDATE() - 30
SELECT
[databaseEntity].[Name] as EntityName,
SUM(CAST(CASE WHEN databaseRelatedEntity2.CreatedOn >= @oneWeekAgo then databaseRelatedEntity2.Attribute1 ELSE 0 END as DECIMAL(36,2))) / @unitSize as Attribute1LastWeek,
SUM(CAST(CASE WHEN databaseRelatedEntity2.CreatedOn >= @oneMonthAgo then databaseRelatedEntity2.Attribute1 ELSE 0 END as DECIMAL(36,2))) / @unitSize as Attribute1LastMonth,
SUM(CAST(CASE WHEN databaseRelatedEntity2.CreatedOn >= @oneWeekAgo then CASE WHEN databaseRelatedEntity2.EntityType = 1 AND databaseType1.ID IS NOT NULL AND databaseType1.OptimizedAttribute <> 0 THEN databaseType1.OptimizedAttribute ELSE databaseRelatedEntity2.Attribute1 END ELSE 0 END as DECIMAL(36,2))) / @unitSize as OptimizedAttributeLastWeek,
SUM(CAST(CASE WHEN databaseRelatedEntity2.CreatedOn >= @oneMonthAgo then CASE WHEN databaseRelatedEntity2.EntityType = 1 AND databaseType1.ID IS NOT NULL AND databaseType1.OptimizedAttribute <> 0 THEN databaseType1.OptimizedAttribute ELSE databaseRelatedEntity2.Attribute1 END ELSE 0 END as DECIMAL(36,2))) / @unitSize as OptimizedAttributeLastMonth,
FROM dbo.[Entity] as databaseEntity
INNER JOIN dbo.[RelatedEntity2] as databaseRelatedEntity2 on databaseEntity.ID = databaseRelatedEntity2.EntityFk
LEFT OUTER JOIN dbo.[Type1] as databaseType1 on databaseRelatedEntity2.ID = databaseType1.ID
WHERE databaseEntity.DeletedOn IS NULL
GROUP BY databaseEntity.Name, databaseEntity.ID
By having this Stored Procedure we reduced the execution from around 5 minutes and sometimes failing to around 20-30seconds