Performance: SQL Server vs MongoDB (Reporting)
As a continuation of this performance comparison, I wanted to also compare performance in SQL Server's supposed strong point: reporting on very large relational data sets. Does it truly perform better than MongoDB? Again, this will attempt to be a mostly vanilla comparison, without the different optimizations that can be done in both platforms, for a couple of typical reporting queries.
Setup
The test box remains the same: Core i7 CPU, 16 GB RAM, 1 TB 7200 RPM data disk. SQL Server 2019, MongoDB 6.0.1.
Data was randomly generated using a console app and inserted into both platforms. For SQL Server, I created both a relational table structure, and a columnstore indexed table that is more typically seen in reporting databases. The columnstore table was created as a flat copy of the relational tables, and a SELECT FROM INSERT statement populated the data. MongoDB received the same data, with the Customer and Product data embedded in each document to avoid doing any lookups or joins.
Sum of Quantity Sold & Total Price per Customer
SQL Server (clustered index)
from FactTransaction fact
join DimCustomer cust on fact.CustomerId = cust.Id
join DimProduct prod on fact.ProductId = prod.Id
where cust.[Name] = 'The GlowUp'
group by cust.[Name], prod.[Name]
Results: 15 minutes
SQL Server (columnstore index)
from TransactionCCI fact
where CustomerName = 'The GlowUp'
group by CustomerName, ProductName
Results: 27 seconds
MongoDB
Results: 18 minutes
Highest Total Price per Customer, per Product
SQL Server (clustered index)
from FactTransaction fact
join DimCustomer cust on fact.CustomerId = cust.Id
join DimProduct prod on fact.ProductId = prod.Id
group by fact.ProductId, fact.CustomerId
order by CustomerName, ProductName
Results: 16.5 minutes
SQL Server (columnstore index)
from TransactionCCI fact
group by ProductName, CustomerName
order by CustomerName, ProductName
Results: 32 seconds
MongoDB
Results: 58 minutes
Conclusion
TL;DR: SQL Server is better at reporting against very large data sets.
"Normal" relational table structures perform better than MongoDB. Depending on the query, performance was either slightly better or much better. However, both structures were blown out of the water by a columnstore index. The performance difference is unquestionable. When dealing with reporting against very large data sets, if it's possible to use columnstore, do it.
A final note, in the future I would like to do a performance comparison of SQL Server's columnstore index versus a NoSQL columnstore platform, such as Cassandra.
Comments
Post a Comment