Performance of SQL Server Substring Searches
SQL Server offers several methods to find a substring in expressions. I extended a post found here by adding the PATINDEX method, and also checking against the legacy cardinality optimizer to see if there were any differences under that query hint.
5 methods were used:
4 situations were compared:
5 methods were used:
- LIKE
- SUBSTRING – can only use this method if the substring either starts or ends the string, and only available in SQL 2016+
- LEFT/RIGHT – can only use this method if the substring either starts or ends the string
- CHARINDEX
- PATINDEX
4 situations were compared:
- A clustered indexed column
- A non-clustered indexed column
- A non-indexed column
- A loop through a list of strings, using an “IF (<method>)” to search inside the string on each loop iteration (not a common usage)
Results under OPTION (USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'))
Results under SQL Server 2017
Summary of results:
- If you force the legacy cardinality:
- PATINDEX and CHARINDEX are the fastest methods in most situations
- LIKE is usually the slowest method
- If you use the newer cardinality:
- SUBSTRING and LEFT/RIGHT are the fastest methods, as long as your search pattern always starts or ends the string
- CHARINDEX and PATINDEX are the better options if the string is part of a clustered indexed column.
- Both operators’ performance are worse in the newer cardinality against all column types
- LIKE is a better option for pattern matching against non-clustered indexed columns and non-indexed columns.
- Microsoft apparently optimized this operator against non-indexed columns in the newer cardinality
Comments
Post a Comment