I have seen quite a few examples of paging data in SQL Server 2005, that use something like the following:
Now when I run a query such as
the CURRENT_TIMESTAMP seems to return entirely the same result. Given this, I don't see how an ORDER BY CURRENT_TIMESTAMP could possibly return a reliable paged dataset.
However, since I have seen this used in multiple places (and in the case of NHibernate, a very widely used library), I find it difficult to believe that this approach is fundamentally flawed and I am the first one to see it!
How is it that ORDER BY CURRENT_TIMESTAMP can produce reliable paging results? (or if it doesn't, under what scenarios does it give bad results?)
asked Aug 01, 2011 at 02:53 PM in Default
All that current_timestamp does is it returns the current date and time information, so it is pretty much identical to getDate(). The samples that you see in the referenced links use the current_timestamp in order to avoid the sorting, not to actually implement it. The sorting in the samples is performed by the sub-select which has both top(n) and order by simply because it is not possible to have order by without top(n) or top percent in sub-selects.
This technique makes the queries admittedly heavy, much heavier than they deserve to be but then again, one does not expect anything less than heavy out of such a heavy product like NHibernate. To apply the technique to your sample, you would have to use a select statement complemented with both top(n) and order by, make a sub-select out of it, select everything plus the row_number() ordered by essentially nothing (that is why the current_timestamp is used), and then finally make either a sub-select or the CTE out of that external select. This will make a query pretty heavy, but allow to to use a "generic template" which those links crave:
If you are to go this route, just ensure that your top(n) uses some ridiculously high value for that n in order to avoid undesired exclusion of some records in the useful, pre-sorted part. I have seen people choosing 2147483647 for that value. Since this is the highest integer value, it will ensure that top(n) returns all data that you have.
Here is the sample of such "generic technique" but with ntile used in place of the row_number. Suppose I want to return the third page of all records from sys.objects ordered by name and I want to have the records to be split into 10 pages (not that I will ever actually do it this way, there is no need to nest twice when once is already enough):
How I would actually do it is probably like this:
Generic? Sadly, it is not. Do I have to use the silly top(ridiculous_n) and order by in the sub-select and useless ordering by current_timestamp on the top? No. Does it make my query lighter? Absolutely.
Sorting the data by
Conor Cunningham (one of the authors of the SQL Server query optimizer) has a blog post about the run-time constant functions: Conor vs. Runtime Constant Functions
So in general ordering by a run-time constant function has no sense and will only add an overhead as all the record to which a run-time constant function value is added, have the same value.
answered Aug 02, 2011 at 12:58 AM