What is the widest you would want a clustered index to be?
Basically, I'm fully aware of best practice being to make sure clustered indexes are as narrow as possible. But how narrow is narrow? Does anyone have a decent rule of thumb about a tipping point that would mean you're better off using a surrogate rather than a compound key?
There is no rule other than "as narrow as practicable." In a perfect world I'd be able to design all databases so that the best candidate for the cluster was a unique int/bigint just because that meets all the criteria for the best possible clustered key in terms of page level storage within the B-Tree. But in the real world, to a degree, I don't care how wide the index is. I try hard to make it as small as I can, but I don't let that get in the way of good choices. If I have a compound key consisting of three columns that will be unique and will store the data in a perfect order for retrieval, it makes absolutely no sense to pop an identity column on to the table just to satisfy an arbitrary "keep the key narrow" rule. But, if those are three VARCHAR(300) columns, then judgement dictates, that even though they fit within the 900 byte limit on the index, in all likelihood, all the problems that are going to be associated with having that wide a key will outweigh that great storage mechanism. As a SWAG, I'd get concerned beyond a range somewhere between 32-64 bytes, depending on circumstances like data types, access patterns, whether this was a table at the top of a wide range of child tables or the table at the bottom (because frequently, I make the child tables FK with the parent a clustered key if I can, again, depending). But note, I said concerned. I didn't say I wouldn't do it. in the end, it's a judgement call dependent on testing and knowledge of the system.
I'm not aware of a rule of thumb about where a tipping point would occur, I just preempt it by always using a surrogate key. A wide key will work without issue for most smaller or low volume data sets, but the impact potential is always there just waiting for your transaction or storage volumes to increase. If you've gone wide with your keys then reach a 'tipping point,' you're now in a place where people are noticing the performance issues, and it's more difficult (read expensive) to fix because the system is written around the wider keys - join paths in the application code, reporting queries, etc.