question

Sarang avatar image
Sarang asked

Impact on performance when Used link server

Hi, Below is my requirement - I need to query and perform transactions on two different database servers (One database is on SQL Server 2000 and anothere is on SQL Server 2008 both resides on different servers) Data needs to be fetched for a web application. I am planning to use link server to perform this task at back-end. my concern is will using link server hit the performance of my application? what care should I take to avoid performance issues while using link server? Is there any othere options to connect two diferetn databases on different servers? Thanks in advance.
performancelinked-server
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Sarang avatar image Sarang commented ·
Thanks a lot all of you and especially Kevin for the detailed explaination on this. sorry for the ambiguity with connect,Let me write the problem again - - We are reengineering some part of one of the older applications. - for this we need to move some tables from SQL 2000 to SQL 2008. - Now there are some SPs on SQL 2000 which uses tables on SQL 2000 as well as SQL 2008 as we have moved some tables. - I need some way to write the logic in these SPs where I need to join tables on SQL 2000 as well SQL 2008 and get the same results without any performance overhead. Thanks in advance
0 Likes 0 ·
WilliamD avatar image
WilliamD answered
It depends upon what you mean with "connect". If you are swapping some sort of read only data, you could use replication or SSIS/DTS or service broker. Linked Servers will work, you will probably be limited by Network bandwidth and I/O on each machine. These are the two bottlenecks on most systems. Tell us a little more about the connection you are wanting and what you want to pass between the servers, maybe we can come up with a few suggestions.
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

TimothyAWiseman avatar image TimothyAWiseman commented ·
I have seen great overall performance gains by replacing linked servers with a replicated table, but of course that has some severe limitations and can only work in some use cases.
0 Likes 0 ·
Kevin Feasel avatar image
Kevin Feasel answered
In my experience with linked servers, they're useful for certain types of queries but can become major performance problems very quickly. Linchi Shea has a series of articles ([1]( http://sqlblog.com/blogs/linchi_shea/archive/2009/11/06/bad-database-practices-abusing-linked-servers.aspx), [2]( http://sqlblog.com/blogs/linchi_shea/archive/2010/11/23/linked-servers-permissions-and-distributed-query-performance.aspx), [3]( http://sqlblog.com/blogs/linchi_shea/archive/2010/12/01/linked-server-and-performance-impact-direction-matters.aspx)) on linked servers and performance that you should look at before beginning anything. My work with them pretty much corroborates his points, too. Here's a quick point list of things I've seen: 1. If you can, you may want to use an account with sysadmin, db*_*ownerfixed, or db*_*ddladmin rights across the linked server. Otherwise, your linked server queries will not get access to the remote SQL Server's statistics and this can lead to some very ugly plans. In our case, we can't use an account with that level of privilege (rather, we're not willing to accept the risk) and performance tuning can be a bear as a result. 2. Pull as little data as possible from the linked server. Ideally, you only want to get filtered records from a where clause (but if you're doing that, you probably should just do this at the business/data object level in your application). Inter-server joins can be a bit trickier because, in order to do the join, you need to send the entire data set across the wire to one of the servers--by default, the remote server sends data to the local server, though if you use an INNER REMOTE JOIN, you can reverse that scenario and have the local server send data to the remote server for processing. This isn't bad if you're pulling 3 records from the remote server, but if you're joining millions of rows on one server to millions of rows on another server, that's trouble. 3. Keep your queries as simple as possible. Try to avoid non-indexed views or anything which brings over more tables than you absolutely need. For example, we had a situation where we had two remote tables, Big and Little (not their real names, obviously). Little only had a couple thousand records, but Big had about 200,000 records. If you join your local table to Little and put a where clause in, it can be okay--even if you need to ship over the couple thousand records, that's not too bad if the two servers are on a fast enough connection. The problem was that the query we had involved a left outer join to Big, so in order to get the few rows from Big that we actually cared about, we needed to get all 200,000 rows across the wire before the joins and filters could actually work. You could imagine the kind of performance trouble this query had. 4. Following up on 3, I've also seen cases where the optimizer gets tricked with remote queries. If you join to Big 3 times, instead of using the same local temporary table, I've seen it get the table three separate times. 5. A simple __select * from dbo.Coordinator c INNER JOIN [Remote].[DB].dbo.User u ON c.UserId = u.Id__ can have much worse results than you would expect. In the case of one plan I ended up with, the remote server creates a temporary procedure (using sp*_*prepexec) with a parameter. Then, for each UserId the query passes along from the Coordinator table, it runs sp*_*execute with the temporary query and the current User ID. So with 60 distinct user IDs in the Coordinator table, it runs sp*_*execute on the server 60 times with 60 different IDs. With 500,000? Built-in coffee break. It's entirely possible to use linked servers in a production situation--we do with some level of success--but I'd recommend doing as little of it as you can get away with. You could try going down @WilliamD's route of moving the data across to the local server. Also, if you aren't doing intricate joins, there are cases where it could actually be faster to get the local data set, figure out the filters, and get the remote data set and perform the data merge in your data or business object layer. If you do use linked server connections, I highly recommend regularly reviewing the performance, especially if you don't use a linked server account which can access the remote server's statistics.
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

WilliamD avatar image WilliamD commented ·
Great answer Kevin (+1). We have problems with a linked server connection to a SAP system. We only read data from that system, so a replication of the tables would be better. Our SAP devs don't want to do it, so we have to wait for our data. The same people wonder why data loads are slow.....
0 Likes 0 ·
KenJ avatar image KenJ commented ·
Very thorough and informative answer!
0 Likes 0 ·
Oleg avatar image Oleg commented ·
This is a really good answer Kevin!!! Where did that +10 button go?There is one thing I found to be very interesting (and unexpected) when trying to insert some records into the database on the local server based on the join of the tables located on both local and remote servers: DTC usually ends up restating the set-based inserts by translating them into RBAR insert statements. It does not advertise the RBARing though.
0 Likes 0 ·
Kevin Feasel avatar image Kevin Feasel commented ·
Thanks for the kind words, all. Despite having no direct experience with SAP, I still cringe at the idea of doing that kind of work... I forgot about that part, Oleg. That's another nasty little thing that will hit you with linked queries. One more thing that I've found but forgot until just now: you can even end up RBAR-ing a select statement if you're going from local to remote. I've added it as point 5 above, but your RBAR mention reminded me of this.
0 Likes 0 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
wow, hats off to you, a great answer. +1
0 Likes 0 ·
KenJ avatar image
KenJ answered
As several other answers have indicated, there is a very real chance of a performance impact with linked servers. A non-linked server option is to modify the web application to make a separate connection to each SQL Server. I don't know how "related" the data on the two servers is, but if you weren't planning to join the data from the two servers, you'll just have two datasets to work with... same as with the linked server solution. If you *were* planning to join the data from the two servers, you can query the two servers into separate datasets, then join the two datasets by adding a relation between them ([super-quick .net overview][1]). If the data in one server is being used to filter the data from the other server, you can get the results from the first server and use some data from that dataset to filter the query on the second server. [1]: http://social.msdn.microsoft.com/Forums/en/adodotnetdataproviders/thread/f122d7f4-3b7d-4d93-bd0f-8bb57cd990a4
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Håkan Winther avatar image
Håkan Winther answered
Great answer Kevin! You should also try to use OPENQUERY to process your request on the linked server. Read more about linked server performance on [ http://www.sql-server-performance.com/tips/linked_server_p1.aspx][1] [1]: http://www.sql-server-performance.com/tips/linked_server_p1.aspx
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.