x

Data warehousing latency minimization strategies

Hi All

We have a strategy here of reporting off a replication of the live database. The idea is that when something changes on live, it can usually be reflected in reporting in a few seconds. I'm not sure this is the best strategy and it will become less appealing as time goes on. A few of our reports are starting to run very slowly because the data are optimized for transactions, not analysis. I'd like to be able to move a lot of our reporting to cubes but there's a perception here that there is too much latency between the cubes and the source transactional data (say a day to a week).

The reality is that for 90% of reporting purposes here, they don't need the reporting data to be current to the minute, but it's like some sacred cow that they won't relinquish. I'm never going to win that argument. Where I might have some leverage though is if I could say that the cubes will be refreshed with minimal overhead every, say hour or two hours instead of once a week or overnight. Where can I get some good doco on how to reduce the lag between the cube and live data? What are the tradeoffs associated with keeping cubes current? What are your experiences?

more ▼

asked Dec 15, 2010 at 03:54 PM in Default

avatar image

GPO
4.9k 42 51 58

(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

GPO, keeping cubes current is the bane of most dbas/devs that have to look after a BI implementation.

You say that reporting runs currently against the OLTP system and this is causing performance problems. What would speak against setting up transactional replication to replicate all required tables to another machine. This would allow you to offload the reporting, freeing up the OLTP and allowing you to add indexes that support reporting to the replication target system.

We use this setup quite successfully, but are looking at the next release of sql server for further improvments through database mirrors that are readable.

more ▼

answered Dec 15, 2010 at 10:38 PM

avatar image

WilliamD
26.2k 18 37 48

@WilliamD. Thanks for taking the time to respond. No, I said we're already reporting of replicated live dbs. Because the data is not set up for reporting, the reports are generally a lot slower than they could be (and a lot slower to write for any newbies coming in too). Incidentally, (off topic I know) we're told to put (nolock) hints on all our queries that run off the replicated data because slow queries are (allegedly) a hindrance to replication. I have no idea whether this is correct, but forcing dirty reads on every query would seem to make a mockery of the conflicting desire to have data accurate up to the minute. Do you use (nolock) hints when reporting off your replicated data? Also incidentally, do large indexes slow down replication? How would I test this?

Dec 16, 2010 at 12:10 AM GPO

Writes to the replicated target can be slowed down by your read access of the data. Basically the replication is performing all write operations coming from the source (almost like a replay of the production system).

If you have additional indexes, these will change how the replication writes work. By default, the indexes from the OLTP system will not be present on the replication target, so the replication writes will be running differently to the prod system anyway.

Nolock will give dirty reads, but lower blocking. Depending on how heavy the load is on the replication target, this may benefit you. It depends on how willing you are to work with dirty pages (business decision, not personal opinion).

We do not use nolock for the reason mentioned. We want clean data - that is a business decision, rather a little slower but right, than quick and maybe wrong.

I suggest looking at the queries you have, the indexes on the replication target and the missing index dmvs to speed up those queries. Your indexing strategy will probably be different compared to the OLTP system, so don't necessarily take those indexes as a template.

Dec 16, 2010 at 12:17 AM WilliamD
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x312
x215
x32
x21

asked: Dec 15, 2010 at 03:54 PM

Seen: 4829 times

Last Updated: Sep 06, 2014 at 04:30 AM

Copyright 2018 Redgate Software. Privacy Policy