We have a mighty 70GB production database but we now need a copy of this database for Analytical Processing. We do have replication setup for some tables to feed in web applications but wanted to know best way to copy entire data from one DB to another.
We would like to refresh data every 24hours. I can manually backup Production database and restore a copy but is not feasible for an everyday task.
Many thanks Dhar
Answer by Grant Fritchey ·
The best approach would be to build an SSIS package that only moves the data that has changed for any given day. It might take a while to build this, but it will work the best for you over the long term.
Other options include running a backup & restore, which, as you note, isn't probably sustainable. You can go with snapshot replication, but that would probably work out to be as problematic, if not more so, than moving the databases.
I'd suggest the SSIS solution.
Answer by TimothyAWiseman ·
I agree with Grant, an SSIS package that only moves essential data is probably the way to go. But in SQL there are often many options with their own trade offs here are a couple of others:
Use Red Gate SQL Backup to do your nightly backup and restore. I did this for a while from production to a test database the developers where using on database over 100GB with great success. Its compression abilities make it very good at minimizing the network traffic involved, and it is quite easy to script a job that will do it automatically so you do not need to do it manually.
Also, you may want to look at transactional replication. I have used transactional replication to keep reporting servers in synch with production with great success. Remember that you do not need to set transactional replication to run constantly, and you can schedule it to run once a day, or once an hour tends to be a good compromise for some organizations.
I should point out it is possible in some cases with lots of updates that transactional replication is less bandwidth effecient then backing up and restoring, but that should only happen on systems where the number of changes is large in comparison with the total size of the data involved, which is relatively rare.
Answer by Jeff Moden ·
If it's that big a DB then, chances are, you have a decent SAN to go with it. Many SAN manufacturers have a utility to make "SAN SNAPSHOTS" which are more like "clones". They take almost nothing to setup, have very little effect on performance of the server, don't require any interuption on the prod server, and require very little "time out" on the target server.
We used such a thing at one of the companies I worked for with great success.... almost a tera-byte of data in about 2 minutes. It beats the hell out of replication and SSIS and restores.
Answer by Fatherjack ·
I would say you have a few viable options - Log Shipping, Backup/Restore, Replication, SSIS - and each has its good and bad points.
Firstly the ones I have used most:
Replication, we have merge and transactional in place here and its pretty reliable, when it goes wrong it is a pain to sort though. Thankfully its not often. The benefit here is that the OLAP database is available 24/7 - no down time while updates are applied.
Backup/Restore, We use SQL Backup but have nothing that is the same size as you are suggesting so I cannot comment on performance/time required that would compare to your situation. SQL Backup is rock solid though. I have never seen a backup or restore fail due to software issues - sometimes the network has failed, sometimes I have made a mess of settings but other than that 100% success rate.
Now ones I havent tried:
Log Shipping, this can be managed through SQL Backup or via SSMS and TSQL scripts. I have not used it in production as our needs for up-to-date data and 'always on' reports means we cant cope with the down time while the log files are applied. It is more flexible than replication and may take system updates more easily. Replication gets broken down and rebuilt approx twice a year here as there are updates brought online.
SSIS, never used it to move data like this so that would be a real voyage of discovery for me. If you are confident with SSIS then I am sure there are people that have gone that way before that will be able to advise on what is required.
Answer by Leo ·
If I would, I will use Logshipping. It is a best way so far in our environment (telecom). I do every 2 hours logshipping. But you can set the time interval, in your case.. you can set @ 10PM every day so that that can avoid the data traffic. You need to make sure to set your database to FULL recovery.
Answer by Blackhawk-17 ·
Another avenue that you could head down would be mirroring.
It's like log shipping without the restore outages per se.
It's like replication with fewer restrictions.
You can use it with a Witness server for HA.
And... you can take a snapshot of the mirror for reporting purposes: Database Mirroring and Database Snapshots