question

muthuveerappan2007 avatar image
muthuveerappan2007 asked

Best solution to provide read copy for MIS environment in SQL Server 2008R2

Hi, We require to provide read copy for MIS purpose ( reporting). The domain is Telecom BSS and our DB size is around 250-500 GB. The DB has around 800+ tables. Out of this, only 50+ tables required for reporting purpose. Here Out of 50+ tables, there are 10+ tables, which have more than 20+ million records. The version is SQL Server 2008R2 standard edition. Approach we are looking: 1. Log shipping 2. Mirroring 3. Replication 4. Incremental Load (By LastModifiedDate) For, 4th approach requires us to modify the schema and procedures. This approach makes huge impact on development. For 4th Apporach: The plan is to export all the column of these tables on every 15 minutes and provide to other team based on Last modified date. For this process, every operation on this table should update the Last modified date. For 3rd approach: Is Transactional replication incur on any IO Issue? We would like to know which is best solution to provide read copy for other teams. And which is the global practice on this process. Note: Dev team is suggesting to implement Transactional Replication. DBA team suggests to go with 4th approach. Thanks!
replicationsql server 2008 r2etl
10 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

anthony.green avatar image anthony.green commented ·
Only real options are, Change Tracking or doing option 4. Log Shipping, Mirroring, Replication - OK can be done but will be a copy of the full database, how are you going to know the changed (delta) records without option 4?
0 Likes 0 ·
muthuveerappan2007 avatar image muthuveerappan2007 commented ·
Replication-It will not be a full copy of the database, since it have subset of the database only. And Most of the tables already have primary key.
0 Likes 0 ·
anthony.green avatar image anthony.green commented ·
OK, so you limit replication to the 50 tables you need, again how are you going to know the delta? Say I modify a record twice, how you going to know about the second change? Also from the documentation, you will need to have the full data set for the tables. >Initial Dataset >Before a new transactional replication Subscriber can receive incremental changes from a Publisher, the Subscriber must contain tables with the same schema and data as the tables at the Publisher. The initial dataset is typically a snapshot that is created by the Snapshot Agent and distributed and applied by the Distribution Agent. The initial dataset can also be supplied through a backup or other means, such as SQL Server Integration Services. >When snapshots are distributed and applied to Subscribers, only those Subscribers waiting for initial snapshots are affected. Other Subscribers to that publication (those that have already been initialized) are unaffected.
0 Likes 0 ·
muthuveerappan2007 avatar image muthuveerappan2007 commented ·
Yes, It is not for audit purpose. Just this data require only for the reporting purpose. And by using 4th approach also, we are going to update the same record in the MIS (read) environment (using ETL process).Initial data set is going to be prepared in that end. Not going to maintain as multiple records for the same information (as i said, its not for audit purpose)
0 Likes 0 ·
anthony.green avatar image anthony.green commented ·
Surely the idea of this is to get delta, changed records only? You cant do that with replication, you wont know what's changed, so you will need to pull the full table every time via ETL, so there is no benefit to replication over what your doing now.
0 Likes 0 ·
Show more comments

1 Answer

·
anthony.green avatar image
anthony.green answered
OK, not all the comments are showing up so posting this way now instead. >From OP >Yes, we are going to start the process with all initial required data. So, replication won't fail. And its going to update all the changes in predefined interval. (Using transaction replication which going to use Log data in back ground)..Mostly pull subscription we are preferring. >Change Tracking and CDC used mainly for audit purpose. OK, so you start the table with 20+ million rows, let replication update 1 row in the 20+ million, how are you going to find that one row that has been updated? Change Tracking, yes primarily an auditing purpose, but what is it your doing here? Your auditing what has been changed as you want delta. Again to only get delta records, you need to implement a delta routine. Change Tracking or option 4 are your only options with Standard edition
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

anthony.green avatar image anthony.green commented ·
1 Like 1 ·
muthuveerappan2007 avatar image muthuveerappan2007 commented ·
We require the total table itself. There we don't need to find what are all the data changed recently. For Example, in bank account profile table is replicated in MIS end. After predefined interval, the replicated table have all the changes applied. Just we require the account profile table for MIS analysis. In their report, they will mention like last 1 hour changes will not be applied here.
0 Likes 0 ·
anthony.green avatar image anthony.green commented ·
OK, so is the subscriber going to be serving up the data to the report directly? Or are you still going to be running an ETL process on the subscriber to move the data into the MIS environment? If all your doing is moving the data again, why go through the pain of setting up replication when your either going to pull 20+ million rows from the publisher or subscriber. When you mentioned option 4, it sounds like you want delta records to be identified so your limiting down to just changed records, replication wont give you that you will still need to suck the full table into the MIS environment so no point doing replication
0 Likes 0 ·
muthuveerappan2007 avatar image muthuveerappan2007 commented ·
As said, the MIS requires complete production data (of those 50+ tables) for their day to day analysis. Our plan is either go with replication or ETL process. Not both of them. We try to understand the pros and cons of these approaches on the production environment before implementation.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.