HI Everyone,

I have SSIS packages which runs on nightly basis to move data from AS400 to SQL Server 2005. From last couple of months my packages consuming more time to complete as a scheduled job. If I run same packages manually they are completing normally . In all my ssis packages i am truncating the data in sql server then Bulk inserting data in those tables. Even I have lot of non clustered indexes on those tables. Do I need to drop all indexes on those tables before executing bulk insert SSIS packages and once the package is succeeded I have to create Non clustered indexes so that the performance and time consuming constraint of the packages can be solved. I am not sure this would be the problem ,Please guide me to solve this problem

more ▼

asked Oct 03, 2010 at 05:04 AM in Default

avatar image

21 5 5 7

(comments are locked)
10|1200 characters needed characters left

4 answers: sort voted first

I have to say it depends. If you are only inserting 1000 rows into a table with 30 million rows then dropping the indexes, inserting, and recreating the indexes may take multiple times longer than just doing the insert. But if you are inserting 2 million rows into a table with 30 million rows then it might be a very different story. You always have to remember that for each index is another insert. So if you have 10 nonclustered indexes then each index has to be updated per row of physical data that is inserted. This creates a lot of over head.

What I am unsure of is you say when the package runs that it takes longer than when you run it manually. The package should execute the same amount out time regardless if it is called by a scheduled job, or executed manually. I would check to see what else is running when the job is scheduled. Are you rebuilding indexes, running backups, etc? Something could be competing for resources. as @ThomasRusshton stated, make a change and measure the difference.

more ▼

answered Oct 03, 2010 at 06:43 AM

avatar image

40.9k 39 94 168

1.usually i am moving 1.5 year data from As400 to Sql server daily and it will be more than 30 millions for some master tables.

  1. Backup was running before but it has been stopped last month still i have same issue . I am not rebuilding any indexes i am not sure thats why i asked.

Oct 03, 2010 at 08:02 AM mushtaq

Thanks for the update. From your original post I gather you are just refreshing the data on the SQL side. That each time this process runs you are truncating the data, then replacing it with a fresh set. If it were me doing this I would
1) Truncate the data.
2) Drop my indexes.
3) Insert my data.
4) Create my indexes.
5) Then I would update stats.

Oct 03, 2010 at 08:09 AM Tim

thanks for your valuable comments .can u tell me what do you mean by update statistics? how to perform it ?

Oct 03, 2010 at 08:20 AM mushtaq

Sure thing. Here is an article on MSDN that will explain how running update stats will help query performance. Updating Stats really should be a part of regular maintenance performed weekly on your databases.

The procedure is sp_updatestats

Oct 03, 2010 at 08:34 AM Tim
(comments are locked)
10|1200 characters needed characters left

mushtag, you say that you load all 1.5 years worth of data every day. I would suggest you look into only loading data that changes and keeping the rest. This can greatly reduce the load times as only the delta is actually inserted, the rest remains the same.

As you are using SSIS, you will be able to use the MERGE functionality that it offers.

more ▼

answered Oct 04, 2010 at 12:12 AM

avatar image

26.2k 18 37 48

Mr William , the problem i have with the existing data migration is i dont have modified date column in the source data so that i can take only modified data. Due to this reason i am transfering 1.5 year data daily because i dont know whihc of the records got changed. will you guide me with MERGE functionality with some sample

Oct 04, 2010 at 04:45 AM mushtaq

Mushtaq - there must be a primary key that you have on your target table. I guess that this is derived from your source data. You can use the primary key to run the imported data against your target table in SSIS. There is a data transformation called merge (http://msdn.microsoft.com/en-us/library/ms141703.aspx) that could be used to do the comparison and insert.

Oct 04, 2010 at 05:06 AM WilliamD

William- I do have primary key but how do i know which of the rows data got changed , I think there should be some flag kind of thing on the source table thru which i can know those reocrds have been modified. But i dont have that flag as well.

Oct 04, 2010 at 05:50 AM mushtaq

Mushtaq - The data that is already old cannot be changed, right? Then the MERGE in SSIS would take all the 1.5 years of data, join to the target table using the Primary Key and then only insert the data that is not in the target table yet. If you have changing data, you can join on the PK and then compare the columns from source and target and also perform an in-place update. I don't see too much of a problem with that method.

I expect it would be faster than trunacte and insert - but you'd have to test that first.

Oct 04, 2010 at 05:55 AM WilliamD

ok william, i will give a try with merge thanks for you valuable comments

Oct 04, 2010 at 06:02 AM mushtaq
(comments are locked)
10|1200 characters needed characters left

I would suggest dropping the indexes before moving data in, and then rebuilding the indexes at the end of the job.

Also, look at the batch size - see if you can split the insert up into batches of, say, 10000 rows at a time.

Take measurements of how long it takes to run, implement a change, and remeasure. Let us know how you get on.

more ▼

answered Oct 03, 2010 at 05:39 AM

avatar image

ThomasRushton ♦♦
42.3k 20 57 53

how to split the data in batches since i am doing select all records from as400 for an year and half data then insert those records in Sql server. will you provide me an example

Oct 03, 2010 at 08:23 AM mushtaq

On the SSIS task doing the insert, the "Data Destination" item's properties - there's an option to change the batch size.

I think. I'm working from memory here at the moment...

Oct 04, 2010 at 06:05 AM ThomasRushton ♦♦
(comments are locked)
10|1200 characters needed characters left

Have you tried running the job manually, instead of scheduled, at the scheduled time? You may find the issue is on the AS400 or the network.

Monitor your network throughput and your disk throughput & latency during a manual run and during the automatic one. Perhaps there is a new process consuming the network - VM backups perhaps? Sometimes changing your job's start time alone will be enough, it may no longer be in contention for resources.

Of course you have also been given very good advice on how to create more efficient packages and you should look into them.

more ▼

answered Oct 05, 2010 at 05:59 AM

avatar image

12.1k 30 36 42

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Oct 03, 2010 at 05:04 AM

Seen: 2346 times

Last Updated: Oct 03, 2010 at 05:04 AM

Copyright 2018 Redgate Software. Privacy Policy