querying SSISDB - text search in the packet definitions
Hello, in 2005 I had a procedure for metadata-search. It queried sysobjects, syscolumns, syscomments and finally the column packagedata from MSDB..sysdtspackages90. With this procedure it was easy to tell, in which views, procedures, functions, triggers and even SSIS-packages we use for example a column named MaterialID. After migration to 2012 I deploy most SSIS-Packages in project-deployment-mode into the SSISDB. Some other packages into MSDB (like in 2005). MSDB can now be queried, using MSDB..sysssispackages and the column packagedata. But for SSISDB I have no idea. Is there a table where these XMLs are stored? TIA Dennis
Ran the following in SSIDB select * from information_schema.routines order by ROUTINE_NAME Routines that go anywhere the package such as validate_package_internal are external routines so I cannot see the code. I ran profiler with everything to the max and nothing is recorded anywhere. Next plan is to run some of the sysinternal traces to see if there are blobs hidden away on the server.
I have loads of articles including the ones here
http://www.ssistalk.com/ and here
http://sqlblog.com/blogs/jamie_thomson/archive/2011/07/16/ssis-logging-in-denali.aspx. I believe the package is physical contained within the MDF file in an encrypted form. One of the articles on SSISTalk shows how to export the package to an .ISPAC file which itself is just a zip archive containing the DTSX files, config and params. The DTSX file is just a lump of XML. So in theory you could automate the whole process and write an XPath query to look at the DTSX contents.
After the hints from tanglesoft I tried to find out, how sql server stores the projects in SSISDB. A trace showed me, that the deployment wizard calls a stored procedure *catalog.deploy_project* and streams the project data to the procedure. It has been found, that the data is the *ispac*-file (the zipped project files). The procedure calls some further procedures and functions which encode the data and store them as varbinary in the database. So I realized that I must create a kind of bypass to get the still unencoded data. As preparation I created one table for the project data, and one for the dtsx-files. I modified the procedure *catalog.deploy_project*. Before it called further procedures, I inserted a insert-statement, which stores the unencoded project data in the varbinary(max)-field of my project-table. Henceforth I have a copy of the deployment-data in my table. But the problem remains. I cannot search the data. To solve this I created a procedure (sp_create_metadata), which calls *bcp* to write the varbinary-data as a file to the hard disk. In a second step I use the *xp_cmdshell* procedure to call the command-line-version of *7Zip* to unzip the *ispac* (saved as zip-file). In the next step the system-procedure *xp_dirtree* lists the now uncompressed files for further processing. In a last step I use *OPENROWSET* to read the dtsx-files and store the xml-shaped data in my second table. Now I finally have the possibility, to query the data in my SSISDB. So far the theory. When I call my procedure sp_create_metadata from Management Studio, everything works fine. But when I let the modified procedure catalog.deploy_project call my procedure, it hangs somewhere while calling (I think) bcp . I hope I am close to breakthrough. If someone has an idea and needs further details, I would certainly post the modification, the table definitions and my procedure.
Hi, bensycamore: Sorry for the late answer. Here you are. USE [SSISDB] GO -- The table for the compressed filestream CREATE TABLE [dbo].[T_Package_Streams]( [folder_name] [nvarchar](128) NOT NULL, [project_name] [nvarchar](128) NOT NULL, [deploy_id] [bigint] NOT NULL, [version_id] [bigint] NULL, [project_id] [bigint] NULL, [project_stream] [varbinary](max) NULL, CONSTRAINT [PK_T_Package_Streams] PRIMARY KEY CLUSTERED ( [folder_name] ASC, [project_name] ASC, [deploy_id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] -- the table with the packet definitions in plain xml text CREATE TABLE [dbo].[T_Package_Content]( [folder_name] [nvarchar](128) NOT NULL, [project_name] [nvarchar](128) NOT NULL, [package_name] [nvarchar](128) NOT NULL, [filecontent] [nvarchar](max) NULL, CONSTRAINT [PK_T_Package_Content] PRIMARY KEY CLUSTERED ( [folder_name] ASC, [project_name] ASC, [package_name] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] -- The system procedure to alter: -- Notice that I uncommented the call of my extraction-procedure because of the issues I described in my last post. alter PROCEDURE [catalog].[deploy_project] @folder_name nvarchar(128), @project_name nvarchar(128), @project_stream varbinary(MAX), @operation_id bigint = NULL output AS SET NOCOUNT ON DECLARE @deploy_id bigint DECLARE @version_id bigint DECLARE @project_id bigint DECLARE @retval int DECLARE @time datetimeoffset DECLARE @status int IF (@folder_name IS NULL OR @project_name IS NULL OR @project_stream IS NULL) BEGIN RAISERROR(27138, 16 , 6) WITH NOWAIT RETURN 1 END IF [internal].[is_valid_name](@project_name) = 0 BEGIN RAISERROR(27145, 16, 1, @project_name) WITH NOWAIT RETURN 1 END EXEC @retval = [internal].[create_deploy_operation] @folder_name, @project_name, @deploy_id output IF @retval <> 0 BEGIN RETURN 1 END SET @operation_id = @deploy_id SET TRANSACTION ISOLATION LEVEL SERIALIZABLE DECLARE @tran_count INT = @@TRANCOUNT; DECLARE @savepoint_name NCHAR(32); IF @tran_count > 0 BEGIN SET @savepoint_name = REPLACE(CONVERT(NCHAR(36), NEWID()), N'-', N''); SAVE TRANSACTION @savepoint_name; END ELSE BEGIN TRANSACTION; BEGIN TRY EXEC @retval = [internal].[prepare_deploy] @folder_name, @project_name, @project_stream, @deploy_id, @version_id output, @project_id output IF @retval <> 0 BEGIN RAISERROR(27118, 16, 1) WITH NOWAIT END --######### modification begin -- insert project data unencrypted into my project table insert into [dbo].[T_Package_Streams] ( [folder_name], [project_name], [deploy_id], [version_id], [project_id], [project_stream] ) Values ( @folder_name, @project_name, @deploy_id, @version_id, @project_id, @project_stream ) -- exec sp_create_metadata @folder_name, @project_name ,@deploy_id --########### modification end EXEC @retval = [internal].[deploy_project_internal] @deploy_id, @version_id, @project_id, @project_name IF @retval <> 0 BEGIN RAISERROR(27118,16,1) WITH NOWAIT END IF @tran_count = 0 COMMIT TRANSACTION; END TRY BEGIN CATCH IF @tran_count = 0 ROLLBACK TRANSACTION; ELSE IF XACT_STATE() <> -1 ROLLBACK TRANSACTION @savepoint_name; SET @time = SYSDATETIMEOFFSET() UPDATE [internal].[operations] SET [end_time] = SYSDATETIMEOFFSET(), [status] = 4 WHERE operation_id = @operation_id; THROW END CATCH DECLARE @process_id bigint SELECT @process_id = [process_id] FROM [catalog].[operations] WHERE operation_id = @deploy_id SET @status = NULL WHILE @status IS NULL BEGIN WAITFOR DELAY '00:00:01' SELECT @status = [status] FROM [catalog].[operations] WHERE operation_id = @deploy_id AND [status] <> 2 IF @status IS NULL BEGIN IF NOT EXISTS (SELECT [process_id] FROM internal.get_isserver_processes() WHERE [process_id]= @process_id) BEGIN DECLARE @end_time datetimeoffset(7) SET @status = 4 SET @end_time = SYSDATETIMEOFFSET() EXEC @retval = [internal].[update_project_deployment_status] @deploy_id, @version_id, @end_time, 4, '' END END END IF @status = 7 BEGIN RETURN 0 END ELSE BEGIN RAISERROR (27203, 16,1, @deploy_id) WITH NOWAIT RETURN 1 END -- My Extraction procedure Create procedure [dbo].[sp_Create_Metadata] ( @folder_name nvarchar(128), @project_name nvarchar(128), @deploy_id bigint ) as begin DECLARE @dir VARCHAR(100) DECLARE @formatFile VARCHAR(100) declare @pathtofile varchar(200) -- Directory on the filesystem of the sql server SELECT @dir = 'D:\ssis\metadatabuffer' SELECT @formatFile = @dir+'\image_export.fmt' -- I thank bpeikes from SQL Server Developer Center > SQL Server Forums > Transact-SQL -- for his answer to the question "Dumping varbinary(max) column values to files on harddisk using a SQL script" (Tuesday, July 24, 2012 3:13 PM) -- This here is slightly modified. DECLARE @command NVARCHAR(4000) -- First write a format file which allows us to dump the image column without column SELECT @command = 'echo 9.0 >> '+ @formatFile EXEC xp_cmdshell @command SELECT @command = 'echo 1 >> '+ @formatFile EXEC xp_cmdshell @command SELECT @command = 'echo 1 SQLBINARY 0 0 "" 1 blob_data "" >> '+ @formatFile EXEC xp_cmdshell @command SELECT @command = 'bcp '+ '"SELECT project_stream '+ 'FROM SSISDB.[dbo].[T_Package_Streams] '+ 'WHERE deploy_id = ' + cast(@deploy_id as varchar(5)) + ' and project_name = ''' + @project_name + ''' and folder_name = ''' + @folder_name+ '''" '+ 'queryout "'+@dir+'\' + @project_name + '.zip" -f "'+@formatFile+'" -T' EXEC xp_cmdshell @command SELECT @command = 'del '+@formatFile EXEC xp_cmdshell @command -- -- In the next step I use 7-Zip Command line version 4.57 to extract the .ispac (.zip) files -- Modify the program path according to your needs. -- 7-Zip Copyright (C) 1999-2007 Igor Pavlov. SELECT @command = 'd:\programme\7za457\7za e ' + @dir + '\' + @project_name + '.zip -o' + @dir + '\' + @project_name + ' -aoa' EXEC xp_cmdshell @command -- -- read the file names into a temp table as preparation for the following parsing process declare @project_dir varchar(4000) set @project_dir = @dir + '\' + @project_name create table #files (lfdnr int identity(1,1) primary key, subdir VARCHAR(1000), depth INT, isfile int); insert into #files (subdir, depth, isfile) exec xp_dirtree @project_dir, 1, 1 delete from #files where isfile = 0 or right(subdir,5) <> '.dtsx' -- Reading / Parsing Process declare @i int, @Cnt int DECLARE @xml XML DECLARE @character VARCHAR(MAX) declare @sql nvarchar(4000) declare @file nvarchar(128) select @Cnt = count(*) from #files set @i = 1 While @i <= @Cnt begin set @file = (select top 1 subdir from #files) set @sql = N' SELECT @character = x.y FROM OPENROWSET( BULK ''' + @project_dir + '\' + @file + ''', SINGLE_CLOB ) x(y)' exec sp_executesql @sql, N'@character varchar(max) OUTPUT', @character OUTPUT -- Fix up the ampersand SELECT @xml = REPLACE( @character, '&', '&' ) delete from [dbo].[T_Package_Content] where folder_name = @folder_name and project_name = @project_name and package_name = @file insert into [dbo].[T_Package_Content] (folder_name, project_name, package_name, filecontent) values (@folder_name, @project_name, @file, cast(@xml as nvarchar(max))) delete from #files where subdir = @file set @i = @i + 1 end -- cleanup drop table #files print @project_dir SELECT @command = 'rmdir ' + @project_dir + ' /S/Q' EXEC xp_cmdshell @command SELECT @command = 'del ' + @dir + '\' + @project_name + '.zip' EXEC xp_cmdshell @command end -- E n d ###################################################################################