question

David 2 1 avatar image
David 2 1 asked

Creating A Function To Return JSON Data

Hi there, So I am trying to create a function that will now return the JSON data from an external table using the second part of the article here [ https://www.red-gate.com/simple-talk/sql/t-sql-programming/consuming-json-strings-in-sql-server/ ] by embedding the ToJSON function within my own function getJSON. I am looking for something similar to the below: CREATE FUNCTION dbo.getJSON( @id int) RETURNS nvarchar(max) AS BEGIN DECLARE @json nvarchar(max) SELECT @json = (DECLARE @MyHierarchy Hierarchy INSERT INTO @myHierarchy SELECT element_id,sequenceNo,parent_ID,Object_ID,NAME,StringValue,ValueType FROM FormsParseJSON WHERE ID = @id SELECT dbo.ToJSON(@MyHierarchy) RETURN @json END What's the best method to complete the above statement? TIA TIA
sql-server-2012tsqlfunctionsjson
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@David 2 1 In your case you already have the Forms table which stores the JSON data in the column named JSONDATA, so you can select it from the table for the specified ID value. Please let me know whether you still need the function.
0 Likes 0 ·
@Oleg thanks, that is correct I have the Forms table however I believe I still need to still declare the hierarchy table so that the ToJSON function can return the correct rows by ID?
0 Likes 0 ·
@David 2 1 I think I failed to get my point across. What I meant to say is this: You have records in the Forms table which includes the JSONDATA column which stores (in each row) the instance of the JSON object as string (nvarchar(max)). Then you executed a process to parse the values in that column for each row, which resulted in the insert of the multiple rows into FormsParseJSON table for each row in the Forms table. Assuming that the data for existing rows in both tables does not change, you now have 2 ways to retrieve the data for the specified id: -- method 1, without the use of the function: select JSONDATA from Forms where ID = @id; -- method 2, using the function: select dbo.getJSON(@id); Both methods will return the same value, but the first one is cheaper because it simply reads the value from the specified row/column while the second needs to do a bit of work to get the same value back out of the multiple rows of FormsParseJSON table for the same id.
0 Likes 0 ·
Oleg avatar image
Oleg answered
In case if selecting the data from Forms table is not an option, and assuming that the user defined type named Hierarchy, and the function named ToJSON are already available, the function in question might look like this: create function dbo.getJSON (@id int) returns nvarchar(max) as begin declare @MyHierarchy Hierarchy; insert into @MyHierarchy select element_id, sequenceNo, parent_ID, [Object_ID], [NAME], StringValue, ValueType from FormsParseJSON where ID = @id; return dbo.ToJSON(@MyHierarchy); end; go Optionally, the function definition could look like this (most closely resembling the original definition in question): create function dbo.getJSON (@id int) returns nvarchar(max) as begin declare @MyHierarchy Hierarchy, @json nvarchar(max); insert into @MyHierarchy select element_id, sequenceNo, parent_ID, [Object_ID], [NAME], StringValue, ValueType from FormsParseJSON where ID = @id; select @json = dbo.ToJSON(@MyHierarchy); return @json; end; go Oleg
8 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@Oleg Many thanks for your help. It appears that my idea to speed up the querying of the JSON data has not worked unfortunately. Thanks anyway.
0 Likes 0 ·
@David 2 1 I am not sure what did not work out, care to elaborate? Usually, there are ways to speed up querying data, up to some point of course. Do you mean that selecting the JSONDATA values from the Forms table is slow? There are numerous reasons why this would be the case, but the details of the problem are important.
0 Likes 0 ·
@Oleg thanks but it's probably not for the question. However in summary, I have a Forms table of only ~25000 rows. It's indexed and very fast for query when you do not include the computed column called JSON, which is defined as (CONVERT([nvarchar](max),[JSONDATA])), and/or the JSONDATA column. Include those columns then the query slows to a stop. I put FILESTREAM on this Forms table and moved the JSONDATA out of the table on to the filesystem in an attempt to speed it up, but still not much improvement. I wanted to see if I could unstructure the JSON data into a separate table with a CLUSTERED INDEX then rebuild the strings from there. On the small sample I compared it produced faster results querying the original JSON string. No idea what to try next but I cannot believe that it's that slow to query JSON strings. Thanks anyway.
0 Likes 0 ·
@David 2 1 The table is tiny so there should be no problems with select statement unless the values are huge and latency simply becomes an issue. What is the data type of JSONDATA? Why do you need a computed column JSON when JSONDATA should be already defined as nvarchar(max)? Filestream might not help in speeding up the select exec times. One way to check whether latency is to blame is to select in such a way that the data is read but is not transferred over network to SSMS query results: declare @id int, @jsonData nvarchar(max)); select @id = ID, @jsonData = JsonData from Forms; When the above is executed, the variables are assigned and reassigned as many times as there are rows in the table but there is no any actual data traffic to SSMS window. Please try this to figure out whether latency is to blame. What is the average size of the JSONDATA value? What is the **large value types out of row** setting for this table? select name, large_value_types_out_of_row from sys.tables where name = 'Forms' If 1 then just the 16 byte pointer to the LOB pages are stored in rows. Otherwise, it is a part of the column value plus the pointer to the rest.
0 Likes 0 ·
@Oleg thank you for the advice so far and apologies for the slow reply, I had left work for the day. The datatype of the JSONDATA column is varbinary(max). I had asked about removing the computed column however the devs say that it is required by the app, hence I am trying to find a workaround removing it from this table into it's own index-able, searchable table. The average row size is ~4Mb. The large value type out of row returns a 0 result. I have run the declare @id int, @jsonData nvarchar(max)); select @id = ID, @jsonData = JsonData from Forms; statement which took 13min 49sec to complete.
0 Likes 0 ·
Show more comments
David 2 1 avatar image
David 2 1 answered
@Oleg the data stored is an assortment of images and documents that the user uploads from the app, and they usually want to see all the most recently uploaded documents using something along the lines of: select * from Forms where CustId = 5 order by SubmitDate We have 1 major customer and they have ~20000 of those forms. I have indexing on the table, but as I say the JSON part of the query is slow. Thanks.
8 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@David 2 1 Usually, such data is best when is stored in the file system while the relational database table stores the URL of the image/document. Images (such as jpeg) are very poor candidates for compression (because the jpeg format is already compressed very efficiently). The documents (Word, Excel, etc) cannot be compressed at all because they are already zip archives (try to rename existing Excel file with zip extension, then extract the zip to see contents and it will show all files and folders which make up the Excel file; the files are human readable too). The SQL Server compression will not work either as it affects only in-row pages while your data is in LOB and/or filestream. It looks like the devs are going to have to change the way they display data. It is a very simple change (UX remains the same too): return all but JSON to the UI and display an icon in the "document" column of the grid. Clicking on it can invoke the code to return JSON for **just one doc**, deserialize it and display it. This way, there will be no need to needlessly drag around 100 GB of data just because the app designers did not consider the consequences of their rather poor (IMHO) decisions.
0 Likes 0 ·
@Oleg thank you for your invaluable insight. As I understand it the devs specified that the forms be loaded into the app in case it loses it's connection to the server. That's what they told me when I started anyway. Another headache is that the management are beginning to make noises that they want to be able to report on the JSON data elements. I'm now hearing that some of them may have been reused for recording different data.
0 Likes 0 ·
@David 2 1 The possibility of losing connection to the server is no excuse to drag 100 GB of data across network for every open instance of the application, so the devs should strongly consider making changes to their app design. With half a page worth of C# code they can make it load in few seconds versus 15 minutes or whatever it takes now, and considerably reduce the network traffic. Reporting on the JSON data elements can be done too (with very fast speeds) provided that the design is changed slightly to store everything but BLOBS in JSON and store just the links to the BLOBS in JSON instead. The way it stands now, there is way too much data in the table, and the problem is compounded by a computed column which converts the binary data (mostly from images) to string (varchar(max)), but why? It ain't readable anyway, so there is no good reason to even have it. It just adds to the size. Just my 2 cents.
0 Likes 0 ·
@Oleg thanks. I am not sure why certain decisions were made regarding the app as, as I said, I wasn't here at the time. I'm also told that none of the original devs are left and that there is almost no documentation. I was trying to extract the JSON data using the parseJSON function to also look into how I could begin to write SSRS reports on it. Having not dealt with JSON before, and on top of everything else, it's quite a challenge. Basically I need to extract the BLOBs in each JSON string and update those elements with links to, say a new FILESTREAM table just for these BLOBs?
0 Likes 0 ·
@David 2 1 This would be one way of doing it. The table may, but does not have to be with the column decorated with filestream attribute, it could be the table with varbinary(max) column and possibly created with the option "large value types out of row" = 1, it all depends on your storage and database backup times/size requirements. The filestream has benefits of drastically reduced database and its backups file sizes. If you haven't read the [filestream white paper][1] by @Paul Randal, please read it when you have time. The paper is 9 years old, but it is still the best resource on the subject. The changes to DB objects will also require changes to the app code of course, but the effort would be beneficial as it will allow very fast retrieval of all rows to display in the app, and the reports will not be too hard. The table you created to store the JSON data as adjacency list (courtesy of @Phil Factor parseJSON function) will come very handy (after you reload it without the blobs). [1]: http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-FEF9550EFD44/FILESTREAMStorage.docx
0 Likes 0 ·
Show more comments

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.