x

Merge row values into a CSV (a.k.a GROUP_CONCAT for SQL Server) without Violating DRY Principle

I have a table like:

EntityID   AttributeID  OptionText
5016       20          Paintings
5044       18          Female
5060       48          M
5060       48           F
5060       49          Apple
5060       49          Banana
5060       49           Cat

I want to create a view that will show:

5016    20   Paintings
5044    18   Female
5060    48   M,F
5060    49   Apple, Banana, Cat

Means The attributes values on every entity should be displayed separated by a comma.

The number of options can be varied.

But the most important issue is the query should not violate DRY Priciple The query should be very fast because the table are going to have millions of records.

Any help is appreciated!

This is the actual query that i created using the answers it takes about 20 seconds to get few million records but if i just wan to get few records like 20-30 it takes 2-3 secinds. I want to reduce the time by less than a half second

I need to inject where conditions into it. Please specify what i am missing

select joined.* , eset.PageNo from (select * from
(


 select a.EntityTypeID ,  a.EntityID , a.AttributeValue ,  ea.AttributeName from 
(SELECT
M.EntityTypeID , M.EntityID,  M.AttributeID, 
SUBSTRING(CAST(foo.bar AS varchar(8000)), 2, 7999) AS AttributeValue

 FROM
 (
 SELECT DISTINCT EntityID, AttributeID , EntityTYPEID
 FROM [testview1]

  where ENtityID between 5060 and 56598
  and EntitytypeID = 9

  ) M
CROSS APPLY
(
SELECT
    ',' + OptionText
FROM
    [testview1] M2
WHERE
    M.EntityID = M2.EntityID AND M.AttributeID= M2.AttributeID and M.EntityTypeID = M2.ENtityTYpeID

    and  ENtityID between 5060 and 56598
     and EntitytypeID = 9

FOR XML PATH ('')
) foo(bar)) a
inner join EntityAttributes ea on ea.ID = a.AttributeID 



) options


 union all
 (select 
     EntityAttributes.EntityTypeID,
     EntityAttributeValues.EntityID as 'EntityID' , EntityAttributeValues.AttributeValue                 ,EntityAttributes.attributeName from EntityAttributeValues
     inner join EntityAttributes on EntityAttributes.ID = EntityAttributeValues.AttributeID


    where EntityAttributeValues.EntityID between 5060 and 56598
     and EntityAttributes.EntityTypeID = 9**93

       )
       ) joined

          inner join Entity_Paged_View_With_Set eset on eset.ID = joined.EntityID 
more ▼

asked Oct 19 '11 at 01:19 AM in Default

Kamal Deep Singh Saini gravatar image

Kamal Deep Singh Saini
25 3 3 4

How often will the code to create have to run, is it a one off?
Oct 19 '11 at 02:21 AM Mrs_Fatherjack
(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Try this, it is ONE way of doing it

declare @YourTable table (EntityID int, AttributeID int, OptionText varchar(50))
insert into @YourTable values(5016,20,'Paintings')
insert into @YourTable values(5044,18,'Female')
insert into @YourTable values(5060,48,'M')
insert into @YourTable values(5060,48,'F')
insert into @YourTable values(5060,49,'Apple')
insert into @YourTable values(5060,49,'Banana')
insert into @YourTable values(5060,49,'Cat')

select  distinct
    EntityID, AttributeID,
    stuff ( ( SELECT
                  ',' + OptionText 
              FROM
                  @YourTable TABLE1
              WHERE TABLE1.EntityID = TABLE2.EntityID 
              AND TABLE1.AttributeID = TABLE2.AttributeID
    FOR XML PATH ( '' ) ) , 1 , 1 , '' )
FROM
    @YourTable TABLE2

gives

EntityID    AttributeID 
----------- ----------- -------------------------
5016        20          Paintings
5044        18          Female
5060        48          M,F
5060        49          Apple,Banana,Cat
more ▼

answered Oct 19 '11 at 02:48 AM

Kev Riley gravatar image

Kev Riley ♦♦
50.7k 43 49 76

I post the same question on stackoverflow too. and get the same reply. This query is too slow and does not solve my purpose. Isnt there is some other way. I want fully optimized query
Oct 19 '11 at 02:54 AM Kamal Deep Singh Saini

How slow is slow, Jeff Moden talks about using this method in this post http://www.sqlservercentral.com/articles/Test+Data/61572/, and measures a 17 sec duration for a million row table.

If you need to keep doing this over and over in a view, then you need to rethink your design.

How about materializing this 'view' in a table?

How about changing the base table design to store the data as a CSV?

How about handling the string manipulation in the application layer/presentation layer?
Oct 19 '11 at 03:02 AM Kev Riley ♦♦
This is the best way I know of. If it's slow, look at the query plan and see how you can optimise it.
Oct 19 '11 at 03:32 AM David Wimbush

@Kamal Deep Singh Saini You can reduce the number of reads from the table a little bit by first grouping the records by 2 columns and then calling cross apply rather than calling it once for each row and then using distinct which does the grouping by 3 columns. This will not help a lot, but will be a bit cheaper to do because the grouping is easier and the number of calls to the inner query is smaller:

select
    t.EntityID, t.AttributeID, 
    stuff(x.Options, 1 , 1 , '') Options
    from
    (
        select
            EntityID, AttributeID
            from @YourTable
            group by EntityID, AttributeID
    ) t cross apply
    (
        select ',' + OptionText
            from @YourTable
            where 
                EntityID = t.EntityID 
                and AttributeID = t.AttributeID
            for xml path('')
    ) x(Options);
If this does not help much then you can try using the dynamic pivot first and then the normal concat to derive your delimited lists (one for each combination). This will make the query much more complex, but considerably cheaper and probably faster. Please let me know if you want it and I can post a sample.
Oct 19 '11 at 08:03 AM Oleg
@Oleg you can please post a sample related to dynamic pivot. What finally i need to do is to get the data in the required format as specified by the question and then pivot it. Actually i am trying to create a Model that resembles EAV. I will post the actual query that i created using the sample given above
Oct 21 '11 at 04:30 AM Kamal Deep Singh Saini
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1933
x1816
x977

asked: Oct 19 '11 at 01:19 AM

Seen: 1791 times

Last Updated: Oct 21 '11 at 04:37 AM