SQL Groupby and Rownumber by sum

Hi All,

I need your suggestion with the following scenario , I am currently trying with temp tables and rownumbers to get the desired output .

     CREATE TABLE [dbo].[Batch](
         [BatchId] [int] NOT NULL,
         [DocumentId] [int] NOT NULL
     ) ON [PRIMARY]
     CREATE TABLE [dbo].[Document](
         [DocumentId] [int] NULL,
         [NumberOfPages] [int] NULL
     ) ON [PRIMARY]

     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (1, 32)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (2, 25)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (3, 41)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (4, 82)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (5, 19)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (6, 81)
     I need to group the documents into a batches so that theie total page count is less than 100. 
     Batch 1 will have DocumentId 1 , 2, 3 which has a total page count of 32+25+41 = 98 
     Batch 2 will have DocumentId 4 which has a page count of 82 
     Batch 3 will have Document 5 and 6 which has a page count of 19+81 = 100 
     expected result is in 
     Select * from Batch 
     Note Document Id 4 and 5 can not be grouped because their page count exceeds 100 

  My Solution below : 

 truncate table [dbo].[Batch] 
 Declare @TempSum int 
 Declare @DocumentId int 
 Declare @NumberOfPages int 
 Declare @BatchId int 
 Set @TempSum  = 0 ;
 Set @BatchId = 1;
  select DocumentId , NumberOfPages from [dbo].[Document]
 OPEN db_cursor
   FETCH NEXT FROM db_cursor INTO @DocumentId , @NumberOfPages 
              if ( (@TempSum +  @NumberOfPages)  <=  100 ) 
              set @TempSum = @TempSum +  @NumberOfPages 
              INSERT [dbo].[Batch] ([BatchId], [DocumentId]) VALUES (@BatchId,@DocumentId ) 
              SET @BatchId = @BatchId + 1;
              Set @TempSum  = @NumberOfPages ;
              INSERT [dbo].[Batch] ([BatchId], [DocumentId]) VALUES (@BatchId,@DocumentId ) 
              FETCH NEXT FROM db_cursor INTO @DocumentId , @NumberOfPages        
 -- Close Cursor
 CLOSE db_cursor
  DEALLOCATE db_cursor
  select * from [dbo].[Batch]

more ▼

asked Jan 09, 2017 at 07:33 PM in Default

avatar image

2.8k 56 65 71

Is it important to batch items contiguously? Also is it important to batch them the most efficient way (fewest batches). Could a possible solution be:

Batch 1 (Docs 1&2) = 32+25 = 57
Batch 2 (Docs 3&5) = 41+19 = 60
Batch 3 (Doc 4) = 82
Batch 4 (Doc 6) = 81

In your example it just so happens that when you keep the documents in the right order, you get an efficient batching.

Jan 10, 2017 at 12:19 PM Kev Riley ♦♦

Thats true.. unfortunately the batches i need should be contiguous. The method i used is not great ... i am looking for a better efficient way because I need to deal with a huge volume of data.

Jan 10, 2017 at 03:40 PM aRookieBIdev

Isn't this the bin packing problem? Hugo Kornelis did a series of posts about it... http://sqlblog.com/blogs/hugo_kornelis/archive/tags/Bin+Packing/default.aspx

Jan 10, 2017 at 03:46 PM ThomasRushton ♦♦

Similar, but not looking for the efficiency of the lowest number of bins. I suspect the answer will be some adaptation of one of the methods Hugo presents.

Jan 10, 2017 at 03:55 PM Kev Riley ♦♦

First bin first / first fit method? https://www.simple-talk.com/sql/t-sql-programming/bin-packing-problems-the-sql/ - there are a few typos in this article, though, so go carefully through it...

Jan 10, 2017 at 04:01 PM ThomasRushton ♦♦
show all comments (comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Not quite sure if I did understand your question but from what I read

select ff.BatchId,count(ff.DocumentId) as Number_of_DocumentID,sum(gg.NumberOfPages) as NumberOfPages from [dbo].[Batch] as ff join [dbo].[Document] as gg on ff.DocumentId=gg.DocumentId group by ff.BatchId

more ▼

answered Jan 10, 2017 at 08:55 PM

avatar image


Batch table is only provided to show what the OP wants the result to be - you can't use it as part of the solution

Jan 11, 2017 at 10:26 AM Kev Riley ♦♦

Thats right ..I was trying to get that data in Batch table. To avoid any further confusion , I have removed those now.

Jan 11, 2017 at 03:10 PM aRookieBIdev
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Jan 09, 2017 at 07:33 PM

Seen: 90 times

Last Updated: Jan 11, 2017 at 03:10 PM

Copyright 2018 Redgate Software. Privacy Policy