x

SQL Groupby and Rownumber by sum

Hi All,

I need your suggestion with the following scenario , I am currently trying with temp tables and rownumbers to get the desired output .

     CREATE TABLE [dbo].[Batch](
         [BatchId] [int] NOT NULL,
         [DocumentId] [int] NOT NULL
     ) ON [PRIMARY]
     
     
     
     CREATE TABLE [dbo].[Document](
         [DocumentId] [int] NULL,
         [NumberOfPages] [int] NULL
     ) ON [PRIMARY]
     
     

     
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (1, 32)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (2, 25)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (3, 41)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (4, 82)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (5, 19)
     INSERT [dbo].[Document] ([DocumentId], [NumberOfPages]) VALUES (6, 81)
     
     /*
     
     I need to group the documents into a batches so that theie total page count is less than 100. 
     
     example 
     
     Batch 1 will have DocumentId 1 , 2, 3 which has a total page count of 32+25+41 = 98 
     Batch 2 will have DocumentId 4 which has a page count of 82 
     Batch 3 will have Document 5 and 6 which has a page count of 19+81 = 100 
     
     expected result is in 
     Select * from Batch 
     
     Note Document Id 4 and 5 can not be grouped because their page count exceeds 100 
     
     
     */

  My Solution below : 

 
 truncate table [dbo].[Batch] 
 
 
 Declare @TempSum int 
 Declare @DocumentId int 
 Declare @NumberOfPages int 
 Declare @BatchId int 
 Set @TempSum  = 0 ;
 Set @BatchId = 1;
 DECLARE db_cursor CURSOR  FOR
  
  select DocumentId , NumberOfPages from [dbo].[Document]
  
 OPEN db_cursor
  
   FETCH NEXT FROM db_cursor INTO @DocumentId , @NumberOfPages 
 
    WHILE @@FETCH_STATUS = 0
    BEGIN               
             
              
              
              
              if ( (@TempSum +  @NumberOfPages)  <=  100 ) 
              begin
              
              set @TempSum = @TempSum +  @NumberOfPages 
              
              INSERT [dbo].[Batch] ([BatchId], [DocumentId]) VALUES (@BatchId,@DocumentId ) 
              end 
              else 
 
              begin
              
              SET @BatchId = @BatchId + 1;
              Set @TempSum  = @NumberOfPages ;
              
              INSERT [dbo].[Batch] ([BatchId], [DocumentId]) VALUES (@BatchId,@DocumentId ) 
               
              end 
 
              FETCH NEXT FROM db_cursor INTO @DocumentId , @NumberOfPages        
 
   END
 -- Close Cursor
   
 CLOSE db_cursor
  DEALLOCATE db_cursor
 
 
  select * from [dbo].[Batch]

   
more ▼

asked Jan 09 at 07:33 PM in Default

avatar image

aRookieBIdev
2.8k 56 65 71

Is it important to batch items contiguously? Also is it important to batch them the most efficient way (fewest batches). Could a possible solution be:

Batch 1 (Docs 1&2) = 32+25 = 57
Batch 2 (Docs 3&5) = 41+19 = 60
Batch 3 (Doc 4) = 82
Batch 4 (Doc 6) = 81

In your example it just so happens that when you keep the documents in the right order, you get an efficient batching.

Jan 10 at 12:19 PM Kev Riley ♦♦

Thats true.. unfortunately the batches i need should be contiguous. The method i used is not great ... i am looking for a better efficient way because I need to deal with a huge volume of data.

Jan 10 at 03:40 PM aRookieBIdev

Isn't this the bin packing problem? Hugo Kornelis did a series of posts about it... http://sqlblog.com/blogs/hugo_kornelis/archive/tags/Bin+Packing/default.aspx

Jan 10 at 03:46 PM ThomasRushton ♦♦

Similar, but not looking for the efficiency of the lowest number of bins. I suspect the answer will be some adaptation of one of the methods Hugo presents.

Jan 10 at 03:55 PM Kev Riley ♦♦

First bin first / first fit method? https://www.simple-talk.com/sql/t-sql-programming/bin-packing-problems-the-sql/ - there are a few typos in this article, though, so go carefully through it...

Jan 10 at 04:01 PM ThomasRushton ♦♦
show all comments (comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Not quite sure if I did understand your question but from what I read

select ff.BatchId,count(ff.DocumentId) as Number_of_DocumentID,sum(gg.NumberOfPages) as NumberOfPages from [dbo].[Batch] as ff join [dbo].[Document] as gg on ff.DocumentId=gg.DocumentId group by ff.BatchId

more ▼

answered Jan 10 at 08:55 PM

avatar image

alemgeb
0

Batch table is only provided to show what the OP wants the result to be - you can't use it as part of the solution

Jan 11 at 10:26 AM Kev Riley ♦♦

Thats right ..I was trying to get that data in Batch table. To avoid any further confusion , I have removed those now.

Jan 11 at 03:10 PM aRookieBIdev
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x403
x73
x48

asked: Jan 09 at 07:33 PM

Seen: 78 times

Last Updated: Jan 11 at 03:10 PM

Copyright 2017 Redgate Software. Privacy Policy