question

Magnus Ahlkvist avatar image
Magnus Ahlkvist asked

How to find a bus-route

EDIT 2011-02-16, 08:16 I have done some changes: - Changed BusFair_BusStop so it contains bot StartBusStopID and EndBusStopID - Changed BusFair_BusStop so it contains absolute times instead of relative minutes - Removed StartsAt from BusFair - Set some boundaries to a search, like @MaxArrival, @StartTime, @StartBusStopID. - Created a PROC which actually solves the problem, but that's not as cheap as it probably can be.. This is what the proc looks like: CREATE PROC GetCheapestFair(@StartBusStopID int, @EndBusStopID int, @StartTime time, @MaxArrivalTime time) AS ;with CTE AS( SELECT 0 as RecLevel, bf.BusFairID , bf.LineNumber, bfbs.StartBusStopID , bsStart.BusStopName as StartBusStopName , bfbs.EndBusStopID , bsEnd.BusStopName as EndBusStopName, bfbs.StartsAt, bfbs.StopsAt FROM BusFair_BusStop bfbs INNER JOIN BusStop bsStart ON bfbs.StartBusStopID = bsStart.BusStopID INNER JOIN BusStop bsEnd ON bfbs.EndBusStopID = bsEnd.BusStopID INNER JOIN BusFair bf ON bfbs.BusFairID = bf.BusFairID WHERE bsStart.BusStopID = @startBusStopID AND bfbs.StartsAt >= @startTime AND bfbs.StopsAt = CTE.StopsAt ), CTE2 AS( select top 1 * from CTE where endbusstopid = @endBusStopID order by StopsAt UNION ALL select CTE.* from CTE INNER JOIN CTE2 ON CTE.EndBusStopID = CTE2.StartBusStopID AND CTE.recLevel = CTE2.recLevel -1 AND CTE.StopsAt =0 ), CTE3 AS( SELECT *, ROW_NUMBER() OVER (PARTITION BY RecLevel ORDER BY stopsAt ) AS RowNum FROM CTE2 ), CTE4 AS( SELECT ROW_NUMBER() OVER(PARTITION BY LineNumber ORDER BY StartsAt) as MinTimePerLine, ROW_NUMBER() OVER(PARTITION BY LineNumber ORDER BY StopsAt DESC) as MaxTimePerLine, * FROM CTE3 WHERE RowNum=1 ) select distinct (select startbusstopname from CTE4 c where MinTimePerLine=1 and c.linenumber=cte4.linenumber) as startstation, (select startbusstopid from CTE4 c where mintimeperline=1 and c.linenumber=cte4.linenumber) as startstationid, (select startsat from CTE4 c where mintimeperline=1 and c.linenumber=cte4.linenumber) as starttime, (select endbusstopname from CTE4 c where MaxTimePerLine=1 and c.linenumber=cte4.linenumber) as startstation, (select endbusstopid from CTE4 c where maxtimeperline=1 and c.linenumber=cte4.linenumber) as startstationid, (select stopsat from CTE4 c where maxtimeperline=1 and c.linenumber=cte4.linenumber) as starttime from CTE4 where mintimeperline = 1 or maxtimeperline=1 END EDIT I'm planning to implement a search engine for bus traffic in the town where I live. The datamodel I'm thinking of looks like the below. It's simplified - a "real" datamodel would consider weekdays, holidays etc. But this is enough detail for now... CREATE TABLE BusLine(LineNumber int PRIMARY KEY) CREATE TABLE BusStop(BusStopID int IDENTITY(1,1) PRIMARY KEY, BusStopName nvarchar(200)) CREATE TABLE BusFair(BusFairID int IDENTITY(1,1) PRIMARY KEY, LineNumber int, startsAt time CONSTRAINT FK_BusFair_BusLine FOREIGN KEY(LineNumber) REFERENCES BusLine(LineNumber)) CREATE TABLE BusFair_BusStop(BusFairID int, BusStopID int, MinutesFromStart int CONSTRAINT PK_BusFair_BusStop PRIMARY KEY(BusFairID, BusStopID), CONSTRAINT FK_BusFair_BusStop_BusFair FOREIGN KEY (BusFairID) REFERENCES BusFair(BusFairID), CONSTRAINT FK_BusFair_BusStop_BusStop FOREIGN KEY (BusStopID) REFERENCES BusStop(BusStopID)) INSERT INTO BusLine (LineNumber) Values(804) INSERT INTO BusLine (LineNumber) Values(803) INSERT INTO BusStop (BusStopName) VALUES ('Enköping'), ('Örsundsbro'), ('Uppsala'), ('Bålsta') INSERT INTO BusFair(LineNumber, startsAt) VALUES (804,'08:00'), (803,'10:00') INSERT INTO BusFair_BusStop(BusFairID, BusStopID, MinutesFromStart) VALUES (1, 1, 0), (1,2,30), (1,3,50), (2,3,0), (2,4,35) A time-table for a bus-company normally consists of a bus-line per page, a BusStop per column and a BusFair per row, with times for when the bus stops at each BusStop. So far so good, I believe. If I want to go from Enköping to Bålsta and be in Bålsta by 11:00 at the latest, I need to take bus 804 to Uppsala and then switch to bus 803 to Bålsta. I would like to get a push in the right direction for two queries: I'm first interested in a list of possible routes, when the route starts, when the route ends and how many times I have to switch buses. For that query, I'm thinking I need a parameter to tell when I want to be at a specific bus-stop and how much deviation is accepted. Next, If I have the result from the first query, I need to find out which BusFairs are involved, at which times and which busstops I have a transit for my route. Anyone with ideas to help me out?
sql-server-2008-r2
9 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

ThomasRushton avatar image ThomasRushton ♦♦ commented ·
Travelling salesman problem? Least cost routing? hmm...
0 Likes 0 ·
Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
I know there's always good old Dijkstra's algorithm, but it was 10-11 years since I last had to even consider graph algorithms. And then it was C or Java programming. But I'm thinking a relational database would be the best place to store a time-table...
0 Likes 0 ·
WilliamD avatar image WilliamD commented ·
Any takers? I have started messing around for 10 minutes and haven't made any sensible progress.
0 Likes 0 ·
Håkan Winther avatar image Håkan Winther commented ·
I will let one of my co-worker look at this, he used to work at swebus.
0 Likes 0 ·
Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
Thanks Håkan. I appreciate it. I have made some progress, found a way to get the route with the earliest stoptime. But it's nowhere as cheap as I want it to be, it includes four nested CTEs.
0 Likes 0 ·
Show more comments
Magnus Ahlkvist avatar image
Magnus Ahlkvist answered
USE [busline] GO /****** Object: Table [dbo].[BusStop] Script Date: 02/16/2011 09:19:39 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[BusStop]( [BusStopID] [int] IDENTITY(1,1) NOT NULL, [BusStopName] [nvarchar](200) NULL, PRIMARY KEY CLUSTERED ( [BusStopID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO SET IDENTITY_INSERT [dbo].[BusStop] ON INSERT [dbo].[BusStop] ([BusStopID], [BusStopName]) VALUES (1, N'Enköping') INSERT [dbo].[BusStop] ([BusStopID], [BusStopName]) VALUES (2, N'Örsundsbro') INSERT [dbo].[BusStop] ([BusStopID], [BusStopName]) VALUES (3, N'Uppsala') INSERT [dbo].[BusStop] ([BusStopID], [BusStopName]) VALUES (4, N'Bålsta') INSERT [dbo].[BusStop] ([BusStopID], [BusStopName]) VALUES (5, N'Arlanda') INSERT [dbo].[BusStop] ([BusStopID], [BusStopName]) VALUES (6, N'Strängnäs') SET IDENTITY_INSERT [dbo].[BusStop] OFF /****** Object: Table [dbo].[BusLine] Script Date: 02/16/2011 09:19:39 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[BusLine]( [LineNumber] [int] NOT NULL, PRIMARY KEY CLUSTERED ( [LineNumber] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO INSERT [dbo].[BusLine] ([LineNumber]) VALUES (801) INSERT [dbo].[BusLine] ([LineNumber]) VALUES (803) INSERT [dbo].[BusLine] ([LineNumber]) VALUES (804) INSERT [dbo].[BusLine] ([LineNumber]) VALUES (877) /****** Object: Table [dbo].[BusFair_BusStop] Script Date: 02/16/2011 09:19:39 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[BusFair_BusStop]( [BusFairID] [int] NOT NULL, [StartBusStopID] [int] NULL, [EndBusStopID] [int] NOT NULL, [startsat] [time](7) NULL, [stopsat] [time](7) NULL ) ON [PRIMARY] GO INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (1, 1, 2, CAST(0x07008482A8410000 AS Time), CAST(0x0700B864D9450000 AS Time)) INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (1, 2, 3, CAST(0x0700B864D9450000 AS Time), CAST(0x0700EC460A4A0000 AS Time)) INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (6, 1, 2, CAST(0x070040230E430000 AS Time), CAST(0x070074053F470000 AS Time)) INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (2, 3, 4, CAST(0x070010ACD1530000 AS Time), CAST(0x0700A25EB5580000 AS Time)) INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (6, 2, 3, CAST(0x070074053F470000 AS Time), CAST(0x0700A8E76F4B0000 AS Time)) INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (3, 3, 6, CAST(0x0700E03495640000 AS Time), CAST(0x0700C03AC26F0000 AS Time)) INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (7, 3, 6, CAST(0x07007870335C0000 AS Time), CAST(0x070048F9F66C0000 AS Time)) INSERT [dbo].[BusFair_BusStop] ([BusFairID], [StartBusStopID], [EndBusStopID], [startsat], [stopsat]) VALUES (4, 1, 3, CAST(0x0700B893419F0000 AS Time), CAST(0x070006E78AA50000 AS Time)) /****** Object: Table [dbo].[BusFair] Script Date: 02/16/2011 09:19:39 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[BusFair]( [BusFairID] [int] IDENTITY(1,1) NOT NULL, [LineNumber] [int] NULL, PRIMARY KEY CLUSTERED ( [BusFairID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO SET IDENTITY_INSERT [dbo].[BusFair] ON INSERT [dbo].[BusFair] ([BusFairID], [LineNumber]) VALUES (1, 804) INSERT [dbo].[BusFair] ([BusFairID], [LineNumber]) VALUES (2, 803) INSERT [dbo].[BusFair] ([BusFairID], [LineNumber]) VALUES (3, 877) INSERT [dbo].[BusFair] ([BusFairID], [LineNumber]) VALUES (4, 804) INSERT [dbo].[BusFair] ([BusFairID], [LineNumber]) VALUES (5, 801) INSERT [dbo].[BusFair] ([BusFairID], [LineNumber]) VALUES (6, 804) INSERT [dbo].[BusFair] ([BusFairID], [LineNumber]) VALUES (7, 877) SET IDENTITY_INSERT [dbo].[BusFair] OFF /****** Object: StoredProcedure [dbo].[GetCheapestFair] Script Date: 02/16/2011 09:19:45 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO --EXEC GetCheapestFair 1, 6, '07:30','21:00' CREATE PROC [dbo].[GetCheapestFair](@StartBusStopID int, @EndBusStopID int, @StartTime time, @MaxArrivalTime time) AS ;with CTE AS( SELECT 0 as RecLevel, bf.BusFairID , bf.LineNumber, bfbs.StartBusStopID , bsStart.BusStopName as StartBusStopName , bfbs.EndBusStopID , bsEnd.BusStopName as EndBusStopName, bfbs.StartsAt, bfbs.StopsAt FROM BusFair_BusStop bfbs INNER JOIN BusStop bsStart ON bfbs.StartBusStopID = bsStart.BusStopID INNER JOIN BusStop bsEnd ON bfbs.EndBusStopID = bsEnd.BusStopID INNER JOIN BusFair bf ON bfbs.BusFairID = bf.BusFairID WHERE bsStart.BusStopID = @startBusStopID AND bfbs.StartsAt >= @startTime AND bfbs.StopsAt = CTE.StopsAt ), CTE2 AS( select top 1 * from CTE where endbusstopid = @endBusStopID order by StopsAt UNION ALL select CTE.* from CTE INNER JOIN CTE2 ON CTE.EndBusStopID = CTE2.StartBusStopID AND CTE.recLevel = CTE2.recLevel -1 AND CTE.StopsAt =0 ), CTE3 AS( SELECT *, ROW_NUMBER() OVER (PARTITION BY RecLevel ORDER BY stopsAt ) AS RowNum FROM CTE2 ), CTE4 AS( SELECT ROW_NUMBER() OVER(PARTITION BY LineNumber ORDER BY StartsAt) as MinTimePerLine, ROW_NUMBER() OVER(PARTITION BY LineNumber ORDER BY StopsAt DESC) as MaxTimePerLine, * FROM CTE3 WHERE RowNum=1 ) select distinct (select startbusstopname from CTE4 c where MinTimePerLine=1 and c.linenumber=cte4.linenumber) as startstation, (select startbusstopid from CTE4 c where mintimeperline=1 and c.linenumber=cte4.linenumber) as startstationid, (select startsat from CTE4 c where mintimeperline=1 and c.linenumber=cte4.linenumber) as starttime, (select endbusstopname from CTE4 c where MaxTimePerLine=1 and c.linenumber=cte4.linenumber) as startstation, (select endbusstopid from CTE4 c where maxtimeperline=1 and c.linenumber=cte4.linenumber) as startstationid, (select stopsat from CTE4 c where maxtimeperline=1 and c.linenumber=cte4.linenumber) as starttime from CTE4 where mintimeperline = 1 or maxtimeperline=1 GO /****** Object: Default [DF__BusFair_B__EndBu__117F9D94] Script Date: 02/16/2011 09:19:39 ******/ ALTER TABLE [dbo].[BusFair_BusStop] ADD DEFAULT ((0)) FOR [EndBusStopID] GO /****** Object: ForeignKey [FK_BusFair_BusLine] Script Date: 02/16/2011 09:19:39 ******/ ALTER TABLE [dbo].[BusFair] WITH CHECK ADD CONSTRAINT [FK_BusFair_BusLine] FOREIGN KEY([LineNumber]) REFERENCES [dbo].[BusLine] ([LineNumber]) GO ALTER TABLE [dbo].[BusFair] CHECK CONSTRAINT [FK_BusFair_BusLine] GO
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

WilliamD avatar image
WilliamD answered
OK, spent a further 20 minutes at lunchtime. I start with pretty much the same CTE as you to build the BusStop list, however, I make note of a route section that has either the start or destination in it. The fun starts in the next CTE "RouteGrouping" - I identify the connections that make sense (those with start or destination in them), then have the rownumber to identify the chronological ordering. I'm having difficulty explaining the `dense_rank()` in words..... I number the stations in descending chronological order so I can ignore the oldest ones - it just works! ;) Finally, in EligibleConnections I do all the min, max stuff to find the actual timings of the groups I just identified, filtering out all the crap. The only weakness in this, that I haven't spent time fixing, is if you try and find a route which cannot be fulfilled - e.g. DECLARE @startBusStopID int = 1, @EndBusStopID int = 6, @startTime time = '07:30', @maxArrivalTime time = '09:00' ; Your solution returns no rows, whilst mine will return the first leg of the journey (Enköping to Uppsala) although the remainder cannot be fulfilled. I'll leave you to come up with a solution for that. ;-p (maybe I'll have a Eureka moment later on) This solution beats your sproc at the moment though: --Using the folllowing Parameters: DECLARE @startBusStopID int = 1, @EndBusStopID int = 6, @startTime time = '07:30', @maxArrivalTime time = '21:00' ; -- Me (2 row(s) affected) Table 'BusStop'. Scan count 0, logical reads 8, physical reads 0 Table 'Worktable'. Scan count 5, logical reads 142, physical reads 0 Table 'BusFair'. Scan count 0, logical reads 30, physical reads 0 Table 'BusFair_BusStop'. Scan count 2, logical reads 16, physical reads 0 -- You (GetCheapestFair) (2 row(s) affected) Table 'Worktable'. Scan count 55, logical reads 2096, physical reads 0 Table 'BusStop'. Scan count 0, logical reads 868, physical reads 0 Table 'BusFair'. Scan count 0, logical reads 420, physical reads 0 Table 'BusFair_BusStop'. Scan count 28, logical reads 224, physical reads 0 ---- DECLARE @startBusStopID int = 1, @EndBusStopID int = 6, @startTime time = '07:30', @maxArrivalTime time = '21:00' ; WITH AllRoutes AS (SELECT BFBS.BusFairID, StartBusStopID, EndBusStopID, startsat, stopsat, CAST(CASE WHEN StartBusStopID = @StartBusStopID THEN 1 ELSE 0 END AS tinyint) IsStart, CAST(CASE WHEN EndBusStopID = @EndBusStopID THEN 1 ELSE 0 END AS tinyint) IsDestination FROM dbo.BusFair BF INNER JOIN dbo.BusFair_BusStop BFBS ON BF.BusFairID = BFBS.BusFairID WHERE StartBusStopID = @startBusStopID AND startsat >= @starttime AND startsat < @maxArrivalTime UNION ALL SELECT BFBS.BusFairID, BFBS.StartBusStopID, BFBS.EndBusStopID, BFBS.startsat, BFBS.stopsat, CAST(CASE WHEN BFBS.StartBusStopID = @StartBusStopID THEN 1 ELSE 0 END AS tinyint), CAST(CASE WHEN BFBS.EndBusStopID = @EndBusStopID THEN 1 ELSE 0 END AS tinyint) FROM dbo.BusFair BF INNER JOIN dbo.BusFair_BusStop BFBS ON BF.BusFairID = BFBS.BusFairID INNER JOIN AllRoutes anchor ON anchor.EndBusStopID = BFBS.StartBusStopID WHERE bfbs.stopsat = anchor.stopsat), RouteGrouping AS (SELECT DISTINCT BusFairId, StartBusStopId, EndBusStopId, StartsAt, StopsAt, MAX(IsStart) OVER (PARTITION BY BusFairID) HasStart, -- identify if a BusFair contains the starting point MAX(IsDestination) OVER (PARTITION BY BusFairID) HasDestination, -- identify if a BusFair contains the end point ROW_NUMBER() OVER (PARTITION BY StartBusStopId, EndBusStopId ORDER BY startsat) rn, -- identify the order of starters for a start and end busstop DENSE_RANK() OVER (PARTITION BY StartBusStopId ORDER BY startsat DESC) rn2 -- identify the reverse order of starters for a start busstop FROM AllRoutes), EligibleConnections AS (SELECT BusFairId, MIN(StartBusStopId) StartBusStopId, MAX(EndBusStopId) EndBusStopId, MIN(StartsAt) StartsAt, MAX(StopsAt) StopsAt, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) StopNo FROM RouteGrouping WHERE (HasStart = 1 OR HasDestination = 1) -- take the entries that have either the start of destiniation in their route AND rn = 1 -- take the earliest starter for the grouping start,end AND rn2 > 1 --take the earliest starter for the grouping start GROUP BY BusFairId) SELECT SS.BusStopName AS StartStation, StartBusStopId AS StartStationId, StartsAt AS DepartureTime, ES.BusStopName AS EndStation, EndBusStopId AS EndStation, StopsAt ArrivalTime FROM EligibleConnections EC INNER JOIN dbo.BusStop SS ON EC.StartBusStopId = SS.BusStopID INNER JOIN dbo.BusStop ES ON EC.EndBusStopId = ES.BusStopID ;
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
Thanks a lot! I'm loading data (lots, lots of manual work) and will test both solutions with all the live data loaded. One thing that I will change in my datamodel is that StartsAt and StopsAt will be datetime-columns, and I will load the bus-stops for every day. I was thinking of a solution where each Fair had a WhichWeekDaysWillIRun column, but figured the solution will be faster loading all data. It's not that much data anyway, and it's easy to adjust if the city one day decides that a week in the middle of the summer will have a temporarily changed time-table, or a busstop can't be used certain days due to road-workers. I'll let you know when I put the website up!
0 Likes 0 ·
BrentShaub avatar image
BrentShaub answered
What is the logical intent behind the table BusFair? Is it to allocate a price for each journey? Is it to associate the reverse route? I ran your code, and it works. I am trying to get it to work for my dataset, and no dice yet. I have a table tRoute that has a route number, a table tCircuit which has a foreign key route number and some attributes about the bus making the trip, of course tStop with the stop name and geographic location, then tCircStop to join that stops to the circuit at start and end times. I don't see where my translation of this idea is off. Your query runs immediately. Mine is still resultless after 15 minutes with only 400 data rows, so something is up. Are you still active in this post? Say the word, I'm happy to have your help.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

BrentShaub avatar image BrentShaub commented ·
That question is meant for Magnus Ahikvist.
0 Likes 0 ·
BrentShaub avatar image BrentShaub commented ·
I did note a very small thing to change: select distinct (select startbusstopname from CTE4 c where MinTimePerLine=1 and c.linenumber=cte4.linenumber) as startstation, (select startbusstopid from CTE4 c where mintimeperline=1 and c.linenumber=cte4.linenumber) as startstationid, (select startsat from CTE4 c where mintimeperline=1 and c.linenumber=cte4.linenumber) as starttime, (select endbusstopname from CTE4 c where MaxTimePerLine=1 and c.linenumber=cte4.linenumber) as endstation, (select endbusstopid from CTE4 c where maxtimeperline=1 and c.linenumber=cte4.linenumber) as endstationid, (select stopsat from CTE4 c where maxtimeperline=1 and c.linenumber=cte4.linenumber) as endtime
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.