Recommended schedule for running UpdateTravelTimes #51

nselikoff · 2018-05-09T21:41:36Z

Based on looking at open-austin/transitime-docker#3 I can see how to run UpdateTravelTimes, and that it can take either one date (interpreted as start and end) or two separate start and end dates.

What's the recommended schedule for running UpdateTravelTimes, and do you typically run for one day or over a span of days?

scrudden · 2018-05-10T16:36:55Z

Hi Nathan,

This is a good question.

The result I believe you are looking for is the best quality predictions and the answer to what gives this is not so simple and can be dependent on the characteristics of the route you are working with.

If you do not run UpdateTravelTimes the default prediction implementation (PredictionGeneratorDefaultImpl) will use a sum of scheduled travel times and scheduled dwell times to predict the arrival times at upcoming stops. This can produce decent results.

So, to get something that is better than the schedule you will need enough sample data so that the "average" (not just a simple average, removes outlying values and dodgy data) travel times and dwell times used are statistically significant. I would run this over the full sample set in one run of UpdateTravelTimes.

The fact this update has to be done manually is one of the drawbacks of the default prediction implementation.

Cheers,

Sean.

nselikoff · 2018-05-10T18:56:11Z

Thanks for the explanation @scrudden - that is indeed the result I'm looking for. Is there a rule of thumb in your experience when it comes to "enough sample data" to get to statistical significance? For example, I'm working with a route right now that is very sparse... there are only 23 trips a week.

scrudden · 2018-05-11T17:22:38Z

Without knowing the schedule I would guess perhaps 20 independent sets of data for each trip_id. If you work up to this you will see the values used for each stop path travel time level off and at that point you should have enough.

scrudden · 2018-05-14T18:11:29Z

@nselikoff Can this issue be closed?

nselikoff · 2018-05-14T18:17:38Z

Yes, thanks for the additional guidance @scrudden

nselikoff · 2018-08-17T14:54:11Z

Hi @scrudden, I'm reopening this issue to get a little more help on running UpdateTravelTimes. I have run the UpdateTravelTimes.jar on the command line, but still have some questions:

Should UpdateTravelTimes be run for just one day? Or for a range of days? If a range, how long of a range?
Besides the start and end date, are there other important config params to be aware of for UpdateTravelTimes?
How often should UpdateTravelTimes be run?
What should I look for in the database to see that it was run correctly and is doing what it is supposed to?

Thanks!

scrudden · 2018-08-17T15:22:46Z

To check that UpdateTravelTimes has run successfully you can run this query. If the only result is SCHED then it has not run correctly (or perhaps there was no data in the date range specified for it to process).

select howset, count(*) from traveltimesforstoppaths group by howset;
 howset |  count  
--------+---------
 AVL    |    1450
 SERVC  |    1399
 SCHED  | 1939017
 TRIP   |    1437
(4 rows)

The description of each of these values can be found here in the code comments.

transitime/transitclock/src/main/java/org/transitclock/db/structs/TravelTimesForStopPath.java

Lines 146 to 180 in ca67e75

    
           	/** 
        
           	 * This enumeration is for keeping track of how the travel times were   
        
           	 * determined. This way can tell of they should be overridden or not.   
        
           	 */ 
        
           	public enum HowSet { 
        
           		// From when there are no schedule times so simply need to use a 
        
           		// default speed 
        
           		SPEED(0), 
        
           		// From interpolating data in GTFS stop_times.txt file 
        
           		SCHED(1), 
        
           		// No AVL data was available for the actual day so using data from 
        
           		// another day. 
        
           		SERVC(2), 
        
           		// No AVL data was available for the actual trip so using data from 
        
           		// a trip that is before or after the trip in question 
        
           		TRIP(3), 
        
           		// Based on actual running times as determined by AVL data 
        
           		AVL(4); 
        
           		@SuppressWarnings("unused") 
        
           		private int value; 
        
           		private HowSet(int value) { 
        
           			this.value =  value; 
        
           		} 
        
           		public boolean isScheduleBased() { 
        
           			return this == SPEED ||  
        
           					this == SCHED; 
        
           		} 
        
           	};

scrudden · 2018-08-17T16:04:23Z

You need to restart Core for it to use the newly processed travel times for generating predictions.

scrudden · 2018-08-17T17:34:38Z

If this is to be run regularly I would run it once a week for a rolling period covering the last 28 days. On larger systems there may be performance issues with this.

transitime/transitclock/src/main/java/org/transitclock/applications/UpdateTravelTimes.java

Lines 46 to 54 in ca67e75

    
            * Uses AVL based data of arrival/departure times and matches from the database 
        
            * to update the expected travel and stop times. 
        
            * <p> 
        
            * NOTE: This could probably be made less resource/memory intensive by 
        
            * processing a days worth of data at a time. Another possibility would be to 
        
            * try to process the data while it is being read in instead of reading it all 
        
            * in at the beginning. But that would likely be quite difficult to implement. 
        
            * Processing one day of data at a time would likely be far simpler and 
        
            * therefore a better choice.

This is one of the motivations for adding the Kalman Filter for travel times and RLS algorithm for dwell times to TheTransitClock. These both update as the system runs.

scrudden · 2018-08-23T11:32:45Z

@nselikoff Can I close this issue again?

nselikoff · 2018-08-23T15:23:25Z

Yes, thanks @scrudden

nselikoff closed this as completed May 14, 2018

nselikoff reopened this Aug 17, 2018

nselikoff closed this as completed Aug 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommended schedule for running UpdateTravelTimes #51

Recommended schedule for running UpdateTravelTimes #51

nselikoff commented May 9, 2018

scrudden commented May 10, 2018 •

edited

Loading

nselikoff commented May 10, 2018

scrudden commented May 11, 2018 •

edited

Loading

scrudden commented May 14, 2018

nselikoff commented May 14, 2018

nselikoff commented Aug 17, 2018

scrudden commented Aug 17, 2018

scrudden commented Aug 17, 2018

scrudden commented Aug 17, 2018

scrudden commented Aug 23, 2018

nselikoff commented Aug 23, 2018

Recommended schedule for running UpdateTravelTimes #51

Recommended schedule for running UpdateTravelTimes #51

Comments

nselikoff commented May 9, 2018

scrudden commented May 10, 2018 • edited Loading

nselikoff commented May 10, 2018

scrudden commented May 11, 2018 • edited Loading

scrudden commented May 14, 2018

nselikoff commented May 14, 2018

nselikoff commented Aug 17, 2018

scrudden commented Aug 17, 2018

scrudden commented Aug 17, 2018

scrudden commented Aug 17, 2018

scrudden commented Aug 23, 2018

nselikoff commented Aug 23, 2018

scrudden commented May 10, 2018 •

edited

Loading

scrudden commented May 11, 2018 •

edited

Loading