Python, matrices, arrays, and the National Trip End Model

Travel demand in a matrix

Travel demand can be expressed using origin-destination (OD) matrices. In this example we have 4 zones:

1 2 3 4
1  10 2 9 18
2  3 2 2 1
3  2 1 15 17
4  1 3 1 4

The origins are in rows and the destinations are in columns, so number of trips from zone 1 to zone 2 is 2 and the number of trips from zone 3 to 4 is 17. Let’s say that this data is for weekday commuting trips by rail. To determine the total number of trips from zone 1 we simply sum across row 1 – 10+2+9+18=39. To determine the number of trips to zone 1 we sum down the column 10+3+2+1=16. Since the number of trips originating in zone 1 is higher than that arriving at zone 1, we can assume that zone 1 is mostly residential. Whereas we can say that zone 4 is mostly offices or factories.

Note that the column and row headings are not part of the matrix.

What we are interested in is forecasting growth in demand. This data can be provided from the National Trip End Model (NTEM), more of which later. This will involve a fair amount of matrix manipulation.  This could be done by representing the matrix as a list of lists but using the Python NumPy package can make this relatively painless.

Matrices As Arrays with NumPy

NumPy provides a specific matrix object but you are advised to use the Array object to perform matrix-like operations. A lot of the methods that apply to an array object can be applied to a matrix object. However, the numpy matrix object is limited to 2 dimensions whereas the array object can is N-dimensional; which could be useful if we want to separate our trips by travel mode, for example.

To create the example matrix (as  an array) :

import numpy
demand = numpy.array([[10,2,9,18],
 [3,2,2,1],
 [2,1,15,17],
 [1,3,1,4]])
print ( demand )

returns:

[[10  2  9 18]
 [ 3  2  2  1]
 [ 2  1 15 17]
 [ 1  3  1  4]]

I’ll leave the basics of NumPy to focus on how it applies to travel demand matrices. First question how many trips do we have in total? Sum all of the elements in the matrix (array):

>>> demand.sum()
91

One area of interest is trip end totals or trip ends. If we want to sum across rows:

>>> demand.sum(axis=1)
array([39, 8, 35, 9])

We can omit the parameter name to return the same result:

demand.sum(1)

To sum the columns:

>>> demand.sum(0)
array([16, 8, 27, 40])

We can find the most popular trip easily enough:

>>> demand.max()
18

But to find out which actual O-D trip this relates to, we can use the where method:

i, j = numpy.where(demand==demand.max())
print ( i, j )

returns:

[0] [3]

which is row 0, column 3. To determine the actual zone numbers:

print ( i+1, j+1 )

returns:

[1] [4]

Which is the most populous zone? Assuming that the most populous zone is the one with the most origins:

OTE = demand.sum(1)
print ( OTE )
a = numpy.where(OTE==OTE.max())
print ( a[0] + 1 )

returns

[39  8 35  9]
[1]

We generate the row totals as before and again use where to determine the element that holds the maximum.  a[0] returns the first dimension of that element, and adding 1 to that returns the zone number.

We can do a similar thing to determine the most popular zone to travel to by summing on the other axis (i.e. the columns:

DTE = demand.sum(0)
print (DTE)
a = numpy.where(DTE==DTE.max())
print ( a[0] + 1 )

returns:

[16  8 27 40]
[4]

Which is the most diverse zone?

So far we have determined that zone 1 is where most trips originate and zone 4 is where the most trips end up.  There are relatively few trips to zone 1, so we can infer that it consists mostly of housing.  Similarly, there are few trips from zone 4 so we can infer that it is mostly workplaces.

If we substitute max for min we would see that zone 2 has the fewest origins and the fewest destinations; this could be a rural zone, perhaps?)

How do we determine the most diverse zone? What do we mean by this? It could be the zone with the most internal trips, for example – this implies that people are living and working in the same zone and are making shorter trips. The internal trips (i.e. zone 1 to zone 1, zone 2 to zone 2) are shown in the main diagonal, so:

internal_trips = demand.diagonal()
print ( internal_trips )
a = numpy.where(internal_trips==internal_trips.max())
print ( a[0] + 1 )

returns:

[[10  2 15  4]]
[3]

Firstly, we return the main diagonal of the matrix then use the where method to return the location of the maximum. Finally we return the column where the maximum appears (again adding 1 to return the zone code rather than the column number).

Alternatively, we could say that the most diverse zone could be the zone with the smallest difference between origin trip end total and the destination trip end total. If we look at the trip end totals (the sum of each row and the sum of each columns) again:

OTE = demand.sum(1)
print ( OTE )
DTE = demand.sum(0)
print (DTE)

returns:

[[39]
 [ 8]
 [35]
 [ 9]]
[[16  8 27 40]]

The two matrices have different orientations, so if we transpose one of them:

print (DTE)
print (DTE.T)

returns:

[39 8 35 9]
[16 8 27 40]

we can then do the subtraction:

print ( abs(DTE.T - OTE) )

returns:

[23 0 8 31]

We can see that zone 2 has the least difference between origin (8) and destination (8) trips. Zone 4 is the least diverse (40 as a destination and 9 as an origin). Again we could use where to return the actual zone.

Applying growth factors

For each zone we can apply a set of growth factors to determine future demand. These growth factors can be obtained from sources such as the National Trip End Model (NTEM). The NTEM uses forecasts of factors such as population, car ownership and employment to determine possible growth in travel demand. As the name implies the NTEM provides information on trip end growth, the growth that applies to the row and column totals as determined above, rather than the growth in trips between specific zones. The process for applying trip end growth factors will be the subject of another blog.