A mixed spatially correlated logit model: formulation and application to residential choice modeling

https://doi.org/10.1016/S0191-2615(03)00005-5Get rights and content

Abstract

In recent years, there have been important developments in the simulation analysis of the mixed multinomial logit model as well as in the formulation of increasingly flexible closed-form models belonging to the generalized extreme value class. In this paper, we bring these developments together to propose a mixed spatially correlated logit (MSCL) model for location-related choices. The MSCL model represents a powerful approach to capture both random taste variations as well as spatial correlation in location choice analysis. The MSCL model is applied to an analysis of residential location choice using data drawn from the 1996 Dallas–Fort Worth household survey. The empirical results underscore the need to capture unobserved taste variations and spatial correlation, both for improved data fit and the realistic assessment of the effect of sociodemographic, transportation system, and land-use changes on residential location choice.

Introduction

Discrete choice models have a long history of application in the economic, transportation, marketing, and geography fields, among other areas. Most discrete choice models are based on the random utility maximization (RUM) hypothesis. Within the class of RUM-based models, the multinomial logit (MNL) model has been the most widely used structure. The random components of the utilities of the different alternatives in the MNL model are assumed to be independent and identically distributed (IID) with a type I extreme value (or Gumbel) distribution (Johnson and Kotz, 1970; Chapter 21). In addition, the responsiveness to attributes of alternatives across individuals is assumed to be homogeneous after controlling for observed individual characteristics (i.e., the MNL model maintains an assumption of unobserved response homogeneity). For example, in a mode choice model, the MNL model maintains the same utility parameters on the level-of-service (LOS) attributes across observationally identical individuals. These foregoing two assumptions together lead to the simple and elegant closed-form mathematical structure of the MNL. However, the assumptions also leave the MNL model saddled with the “independence of irrelevant alternatives” (IIA) property at the individual level (Luce and Suppes, 1965; Ben-Akiva and Lerman, 1985).

There are several ways to relax the IID error structure and/or the unobserved response homogeneity assumption. The IID error structure assumption can be relaxed in one of three ways: (a) allowing the random components to be correlated while maintaining the assumption that they are identically distributed (identical, but non-independent, random components), (b) allowing the random components to be non-identically distributed, but maintaining the independence assumption (non-identical, but independent, random components), or (c) allowing the random components to be non-identical and non-independent (non-identical, non-independent, random components). Unobserved response homogeneity may be relaxed in one of two ways: (a) allowing the attribute coefficients to vary randomly due to unobserved factors using a continuous distribution across individuals (random-coefficients approach), or (b) allowing the attribute coefficients to vary randomly due to unobserved factors using a non-parametric discrete distribution across individuals (latent segmentation approach). Within each of the different approaches to relax the IID and unobserved response homogeneity assumptions, there are several different types of model structures that may be used (see Bhat, 2002a for a detailed discussion). Of these model structures, two classes of models have received particular attention, corresponding to the generalized extreme value (GEV) class of models and the mixed multinomial logit (MMNL) class of models. These two classes of models are discussed in turn in 1.1 The GEV class of models, 1.2 The MMNL class of models. Section 1.3 discusses the motivation for combining the GEV class of models with the MMNL class of models in certain empirical circumstances.

The GEV class of models relaxes the IID assumption of the MNL by allowing the random components of alternatives to be correlated, while maintaining the assumption that they are identically distributed (i.e., identical, non-independent, random components). This class of models assumes a type I extreme value (or Gumbel) distribution for the error terms. All the models belonging to the GEV class nest the MNL and result in closed-form expressions for the choice probabilities. In fact, the MNL is also a member of the GEV class, though we will reserve the use of the term “GEV class” to those models that constitute generalizations of the MNL.

The general structure of the GEV class of models was derived by McFadden (1978) from the random utility maximization hypothesis, and generalized by Ben-Akiva and Francois (1983). Several specific GEV models have been formulated and applied, including the nested logit (NL) model (Williams, 1977; McFadden, 1978; Daly and Zachary, 1978), the paired combinatorial logit (PCL) model (Chu, 1989; Koppelman and Wen, 2000), the cross-nested logit (CNL) model (Vovsha, 1997), the ordered GEV (OGEV) model (Small, 1987), the multinomial logit-ordered GEV (MNL-OGEV) model (Bhat, 1998), and the product differentiation logit (PDL) model (Bresnahan et al., 1997). More recently, Wen and Koppelman (2001) proposed a general GEV model structure, which they referred to as the generalized nested logit (GNL) model. Swait (2001), independently, proposed a similar structure, which he labels the choice set generation ogit (GenL) model. Wen and Koppelman’s derivation of the GNL model is motivated from the perspective of flexible substitution patterns across alternatives, while Swait’s derivation of the GenL model is motivated from the concept of latent choice sets of individuals. Wen and Koppelman (2001) illustrate the general nature of the GNL formulation by deriving the other GEV models mentioned earlier as special restrictive cases of the GNL model or as approximations to restricted versions of the GNL model. Swait (2001) presents a network representation for the GenL model, which also applies to the GNL model. Bierlaire (2002) has built on this concept and has proposed a very general network structure-based motivation and design of GEV models, which he refers to as the network GEV model.

The GNL model proposed by Wen and Koppelman (2001) is conceptually appealing from a formulation standpoint and allows substantial flexibility. However, in practice, the flexibility of the GNL model can be realized only if one is able and willing to estimate a large number of dissimilarity and allocation parameters. The net result is that the analyst will have to impose informed restrictions on the general GNL model formulation that are customized to the application context under investigation.

The advantage of all the GEV models discussed above is that they allow relaxations of the independence assumption among alternative error terms, while maintaining closed-form expressions for the choice probabilities.

The MMNL class of models is a generalization of the MNL model. It involves the integration of the MNL formula over the distribution of unobserved random parameters. It takes the structure shown below:Pqi(θ)=∫−∞+∞Lqi(β)f(β|θ)d(β),Lqi(β)=eβxqijeβxqj,where Pqi is the probability that individual q chooses alternative i, xqi is a vector of observed variables specific to individual q and alternative i, β represents parameters which are random realizations from a density function f(·), and θ is a vector of underlying moment parameters characterizing f(·).

The MMNL model structure of Eq. (1) can be motivated from two very different (but formally equivalent) perspectives (see Bhat, 2000). Specifically, a MMNL structure may be generated from an intrinsic motivation to allow flexible substitution patterns across alternatives (error-components structure) or from a need to accommodate unobserved heterogeneity across individuals in their sensitivity to observed exogenous variables (random-coefficients structure). Most importantly, the MMNL class of models can approximate any discrete choice model derived from RUM (including the multinomial probit) as closely as one pleases (see McFadden and Train, 2000). The MMNL model structure is also conceptually appealing and easy to understand since it is the familiar MNL model mixed with the multivariate distribution (generally multivariate normal) of the random parameters (see Hensher and Green, 2001). In the context of relaxing the IID error structure of the MNL, the MMNL model represents a computationally efficient structure when the number of error components (or factors) needed to generate the desired error covariance structure across alternatives is much smaller than the number of alternatives (see Bhat, 2002a).

The MMNL class of models is very general in structure and can accommodate both relaxations of the IID assumption as well as unobserved response homogeneity within a simple unifying framework. Consequently, the need to consider a mixed GEV (MGEV) class may appear unnecessary. However, there are instances when substantial computational efficiency gains may be achieved using a MGEV structure. Consider, for instance, a model for household residential location choice. It is possible, if not very likely, that the utility of spatial units that are close to each other will be correlated due to common unobserved spatial elements. A common specification in the spatial analysis literature for capturing such spatial correlation is to allow alternatives that are contiguous to be correlated. In the MMNL structure, such a correlation structure will require the specification of as many error components as the number of pairs of spatially contiguous alternatives.1 On the other hand, a carefully specified GEV model can accommodate the spatial correlation structure within a closed-form formulation. However, the GEV model structure cannot accommodate unobserved random heterogeneity across individuals. One could superimpose a mixing distribution over the GEV model structure to accommodate such heterogeneity, leading to a parsimonious and powerful MGEV structure.

This paper proposes a mixed spatially correlated logit (MSCL) model that uses a GEV-based structure to accommodate correlation in the utility of spatial units, and superimposes a mixing distribution over the GEV structure to capture unobserved response heterogeneity. The GEV structure used in the paper is a restricted version of the GNL model proposed by Wen and Koppelman. Specifically, the GEV structure takes the form of a paired GNL (PGNL) model with equal dissimilarity parameters across all paired nests (each paired nest includes a spatial unit and one of its adjacent spatial units). The MSCL model developed in this paper emphasizes the fact that closed-form GEV-based models and open-form mixed distribution models are not as mutually exclusive as may be the impression in the discrete choice field.

The rest of the paper is structured as follows. The next section discusses the structure, properties, and estimation of the MSCL model. Section 3 discusses an empirical application of the MSCL model to residential location choice. The final section summarizes the important findings from the study.

Section snippets

Model structure and properties

In this section, we first maintain the assumption of observed response homogeneity and propose the spatially correlated logit (SCL) model (2.1 Notation and definitions, 2.2 The SCL model). Subsequently, we relax the assumption of unobserved response homogeneity to develop the MSCL model and present the estimation procedure for the MSCL model (2.3 The MSCL model, 2.4 Estimation of the MSCL model).

Background

The integrated analysis of land-use and transportation interactions has gained renewed interest and importance with the passage of the Inter-modal Surface Transportation Efficiency Act (ISTEA) and the Transportation Equity Act for the 21st Century (TEA-21). In this context, one of the most important household decisions is that of residential location, especially because residential land-use occupies about two-thirds of all urban land and home-based trips account for a large proportion of all

Summary and conclusions

This paper has proposed a MSCL model for the analysis of location-related decisions of individuals and households. The paper submits, and demonstrates, that while the MMNL class of models is very general in structure, these are substantial computational efficiency gains to be achieved by using MGEV structures in spatially correlated choice situations. This is because the number of error components that needs to be specified in the MMNL structure to generate the desired spatial correlation

Acknowledgements

The authors would like to thank the North Central Texas Council of Governments for providing the data used in the analysis. The authors are also grateful to Lisa Weyant for her help in typesetting and formatting this document. This research was partially funded by a Texas Department of Transportation project.

References (50)

  • M. Ben-Akiva et al.

    mu-Homogenous generalized extreme value model, Working paper, Department of Civil Engineering

    (1983)
  • M. Ben-Akiva et al.

    Discrete-Choice Analysis: Theory and Application to Travel Demand

    (1985)
  • C.R. Bhat

    Incorporating observed and unobserved heterogeneity in urban work mode choice modeling

    Transportation Science

    (2000)
  • C.R. Bhat

    Recent methodological advances relevant to activity and travel behavior

  • Bhat, C.R., 2002b. Simulation estimation of mixed discrete choice models using randomized and scrambled Halton...
  • Bierlaire, M., 2002. The network GEV model. In: Proceedings of the 2nd Swiss Transportation Research Conference. Monte...
  • R. Blumen

    Gender differences in the journey to work

    Urban Geography

    (1994)
  • T.F. Bresnahan et al.

    Market segmentation and the sources of rents from innovation: personal computers in the late 1980s

    RAND Journal of Economics

    (1997)
  • Chu, C., 1989. A paired combinatorial logit model for travel analysis. In: Proceedings of the Fifth World Conference on...
  • A.J. Daly et al.

    Improved multiple choice models

  • A.S. Fotheringham

    Modeling hierarchical destination choice

    Environment and Planning

    (1986)
  • S.A. Gabriel et al.

    Household location and race: estimates of a multinomial logit model

    The Review of Economics and Statistics

    (1989)
  • Guo, J.Y., Bhat, C.R., 2001. Residential location choice modeling: a multinomial logit approach. Technical paper,...
  • Harris, B., 1996. Land use models in transportation planning: a review of past developments and current practice....
  • Hensher, D.A., 1999. The valuation of travel time savings for urban car drivers: evaluating alternative model...
  • Cited by (214)

    • A joint analysis of accessibility and household trip frequencies by travel mode

      2024, Transportation Research Part A: Policy and Practice
    • Spatially correlated nested logit model for spatial location choice

      2022, Transportation Research Part B: Methodological
    View all citing articles on Scopus
    View full text