a high-dimensional multinomial logit model (replication data)

The number of parameters in a standard multinomial logit model increases linearly with the number of choice alternatives and number of explanatory variables. Since many modern applications involve large choice sets with categorical explanatory variables, which enter the model as large sets of binary dummies, the number of parameters in a multinomial logit model is often large. This paper proposes a new method for data-driven two-way parameter clustering over outcome categories and explanatory dummy categories in a multinomial logit model. A Bayesian Dirichlet process mixture model encourages parameters to cluster over the categories, which reduces the number of unique model parameters and provides interpretable clusters of categories. In an empirical application, we estimate the holiday preferences of 11 household types over 49 holiday destinations, and identify a small number of household segments with different preferences across clusters of holiday destinations.

Data and Resources

Suggested Citation

Nibbering, Didier (2023): A high-dimensional multinomial logit model (replication data). Version: 1. Journal of Applied Econometrics. Dataset. http://dx.doi.org/10.15456/jae.2023325.2253958331

JEL Codes