Despite their widespread usage, little is known about the predictive accuracy of different discrete choice demand models. To evaluate their performance, we use a series of natural disasters that unexpectedly removed hospitals from consumers' choice sets. We compare the model predictions of post-disaster behavior to the benchmark of actual post-disaster consumer behavior. Across our different settings, we find that models that allow for flexible interactions between patient characteristics and unobserved hospital quality perform the best and that it is important to use different classes of models. Further, the use of less accurate models could lead to more lax merger enforcement.