Whatever sophisticated numerical method cannot redeem a deficient design.
[ Continually refreshed experience ]
Necessity to estimate interaction terms in aggregated choice models has been long discussed in literature. In the late nineties when HB – Hierarchical Bayes techniques allowing robust estimation of individual utilities were introduced those discussions vanished. Most of the earlier interaction terms have been recognized as due to heterogeneity in the sample. It did not take long, and the estimation method designed specifically for CBC - Choice Based Conjoint have become commercially available from Sawtooth Software, Inc.
A choice is an event conditioned by the choice-set. The choice probability is therefore conditional. At the same time it is supposed the behavior of each respondent is different but comes from some distribution of behavior in the sample. In this view, there are two hierarchical levels of conditioning. The upper level is related to the sample distribution and the lower level to the characteristic of the individual. Each of the relations has some likelihood. The main process is maximization of the likelihood obtained (usually) as product of the two likelihoods.
If we drew a respondent from the sample by random and knew the distribution of the behavior in the sample, we could predict the behavior of the individual with some probability. The highest probability we would expect at the maximum probability density of the sample behavior. The behavior probability density for an individual based on the sample distribution is called prior (pre-experimental) likelihood. At the beginning of an estimation we can start with some arbitrary, but reasonable sample distribution. Using this distribution we can modify parameters of the model for an individual so that the posterior likelihood is maximized. The important property of this step is that the likelihood maximum for an individual is shifted from the maximum for the sample. The more 'unique' the respondent is, the farther the shift will be from the sample mean. The density of the new distribution for an individual is termed posterior (post-experimental) likelihood. The new maximum is adopted as the point estimate for the respondent. The new point estimates are used to update the distribution in the sample, that serves for estimation of the prior likelihood for each individual. The process is repeated until no change in point estimates for all individuals is seen.
This general approach is known as the empirical Bayes method. The prior distribution in HB method may be thought of as a flexible constraint that gets stronger whenever the estimated parameters for an individual get farther from the estimated parameters for the sample. The ability of HB techniques to utilize the sample properties, and keep the individual estimates inside the credible interval of the sample, is sometimes ascribed as a "borrowing information from the sample".
HB-MCMC, a top-level variant of the empirical Bayes method, is the workhorse in estimation of DCM models. It has many advantages compared to the traditional (non-Bayes) aggregate maximum likelihood estimation methods.
|
HB-MCMC estimation technique has only marginal disadvantages compared to its advantages. Most of them come from the data rather than from the technique itself. |
|
A number of methods for posterior choice likelihood maximization has been suggested. Some of variants are closely related to the methods of latent classes or factors, e.g. mixture regression models, that have been shown to give results similar to HB-MCMC technique. An objection may be a lower certainty of reaching the global optimum due to presence of local maximums or saddle points. Several restarts are usually required.
A simple variant of restricted HB-MLE, the restriction being the diagonal variance matrix of the sample part-worth estimates, has been found useful in case of nonlinear formulations of part-worths of value-based attributes. As it is less robust than a method with full sample variance matrix, it is not suitable for sparse data, e.g. from a short study with only a few choices. The estimation procedure can be implemented using programming tools available in standard statistical packages but must be programmed for each study individually.