DCM assumptions in market research

The central notion in all DCM-related models and methods is choice probability and its relationship to utility.

A choice of an item from a set of items has some probability. The actual choice is a realization of this probability. The probability model used for interpretation of the choices is the RUT - Random Utility Theory (Louis Leon Thurstone, 1927). Particular behavioral models that are essence of choice modeling, differ only in the considered conditions and assumptions. In market research (MR), a model is usually anticipated before the actual experiment is run. The same is true for experimental design and estimation method that correspond to the model. There is seldom enough time to test several models,  designs and methods, discriminate between them and find the best one. A study is planned to match the assumed model. This requires a thorough knowledge of the problem.

A well designed choice-based exercise complies with the four requirements of a good questionnaire:


Discrete choice experiments

Experimental methods of market research (MR) based on DCM rely on a choice from a set of items, i.e. a simulated action conditional to the set of stimuli presented. The stimuli may span from the real up to completely fictional ones. The most common sets are made of products, product profiles, attributes, aspects, properties, statements, etc.

A choice is a natural manifestation of human behavior. It is much less culturally or socially dependent than a metric (evaluation-based) statement expressing a preference, attitude, or value. Especially when differences between alternatives are small, choice methods are likely to be more sensitive and provide more reliable information than ratings. In addition, consistency checks are available that facilitate the identification of respondents who do not have well-defined preferences. Experimental conditions can be adjusted to mimic the real purchase conditions. The concepts of interest, their attributes, and the presence or absence of competing concepts can be considered jointly, and the preferences analyzed.

The use of choice experiments is an alternative to asking direct questions. In some cases, however, results obtained from a standard way of interviewing can be transformed to a format equivalent to that available from a choice experiment and analyzed accordingly. The reverse transformation, usually as a part of the data interpretation in form of a simulation of particular events or their equilibrium, is also possible.

The common problem met in interpreting an experiment is that the experimental data do not reflect what they were expected to reflect. Psychology states, in the PEBSE model, the five important treats influencing human behavior:

One can learn a lot about personality and expectations from a research interview, but only very little about the influence of the environment or situation. The behavior in an experiment and conclusions derived from it cannot reflect the market behavior in its completeness. The results are always subject to the experimental conditions and the range of collected pieces of information.

As aside


Modeling choice

A DCM behavioral model should satisfy constraints inherent to the given type of items and the conditions under which the choice is made so that the estimated model reflects the observed behavior. The model should allow for the following constraints common in MR:

Choice-based modeling in marketing research draws on the achievements in understanding human behavior and advances in numerical methods. A behavioral model lacking a feasible way to reliably estimate its parameters would be useless. The following theoretical foundations are essential.

The probabilistic approach provides marketing researcher with a powerful tool for analyzing and predicting consumer choice behavior. Market (revealed) and/or experimental (stated) research data can be combined and analyzed. This does not mean all problems can be solved with a single model, and with the same ease. Many acute problems still show a stubborn resistance. 


IIA - Independence from Irrelevant Alternatives axiom 

Choice postulates (Luce, 1958) assign a choice probability to each alternative in a choice set made of uncorrelated items. In the case of such an unstructured choice set, the postulates lead to the well-known logit formula, also known as BTL - Bradley-Terry-Luce model. Another consequence of the postulates is the IIA - Independence from Irrelevant Alternatives axiom that deserves a short note. 

The outcome of IIA is that the choice probability ratio for any two items in any choice set, independently from the presence or absence of other items in the set, is constant for an individual. The other items are therefore called irrelevant. Preference shares in a simulation for an individual are usually taken equal to the conditional choice probabilities given the choice set, and independent from other individuals. This removes part of the IIA irrelevancy of items between respondents. However, it is still present within the simulation for individual respondents. The false irrelevancy of the items is hidden behind the share averages that do not have constant ratios due to varying preferences of individuals.

Some products, irrespective of their different brands or other extrinsic properties, are close substitutes. They are often found among CPG/FMCG products and are understood as indistinguishable commodities. When present in a real choice set they are taken as a bulk alternative and the decision-maker selects one of them by random. In a simulation, if only one of such products is present in the set, it gets some share. If an equivalent product is added, the sum of the simulated shares for the two products is nearly twice as high as for a single product. In reality, the sum might be only slightly higher than that of a single product share. This result is due to the IIA assumption that both products are irrelevant alternatives. This unfavorable property is often explained as a "red/blue bus" problem.

Nearly all analyses in MR are done using the logit model with the inherent IIA property. To eliminate its effect in simulations is possible using the first-choice method. However, the result to be reliable, a sufficiently large data sample is required. For smaller data sets encountered in MR, Sawtooth, Inc., developed the RFC - Randomized First Choice simulation method. Another approach might be based on finding a reasonable size of consideration set of simulated products for each individual.

As aside

Discrete choice versus metric models

If the estimation of the behavioral model parameters is based on ratings of stimuli used in the experiment, the method is called metric. If we were able to obtain reliable (sufficiently accurate and precise) and unequivocally interpretable metric data, as it is in some other experimental sciences, no doubt the metric approach would be invincible. As MR deals with a great deal of randomness and uncertainty, a probabilistic approach offers by itself. Either of the approaches has its caveats.

Metric approach
  • The most distinctive disadvantage is an uncontrollable bias. The main reason lies in the fact that there is hardly any reasonable transformation of the value stated in an interview (a percentage, a point on a Likert scale, or a normalized centroid value) to a probability of a real event such as a purchase. The transformation function, if found, is known to be nearly always nonlinear. Neglecting the nonlinearity, often affiliated with an asymmetrical distribution of the experimental values, leads to a bias of estimated model parameters. If no countermeasures are taken the computed values often attain unrealistic values. 
  • The values of selected answers are usually accumulated in some region of the scale. The repeated use of the same value on the scale removes discrimination in this region of values and leads to loss of information.
  • Combining results from several relatively independent metric data blocks into a common model is problematic. The metrics have often different scaling and unknown distribution.

Probabilistic approach
  • The most distinctive disadvantage is the implicit scaling factor of obtained parameters that is often ambiguous and virtually inestimable. There is only a partial remedy. If the permissible probabilities of simulated actions are known, the common scale of the parameters can be adjusted. This process is called calibration.
  • Since a choice is an event with no direct quantitative measure assigned to it, the amount of available information is usually much lower than from a comparable metric-based question. An accumulation of a higher number of choices and/or from a higher number of respondents is required to obtain a sufficient amount of information. This is richly outweighed by a generally lower bias of estimated parameters.

DCM-based conjoint

The term conjoint is believed to be an acronym for "considered jointly". It is assigned to a rich group of methods allowing extraction of quantitative effects of the stimuli aspects influencing the respondent's responses in the experiment. The sets of mutually exclusive aspects are known as factors in the standard analysis of variance and are called attributes in MR. A stimulus is composed of aspects each coming from a single attribute. The most frequent stimuli are products or services.

All conjoint methods have common features described in Conjoint Method Overview. The most straightforward application of DCM is CBC - Choice Based Conjoint.

Results from a conjoint exercise with more than about 6 attributes lose reliability. Various hybrid methods exist to increase the number of attributes, respect the natural correlations between attribute levels (typically overall quality vs. price), make the stimuli realistic, use optional attribute levels, assess acceptability of attribute levels, estimate willingness to spend, etc. The Concept and Package Tests method is available for a full product concept line prepared in advance. The CBCT - Choice Based Concept Test has been developed for concepts with a small number attributes to be randomized. When the concepts cannot be provided in advance and only the attributes are known, CSDCA - Common Scale Discrete Choice Analysis is a suitable screening method. Calibration can be included in all the hybrid methods.

Maximum difference scaling

There is a special case of a DCM design with just a single attribute. In contrast to conjoint, the levels of the attribute can be mutually exclusive. The goal is to differentiate between the levels and determine their preferences. A choice-based version is known as MXD - Maximum Difference Scaling method, shortly MaxDiff. An indisputable advantage of MaxDiff is possibility to test a large number of items. Various flavors of so-called Bandit MaxDiff have been suggested.

The SCE - Sequential Choice Exercise is a faster alternative to MaxDiff. It is based on the ranking of a full or reduced set of items and is suitable for about 16 items at most.

DCM-based concept test

A common task in MR is an overall evaluation of product concepts with fixed properties. The interest is often in the competitive potential of the concepts. As a standard method, the concepts are randomized and presented to respondents who rate them. The ratings averaged over respondents make the basis for a metric evaluation and interpretation.

Both MXD or SCE methods mentioned above can be used in a concept test. The obtained preferences provide additional discrimination between concepts that are rated equally. To estimate the competitive potential of the concepts in terms of purchase intentions, calibration of the obtained preferences is applied.

Combining choice data blocks

A metric-based merging of conjoint utility blocks via several common attributes known as "bridging" often leads to disappointing results. The problem is in different scaling factors in the metric blocks that can be adjusted only with a serious distortion of preferences between the items. In contrast, a merging of choice data can profit from the inherent flexibility of the between-blocks scaling.

Several DCM blocks in a survey can be designed to contain some of the features common to the blocks. The invaluable gain from using the DCM approach is a possibility of a seamless merger of the data from several DCM blocks in a common model.

Merging of several blocks of choice data (rather than parameters estimated before the merging), each having different measurement "sensitivities", is possible by implicit rescaling (compression or expansion) of parameters for each of the blocks by the estimation procedure so that the parameters for all the blocks match. The scaling factors of the choice data blocks automatically adapt one to another. A disadvantage is a potential distortion of some estimates if the constraints between the blocks are not strong enough. As prevention, both the design of the blocks and estimation procedure requires certain restrictions to be imposed. A merging of a SCE - Sequential Choice Exercise or a MXD - Maximum Difference Scaling of fixed product profiles designed in a managerial way, possibly used as a filter in CBS - Choice Based Sampling, and of several class-based CBC - Choice-Based Conjoint blocks of pseudo-randomly modified products, are typical examples. These procedures are the essence of the CBCT - Choice-Based Concept Test and CSDCA - Common Scale Discrete Choice Analysis mentioned previously.

The problem of matching DCM outcomes with aggregated metric data,  e.g. with the market data (revealed preferences), does not have a straightforward solution. The same is true for optional features of a product. Nevertheless, if one can assume independence between the core product properties and those of available options, the probability character of choices allows for deriving a model founded on conditional probabilities (a path model).

Estimation of the competitive potential of a core product and its selectable options is possible with MBC - Menu Based Choice. The method relies on the choice of a core product in a standard CBC, and, conditionally, in a modified volumetric conjoint method that allows for modeling saturation of needs of the options.

Accuracy and precision

Choice-based methods are very efficient in the interpretation and prediction of the events made with high involvement. Results for low-involvement events are often less reliable but still clearly superior to metric methods. The reason is simple. In a classic metric-based method, an item is usually rated no more than once. In a choice-based method, it can be present in many sets and, thus, be evaluated many times under different conditions and contexts.

Individual-based estimates are very sensitive to the design of choice sets which concerns namely orthogonality and balance of the sets. The number of freedoms of the estimated system should always be higher than the number of estimated parameters. The hierarchical Bayes or other methods that bank on aggregate sample properties and allow estimation of a higher number of individual parameters than the number of degrees of freedom per individual, cannot correct the design deficiency. Especially the bias of estimates may be unpredictable.

A frequent problem encountered in DCM is varying and often unchecked (or even inestimable) the precision of estimates. The levels of items that have been selected least, e.g. those least attractive in CBC - Choice-Based Conjoint, have the lowest precision for a simple reason - little data is available. Fortunately, this does not introduce a big problem in a simulation. Such a product will have only a small share. However, the simulation may be useless when a low part-worth level is combined with a high part-worth level. The problem lies directly in the conjoint additive kernel. Due to the error propagation, a sum of part-worths has always lower precision than any part-worth alone. When a low part-worth is "overcome" by a high one the computed item utility may be completely wrong.

An unduly attractive item has always an unwanted effect. When such an item is present in a choice task the information is lost for all items in the given choice set. Inappropriately attractive items or the combinations of levels making them so have always a strongly detrimental effect on the estimates and should be strictly avoided. The selection and format of items for a concept test or a MaxDiff study are fully in hands of the project designer. In a conjoint study, the problems can be efficiently tackled using a design based on product classes. Use of prohibited combinations of attribute levels should be avoided whenever and wherever possible. 

Deployment of robust design and estimation methods in conjoint studies is important. Commercially available programs (e.g. from Sawtooth Software, Inc.) have parametric options that help to keep tight reins on the design and computation processes.