DCM assumptions in market research

The central notion in all DCM-related models and methods is choice probability and its relationship to utility.

A choice of an item from a set of items has some probability. The actual choice is a realization of this probability. The probability model used for interpretation of the choices is the RUT - Random Utility Theory (Louis Leon Thurstone, 1927). Particular behavioral models that are essence of choice modeling, differ only in the considered conditions and assumptions. In market research (MR), a model is usually anticipated before the actual experiment is run. The same is true for experimental design and estimation method that correspond to the model. There is seldom enough time to test several models,  designs and methods, discriminate between them and find the best one. A study is planned to match the assumed model. This requires a thorough knowledge of the problem.

A well designed choice-based exercise complies with the four requirements of a good questionnaire:

 

Discrete choice experiments

Experimental methods of market research (MR) based on DCM rely on choices from sets of items, i.e. simulated actions conditional to some sets of stimuli being presented. The stimuli may span from the real up to completely fictional ones. The most common sets are made of products, product profiles, attributes, properties, statements, etc.

Choice is a natural manifestation of human behavior. It is much less culturally or socially dependent than a metric (evaluation-based) statement expressing a preference, attitude or value. Especially when differences between alternatives are small, choice methods are likely to be more sensitive and provide more reliable information than ratings. In addition, internal consistency checks are available that facilitate the identification of respondents who do not have well defined preferences. Experimental conditions can be easily adjusted to mimic the real purchase conditions. The concepts of interest, their attributes, and presence or absence of competing concepts, can be considered jointly, and the preferences analyzed.

Use of choice experiments is an alternative to asking direct questions. In some cases, however, results obtained from a standard way of interviewing can be transformed to a format equivalent to that available from a choice experiment, and analyzed accordingly. The reverse transformation, usually as a part of the data interpretation in form of a simulation of particular events or their equilibrium, is also possible.

The common problem met in interpreting an experiment is that the experimental data do not reflect what they were expected to reflect. Psychology states, in the PEBSE model, the five important treats influencing human behavior:

One can learn a lot about personality and expectations from a research interview, but only very little about the influence of environment or situation. The behavior in an experiment and conclusions derived from it cannot reflect the market behavior in its completeness. The results are always subject to the experimental conditions and the range of collected pieces of information.

As aside

 

Modeling choice

A DCM behavioral model should satisfy constraints inherent to the given type of items and the conditions under which the choice is made so that estimated model parameters reflect the observed behavior. The model should allow for the following constraints common in MR:

Choice-based modeling in marketing research draws on the achievements in understanding human behavior and advances in numerical methods. A behavioral model without a feasible way to reliably estimate its parameters would be useless. The following theoretical foundations are essential.

The probabilistic approach provides marketing researcher with a powerful tool for analyzing and predicting consumer choice behavior. Market (revealed) and/or experimental (stated) research data can be combined and analyzed. This does not mean all problems can be solved with a single model, and with the same ease. Many acute problems still show a stubborn resistance. 

 

IIA - Independence from Irrelevant Alternatives axiom 

Choice postulates (Luce, 1958) assign choice probability to each alternative in a choice set made of uncorrelated items. In the case of such an unstructured choice set the postulates lead to the well known logit formula also known as BTL - Bradley-Terry-Luce model. Another consequence of the postulates is the IIA - Independence from Irrelevant Alternatives axiom that deserves a short note.

The outcome of IIA is that the choice probability ratio for any two items in any choice set, independently from presence or absence of other items in the set, is constant for an individual. The other items are therefore called irrelevant. Preference shares in a simulation for an individual are usually taken equal to the conditional choice probabilities given the choice set, and independent from other individuals. This removes part of the IIA irrelevancy of items between respondents. However, it is still present within simulation for individual respondents. The false irrelevancy of the items is hidden behind the share averages that do not have constant ratios due to varying preferences of individuals.

Some products, irrespective of their different brands or other extrinsic properties, are close substitutes. They are often found among CPG/FMCG products and are understood as commodity rather than products of different utilities. When present in a real choice set they are taken as a bulk alternative by the decision maker who selects one of them by random. In a simulation, if only one of such products is present in the set, it gets some choice probability. If an equivalent product is added to the choice set, the sum of the computed choice probabilities for the two products is always higher than for a single product in the previous case. In order to represent reality, it should be about the same. This result is due to the IIA assumption that both products are irrelevant alternatives. This unfavorable property is often explained as a "red/blue bus" problem.

The logit model gives the IIA property to all alternatives in a set while a nested "mother" logit gives it to alternatives within each hierarchically lowest nest. Of all share simulations, only the first-choice method completely avoids the influence of IIA. However, so that that the results are representative, a sufficiently large sample of individuals is required. To alleviate this, Sawtooth, Inc., developed the RFC - Randomized First Choice simulation method. Another approach might be based on a reasonable size of consideration set of an individual.


Discrete choice versus metric models

If estimation of the behavioral model parameters is based on quantitative evaluations (ratings) of stimuli used in the experiment, the method is called metric. If we were able to obtain reliable (sufficiently accurate and precise) and unequivocally interpretable metric data, as it is in some other experimental sciences, no doubt metric approach would be invincible. As MR deals with a great deal of randomness and uncertainty, a probabilistic approach must be adopted. Either of the approaches has its caveats.

Metric approach
  • The most distinctive disadvantage is an uncontrollable bias. The main reason lies in the fact that there is hardly any reasonable transformation of the value stated in an interview (a percentage, a point on a Likert scale or a normalized centroid value) to a probability of a real event such as a purchase. The transformation function, if found, is known to be nearly always nonlinear. Neglecting the nonlinearity, often affiliated to an asymmetrical distribution of the experimental values, leads to bias of estimated model parameters. If no measures are taken the computed values often attain unrealistic values. 
  • The values related to selectable answer levels are usually accumulated in some region of the scale. The repeated use of the same value on the scale removes discrimination in this region of values and leads to loss of information.
  • Combining results from several relatively independent metric data blocks from an interview into a common model is problematic. The metrics have often different scaling and unknown distribution.
Probabilistic approach
  • The most distinctive disadvantage is the implicit scaling factor of obtained parameters that is often ambiguous and virtually inestimable. There is only a partial remedy. As model parameters have to reflect differences in the permissible interval of probabilities, they can be constrained not to attain unreasonable values.
  • Since a choice is an event with no direct quantitative measure assigned to it, the amount of available information is usually much lower than from a comparable metric-based question. An accumulation of a higher number choices and/or from a higher number of respondents is required to obtain a sufficient amount of information. This is outweighed with a generally lower bias of estimated parameters.

 

DCM-based conjoint

The term conjoint is believed to be an acronym for "considered jointly". It is assigned to a rich group of methods allowing extraction of quantitative effects of the stimuli constituents influencing the respondent's responses in the experiment. The constituents, known as factors in the standard analysis of variance, are called attributes in MR. The most frequent stimuli are products or services.

All conjoint methods have common features described in Conjoint Method Overview. The most straightforward and well known application of DCM is CBC - Choice Based Conjoint.

There is a special case of conjoint design with just a single attribute with levels representing some products, statements, options, benefits, etc. The goal is to differentiate between the levels and determine their influence on a decision. A choice-based version known as MXD - Maximum Difference Scaling method, shortly MaxDiff. An alternative based on the order of choices from a full or reduced set of items is SCE - Sequential Choice Exercise that is also suitable as an experimental frame for a DCM-based concept test.


DCM-based concept test

A common task in MR is an overall evaluation of product concepts with fixed properties. The interest is often in the competitive potential of the concepts. As a standard method, concepts are randomized and presented to respondents who rate them. The ratings averaged over respondents or their segments make the basis for a metric evaluation and interpretation.

Prior the rating, the concepts can be sorted and ranked by respondents thus switching from a metric to a discrete model. Ranks provide an additional discrimination between concepts that are rated equally. The ranks can be translated into pair-wise choices that can be analyzed with an adequate method. The most often suggested are Thurstonian, path log-lin and structural equation models (SEM) with latent variables.

A more efficient alternative is a translation into a number of multinomial choices. This approach is used in the SCE - Sequential Choice Exercise method. Primary results are random utilities of the profiles allowing a what-if simulation. In order to utilize the ratings and estimate competitive potential of the concepts in terms of purchase intentions, calibration is applied. 

 

Combining choice data blocks

Several data blocks obtained independently in a survey often relate to some common features. The invaluable gain from using DCM approach is a possibility to seamless merge the data in a common DCM model. This is seldom achievable with metric methods.

A metric-based merging of conjoint utility blocks via several common attributes known as "bridging" often leads to disappointing results. The problem is in different scaling factors in the blocks that can be readjusted only with a severe neglect of the original relative attribute importances. In contrast, a direct merging of choice data can profit from the inherent between-attribute scaling flexibility.

Merging of several blocks of choice data (rather than parameters estimated prior the merging) with different measurement "sensitivities" is possible by implicit rescaling (compression or expansion) of parameters for each of the blocks by the estimation procedure so that the parameters for all the blocks match. The (still unknown) scaling factors of the choice data blocks are automatically adapted one to another. A disadvantage is potential distortion of some estimates. As a prevention, both the design of the blocks and estimation procedure require certain restrictions to be imposed. A merging of SCE - Sequential Choice Exercise of fixed product profiles designed in a managerial way, possibly used as a filter in CBS - Choice Based Sampling, and of several class-based CBC - Choice Based Conjoint blocks of pseudo-randomly modified products, are typical examples.

The problem of matching DCM results with aggregated metric data,  e.g. with the market data (revealed preferences), does not have a straightforward solution. The same is true for optional features of a product. However, if one can assume an independence between the core product properties and those of available options, the probability character of choices allows for deriving a model founded on conditional probabilities (a path model).

Estimation of competitive potential of a core product and its selectable options is possible with MBC - Menu Based Choice. The method relies on choice of a core product in a standard CBC, and, conditionally, a modified volumetric conjoint method that allows for modeling saturation of needs of the options.


Accuracy and precision

Choice-based methods are very efficient in interpretation and prediction of the events made with high involvement of the decision maker. Results for low-involvement events are often less reliable but still clearly superior to metric methods. The reason is simple. In a classic metric-based method, an item is usually rated no more than once. In a choice-based method, it can be present in many sets and, thus, be evaluated many times under different conditions and context.

Individual-based estimates are extremely sensitive to the design of choice sets which concerns namely their orthogonality and balance. The number of freedoms of the estimated system should be always higher than the number of estimated parameters. The recent developments such as hierarchical Bayes or other methods that bank on aggregate sample properties and allow estimation of a higher number of individual parameters than the number of degrees of freedom per individual is, cannot correct the design deficiency. Especially the bias of estimates may be unpredictable.

A frequent problem encountered in DCM is varying and often unchecked (or even inestimable) precision of the estimates. The levels of items that have been selected least, e.g. those least attractive in CBC - Choice Based Conjoint, have the lowest precision for a simple reason - little data is available. Fortunately, this does not introduce a big problem in a simulation. Such a product will have only a small share. However, the simulation may be useless when a low part-worth level is combined with a high part-worth level. The problem lies directly in the conjoint additive kernel. Due to the error propagation, a sum of part-worths has always lower precision than any of the part-worths alone. When a low part-worth is "overcome" by a high one the computed item utility may be completely wrong.

An unwanted effect can have an unduly attractive item. When an item has been chosen nearly whenever it was present in the choice task, the information is lost for all items in the given choice set. Inappropriately attractive items or the combinations of levels making them so have always a strongly detrimental effect on the estimates and should be strictly avoided. Design of items for a concept test or a MaxDiff study is fully in hands of the project designer. In a conjoint study, the problems can be efficiently tackled using a design based on product classes.

Deploying robust estimation methods in conjoint studies is also important. Fortunately enough, commercially available programs (e.g. from Sawtooth Software, Inc.) have parametric options that allow to keep tight reins both on the design and computation processes so that satisfactorily reliable estimates can be obtained.