DCM Utility Calibration

Product utility values obtained from an analysis of a DCM experiment, are interval scaled "raw" utilities. The differences in utilities reflect the preferences among the products they represent but the utility value provide no measure of willingness of the individual to purchase the product. The question inviting to a choice is often understood as "if the shown products were the only available and you had to choose one" leads to a conditional choice having only loose linkage with the potential of the product on the market.

Use of raw utilities in a simulation is useful for comparison of competing products in a preference share simulation. However, share models do not account for market acceptability of the products in the choice set. If there is no product acceptable for an individual, the assumption of 100% total share for the individual will make the unacceptable products influence the computed shares. The lowest utilities are always estimated with the highest error and this error is projected in the simulation. Weighting individuals by their average consumption does not help in this case.

Solution

Should a utility reflect probability of some event or intention, typically purchase, it must be calibrated. A calibration transforms the "raw" utility to a value that reflects probability of an event, typically purchase, or an intention to act, as stated by the individual. This allows to compute the "as if stated" acceptance of any simulated product and thus estimate a more appropriate contribution of an individual to the simulated aggregate values. The most important aspect is that utilities get the reference (zero) value that reflects probability of a reference event or intention. The transformation of raw utilities is usually linear with two parameters determining the location and scaling.

As a rule of thumb, a change in the transformation location factor (a shift of values) is often much more important than that of the scaling (changing differences between utility values, i.e. sensitivity).

A calibration is sometimes replaced by aggregate weights of the sample segments obtained from an external source such as market data. This is equivalent to shifting utilities by a value common to all individuals in the segment. However, if differences in acceptance values of products between individuals in the segment are not uniform, the results of calibration may be misleading.

Calibration assumptions and execution

The selection of calibration profiles should be made with a good knowledge of the current and future market expectations. Especially profiles with an excessive attractiveness should be avoided as they would make respondent overly refuse all other profiles. The actual implementation of asking calibration questions is very variable. The most common format is, for example, implemented in Sawtooth conjoint module. Provided managerially designed profiles are available, an efficient format of calibration is SCE - Sequential Choice Exercise that relies on ranking. The ranking fully eliminates tied (equal) answer values without an excessive prolongation of the interview.

As aside

Calibration procedures implemented in most commercial programs are based on linear regression of stated utilities using least squares (OLS - Ordinary Least Squares). OLS method too often fails with calibration data, and experienced analysts avoid it. More reliable is regression of stated utilities weighted by multinomial densities of the calibration profiles, or, better, an equivalent Bayesian regression.
Non-linear calibration transformations of product utilities are possible but hardly ever used. On the other hand, nonlinear models for part-worths of quantitative attribute levels are quite common.
No information about differences between levels of two different attributes can be obtained by any calibration method or technique. If this is desirable the MXD - Maximum Difference Scaling method for product aspects should be used.

Calibration parameters

The calibration process introduces positional and scaling factors into multinomial logit model. If appropriate, additional components such as the outer goods may be added to the model.

Whatever experimental method is used for a calibration, stated data are always censored. E.g., in a purchase intention question using a 5-step Likert scale, all concepts below certain utility value will get the the answer "definitely no", and all concepts above some (high) utility value the answer "definitely yes". Traditional estimation technique (such as OLS - Ordinary Least Squares) cannot fit the data correctly even if the data were completely noiseless because (1) the distribution of calibration answers is truncated and (2) ties between experimental values are common. Both issues can be rectified in a calibration carried out as a SCE - Sequential Choice Exercise and using Bayesian regression. Bayesian priors of the Likert scale steps estimated from choice orders in SCE have proved especially useful in cases with many ties.

As aside

Experimenting with extended Likert or "percent of likelihood" scales did not show any observable improvement in calibration fit. Respondents had tendency to cumulate their answers into a relatively narrow region of a broader scale with no improvement in discrimination between tested items. Neither the "dual-response none" method has proved useful as it often leads to evasive choices such as he cheapest item in the choice set. A symmetrical 5-point Likert scale (with the alternatives "definitely yes", "rather yes", "neither-nor", "rather no" and "definitely no") bounded in the interval [-4, 4] (logit units) seems to be a fixed star. Non-response is usually omitted from the answering options as it is supposed anybody can have beliefs about the calibrated items.
The wording of a calibration question is crucial. The formulation is usually conditional as it should reflect some predetermined action, typically purchase.
The "none" alternative often used in CBC - Choice Based Conjoint is ignored in calibration as it has an undefined meaning.
The numerical procedure is based on two-parameter logistic model of IRT - Item Response Theory where the "ability" trait is the stated willingness of the person to make a decision (such as a purchase, switch of brand or provider, churn, etc.) conditional on the item. To make up for the truncated data distribution, a version of weighted empirical Bayes regression is used to estimate the IRT parameters.

Properties of calibrated utilities

A calibration allows to reflect conditions of external effects, and may substantially change the results of a simulation compared to the simulation based on "raw" utilities. However, wrong assumptions, question conditions, or a numerical procedure, may invalidate results of a study.

An error implicit to any calibration procedure is the assumption that the stated purchase probability is proportional to the expected purchase probability. Both measures are conditional given the choice set, the question condition, assumed situation, and the choice mode in respect to an assumed action (e.g. intentional, occasional or impulsive purchase, trial or repeat) which influence both the test and market events. If possible, it is useful to correct the calibration bias at least in part using additional data for the subjects.

Future development

Calibration methods have evolved as a reaction to the demand for more market-like numbers obtained from a simulation. The inherent disadvantage of many calibration methods is use of acceptance as the target variable. Acceptance, in principle, is not a measure directly related to the expected sales. It is just a characteristic of a product as seen and stated by respondents in the interview. The value 50% reflecting hesitation if to buy or not is fallacious. Usually, there are many other products on the market with much higher acceptance. A product with stated acceptance 50% or less has, in most real cases, nearly no chance to be successful.

In contrast, the CSDCA - Common Scale Discrete Choice Analysis, with the possibility to determine and include acceptability threshold of product aspects, allows for non-compensatory estimation and simulation using nested logit model. A perceptance, being 0% or negative if the aspect or product is unacceptable, can be estimated. It is believed this approach can replace the standard calibration approach and give more realistic view of the expected behavior of customers.

Problem

Solution

Calibration assumptions and execution

Calibration parameters

Properties of calibrated utilities

Future development