Conjoint Method Overview

> Conjoint Method Overview

Conjoint experiment

A trade-off experiment has two sides: offers and responses evoked by the offers. Analysis of the responses (choices or evaluations, or both) given the offers, provides specific values called part-worth to the features buyers are considering when making purchase decisions. With this knowledge, marketers can focus on the most important features to address the targets.

People's real-life decisions are influenced by implicit aspects that cannot be foreseen by researcher. As no question can be asked about the unknown, such information is not available from the standard verbal interviewing. It is believed at least some implicit aspects are projected in evoked choices. The utilities obtained from a conjoint study are therefore often called "as if" utilities as they comprise effects related to unknown endogenous aspects such as experience, perceptions, attitudes, needs and expectations. As these effects vary among individuals, individual behavioral data are required.

A typical set of stimuli to be presented to a respondent is made of product concepts called profiles. Each profile is made of a set of features called attributes. Every attribute is set to one of its two or more mutually exclusive levels. The concepts must be comprehensible and distinguishable from one another. Their complexity and number should not exceed a threshold above which respondents tend to resort to simplified statements and decisions.

The preferences obtained in a research study are called "stated preferences" (SP) to distinguish them from the "revealed" preferences (RP) reflecting actual behavior, i.e. the market data. Conjoint analysis mostly relies on conversion of stated preferences to attribute level part-worths under the assumption of their additivity known as conjoint additive kernel. In plain words, the level part-worths, when summed up in an appropriate way, make up utilities of the concepts. Utilities can be interpreted independently as stand-alone values, but their use is namely in what-if simulations thus allowing an interpretation in context of the competing products.

As aside to preference

A preference related to DCM should be always understood as relative. It is most often a ratio of two choice probabilities. Natural logarithm of such a ratio is in logit units.
The additive kernel mentioned above has limited validity, possibly in a sufficiently narrow span of attribute levels. In non-compensatory simulation models, it can be assumed only in the range of attribute levels acceptable by the individual.

At this point it is important to remember that the "utility of the product" is not property of the product. It is a value quantifying respondent's view of the product as seen when "considered jointly" in the conjoint exercise.

As aside to design

All parts of a conjoint study, namely the design, conduction, analysis and interpretation, must be consistent and reflect pre-stated goals of the study.
The designer, analyst, interpreter or user of a conjoint study results should be aware that the behavioral models are quite good but still approximate descriptions. Veracity of any model of the real human behavior is function of time.

Design of a study

Success of a conjoint study starts with its design. The formal rules, ensuring estimability of part-worths as known from the applied statistics of DOE - Design Of Experiments, are common to all types of conjoint methods. Reliability of estimates related to the ultimate goals of the study has the highest priority. It is wise to rely on a commercial design software and comprehensive tests of the generated designs.

The actual sequence of steps should have approximately the following order.

Define the problem to be elucidated and the goals to be achieved.
Decide whether the study will be branded or non-branded.
Suggest the variables, presentation form and end user tools that allow the user to make marketing decisions.
Define the model to be estimated.
Define a set of additional parameters (scaling and location factors, attribute level acceptability thresholds, marketing coefficients, etc.) to be estimated experimentally or from external data.
Check feasibility of the study in respect to the expected credibility of the parameters to be estimated, namely their number an type.
Define product classes and specific alternatives.
Select an experimental method of interviewing.
Decide if CBS - Choice Based Sampling should be applied.
Define or modify attributes and their levels conditional to the methods used.
Program and test the conjoint exercise(s).

The design steps are interdependent and should match the predetermined goals of the study. A seemingly small change in any of them may lead to a complete redesign of the study and, unfortunately, will influence the time schedule and deadlines. A close and iterative cooperation with the client is essential.

An often encountered problem is a request for a too broad scope of a study. Limitations such as duration of the interview, the stimuli presentation formats and their comprehensibility, the ability, fatigue and willingness of respondents to cooperate, and other objective and subjective factors should be respected. There is always a complexity limit above which the parameters (part-worths) become biased or inestimable. It is responsibility of the study designer to find, suggest and guarantee an optimal solution.

Target groups and sampling

When a broad range of the products is to be tested the CBS - Choice Based Sampling allows for making the range of offered products narrower and closer to the respondent's consideration set.

Selection of interviewing method

There is a broad range of techniques for eliciting values people place on the attributes (features) that define products and services. The list of the most often used basic methods for estimation of preferences between profiles shown below is far from being exhaustive.

Ranking of profiles
- Classic sort-conjoint, e.g. by Q-Sort (obsolete, but usable as a construction block of a DCM-based exercise)
- VCT - Virtual Concept Test (Hauser & Dahan, MIT, 2002 and later)
- SCE - Sequential Choice Exercise (possibly completed with rating - calibration)
Rating of profiles one by one on a scale (e.g., 0-100% acceptable, n-level Likert scale, etc.)
- CVA - Conjoint Value Analysis (a traditional method also known as "card conjoint")
- VCT - Virtual Concept Test (Hauser & Dahan, MIT, 2002 and later) in addition to ranking
- Calibration of raw utilities (obtained from choices or ranks) by rating at least on an ordinal scale
Choosing from a pair of profiles
- Enhanced CVA (Sawtooth Inc., 1996)
- Pair-wise comparison method (extended by e.g. Pendergrass, Böckenholt, Louviere, e.a.)
Rating of the difference between a pair of profiles
- ACA - Adaptive Conjoint Analysis (pairwise comparisons, Green et. al., 1970)
- Pair-wise comparisons (BTL - Bradley-Terry-Luce, 1952)
Choosing a profile from a choice-set
- CBC - Choice Based Conjoint (single choice from a set)
- ACBC - Adaptive Choice Based Conjoint (Sawtooth Software, Inc.)
Choosing a number of profiles from a choice-set
- Best-worst CBC (both best and worst options are chosen; Louviere 1995)
- MBC - Menu Based Choice (multiple choices from a set)
- Volumetric variant of CBC (multiple quantified choices from a set)

Conjoint interviewing methods have many variants of many flavors. In general, the most reliable are methods based on choices from randomized sets. A sufficient number of choices is needed.

Ranking belongs to the earliest methods. Compared to a single-choice method, a lower total number of profiles is required to obtain the same amount of information. On the other hand, ranking suffers from the inherent correlation of choices which may cause bias of utility estimates.

Pure rating always brings in uncertainty due to frequent ties and the fact that actual meaning of the stated values is unknown. Estimation bias is hardly avoidable.

As aside to ACA

The first section of ACA - Adaptive Conjoint Analysis method, is SEC - Self Explicated Conjoint. Respondent is asked about his/her acceptance of all levels in an attribute one by one. There is an inherent problem. Levels that would be quite acceptable if outweighed with an attractive level of another attribute are often being refused. They are removed from all ACA pair-wise comparisons in the second section that makes the body of the method. In our experience, levels that remain are often too good and their span too narrow to reflect the full span of levels a respondent is able to accept under the tenet "considered jointly". The estimated perceptions of a product as a whole are distorted. This is why we prefer product classes based design that allows substantial narrowing of attribute ranges in a predetermined way.

A possible approach for products bought concurrently and in differing quantities is the Volumetric Conjoint method, a variant of the chip allocation method. Results from a volumetric exercise can be calibrated to the known market data which is typical for a category of FMCG/CPG (fast moving consumer goods, consumer packaged goods). Just like the classic approach using Fourt & Woodlock and Parfitt & Collins models, the method suffers from inability to separate the trial from the repeat probabilities.

The problem of products purchased concurrently in supposedly fixed quantities, usually by a single piece, can be solved using MBC - Menu based Conjoint. A satisfactory solution of a similar problem of products purchased concurrently in varying quantities is yet to be found. A simplification by splitting it into several smaller problems is usually inevitable.

Below are the findings from tracking the use of conjoint-related methods among users of Sawtooth Software, Inc. (cit.).

Use of conjoint methods

As aside

The picture suggests the choice methods (CBC, ACBC, MBC) most recently accounted for 96% of the projects completed, whereas ratings-based methods (ACA and CVA) accounted for 4%. CBC continues to be the most commonly used approach, accounting for 78% or more in last 6 years.

Attributes and their levels

Attributes (features) of conjoint profiles should ideally be

separable (defined so that a level of an attribute could be set independently from a level of any other attribute), and
uncorrelated (the design should be close to orthogonal).

Levels of attributes can be values of any standard type of MR variables. They should be

exhaustive (fully spanning properties of the investigated products),
exclusive (any ambiguity or overlap among levels should be avoided)
balanced (the number of levels shown for a class of profiles should be similar for all non-class attributes; expected influence differences between levels of all attributes should be about the same).

The above conditions cannot be always satisfied, often due to the natural restrictions in combinations of attribute levels and/or limitations in concurrent presence of some profiles in a choice set. The discrepancies can be resolved by partitioning a broad attribute range into sub-ranges corresponding to mutually exclusive or partially overlapping product classes, or by modifying attributes and their levels as described on page Attribute Properties and Models so that the desired properties are met.

Attribute level descriptions should be as concise as possible, resembling catch words or key phrases. Their explanations should be given before the exercise is started and, possibly, available as help during the tasks.

Number of attributes

Comparison of profiles with many attributes is very demanding. Annoyance from comparison or choice tasks with complicated profiles often yields evaluations made up on only a small number of the most important attributes, and sometimes just on a single one (e.g. price). Random choices or even refusal to continue in the exercise are not exceptional.

Some authors claim several tens of attributes are feasible to be interviewed and evaluated provided a "special" interviewing method is combined with a "special" method of estimation. In our view this is possible but the outcome is disputable. Probably the most popular is ACA - Adaptive Conjoint Analysis due to its inherent use of partial profiles. Usually only 5 randomly selected attributes are shown at a time. Comparison of a number of results from ACA studies showed that, in too many cases, only two or three attributes were identified as important while the importance of all the others were nearly indistinguishable.

Our experience with CBC - Choice Based Conjoint suggests that a product can be described with up to 6 or 7 attributes without a substantial loss of respondent's attention provided the profiles are easily understood. Increasing the number to 8 or 9 attributes and using partial profiles could not be termed as unsuccessful, but the part-worths of the alternately omitted attributes had to be scaled down. An attribute shown less often will gain on importance because when shown it will arouse more attention than an attribute shown more often.

As aside

It is supposed the number of visible attributes can be increased up to about 12 in a SCE - Sequential Choice Exercise as the total number of profiles seen by a respondent is much lower and the exercise is less demanding than any other type of exercise. Of course, the amount of obtained information is lower. In some special cases of well comprehensible profiles the number of attributes could be increased up to about 14, however, with clearly observable, and sometimes substantial decrease of reliability of estimates, especially for the least important attributes due to the high rate of "data borrowing" among respondents in the Bayesian utility estimation process.
As a matter of fact, there is only a very limited experience with SCE as a single source of conjoint data.

Number of levels

So that levels of all attributes are shown equally often requires attributes to have the same number of levels. Attributes with more levels have tendency to obtain higher importance simply due to the more frequent change of the shown value.

When an attribute has many more levels than other attributes it has proved effective to use only a subset of "representative" levels in the conjoint exercise. The preferences among the full set of levels can be estimated in a separate MXD - Maximum Difference Scaling or SCE - Sequential Choice Exercise, and then merged with the data from the conjoint exercise. Ordinal or quantitative levels can be often separated into groups that are assigned to product classes. This is especially useful in case of a broad range of prices.

Design of profiles

Profiles for a conjoint study may be either designed in advance or be generated (and possibly modified) during the interview. Each respondent can be shown either the same set of profiles, or one version of several versions of profiles, or each respondent is shown a different, randomized set of profiles generated by a design program.

There are two major aspects to be considered (see page Profile Properties):

Statistical properties allowing estimation of utilities (problem of identification).
Perceptional properties such as simplicity, lucidity and comprehensibility.

Branded vs. non-branded study

Brand is a vehicle carrying implicit properties and image of a product. There is a profound difference between branded and non-branded presentation of stimuli to respondents.

Non-branded approach
- A broader selection of combinations of attribute levels is possible, but the evidently implausible combinations should be avoided. Nevertheless, they may be tolerated by respondents and their detrimental influence might not be as bad as in a branded presentation.
- Choices reflect general needs and expectations related to the category of a product. Influence of the previous experience is suppressed.
Branded approach
- Profiles should contain only plausible combinations of levels for the brand. This can be established by creating proper classes of the products. Presence of implausible attribute level combinations might severely impair credibility of the whole exercise.
- Choices are influenced by respondent's previous experience, likes and dislikes, expectations an other brand image related factors.
- Use of incomplete profiles should be avoided else the choices would reflect the brand implicit values of attributes not explicitly shown in the concepts and profiles.

Conjoint tasks

The question common to all tasks in a conjoint block should be aimed at the action or event probability that is to be estimated and modeled. Improper wording may cause a misinterpretation of the task. The most common mistake is to ask a general question about "most attractive", "most advantageous", "best", etc., profile. The question should be formulated in the frame of PEBSE model (personality, expectations, behavior, situation and environment), and evoke the conditions close to those the outcome will be modeled for.

The feel and look of a task presentation plays an important role as well. Graphics, if used, should be of the same graphical design and quality for all equivalent items. Regularity and evenness is more important than the actual quality. Clients can usually provide good visual representations (pictures, videos) of their own products but often forget about the competing products included in the test. Such a situation is very difficult and should be remedied at the very start of the project design.

Holdout tasks

Some analysts use several "holdout" tasks, identical for all respondents and interspersed among randomized tasks. The purpose is to validate the current or select an alternative model of choice behavior by comparing the choice probabilities from the holdout tasks with those calculated from the current model, usually with the holdout tasks excluded from the estimation. The simplest approach, often used in CBC as a check of internal consistency, is to compute the percentage of correctly predicted choices. It relies on mutual independence of the choices in holdout and estimation tasks thus resembling the bootstrap methods for checking data validity.

Profiles in holdout tasks are usually designed in a managerial way and their attributes are correlated. This is why the answers to holdout tasks should not be included in the model as the risk of parameter estimation bias is quite high.

Holdout tasks may be useful in the following cases:

The profiles are existing products and at least one of them is assumed to be from the consideration set (scope) of the respondent.
In a partial-profile based conjoint exercise, the holdout full profiles can indicate if there is a need for rescaling of alternately omitted attributes.
The sample size is relatively large and the conjoint exercise simple so that it is affordable to remove some choices from the estimation.

We use holdout tasks only seldom for the following reasons.

Conjoint tasks are usually quite demanding for respondents. An excessive number of tasks prolongs the interview, increases fatigue and decreases reliability of later choices.
There is a danger of bias due to the fact that all respondents get the same tasks while their consideration sets differ. A universal choice set gratifying all respondents equally is not realistic.
Internal consistency of a respondent can be checked without holdout tasks using the fit of the observed to the predicted data.
As aside
- When a metric model is used, the correlation between stated and estimated utility can be computed for a person. As a rule of thumb, coefficient of determination (one minus ratio of the residual and stated variability) under 0.25 signals a severe problem.
- When a DCM model is used, a simple measure can be based on the predicted probability P for the chosen profile. The choice probability for the actually chosen profile is 1 by definition. With J being the number of items in a choice-set the computed choice probability can never be lower than 1/J, the value for a random choice. A ratio (odds) of predicted P to null-hypothesis probability 1/J can be used. A geometric mean value below 2 for a respondent signals a severe problem.
A test of the assumed model is hardly feasible in practice since a different model would require a different design.
Association of holdout tasks to the market is too loose to provide a reliable "external validation" of the conjoint exercise. Covering the whole market is simply impossible. Use of a simulation is a better approach.

Estimation of part-worths

The estimated utility of a product is a function of part-worths of the individual attributes and their interactions, in case of the additive model as their sum. It is of advantage to work with individual-based part-worths. This, in most practical cases, allows to neglect between-attribute interactions that are the most frequent reason for heterogeneity of the sample estimates. The following estimation methods deserve mentioning.

Metric methods
- OLS - Ordinary Least Squares regression
  OLS regression has been nearly abandoned as it brings in an uncontrollable bias of estimates, especially for the outer levels of attribute ranges.
- Robust Linear Regression
  Least median instead of mean, absolute deviation instead of its square, Winsorized LS, max deviation, monotonic regression and other methods have been attempted to eliminate the inherent bias of OLS. E.g., the popular monotonous regression is very sensitive to outliers.
- OLS of ranks with dummy and slack variables
  The authors of the method (Evgeniou, Boussios and Zacharia, 2005) claim that "While theoretically questionable ... it produces almost the same results as non-metric techniques".
Non-metric methods (generally applied to MNL - Multinomial Logit Model with additive kernel)
- Pair-wise switching
  The number of wrong predictions for pair-wise preferences is minimized.
- ML-MLE - Mixed Logit Maximum Likelihood Estimation
  It is essentially the standard logit model with coefficients that vary in the population according to the preselected type of distribution.
- HB-MCMC - Hierarchical Bayes - Markov Chain Monte Carlo estimation technique.
  As aside
  - Based on the author's experience since 2002, HB-MCMC technique, with no doubt, is the best option.

Calibration of utilities

Utilities estimated from conjoint exercise reflect the preferences that respondents revealed through their answers to the stimuli in a study. While metric methods based on ratings or stated purchase likelihoods lead to stated utility values, utilities obtained from a DCM-based method are only relative measures of preferences. These "raw" utilities can be interpreted only as differences, e.g. in preference share simulations with shares summing to 100%.

In order to estimate measures specific for any simulated product concept such as stated acceptance, projected sales, competitive potential or other "absolute" measures, calibration questions outside the framework of the conjoint method must be asked and applied. Calibration values are usually purchase likelihood statements on a set of product profiles. The calibration is a regression-based modification of raw utilities to reflect the calibration answers.

A calibration procedure provides expected stated values. To estimate market derived values, market data would have to be known.

Hybrid methods

Dimensionless nature of DCM allows for combining data from several sources and estimation of the common parameters. Some data may come from standard MR questions converted to a DCM format. This gives a way to various hybrid methods.

Additional data may come from questions designed to obtain preferences within separate groups of aspects such as levels of a single attribute. The data can be used as "soft constraints" for the data from a CBC exercise thus allowing for improving the estimates. When an exercise known as the Best-Worst Case 2 [Louviere et al., 1995] is added levels of all attributes can be put on the same scale. This approach is known as CSDCA - Common Scale Discrete Choice Analysis.

The questions on preferences of attribute levels may classify the levels as acceptable and unacceptable thus providing prior estimates of acceptability thresholds for the attributes. The thresholds can be used in a non-compensatory model for perceived acceptability of simulated products (the link is under preparation).

Interpretation of utilities

In the ideal case the computed part-worths contain the comprehensive information obtained from the DCM model. No wonder many conjoint analyses end with computation and presentation of part-worths or their simple transformations. There are still several more methods of utility interpretation and presentation.

A robust and attractive feature, and often the main goal of a conjoint study, is running and exploring what-if simulations with optional scenarios of interest.
Quantitative attributes allow for comparison of their elasticity of substitution. This is especially useful for prices for which elasticities and cross-elasticities can be obtained.
Products of interest do not live alone in the market. Competitive potential of a single product or a group of products can be estimated.
Similarly, an optimal competitive price of a product can be computed as a weighted WTP (Willingness To Pay) value for a given environment of competing products.
TURF (Total Unduplicated Reach and Frequency) analysis of binary data (buy/don't buy) is quite common. The DCM Portfolio Optimization method is an analog of TURF that uses quantitative utility values instead of binary values.
Conjoint data can be used in segmentation analysis of various kinds. From a behavioral view, clear and unobstructed results can be obtained using LCA - Latent Class Analysis of the choices obtained from a conjoint exercise. LCA provides very clean hierarchically partitioned classes of subjects.

As aside
Some authors recommend segmentation based on the estimated part-worths. It seems that segmentation based directly on the behavioral responses (choices) should be preferred because computed utilities carry over the influences from the assumed model and part-worths estimation method.
Perceptional acceptability (perceptance) of attribute levels ranging from -100% (repulsion) to 100% (attraction), and comparable between attributes, is available when acceptability thresholds for attributes are known.