It's better to be safe than sorry.



Estimation of sales cannibalization is important especially when a new product or its upgrade is to be introduced while the current product is still marketed. Cannibalization is usually estimated from the results of a CBC study even when the interactions between attributes are neglected. The reason for restricting analysis of main effects only is justified and well-founded.
  • The number of possible interactions, in most practical cases, is prohibitively high to be estimated on an individual level. Capturing the heterogeneity in the sample by estimation of individual preferences is more important. The overall efficacy is better than when aggregated interactions are estimated. 
  • Both the number of items in choice sets and the number of tasks are usually too small to achieve a sufficient number of choices from the relevant pairs of items. A satisfactory number would lead to unacceptably long CBC exercise. On the other hand, a small number of items in choice sets suppresses the IIA condition so that utility estimates are less biased. IIA affects mostly a simulation.
  • Some dedicated methods such as CEA - Cross Effects Analysis require special design and estimation methods. As the interactions are estimated for just several pairs of levels of one attribute (usually product) and all other interactions are neglected, some utility estimation bias is inevitable.

In the simplest case of two products, cannibalization is computed from the decrease of the (old) product preference share when another (the new) product is added to the choice set. Cannibalization can be expressed as a percentage of the old product share decrease or, additionally, as a percentage of the new product shares gained on the account of the old product. Cannibalization estimates based on main effects are always underestimated since the multinomial logit model assumes all items in the choice set are uncorrelated, and the share ratios are constant. The estimated substitution is incomplete. As a matter of fact, the items that cannibalize one on another, are correlated in most cases.

As aside

At the current state of standard design and analysis tools for CBC, we can obtain good estimates of main effects, i.e. attribute level part-worths of the products, but not the correlations among the products for each respondent. However, if we suppose a full correlation between a pair of products, we can estimate the highest possible cannibalization in the pair using the known utility values. It is believed such an estimate could be helpful in managerial decisions since the actual cannibalization will, with high credibility, not exceed the estimated value.


The nested logit model is an extension of the multinomial logit model that takes account of correlations between items. In a two-level nesting model, items are partitioned into K nests, each with the structural parameter λk of the nest k. If λk = 0, the items in the nest are perfectly correlated. If λk = 1, the items are uncorrelated.

As aside

Probability Pik (eq. 1) of choosing item i with utility Vi|k from the nest k is the product of two probabilities. Pi|k (eq. 2) is the conditional probability of choosing the item i from the (isolated) nest k, and Pk (eq. 3) is probability of choosing the nest k with expected utility Εk (eq. 4). The probabilities and expected values depend on items occupying the nests.

Pik = Pi|k × Pk (1)
Pi|k = exp(Vi|k / λk ) /

i = 1
exp(Vi|k / λk)
Pk = exp(Εk) /

k = 1
Εk = λk × ln(

i = 1
exp(Vi|k / λk)

As aside

To estimate the highest possible cannibalization value between a pair of products, only K = 2 nests are required to be considered. One of them (k = 1, λ1 = 0) will contain a pair of perfectly correlated items, and the other (k = 2, λ2 = 1) all other supposedly uncorrelated items. It can be shown, that with λ1→0 the expected value approaches to Von Neumann-Morgenstern utility:

Εk | λk→0 =

i = 1
Pi|k × Vi|k

There is a problem in the expression for Pi|k (eq. 2) as it contains division by zero. In agreement with the actual practice implemented in the major software packages STATA, NLOGIT and SAS, the lower-level model (eq. 2) is considered as multinomial. The respective values of λk are omitted (set to 1) in the direct computation of Pi|k values by eq. 2, but not in eq. 4 in their estimation.

As aside

The five equations above provide an analyst a possibility to compute the fraction of preference shares of a product lost on account of another product added in the choice set for the two special cases.
 - The considered products are uncorrelated (λk→1). The estimated cannibalization is the minimal possible.
 - The considered products are fully correlated (λk→0). The estimated cannibalization is the maximal possible.
The CBC study should cover a substantial portion of possible substitutes. Choice sets should be either small or, better, the presence of the products suspected from the correlation should be avoided.




Example: Non-alcoholic beverages

The client, a beverage company, is producing a well-accepted drink delivered in 0.5L cans. The company decided to extend its portfolio by the same product in 0.5L PET bottles. A bottle can be resealed after partial consumption and thus attract some more consumers. The client wanted to know, among others, what the cannibalization of the current can sales might be.

The CBC study was carried out in two countries, A and B, with 600 users in each country. The direct competition was represented with 7 products supposed to have a similar purpose of consumption. The interviewing format was a subset of CSDCA - Common Scale Discrete Choice Analysis. Priors for the brands were obtained from an SCE - Sequential Choice Exercise and used as soft constraints in the final utility estimation. A randomized Gabor-Granger elimination of 16 discrete prices was used to get priors of price thresholds. The CBC section had 9 tasks with 5 brands on 6 price levels designed independently for each tested brand.

The results of cannibalization estimation are in the table below.

A flavored sparkling non-alcoholic beverage
  Country A Country B
No correlation between CAN and PET Full correlation between CAN and PET No correlation between CAN and PET Full correlation between CAN and PET
Share of CAN,
PET absent
27.10 % 27.10 % 12.45 % 12.45 %
Share of CAN,
both present
24.10 % 23.58 % 10.08 % 9.70 %
Share of PET,
both present
7.29 % 6.87 % 7.27 % 6.90 %
Combined share
31.40 % 30.45 % 17.35 % 16.60 %
of CAN
11.06 % 12.98 % 19.00 % 22.04 %
Gain of PET by cannibalizing CAN 41.11 % 51.22 % 32.53 % 39.77 %
Estimated preference shares of CAN and PET packages with no or full correlation.
Cannibalization of CAN package due to introduction of PET bottle at market prices.

In both countries, when adding PET to the portfolio, the increase in shares is lower when the full correlation is supposed (30.45 % vs. 31.40 % in country A, 16.60 % vs. 17.35 % in country B). The difference is small and does not seem important. A different view can provide the gain of a new product on account of the old one. The gain of PET is made of about 51.22 % or 39.77 % of loss of the old product shares in country A or B, respectively. The lower value in country B is probably due to taking gain from a PET of another brand.

The loss of CAN sales not exceeding 51 % of PET sales is a good message for the producer. The expectation the PET will acquire new customers is promising.