CBC-based Cannibalization Estimation

> CBC - Choice Based Conjoint > CBC-based Cannibalization Estimation

It's better to be safe than sorry.

Problem

Estimation of sales cannibalization is important especially when a new product or its upgrade is to be introduced while the current product is still marketed. Cannibalization is usually estimated from the results of a CBC study even when the interactions between attributes are neglected. The reason for restricting analysis of main effects only is justified and well-founded.

The number of possible interactions, in most practical cases, is prohibitively high to be estimated on an individual level. Capturing the heterogeneity in the sample by estimation of individual preferences is more important. The overall efficacy is better than when aggregated interactions are estimated.
Both the number of items in choice sets and the number of tasks are usually too small to achieve a sufficient number of choices from the relevant pairs of items. A satisfactory number would lead to unacceptably long CBC exercise. On the other hand, a small number of items in choice sets suppresses the IIA condition so that utility estimates are less biased. IIA affects mostly a simulation.
Some dedicated methods such as CEA - Cross Effects Analysis require special design and estimation methods. As the interactions are estimated for just several pairs of levels of one attribute (usually product) and all other interactions are neglected, some utility estimation bias is inevitable.

In the simplest case of two products, cannibalization is computed from the decrease of the (old) product preference share when another (the new) product is added to the choice set. Cannibalization can be expressed as a percentage of the old product share decrease or, additionally, as a percentage of the new product shares gained on the account of the old product. Cannibalization estimates based on main effects are always underestimated since the multinomial logit model assumes all items in the choice set are uncorrelated, and the share ratios are constant. The estimated substitution is incomplete. As a matter of fact, the items that cannibalize one on another, are correlated in most cases.

As aside

Correlation of items does not necessarily mean the items have similar utilities. The items must be similar in having some common properties that are the cause of their selection from among other items that do not have those properties. For example, rice and beans are both cheap basic foods, so they are possible substitutes. As their usage is different, they are considered as uncorrelated (in fact, they may be complements in poor societies). When either of them is available both in bulk and packed, the bulk and packed formats can be considered as correlated, even if the utilities of the same quantities will be different.
The above view implies various methods of "correction for similarity" based on product utilities in some preference share simulators will fail.

At the current state of standard design and analysis tools for CBC, we can obtain good estimates of main effects, i.e. attribute level part-worths of the products, but not the correlations among the products for each respondent. However, if we suppose a full correlation between a pair of products, we can estimate the highest possible cannibalization in the pair using the known utility values. It is believed such an estimate could be helpful in managerial decisions since the actual cannibalization will, with high credibility, not exceed the estimated value.

Solution

The nested logit model is an extension of the multinomial logit model that takes account of correlations between items. In a two-level nesting model, items are partitioned into K nests, each with the structural parameter λ_k of the nest k. If λ_k = 0, the items in the nest are perfectly correlated. If λ_k = 1, the items are uncorrelated.

As aside

The the structural parameter λ_k of the nest k is sometimes called index of dissimilarity of the items contained in the nest k.

Probability P_ik (eq. 1) of choosing item i with utility V_i|k from the nest k is the product of two probabilities. P_i|k (eq. 2) is the conditional probability of choosing the item i from the (isolated) nest k, and P_k (eq. 3) is probability of choosing the nest k with expected utility Ε_k (eq. 4). The probabilities and expected values depend on items occupying the nests.

`P_ik` = `P_i\|k` × `P_k`	(1)
`P_i\|k` = exp(`V_i\|k` / `λ_k` ) / `I_k` Σ `i` = 1 exp(`V_i\|k` / `λ_k`)	(2)
`P_k` = exp(`Ε_k`) / `K` Σ `k` = 1 exp(`Ε_k`)	(3)
`Ε_k` = `λ_k` × ln( `I_k` Σ `i` = 1 exp(`V_i\|k` / `λ_k`) )	(4)

As aside

Probabilities P_i|k and P_k have the form of odds ratio known from a multinomial model. The expression for P_i|k is known as the lower-level model, and for P_k as the upper-level model.
When all items are uncorrelated, i.e. all λ_k = 1, the model collapses to a multinomial model. The expected value of a nest becomes "inclusive value" or "log-sum value".

To estimate the highest possible cannibalization value between a pair of products, only K = 2 nests are required to be considered. One of them (k = 1, λ₁ = 0) will contain a pair of perfectly correlated items, and the other (k = 2, λ₂ = 1) all other supposedly uncorrelated items. It can be shown, that with λ₁→0 the expected value approaches to Von Neumann-Morgenstern utility:

Ε_k | λ_k→0 =

I_k
Σ
i = 1

P_i|k × V_i|k

(5)

There is a problem in the expression for P_i|k (eq. 2) as it contains division by zero. In agreement with the actual practice implemented in the major software packages STATA, NLOGIT and SAS, the lower-level model (eq. 2) is considered as multinomial. The respective values of λ_k are omitted (set to 1) in the direct computation of P_i|k values by eq. 2, but not in eq. 4 in their estimation.

As aside

The maximized likelihood function in the nested model estimation is not concave in the whole region of parameter definitions. Setting all λ_k = 1 in the lowest level nests alleviates the problem, and facilitates both estimation and interpretation of utilities as they become comparable between nests. More information can be found in the documentation of the mentioned software packages.
To understand Von Neumann-Morgenstern utility, imagine a nest composed of two, from a view of a user identical products with utilities V₁ = V₂ = V. A person will choose any of the items, given one of them is chosen, with the same probability 1/2. The utility of the nest is 1/2×V₁ + 1/2×V₂ = V. Such an extension of a nest, e.g. a line of products, does not increase the sales, and is known as "red/blue bus" problem. However, if the products are dissimilar, that is uncorrelated with λ_k→1, the expected utility of the nest will approach the inclusive value, and the joint probability of choosing from the nest will be nearly doubled.

The five equations above provide an analyst a possibility to compute the fraction of preference shares of a product lost on account of another product added in the choice set for the two special cases.
-	The considered products are uncorrelated (`λ_k`→1). The estimated cannibalization is the minimal possible.
-	The considered products are fully correlated (`λ_k`→0). The estimated cannibalization is the maximal possible.
The CBC study should cover a substantial portion of possible substitutes. Choice sets should be either small or, better, the presence of the products suspected from the correlation should be avoided.

Properties

Strengths

The method relies on the results from a CBC method based on the multinomial model commonly used in MR.
The method has a sound theoretical justification.

Weaknesses

Only pairwise cannibalization estimates can be considered as reasonable. Fully correlated ternary (and larger) sets would be overly speculative.
Minimal and maximal cannibalization estimates are obtained. The real values are supposed to lie between them.

Example: Non-alcoholic beverages

The client, a beverage company, is producing a well-accepted drink delivered in 0.5L cans. The company decided to extend its portfolio by the same product in 0.5L PET bottles. A bottle can be resealed after partial consumption and thus attract some more consumers. The client wanted to know, among others, what the cannibalization of the current can sales might be.

The CBC study was carried out in two countries, A and B, with 600 users in each country. The direct competition was represented with 7 products supposed to have a similar purpose of consumption. The interviewing format was a subset of CSDCA - Common Scale Discrete Choice Analysis. Priors for the brands were obtained from an SCE - Sequential Choice Exercise and used as soft constraints in the final utility estimation. A randomized Gabor-Granger elimination of 16 discrete prices was used to get priors of price thresholds. The CBC section had 9 tasks with 5 brands on 6 price levels designed independently for each tested brand.

The results of cannibalization estimation are in the table below.

**A flavored sparkling non-alcoholic beverage**
	Country A		Country B
	*No correlation between CAN and PET*	*Full correlation between CAN and PET*	*No correlation between CAN and PET*	*Full correlation between CAN and PET*
Share of CAN, PET absent	27.10 %	27.10 %	12.45 %	12.45 %
Share of CAN, both present	24.10 %	23.58 %	10.08 %	9.70 %
Share of PET, both present	7.29 %	6.87 %	7.27 %	6.90 %
CAN + PET Combined share	31.40 %	30.45 %	17.35 %	16.60 %
Cannibalization of CAN	11.06 %	12.98 %	19.00 %	22.04 %
Gain of PET by cannibalizing CAN	41.11 %	51.22 %	32.53 %	39.77 %

Estimated preference shares of CAN and PET packages with no or full correlation.
Cannibalization of CAN package due to introduction of PET bottle at market prices.

In both countries, when adding PET to the portfolio, the increase in shares is lower when the full correlation is supposed (30.45 % vs. 31.40 % in country A, 16.60 % vs. 17.35 % in country B). The difference is small and does not seem important. A different view can provide the gain of a new product on account of the old one. The gain of PET is made of about 51.22 % or 39.77 % of loss of the old product shares in country A or B, respectively. The lower value in country B is probably due to taking gain from a PET of another brand.

The loss of CAN sales not exceeding 51 % of PET sales is a good message for the producer. The expectation the PET will acquire new customers is promising.