SCE - Sequential Choice Exercise

Concept testing with concepts presented randomly and evaluated one by one is little efficient. Frequent ties and inconsistencies in evaluation are the major disadvantages. Evaluations often show a drift to lower acceptances and to a lower discrimination between the concepts offered later. Reliability of answers is decreased as fatigue and annoyance of respondents from the interviewing is increasing.

A remedy was searched in various flavors of concept sorting. "Sort conjoint" based on ranking of concepts was popular some decades ago. Q-Sort method is experiencing a renaissance. While efficiency could be improved some problems appeared.

When sorting a number of items in place respondents tend to give more weight to general benefits and advantages they see rather than to their personal needs, expectations, affordability or other personal values. When the items are permanently visible during the sorting the respondent may get impression that the final decision of the actual choice can be made later.

Solution

The method was inspired by the VCT - Virtual Concept Test [Dahan, E., and Hauser, J.R. (2002). The Virtual Customer. Journal of Product Innovation Management, 19, 332-353]. VCT is experimentally a sequential choice method without replacement and, therefore, does not suffer from the drawbacks inherent to in-place sorting. Such an arrangement allows to achieve an "out of sight, out of mind" state for previous choices from the set, and makes the choices more realistic.

In SCE, the original monotonous OLS estimation method of ranks has been replaced with hierarchical Bayesian estimation of part-worth from a number of custom generated choice sets.

As aside

A rigorous solution of obtaining aggregated utility estimates from sequential choice data was given by Louviere and Woodworth (1983) [Louviere, Jordan J., and Woodworth, G. (1983). Journal of Marketing Research, 20(11), 350-67.]. A review of various approaches is in Greene and Hensher (2008).
The individual-based sequential choice problem is usually solved using the rank-explosion rule [Chapman, R.G., and Staelin, R. (1982). Exploiting Rank Ordered Choice Set Data Within the Stochastic Utility Model. J. of Marketing research, 19, 288-301]. The rule requires that the error term in random utility difference is constant for each subsequent choice. This is approximately true for sequential choices from a set with cardinality of an order higher than the number of choices. In a situation typical for interviewing the number of items is relatively small, and the rule requirement is not satisfied. There are some points to be considered.
- As the amount of information obtained from each consecutive choice decreases, so does the inherent scaling factor. The estimated differences between utilities obtained from the earlier choices are therefore larger than those from the later choices.
- The exploded choices are stochastic independent but the design matrix for combined choice tasks is not balanced in respect to pairwise combinations of items.
- The common consequence is the bias known from the "second choice" CBC method. E.g., Sawtooth, Inc. claim that "The utility of the best level within each attribute is biased downward for second choice utilities". Sawtooth have removed the option of multiple sequential choices available in their early version of CBC software.
Another solution is presented in the paper Common Scale Hybrid Discrete Choice Analysis [Orme, Bryan (2013). Common Scale Hybrid Discrete Choice Analysis. Sawtooth Software Technical Papers, Sawtooth Software, Inc.]. A set of choice-sets, each consisting of two items, is constructed by combining the best choice with the second best one, the second best choice with the third best one, etc. The last left item (the worst one) is not combined with the first choice as this combination would contribute with only negligible amount of information. If the ordering is not complete, the last best chosen item is combined with all the left (not chosen) items.
The method leads to fully orthogonal choice-sets allowing highly reliable estimates. The number of choice sets entering estimation is given by the number of rankings, and cannot be modified to conform to a number of other choice sets in a study.
The solution used in the SCE method is based on properties of permutohedrons that provide a natural representation of ranking probability [Zhang, Jun (2004). Binary choice, subset choice, random utility, and ranking: A unified perspective using the permutohedron. Journal of Mathematical Psychology, 48, 107-134]. The underlying idea is that each choice made in sequence removes the same amount of information entropy from the initial choice entropy from the set of all items having entered the test.
The ranking approach has both advantages and caveats.
- Advantages
  - The main advantage lies in the amount of information obtained. For example, sorting a set of N = 8 items is equivalent to 28 choices from all pair-wise combinations of the 8 items, as comb(2 from 8) = 8!/(2!×(8-2)!) = 28. Say there were 6 unacceptable items excluded from the initially offered set of 14 items. This is as if there were further 8 x 6 = 48 choices from paired combinations.
  - Both the Orm's and SCE methods completely remove the bias originating from the decreasing size of the choice set inherent to the rank-explosion rule.
- Disadvantages
  - A sequential choice does not take account of evaluation ties among the items.
  - Since the number of rankings differs for different respondents a different number of choice tasks must be generated for each respondent to avoid correlations of the tasks between respondents.
  - Only Bayesian parameter estimation is possible. The generated individual design plans are always supersaturated in parameters.
  - Interaction terms are not estimable.

Properties

Ranking without ties

A number of consecutive choices from a set can provide more information in lesser time and effort compared to a number of single choices from randomized choice sets. It is known that ranking of low priority items is generally subject to gross error. It is often possible to break the ranking process at some point or to limit the maximal number of sequential choices without an excessive loss of information.

The prospect of a gradually simplified task prods a respondent to more reasonable decisions and statements. Each of the following choices takes less time since the decision maker has already come down to an opinion about the items remaining in the choice set. The influence of the previous choices is decreased since they are not visible anymore and are (at least partially) forgotten. The interviewing time and the exercise-induced fatigue of respondents is substantially lowered which contributes to higher reliability of the data.

As aside

A grouping of items by their acceptability followed by sorting the items in the preselected groups is typical for the Q-Sort ranking method. It has been applied in some metric methods, e.g. the proprietary "build conjoint" developed in Sofres, France.

The main results of SCE analysis are part-worths of the items and can enter what-if simulations of the share type. If calibrated to become item utilities, they can be used in estimation of stated acceptance and competitive potential of the concepts.

A unique advantage of SCE lies in its general DCM principle allowing to combine SCE data with data from other types of discrete choice, such as CBC, MaxDiff, single or multiple-choice batteries (of the type "select any that apply"), best-worst choice exercises, etc.

The only but crucial disadvantage of SCE is a limited number of items that can be ranked. As in any sorting method, this is due to the requirement a respondent must read and remember all the tested items before having made the first choice. When items are very complicated (such as banking, insurance, etc. products) SCE is appropriate to only a few profiles. The limit for statements not exceeding about 15 words (say a short line) is about 15 items. Good results have been obtained for 20 items each described by 4 words at most.

As aside

The number of items a respondent is able to choose from is a subject of discussion. It is believed the interest, involvement and knowledge of the respondent, and low complexity of the items, are the most important factors.
A mall can be reminded where a selection of many tens of (quite similar) items is available. If the most favorite item is not available, a customer will usually make a different choice. Repeating this step (and using the rule of induction) the SCE method is obtained.

If the number of items is too high to be tested in a single SCE exercise, subsets of concepts can be generated from the master set. In a computer controlled interview, every respondent can be presented a different orthogonal and balanced subset of items. Alternatively, a preliminary piling of items into groups suggested in Q-Sort method can be used, and the piles or their subset are amenable to SCE exercises.

The low number of concepts the SCE exercise allows may be prohibitive if relatively subtle differences in concepts are of importance. In such a case either CBC - Choice Based Conjoint or MaxDiff - Maximum Difference Scaling should be the preferred method. In contrast to SCE, these methods require much simpler descriptions of concepts as each concept has to be shown to and evaluated by a respondent many more times.

Ranking with ties

Q-Sort method is a typical example of a method leading to ranks with ties. The same is true for batteries of evaluation questions on a Likert scale when converted to ranks. The current methods of data transformation to be used in DCM rely on specially constructed likelihood terms to be maximized. The principle is a merger of two or more tied choices in the same choice set. Unfortunately, the method is not available in commercial estimation programs.

Using the choice set generation method developed for SCE, account of tied ranks can be taken by omitting the choice sets containing tied items of which any item has been chosen. As the set of choice sets is near-to-orthogonal the loss of information is proportional to the number of tied choices in the data. In this way the actual data weight for the respondent is reflected. This approach allows for DCM-based processing of data from Likert scale batteries (see an example below) or Q-Sort exercises, and merging them with genuine choice-based data such as CBC, MaxDiff, MBC, SCE, etc.

As aside

The method has been proved valid even for data obtained from batteries of the type "check at most N items that apply" provided the total number of items is greater than 2N. However, an application for data with a small number of checks is of no advantage. The result is equivalent to frequency counts. The method may be of some advantage when the discrimination power should be enhanced with Bayesian influence amplifying common treats among respondents.
The SCE method is useful for transforming data for a segmentation procedure. The detrimental equalities or outliers are removed or suppressed with the Hierarchical Bayes estimation.

SCE interviewing method

Usage

The original purpose of a standalone SCE was an estimation of a competitive potential of products or services in presence of competing products, typically in a pre-launch study. The discrimination power appears to be clearly better than from evaluation of concepts shown by random. The improvement is comparable to that of a switch from a standard battery-based evaluation of items to the MaxDiff - Maximum Difference Scaling method.

If the items to be sorted are product concepts, they should be of a managerial type. They should reflect the expected demand and utilize the known trade-offs of the product attributes, typically performance and quality vs. price. The concepts should be provided by the producer or vendor with the advantage of skintight knowledge and expertise a research agency seldom has. Each concept that enters the test is considered an independent entity with its own utility. In contrast to a conjoint study, there are no limitations imposed on the properties (attributes and their values) of the concepts.

SCE is a very general concept and method. It can be deployed anywhere a ranking is desirable. It has proved to be an alternative to MaxDiff in case the number of tested items is small, or the items are complicated, i.e. not easily comprehended, assessed and evaluated in a brief (and superficial) judgment.

Example: External validation

The comparison led to a conclusion that SCT gave a worse agreement with the market and higher error than CBC. In addition, respondents claimed higher annoyance and a temptation to prematurely finish the interview with SCT. This could be attributed to the request for making all possible 15 choices. From today's view and knowledge, 5 to 8 choices would do. However, there were still only 16 product profiles shown to each respondent in SCT compared to 105 product profiles in CBC.

SCE is particularly useful for testing (a relatively low number of) complicated and hard to evaluate profiles rather than for simple brand-price ones where CBC is no doubt the preferred method.

Examples

Ranks without ties

A full-fledged statistical verification of the SCE method would require a subsidized study. To present a visual proof, SCE data were simulated from the results of a MaxDiff study on importance of 36 items related to a banking service. Every respondent from a sample of 870 made the best choices from 14 randomized sets by 5 items. SCE data were obtained by simply sorting the obtained importance part-worths. The first 18 simulated choices were used in the SCE estimation. Items are labeled as B01 to B36 by decreasing aggregated influence obtained from the MaxDiff study. The influences were computed as choice likelihoods from the full set of items, i.e. they sum to 100%.

The results for SCE are similar to those for MaxDiff with the first reversal for item B15. This may be due to non-existent (i.e. zero) within-respondent covariances between items for rankings. For the same reason the scaling factor (steepness) of the estimated part-worth is slightly higher than in MaxDiff. These differences should have no detrimental effect on interpretation of results in practice.

Since an SCE experiment is easily implemented, a simpler method than building a special design and use of hierarchical Bayes estimation has been searched for. Based on "look & feel" of typical results from MaxDiff and some theoretical assumptions, a simple finite computational procedure has been developed. Results for reduced numbers of sequential choices simulated from the above MaxDiff are in the picture below.

The simplified computation procedure with only 6 (simulated) choices from 36 items seems to give a quite satisfying result.

Ranks with ties

Direct ranking in a questionnaire can be used only for sets with a limited number of items. With many tens of items, battery of questions asking to evaluate items on a Likert scale is more appropriate. A DCM approach with hierarchical Bayes estimation can improve results and their interpretation.

An employee attitude study for 1000 respondents was composed as a battery of 60 items organized in 8 blocks. Both the blocks and items in each block were randomized to compensate for possible drifts in answers. The symmetrical Likert scale had 6 levels from 1 - strongly disagree to 6 - strongly agree. Respondents have largely used the positive side of the scale.

So that the direct answers could be compared with an SCE tied rank approach, the stated values were decreased by a "neutral level" value 3.5 common to all respondents, and then averaged over the sample. The resulting pattern should be similar to perception values used e.g. in OBIMA method. The required reference "zero" item (a threshold) common to all respondents was assigned a formal rank that was preferred over the Likert level 3, and the Likert level 4 was preferred over the reference. The results are shown below.

Patterns of aggregated results from both methods of analysis are nearly identical with no significant systematic bias in SCE. Some differences can be attributed to different definitions of reference items, i.e a fixed value (the third one) in a Likert scale and an estimated threshold value (zero part-worth) in SCE approach. While Likert values are arbitrary and have no direct interpretation, estimated SCE perceptions are closely related to the number of respondents who either agree or disagree with the statement.

As aside

It is to be stressed that the results shown above have been prepared just to demonstrate validity of the SCE method for a large number of ties. The main purpose of SCE can be seen in combination of battery-based or other tied rank data with virtually any other preference data that can be analyzed with tools appropriate to DCM.

Data from large Likert-scale batteries are often subject to a clustering. In this case, as it is in many other cases of direct clustering based on the answered values, respondents were categorized according to the interval of values they were selecting from. E.g., a 3 group solution consisted of groups of respondents selecting either mostly low, or medium, or high Likert values. Clustering based on relative values obtained by centering the answers for each respondent lead to apparently reasonable groups, however, the partitioning of groups was not sufficiently clean. A new group in (K+1) group solution was too often created by taking respondents from several groups of K group solution. In contrast, SCE offers a possibility to base clustering on probability values in a way similar to LCA - Latent Class Analysis known for clean partitioning. Use of hierarchical Bayes estimation can correct outliers or ties by sweeping them closer to the sample means thus facilitating the clustering process. The result of clustering 977 respondents (23 gave identical answer for all 60 evaluated items) based on relative perceptions (rather than on absolute perceptions shown above) is in the picture below.

It is important to remember the groups are based on relative rather than absolute evaluation of items, i.e. the differences between the evaluation levels selected by each respondent. An interesting finding is that all those happy with their job are critical to management and see future prospects for themselves even when they consider their current job dull. On the other hand, those dissatisfied and disillusioned do not see future prospects of their personal growth irrespective of working climate or management evaluation.

Problem

Solution

Properties

Ranking without ties

Ranking with ties

SCE interviewing method

Usage

Example: External validation

Examples

Ranks without ties

Ranks with ties