Problem

Concept testing with concepts presented randomly and evaluated one by one is little efficient. Frequent ties and inconsistencies in evaluation are the major disadvantages. Evaluations often show a drift to lower acceptances and to a lower discrimination between the concepts offered later. Reliability of answers is decreased as fatigue and annoyance of respondents from the interviewing is increasing.

A remedy was searched in various flavors of concept sorting. "Sort conjoint" based on ranking of concepts was popular some decades ago. Q-Sort method is experiencing a renaissance. While efficiency could be improved some problems appeared.

When sorting a number of items in place respondents tend to give more weight to general benefits and advantages they see rather than to their personal needs, expectations, affordability or other personal values. When the items are permanently visible during the sorting the respondent may get impression that the final decision of the actual choice can be made later. 


Solution

The method was inspired by the VCT - Virtual Concept Test [Dahan, E., and Hauser, J.R. (2002). The Virtual Customer. Journal of Product Innovation Management, 19, 332-353]. VCT is experimentally a sequential choice method without replacement and, therefore, does not suffer from the drawbacks inherent to in-place sorting. Such an arrangement allows to achieve an "out of sight, out of mind" state for previous choices from the set, and makes the choices more realistic.

In SCE, the original monotonous OLS estimation method of ranks has been replaced with hierarchical Bayesian estimation of part-worth from a number of custom generated choice sets.

As aside

Properties

Ranking without ties

A number of consecutive choices from a set can provide more information in lesser time and effort compared to a number of single choices from randomized choice sets. It is known that ranking of low priority items is generally subject to gross error. It is often possible to break the ranking process at some point or to limit the maximal number of sequential choices without an excessive loss of information.

The prospect of a gradually simplified task prods a respondent to more reasonable decisions and statements. Each of the following choices takes less time since the decision maker has already come down to an opinion about the items remaining in the choice set. The influence of the previous choices is decreased since they are not visible anymore and are (at least partially) forgotten. The interviewing time and the exercise-induced fatigue of respondents is substantially lowered which contributes to higher reliability of the data.

As aside
A grouping of items by their acceptability followed by sorting the items in the preselected groups is typical for the Q-Sort ranking method. It has been applied in some metric methods, e.g. the proprietary "build conjoint" developed in Sofres, France.

The main results of SCE analysis are part-worths of the items and can enter what-if simulations of the share type. If calibrated to become item utilities, they can be used in estimation of stated acceptance and competitive potential of the concepts.

A natural next step in interpretation of the results is search for the best subset of the tested concepts that would gain the highest reach using DCM Portfolio Optimization method. Additional information can be obtained from grouping interviewed subjects according to the appeal the concepts have on them using LCA - Latent Class Analysis.

A unique advantage of SCE lies in its general DCM principle allowing to combine SCE data with data from other types of discrete choice, such as CBC, MaxDiff, single or multiple-choice batteries (of the type "select any that apply"), best-worst choice exercises, etc.

The only but crucial disadvantage of SCE is a limited number of items that can be ranked. As in any sorting method, this is due to the requirement a respondent must read and remember all the tested items before having made the first choice. When items are very complicated (such as banking, insurance, etc. products) SCE is appropriate to only a few profiles. The limit for statements not exceeding about 15 words (say a short line) is about 15 items. Good results have been obtained for 20 items each described by 4 words at most.

As aside

If the number of items is too high to be tested in a single SCE exercise, subsets of concepts can be generated from the master set. In a computer controlled interview, every respondent can be presented a different orthogonal and balanced subset of items. Alternatively, a preliminary piling of items into groups suggested in Q-Sort method can be used, and the piles or their subset are amenable to SCE exercises.

The low number of concepts the SCE exercise allows may be prohibitive if relatively subtle differences in concepts are of importance. In such a case either CBC - Choice Based Conjoint or MaxDiff - Maximum Difference Scaling should be the preferred method. In contrast to SCE, these methods require much simpler descriptions of concepts as each concept has to be shown to and evaluated by a respondent many more times.


Ranking with ties

Q-Sort method is a typical example of a method leading to ranks with ties. The same is true for batteries of evaluation questions on a Likert scale when converted to ranks. The current methods of data transformation to be used in DCM rely on specially constructed likelihood terms to be maximized. The principle is a merger of two or more tied choices in the same choice set. Unfortunately, the method is not available in commercial estimation programs.

Using the choice set generation method developed for SCE, account of tied ranks can be taken by omitting the choice sets containing tied items of which any item has been chosen. As the set of choice sets is near-to-orthogonal the loss of information is proportional to the number of tied choices in the data. In this way the actual data weight for the respondent is reflected. This approach allows for DCM-based processing of data from Likert scale batteries (see an example below) or Q-Sort exercises, and merging them with genuine choice-based data such as CBC, MaxDiff, MBC, SCE, etc.

As aside


SCE interviewing method

The usual steps in an SCE - Sequential Choice Exercise are as follows. 

  1. Respondent is presented with an initial set of concepts. When the number of concepts is high, respondent may eliminate unacceptable concepts before entering SCE in one or more steps. The number of the remaining concepts which make the choice set for the actual choices should not fall under some pre-stated value.
  2. Respondent is given enough time to consider pros and cons of all concepts in the choice set. Any type of manipulation with the concepts that might help respondent in making the decisions and statements, should be allowed. This would give respondent a chance to apply their own way of sorting, perceiving and evaluating the information.
  3. Respondent chooses the concept with the highest evaluation (e.g. purchase intention or other measure of interest).
  4. Respondent states the evaluation of the chosen concept.
  5. If relevant, supplementary questions may be asked about the chosen concept .
  6. The chosen concept is removed from the choice set.
  7. If the given evaluation has reached or exceeded some pre-stated value or the number of the remaining concepts is higher than a pre-stated number (usually less or equal to a half of the initial number of concepts), the interview continues with a new choice (step 3).
  8. The test is terminated when some minimal number of choices has been reached or the evaluation is lower than some pre-stated value. This condition depends on actual properties of the items and the purpose of the test.
The questions 4 and 5 can be omitted if only ranking is required. A purely technical demo of SCE web questionnaire is available.


Usage

The original purpose of a standalone SCE was an estimation of a competitive potential of products or services in presence of competing products, typically in a pre-launch study. The discrimination power appears to be clearly better than from evaluation of concepts shown by random. The improvement is comparable to that of a switch from a standard battery-based evaluation of items to the MaxDiff - Maximum Difference Scaling method.

If the items to be sorted are product concepts, they should be of a managerial type. They should reflect the expected demand and utilize the known trade-offs of the product attributes, typically performance and quality vs. price. The concepts should be provided by the producer or vendor with the advantage of skintight knowledge and expertise a research agency seldom has. Each concept that enters the test is considered an independent entity with its own utility. In contrast to a conjoint study, there are no limitations imposed on the properties (attributes and their values) of the concepts.

SCE is a very general concept and method. It can be deployed anywhere a ranking is desirable. It has proved to be an alternative to MaxDiff in case the number of tested items is small, or the items are complicated, i.e. not easily comprehended, assessed and evaluated in a brief (and superficial) judgment.

 SCE of concepts is of advantage in the following cases:
  • Attributes of the concepts are strongly correlated
  • Attribute level combinations are restricted
  • Concepts are complex and difficult to read or understand
  • Concepts are fixed and specific
  • Formats of concepts are dissimilar
  • A full-fledged conjoint study is not practicable

Typical use of SCE is in the following tasks:
  • CBS - Choice Based Sampling (a tool for sample splitting)
  • Calibration of DCM part-worths (coming from CBC, ACA, MaxDiff, etc.)
  • Nominal attribute levels screening in the module PRIORS of CSDCA - Common Scale Discrete Choice Analysis
  • Location of attribute level thresholds of perceived acceptance for use in non-compensatory choice simulation
  • OBIMA - Object Image Analysis as the base interviewing method
  • Product concept test, especially competitive potential estimation of (pre-launch) products in a competitive arrangement
  • Full-profile "card" conjoint (useful in pen-and-paper interviewing when computer control is not feasible)
  • Multiple-choice batteries of items or statements


Example: External validation

The SCE method, formerly named SCT and using monotonic regression in its early stage of development, was compared with CBC in a brand-price study of cellular phone handset demand in Germany in 2003. The comparison was carried out by then NFO-Infratest in Munich. Each brand was assigned 5 price levels. All 16 brands, each set at a certain price level, were presented as a single choice set for sequential choice. In the referential CBC, each respondent had to make choices from 15 (different) choice sets, each of them consisting of 7 brands and their prices, and the alternative "None of them". Final results are shown in the picture below.

Comparison of CBC and SCE methods

The comparison led to a conclusion that SCT gave a worse agreement with the market and higher error than CBC. In addition, respondents claimed higher annoyance and a temptation to  prematurely finish the interview with SCT. This could be attributed to the request for making all possible 15 choices. From today's view and knowledge, 5 to 8 choices would do. However, there were still only 16 product profiles shown to each respondent in SCT compared to 105 product profiles in CBC.

SCE is particularly useful for testing (a relatively low number of) complicated and hard to evaluate profiles rather than for simple brand-price ones where CBC is no doubt the preferred method.


Examples

Ranks without ties

A full-fledged statistical verification of the SCE method would require a subsidized study. To present a visual proof, SCE data were simulated from the results of a MaxDiff study on importance of 36 items related to a banking service. Every respondent from a sample of 870 made the best choices from 14 randomized sets by 5 items. SCE data were obtained by simply sorting the obtained importance part-worths. The first 18 simulated choices were used in the SCE estimation. Items are labeled as B01 to B36 by decreasing aggregated influence obtained from the MaxDiff study. The influences were computed as choice likelihoods from the full set of items, i.e. they sum to 100%.

    
MaxDiff
Simulated SCE
Comparison of computed influences from MaxDiff and simulated SCE

The results for SCE are similar to those for MaxDiff with the first reversal for item B15. This may be due to non-existent (i.e. zero) within-respondent covariances between items for rankings. For the same reason the scaling factor (steepness) of the estimated part-worth is slightly higher than in MaxDiff. These differences should have no detrimental effect on interpretation of results in practice.

Since an SCE experiment is easily implemented, a simpler method than building a special design and use of hierarchical Bayes estimation has been searched for. Based on "look & feel" of typical results from MaxDiff and some theoretical assumptions, a simple finite computational procedure has been developed. Results for reduced numbers of sequential choices simulated from the above MaxDiff are in the picture below.

    
SCE - 9 choices
SCE - 6 choices
SCE influences computed with a simple finite method for 9 and 6 simulated choices

The simplified computation procedure with only 6 (simulated) choices from 36 items seems to give a quite satisfying result.

Ranks with ties

Direct ranking in a questionnaire can be used only for sets with a limited number of items. With many tens of items, battery of questions asking to evaluate items on a Likert scale is more appropriate. A DCM approach with hierarchical Bayes estimation can improve results and their interpretation.

An employee attitude study for 1000 respondents was composed as a battery of 60 items organized in 8 blocks. Both the blocks and items in each block were randomized to compensate for possible drifts in answers. The symmetrical Likert scale had 6 levels from 1 - strongly disagree to 6 - strongly agree. Respondents have largely used the positive side of the scale.

So that the direct answers could be compared with an SCE tied rank approach, the stated values were decreased by a "neutral level" value 3.5 common to all respondents, and then averaged over the sample. The resulting pattern should be similar to perception values used e.g. in OBIMA method. The required reference "zero" item (a threshold) common to all respondents was assigned a formal rank that was preferred over the Likert level 3, and the Likert level 4 was preferred over the reference. The results are shown below.

Centered answers
SCE Absolute Perceptions
Comparison of direct answers (left) and SCE-derived perceptions (right)

Patterns of aggregated results from both methods of analysis are nearly identical with no significant systematic bias in SCE. Some differences can be attributed to different definitions of reference items, i.e a fixed value (the third one) in a Likert scale and an estimated threshold value (zero part-worth) in SCE approach. While Likert values are arbitrary and have no direct interpretation, estimated SCE perceptions are closely related to the number of respondents who either agree or disagree with the statement.

As aside
It is to be stressed that the results shown above have been prepared just to demonstrate validity of the SCE method for a large number of ties. The main purpose of SCE can be seen in combination of battery-based or other tied rank data with virtually any other preference data that can be analyzed with tools appropriate to DCM.

Data from large Likert-scale batteries are often subject to a clustering. In this case, as it is in many other cases of direct clustering based on the answered values, respondents were categorized according to the interval of values they were selecting from. E.g., a 3 group solution consisted of groups of respondents selecting either mostly low, or medium, or high Likert values. Clustering based on relative values obtained by centering the answers for each respondent lead to apparently reasonable groups, however, the partitioning of groups was not sufficiently clean. A new group in (K+1) group solution was too often created by taking respondents from several groups of K group solution. In contrast, SCE offers a possibility to base clustering on probability values in a way similar to LCA - Latent Class Analysis known for clean partitioning. Use of hierarchical Bayes estimation can correct outliers or ties by sweeping them closer to the sample means thus facilitating the clustering process. The result of clustering 977 respondents (23 gave identical answer for all 60 evaluated items) based on relative perceptions (rather than on absolute perceptions shown above) is in the picture below.

Clustering on SCE Relative Perceptions

It is important to remember the groups are based on relative rather than absolute evaluation of items, i.e. the differences between the evaluation levels selected by each respondent. An interesting finding is that all those happy with their job are critical to management and see future prospects for themselves even when they consider their current job dull. On the other hand, those dissatisfied and disillusioned do not see future prospects of their personal growth irrespective of working climate or management evaluation.