Next: Conclusion Up: Experimental Results Previous: Performance of Individual Experts

Results of Combination

Equally Weighted Combination of Classifiers In the first experiment we compared the combination strategies in which all experts are deemed to carry the same weight. Five commonly used multiple expert fusion schemes (mean, max, median, min and vote combiners) described in [ 2 ] were investigated. The errors produced on the two independent test sets A and B are shown in Table  2 . We find that the mean combiner outperforms the others in three out of the four cases (Set A Error-1 and -2 and Set B Error-2). The worst performance for both figures of merit is achieved by the min combiner.

Comparing the results of multiple expert fusion in Table  2 with the results produced by the individual experts shown in Table  1 we note that all the strategies excluding the min combiner perform better than the worst individual expert. This statement is correct for both errors considered here. On Set A, the mean combiner performs better or as well as the best individual expert with the exception of Error-2 on Set B. Among the rest, the median outperforms the best individual expert only in terms of Error-1. These comparisons illustrate that, although on the whole fusion offers a higher performance in comparison with the worst expert, it may fail to perform better than the best individual expert.

Weighted average combiner This section examines the benefit of incorporating weighting factors in multiple expert fusion. In each experiment, one independent data set is used to determine the best combination of weights for the weighted average combiner for each figure of merit. The performance of the resulting combiner is tested on the other data set. The role of Sets A and B is then interchanged. The best set of weights, is obtained using an exhaustive search method by changing the weights incrementally between zero and one. Table  3 shows the combination of weights for the weighted average combiner using Error-1 as the figure of merit. In both cases the results on the independent tests show an improvement over the equally weighted expert fusion. Note from the second row that for Set A the mean combination of two experts gives a better performance than the mean combination of the four experts. The results in both cases are not only better than the best classifier used but also better than the results produced by any of the equally weighted combiners. From the weights associated with the individual experts in Table  3 and the performance of each single classifier for Error-1 in Table  1 we find that the expert with the best performance, MLP, is always included in the set of multiple experts and the worst classifier, K -NN, is always excluded.

 

 


Table 3: The performance of the weighted average combiner for Error1. The first row shows the weights trained on Set A and the results obtained on Set B. The second row shows weights trained on Set B and the results obtained on Set A.

Similar experiments were performed to find the best combination of weights when Error-2 is used as a performance measure. The results of this experiment are shown in Table  4 . The best performance for both sets, Set A and Set B, is achieved by combining the RBF, K -NN, MLP and Gaussian classifiers with the weights of 0.5, 0.3, 0.2 and 0.0 respectively. The results for both data sets are much better than those obtained by the best individual classifier. By considering the weights in combination with the performance of the individual classifiers presented in Table  1 we observe that the best individual expert (RBF), which produces minimum Error-2, has the highest weight among the set of available experts while the worst individual expert (Gaussian) takes the lowest weight.

 

 


Table 4: The performance of the weighted average combiner for Error2. The first row shows the weights trained on Set A and the results obtained on Set B. The second row shows the weights trained on Set B and the results obtained on Set A.

From these observations it is apparent that fusing the best set of designs will result in a better performance than a simple averaging of a larger number of classifiers regardless of their performance.



Next: Conclusion Up: Experimental Results Previous: Performance of Individual Experts

S Ali Hojjatoleslami
Tue Jul 15 17:20:44 BST 1997