The approach to labelling which is proposed in this paper has been implemented successfully as part of an experimental form reader system. The system contains a preprocessing stage which performs word labelling and is followed by a contextual postprocessing stage. The results presented below summarise the performance of the word-labelling stage of the form reader, whilst some results for the contextual postprocessing stage may be found in [ 3 ].
46 binary ``change of address'' images have been used as an initial testset. An example image is shown in figure 4 .
Three different labelling models of varying flexibility were produced in order to allow the performance of the labelling sub-system to be evaluated. None of the labelling models make use of any absolute geometric constraints.
The first labelling model,
, is particularly restrictive and only models the simple address format
shown in Figure
4
. Labelling model
is a superset of
which allows postcodes to be split into two words. Finally, labelling
model
is a superset of
and allows the postcode to occur twice in the image (this occurs on
several images) and the county to consist of two words.
Labelling results for the 46 images are presented below. (The CPU times quoted are measured for the labelling algorithm running on a 20MHz Sparc station with 8Mb of RAM).
Labelling results for 46 layout models.
correctly deemed to be outside of labelling models bounds
74 %
52 %
43 %
n
labellings hypothesised, one of which was correct
22 %
44 %
48 %
n
labellings hypothesised, none of which was correct
4 %
4 %
9 %
Mean value of
n
1.0
1.0
1.5
Mean time taken to perform labelling (in seconds)
0.02
0.03
0.07
Maximum time taken to perform labelling
0.03
0.04
0.15
These initial results are encouraging, particularly since, even for the
least restrictive labelling model,
, less than one in ten layout models was completely mislabelled.
Mislabellings occur when the correct labelling for a particular image
has not been encoded within a given labelling model, but the layout
model satisfies the layout constraints of one of the hypothesised
labelling alternatives. Thus, successful analysis of a sample image
either provides a set of hypothesised labellings which includes the
correct labelling, or an image which is rejected as outside the bounds
of the specified labelling model. Once contextual postprocessing is
properly integrated within this system, detecting the mislabellings
amongst the hypothesised set of labellings will be straightforward.
To process images for which no labellings are correctly found (e.g.
figure
5
), a more flexible superset of labelling model
is required.
Figure 4:
An example image from the testset.
Figure 5:
An image from the testset which is mislabelled.
Cracknell C R W