Next: 2 Rule trees. Up: Hierarchical recognition of structured Previous: Hierarchical recognition of structured

1 Introduction.

1.1 Structured hand-printed document recognition.

A major issue which currently inhibits advances towards commercial application of OCR for hand-print form reading tasks is the low overall accuracy of the recognition performance. In common with other researchers, we consider that to improve accuracy, contextual information must be used, since the performance of OCR is now comparable with a human's capability.

Elsewhere, we describe a novel contextual postprocessing library [ 3 ] which is capable of efficiently enforcing contextual constraints to achieve an optimum overall global interpretation of a form's content. In order to make use of contextual postprocessing however, we often need correctly to assign labels to a group (or groups) of hand-printed words on a form image so that the appropriate contextual constraints can be applied to each word. For example, words making up an address may be written free-format in an address box, rather than in a fully constrained format where each address component is written in a specified physical field.

1.2 The labelling problem.

Many other applications exist where we wish to label a number of objects which appear within an image. Possible relationships between different objects and the conditions under which a particular set of labels may or may not be applied are often known a-priori . The ``labelling problem'' is thus an important general topic in image understanding research.

Although other solutions to the labelling problem have been proposed (for example, relaxation labelling [ 4 ]), these methods typically create and prune a graph which initially maps all possible labels to all possible objects. However, there are many applications where the labelling model used permits so many alternative labellings that this procedure becomes computationally impractical.

In this paper we propose a solution to the labelling problem which utilises a powerful and concise model representation method particularly suitable for domains where the labelling model in use is very large and very sparse. By this, we mean that the model encodes millions of alternative labellings, but that for a given set of objects, the constraints specified in the labelling model imply that only a few labelling alternatives will be hypothesised.

Our solution to the labelling problem has the following features:

Only a small number of constraint types are used to specify the conditions under which a particular labelling may be applied (for the domain of form reading, we use only layout and occurence constraints).
Labelling is deterministic. Therefore, although the method may hypothesise several alternative labellings for the objects within an image, none of the alternatives is favoured.
Labelling is model-based.
- A labelling model encodes the logical constraints which apply to a set of labels. This determines all of the labellings, , which are possible, and which constraints, , must be satisfied in order for a particular labelling, , to be made.
- A layout model encodes the physical layout geometry which has been extracted from the set of objects for which labelling is to be performed. If the information within the layout model is within the bounds of the set of logical constraints, , for , then is recorded as being one hypothesised labelling for the objects.
- Both models make use of hierarchically structured information which is associated with a 2-dimensional image domain, and each model is economically implemented.

Next: 2 Rule trees. Up: Hierarchical recognition of structured Previous: Hierarchical recognition of structured

Cracknell C R W
Mon Jul 7 15:13:40 BST 1997