Proceedings Article | 2 November 2001
KEYWORDS: 3D image processing, Algorithm development, Matrices, Object recognition, Machine vision, Computer vision technology, Image understanding, Distance measurement, Detection and tracking algorithms, Information visualization
The general problem of single-view recognition is central to man image understanding and computer vision tasks; so central, that it has been characterized as the holy grail of computer vision. In previous work, we have shown how to approach the general problem of recognizing three dimensional geometric configurations (such as arrangements of lines, points, and conics) from a single two dimensional view, in a manner that is view independent. Our methods make use of advanced mathematical techniques from algebraic geometry, notably the theory of correspondences, and a novel equivariant geometric invariant theory. The machinery gives us a way to understand the relationship that exists between the 3D geometry and its residual in a 2D image. This relationship is shown to be a correspondence in the technical sense of algebraic geometry. Exploiting this, one can compute a set of fundamental equations in 3D and 2D invariants which generate the ideal of the correspondence, and which completely describe the mutual 3D/2D constraints. We have chosen to call these equations object/image equations. They can be exploited in a number of ways. For example, from a given 2D configuration, we can determine a set of non-linear constraints on the geometric invariants of a 3D configurations capable of imaging to the given 2D configuration (features on an object), we can derive a set of equations that constrain the images of that object; helping us to determine if that particular object appears in various images. One previous difficulty has been that the usual numerical geometric invariants get expressed as rational functions of the geometric parameters. As such they are not always defined. This leads to degeneracies in algorithms based on these invariants. We show how to replace these invariants by certain toric subvarieties of Grassmannians where the object/image equations become resultant like expressions for the existence of a non- trivial intersection of these subvarieties with certain Schubert varieties in the Grassmannian. We call this approach the global invariant approach. It greatly increases the robustness and numerical stability of the methods. This approach also has advantages when considering issue sin geometric computation, notably geometric hashing. Here we can exploit the natural metric on the Grassmannian to measure distances between objects and images. Our ultimate aim is the development of new algorithms for geometric content-based retrieval. Content-based retrieval of information from large-scale databases, particularly visual/geometric information contained in images, schematics, design drawings, and geometric models of environments, mechanical parts, or molecules, etc., will play an important role in future distributed information and knowledge system.