|Computer Vision has gradually become an enabling technology for a range of applications involving in particular inspection and recovery of 3D models. The problem of recognition and categorisation of general sets of objects has posed to be a significant challenge. This proposal addresses the problem of recognition, categorisation, scene interpretation and learning in the context of an embodied system to demonstrate that through careful consideration of spatio-temporal context, task constraints, embodiment, and representation it is indeed possible to provide the methods needed for the demonstrating the basic functionalities of cognitive vision in the context of realistic problems.
The objective of this project is to provide the methods and techniques that enable construction of vision systems that can perform task oriented categorisation and recognition of objects and events in the context of an embodied agent. The functionality will enable construction of mobile agents that can interpret the action of humans and interact with the environment for tasks such as fetch and delivery of objects in a realistic domestic setting.
Cognitive vision systems include facilities for "understanding", "knowing" and "learning". Understanding here involves both recognition/categorisation of objects and events, through association of semantic labels with data from the scene. Interpretation does, however, also involve interpretation and reasoning to enable construction of rich semantic models of the environment. Knowing implicitly specifies a need to consider memory as a common basis for representation and maintenance of information, including methods for associate access. Systems with a realistic complexity cannot be engineered. There is consequently a need for methods for automatic acquisition of models and representations to allow the system to operate in an open-ended fashion, i.e. beyond initial specifications. Finally the above issues can only be addressed in a meaningful manner in the context of a fully operational system, which implies that it must be embodied and continuously operating.
To address the issues outlined above the work has been organised into four workpackages:
i) Recognition and Categorisation of Objects, Structures and Events,
ii) Reasoning and Interpretation about Scenes and Events,
iii) Learning and Adaptation, and
iv) Control and Integration. Through the work in each of these packages and integration of these efforts into a number of operational systems the key issues in cognitive vision will be studied. Particular emphasis is placed on studies of these systems in realistic settings and through combination of static and dynamic information to allow interpretation, control and knowledge acquisition to operate in concert.
The project will in particular deliver basic methods for recognition / categorisation of objects and events/actions in large scale scenarios, new methods for robust interpretation of dynamic scenes, methods for acquisition of basic skills and environmental models, and techniques for fully distributed control of continuously operating systems. The methods studied will be integrated in prototype systems. Many of the results within the project will be of immediate industrial utility.