Title | : | Identifying Salient Content from Images using Cognitive Models for Smart-CBIR design |
Speaker | : | Nitin Gupta (IITM) |
Details | : | Fri, 15 May, 2015 3:00 PM @ BSB 361 |
Abstract: | : | Traditionally, content-based image retrieval (CBIR) techniques have been designed to retrieve similar images from a large database, using a global content representation of the image. But the salient (desired) content in an image is often localized (e.g. car in a street, face in a photograph) instead of being holistic, demanding the need for a smart (object centric) CBIR. Simultaneously detecting and recognizing salient parts (mostly objects) in a scene is a powerful feature of the cognitive process of human visual perception. This motivated us to propose a cognitive inspired novel 'What is Where' framework, for solving the problem of smart CBIR.
The problem for smart CBIR is: Given a scene (image) comprising of one (or more) foreground objects, first identify and locate the object(s) present in a scene and then retrieve images with similar object(s) from a categorized gallery of samples, to be displayed rank-wise based on similarity. The key contributions of this work are: (i) Cognitive inspired design of a 'What is Where' framework for smart CBIR, using an iterative feedback mechanism; (ii) Methods for improving the performance of a multi-class 'What' module, using feature Integration theory (FIT), MKL and hierarchical clustering. By design, the method of feedback iteratively filters the undesired (spurious) proposals individually generated by the pair of modules, which helps convergence to a mutually acceptable solution by consensus. Results are shown on three real-world object-detection datasets (including the challenging PASCALVOC 2007), to exhibit the superior performance of the proposed method. |