Title | : | Hierarchical Feature Informative Prototype with Double Angular Margin Contrast for Few-Shot Class Incremental Learning |
Speaker | : | Riya Verma (IITM) |
Details | : | Tue, 23 Jan, 2024 11:00 AM @ SSB-334 |
Abstract: | : | Deep learning holds state-of-the-art performances in many tasks. They are extremely powerful approximators. However, despite their impressive advances, the conventional paradigm for training demands large amounts of data and static datasets that need to align better with the dynamic nature of real-world environments, where new classes can emerge over time, and data may be scarce. Humans/animals have this extraordinary ability to learn continually from experiences, apply this newly discovered knowledge and skills to new situations, and use them as the foundation for informed learning. Our desire is to mimic the same with machine intelligence in low data regimes. Class incremental learning (CIL) has gained significant interest due to its ability to handle ample and distinct datasets arriving in successive sessions. Moreover, the assumption of constant access to vast amounts of data is often impractical. Few-shot class incremental learning (FSCIL) addresses this challenge by enabling models to learn from a limited number of examples and to accommodate new classes without forgetting previously acquired knowledge. Due to the unavailability of old training samples at the incremental stages, FSCIL often suffers catastrophic forgetting and overfitting problem to new, sparse data. We propose our innovative approach, a Hierarchical Feature Informative Prototype with Double Angular Margin Contrast, which extracts multiple feature vectors with rich semantic content from each image at various scales and abstraction levels. This is unlike the common practice of extracting a single feature vector per image, fostering superior generalization for new classes, especially with limited data samples. Double Angular margin-based contrastive learning framework ensures compact intra-class embeddings and enhances inter-class separability, preserving space for novel classes and enhancing diverse learning of feature embeddings. We utilize Contrastive Prototype Learning (CPL), in which the anchor is taken as a class prototype, encouraging the model to learn features representative of the class, making the model more robust. Instead of relying on straightforward feature averaging to create prototypes, we employ a non-parametric self-attention mechanism to obtain weighted prototypes. This approach gives prominence to the most informative and representative samples, resulting in a sturdier and more reliable setup. We use Layer-wise Feature Augmentation to enhance specific types of features at each level, leading to a richer and more diverse feature representation. For inference, we use set-based distance metrics to boost confidence. The performance of the proposed work is verified using the benchmark datasets CIFAR100, CUB200, and miniImageNet, becoming a new state-of-the-art. |