Composite Concept Extraction Through Backdooring

Ghosh Banibrata, Harikumar Haripriya, Doan Khoa D., Venkatesh Svetha, Rana Santu

Publisher

Learning composite concepts, such as “red car”, from individual examples—like a white car representing the concept of “car” and a red strawberry representing the concept of “red”—is inherently challenging. This paper introduces a novel method called Composite Concept Extractor (CoCE), which leverages techniques from traditional backdoor attacks to learn these composite concepts in a zero-shot setting, requiring only examples of individual concepts. By repurposing the trigger-based model backdooring mechanism, we create a strategic distortion in the manifold of the target object (e.g., “car”) induced by example objects with the target property (e.g., “red”) from objects “red strawberry”, ensuring the distortion selectively affects the target objects with the target property. Contrastive learning is then employed to further refine this distortion and a method is formulated for detecting objects that are influenced by the distortion. Extensive experiments with an in-depth analysis across different datasets demonstrates the utility and applicability of our proposed approach.

Publisher: Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics

ISSN (Electronic): 16113349

ISSN (Print): 03029743

Keywords

  • Backdoor attack
  • Concept extraction
  • Contrastive learning
  • Deep learning
  • Fine-grained classification
  • Trigger

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science (all)

Publication year

2025

Fingerprint