🎯 Advanced AI Research

Zero-Shot Learning

Breaking barriers in Artificial Intelligence: Teach once, recognize infinite possibilities. The next frontier to reach GPAIS.

1

🎯 Introduction to Zero-Shot Learning

Zero-Shot Learning (ZSL) represents a revolution in the field of machine learning. Imagine a system that can recognize and classify objects it has never seen before during its training. This is not science fiction, it's the reality of ZSL.

Unlike traditional learning that requires examples of all possible classes, ZSL uses semantic knowledge to make inferences about completely new classes.

🌟 Why is it Revolutionary?

  • Infinite Scalability: No need for training data for each new class
  • Instant Adaptability: Recognizes new categories immediately
  • Resource Efficiency: Dramatically reduces data requirements
  • Universal Applicability: From rare species recognition to medical diagnostics
2

🧠 The Semantic Space

The semantic space is the foundation of Zero-Shot Learning, as it provides the attributes that link seen and unseen classes. These attributes can be constructed in two main ways:

🧑‍🏫 Human-Annotated Spaces

  • Generated from human experts
  • Interpretable and transparent
  • Significant human effort

🤖 Model-Generated Spaces

  • Generated from the output of a model
  • Less human effort
  • High dimensional spaces

In practice, no type of semantic space is free from noise or irrelevant information, which makes their refinement essential for improving generalization in Zero-Shot Learning.

3

⚙️ Intelligent Preprocessing

Preprocessing is crucial in ZSL as the quality of the semantic space directly affects the ability to generalize to unseen classes. Attributes are often noisy, redundant, or weakly correlated with visual differences, undermining performance. By refining it, models can focus on the most informative attributes, improving both performance and interpretability.

🔧 Difficulties in preprocessing

  • Tabular structure with one instance per class: Unlike standard datasets, the semantic space has only one prototype vector per class, limiting statistical evaluation.
  • Risk of overfitting: Traditional preprocessing techniques overfit, producing subsets that fail to generalize.
  • Bias toward seen classes: Any preprocessing based only on seen classes (inductive setting) risks being unrepresentative for unseen classes.
  • Attribute Redundancy: Many attributes overlap (e.g., “has a tail” and “has four legs” for dogs and horses), adding little discriminative power.

Our preprocessing approach operates in an inductive setting (the most challenging), takes all the above into account and focuses on preserving critical semantic relationships while eliminating noise that could confuse the transfer process. It does not overfit, it does not bias towards seen classes, and it actively reduces attribute redundancy and handles the single-instance-per-class challenge.

Feature Selection Algorithm

🔧 Feature Selection Algorithm: Advanced preprocessing pipeline that optimizes feature extraction and dimensionality reduction

4

🔄 Advanced Partitioning Algorithm

🧭

Objective: Recreate an inductive ZSL scenario using only seen classes \(\mathcal{Y}^s\), generating pseudo-unseen partitions.

Stratified k-fold workflow

Each of the K folds rotates the seen classes \(\mathcal{Y}^s\) to produce mutually exclusive pseudo-seen \(\mathcal{Y}^{ps}\) and pseudo-unseen \(\mathcal{Y}^{pu}\) subsets.

1🧩

Stratify \(\mathcal{Y}^s\)

Split the seen classes into K folds class-stratified

2🔄

Create \(\mathcal{Y}^{ps}\) and \(\mathcal{Y}^{pu}\)

Within each fold, assign buckets to \(\mathcal{Y}^{ps}\) (train) and \(\mathcal{Y}^{pu}\) (validation), ensuring strict disjointness.

3🔍

Rotate and evaluate

Repeat K times so every class appears at least once in \(\mathcal{Y}^{pu}\), measuring inductive performance fold by fold.

K-Cross Validation Scheme
📊 Class-stratified k-fold scheme

⚖️ Requirements for a fair split

  • Balance: each pseudo-unseen subset differs by at most one class from the others.
  • Coverage: every class in \(\mathcal{Y}^s\) appears only once as \(\mathcal{Y}^{pu}\) across the K folds.
  • Disjointness: pseudo-unseen subsets from different folds remain mutually exclusive.

These criteria prevent class bias and stabilize cross-fold evaluation.

\(\mathcal{Y}^s\)
Seen classes available during training. Source
\(\mathcal{Y}^{ps}\)
Pseudo-seen: training split per fold. Train
\(\mathcal{Y}^{pu}\)
Pseudo-unseen: validation split mimicking unseen classes. Eval

🧭 Importance

It enables working under inductive conditions. By simulating unseen classes using the seen ones, it recreates a controlled ZSL environment for training models, fostering generalization and knowledge transfer. Moreover, the systematic and balanced partitioning ensures that every class is treated as unseen exactly once, providing a fair and reliable evaluation.

5

⚡ Revolutionary Techniques

We have developed an arsenal of complementary techniques that work in synergy to achieve exceptional zero-shot performance. Each technique addresses different aspects of the challenge, creating a robust and versatile system.

🎨 RFS – Ranking-based Feature Selection

Ranking-based Feature Selection refines the semantic attribute space with an embedded ranking plus a cross-validated wrapper over a class-stratified partition. It selects the attribute subset that maximizes pseudo-unseen accuracy, and stabilizes it with an inter-fold consensus threshold.

  • Embedded Ranking: Build an attribute-importance ranking on seen semantic prototypes using embedded FS.
  • Wrapper: Select the optimal subset of attributes from the ranking.
  • Consensus: Retain attributes that appear across folds above a threshold.
RFS Architecture Diagram

🏗️ RFS Architecture: Each fold builds an embedded ranking and evaluates top-i masks by training on pseudo-seen and validating on pseudo-unseen; the best mask per fold is retained. A cross-fold consensus with threshold T yields a stable final subset.

🧬 Evolutionary Genetic Algorithm

The Genetic Algorithm (GA) frames attribute selection as a global combinatorial search.

  • Representation: Individuals encode attribute subsets as binary vectors of length N.
  • Fitness: Fold-averaged pseudo-unseen accuracy
  • Genetic Operators: Population evolves through generations via genetic operators.

🔬 Evolutionary Process

Each generation of the algorithm refines attribute selection through specific genetic operators:

  • 🏟️ Tournament Selection: Selects parent individuals through competitive tournaments of size 3
  • 🧬 Uniform Crossover: Combines parent features with 0.5 probability per attribute
  • ⚡ Bit-Flip Mutation: Randomly toggles attributes with probability 1/L (L=total attributes)
  • 🔄 Simple Evolution Strategy: Generational replacement maintaining constant population size
  • 🏆 Hall of Fame: Preserves the best solutions found across all generations
  • 📊 Dynamic Fold Training: Adapts evaluation criteria by rotating cross-validation folds

This evolutionary approach allows us to discover non-intuitive feature combinations that maximize zero-shot performance, overcoming the limitations of traditional selection methods.

📚 Cite this work

If you use this work in your research, you can cite it using the BibTeX format provided below.

BibTeX
@article{SIAS25,
title={Semantic Inductive Attribute Selection for Zero-Shot Learning},
author={Herrera-Aranda, JJ., Gomez-Trenado, G., Triguero, I., Herrera, F.},
journal={Journal Name},
year={2025},
volume={X},
pages={X--X}
}
📥 Descargar .bib

🎓 Academic Collaboration & Funding

Universidad de Granada
University of Nottingham
DasCi
IAFER