
PRE: Vision-Language Prompt Learning with Reparameterization Encoder
Researchers from Kingston University, University of Oxford and Queen Mary University of London have developed “PRE” (Prompt Learning with Reparameterization Encoder), an innovative approach that significantly enhances the ability of vision-language models like CLIP to generalize to unseen classes.
Unlike traditional prompt engineering that requires domain expertise and considerable time, PRE employs a prompt encoder to reparameterize input prompt embeddings, enabling better exploration of domain-specific knowledge from few-shot data.
This research addresses a critical challenge in AI deployment – the generalization ability of learnable prompts to unseen classes. In extensive experiments across 8 benchmarks, PRE achieved remarkable improvements: a 5.60% increase in average accuracy on new classes and a 3% enhancement in Harmonic mean compared to CoOp in the 16-shot setting.
The work was presented at the ICLR 2024 Workshop on Diversity in Machine Learning Research (DMLR) conference.
Research Team
- Thi Minh Anh Pham
- An Duc Nguyen
- Cephas Svosve
- Vasilis Argyriou
- Georgios (Yorgos) Tzimiropoulos
Further Information
The full paper is available here.
Follow us on LinkedIn and X for more content.
Funding Acknowledgment
This work was funded by UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant number 10099264] and funded by the European Union [under EC Horizon Europe grant agreement number 101135800 (RAIDO)].
Part of the RAIDO Project, promoting Sustainable AI development through Horizon Europe.