Vision, Language and Reading

Welcome to the Vision, Language and Reading (VLR) research team at the Computer Vision Center in Barcelona, Spain.

The VLR research team conducts fundamental research and technology transfer at the frontier between vision, language and reading systems. We devise reading systems for text in the wild, and incorporate scene text semantics in a multitude of computer vision tasks such as captioning, visual question answering, cross-modal retrieval, fine-grained classification, etc. In parallel, we advance document understanding with a special interest in end-to-end approaches for Document Visual Question Answering.

News

Jan 2026 Start of the DocVQA2026 ICDAR Competition
Sep 2025 A new PhD student joined the group - welcome Christos Georgakilas!
Jun 2025 3 papers accepted in ICDAR 25
May 2025 A new PhD student joined the group - welcome Yiming Xu!
May 2025 1 paper accepted at ICML 2025
Apr 2025 Dimosthenis Karatzas gave an invited talk at the 49th Pattern Recognition and Computer Vision Colloquium
Feb 2025 1 paper accepted at ICLR 2025

See more news

Google Sites

Report abuse