Welcome to the Vision, Language and Reading (VLR) research team at the Computer Vision Center in Barcelona, Spain.
The VLR research team conducts fundamental research and technology transfer at the frontier between vision, language and reading systems. We devise reading systems for text in the wild, and incorporate scene text semantics in a multitude of computer vision tasks such as captioning, visual question answering, cross-modal retrieval, fine-grained classification, etc. In parallel, we advance document understanding with a special interest in end-to-end approaches for Document Visual Question Answering.
Jun 2025 3 papers accepted in ICDAR 25
May 2025 A new PhD student joined the group - welcome Yiming Xu!
May 2025 1 paper accepted at ICML 2025
Feb 2025 1 paper accepted at ICLR 2025
Jan 2025 Dimosthenis Karatzas gave an invited talk at the AI for Complex Systems Workshop