Skip to main content

Showing 1–5 of 5 results for author: Kajić, I

Searching in archive cs. Search in all archives.
  1. arXiv:2404.16820  [pdf, other


    Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings

    Authors: Olivia Wiles, Chuhan Zhang, Isabela Albuquerque, Ivana Kajić, Su Wang, Emanuele Bugliarello, Yasumasa Onoe, Chris Knutsen, Cyrus Rashtchian, Jordi Pont-Tuset, Aida Nematzadeh

    Abstract: While text-to-image (T2I) generative models have become ubiquitous, they do not necessarily generate images that align with a given prompt. While previous work has evaluated T2I alignment by proposing metrics, benchmarks, and templates for collecting human judgements, the quality of these components is not systematically measured. Human-rated prompt sets are generally small and the reliability of… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Data and code will be released at:

  2. arXiv:2303.07172  [pdf, other

    cs.AI cs.CV cs.LG

    Evaluating Visual Number Discrimination in Deep Neural Networks

    Authors: Ivana Kajić, Aida Nematzadeh

    Abstract: The ability to discriminate between large and small quantities is a core aspect of basic numerical competence in both humans and animals. In this work, we examine the extent to which the state-of-the-art neural networks designed for vision exhibit this basic ability. Motivated by studies in animal and infant numerical cognition, we use the numerical bisection procedure to test number discriminatio… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

  3. arXiv:2211.01480  [pdf, other

    cs.MA cs.CL cs.HC

    Over-communicate no more: Situated RL agents learn concise communication protocols

    Authors: Aleksandra Kalinowska, Elnaz Davoodi, Florian Strub, Kory W Mathewson, Ivana Kajic, Michael Bowling, Todd D Murphey, Patrick M Pilarski

    Abstract: While it is known that communication facilitates cooperation in multi-agent settings, it is unclear how to design artificial agents that can learn to effectively and efficiently communicate with each other. Much research on communication emergence uses reinforcement learning (RL) and explores unsituated communication in one-step referential tasks -- the tasks are not temporally interactive and lac… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  4. arXiv:2205.12191  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization

    Authors: Aishwarya Agrawal, Ivana Kajić, Emanuele Bugliarello, Elnaz Davoodi, Anita Gergely, Phil Blunsom, Aida Nematzadeh

    Abstract: Vision-and-language (V&L) models pretrained on large-scale multimodal data have demonstrated strong performance on various tasks such as image captioning and visual question answering (VQA). The quality of such models is commonly assessed by measuring their performance on unseen data that typically comes from the same distribution as the training data. However, when evaluated under out-of-distribu… ▽ More

    Submitted 1 April, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Findings of EACL 2023. Aishwarya, Ivana, Emanuele and Aida had equal first author contributions. Elnaz and Anita had equal contributions. Aida and Aishwarya had equal senior contributions

  5. arXiv:2004.01097  [pdf, other

    cs.LG cs.CL cs.MA stat.ML

    Learning to cooperate: Emergent communication in multi-agent navigation

    Authors: Ivana Kajić, Eser Aygün, Doina Precup

    Abstract: Emergent communication in artificial agents has been studied to understand language evolution, as well as to develop artificial systems that learn to communicate with humans. We show that agents performing a cooperative navigation task in various gridworld environments learn an interpretable communication protocol that enables them to efficiently, and in many cases, optimally, solve the task. An a… ▽ More

    Submitted 30 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

    Comments: Accepted to CogSci 2020