Jaewoo Ahn

jaewoo.ahn AT vision.snu.ac.kr

prof_outside.jpg

Hi, I’m a Ph.D. candidate in the Department of Computer Science and Engineering at Seoul National University where I am advised by Prof. Gunhee Kim as a member of Vision & Learning Lab.

I am broadly interested in NLP and multimodal AI, with a particular focus on human-like conversational agents. My long-term goal is to develop embodied conversational agents, such as digital avatars or personal assistants, that interact naturally in real-world environments.

Currently, my research interests lie in integrating multisensory perception into LLMs to support multimodal interactions within diverse (e.g., computer-use, video game, embodied) environments. Specifically, I focus on enhancing decision-making capabilities of LLM/VLM agents across 2D/3D environments.

You can find my CV here.

News

Jul 24, 2025 Our ChartCap paper got accepted to ICCV 2025 as a Highlight Poster!
Jun 05, 2025 Our Orak benchmark for video game agents is released!
May 15, 2025 Our MAC paper got accepted to ACL 2025!
Apr 07, 2025 I joined KRAFTON AI as a Research Scientist Intern — Ready to level up game agents! 🎮
Mar 08, 2025 Our CCPT paper got accepted to NAACL 2025 as an Oral Presentation!

Publications (* equal contribution)

  1. Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games
    arXiv, 2025
  2. ChartCap: Mitigating Hallucination of Dense Chart Captioning
    Junyoung Lim, Jaewoo Ahn, and Gunhee Kim
    In ICCV, 2025 Highlight
  3. Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates
    Jaewoo Ahn*Heeseung Yun*Dayoon Ko, and Gunhee Kim
    In ACL, 2025
  4. Is a Peeled Apple Still Red? Evaluating LLMs’ Ability for Conceptual Combination with Property Type
    In NAACL, 2025 Oral
  5. TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
    In ACL Findings, 2024
  6. Who Wrote this Code? Watermarking for Code Generation
    In ACL, 2024
  7. mRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images
    In EMNLP, 2023
  8. MPCHAT: Towards Multimodal Persona-Grounded Conversation
    Jaewoo Ahn, Yeda SongSangdoo Yun, and Gunhee Kim
    In ACL, 2023
  9. Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
    Byeongchang Kim, Jaewoo Ahn, and Gunhee Kim
    In ICLR, 2020 Spotlight