Jaewoo Ahn

Hi, I’m a Ph.D. candidate in the Department of Computer Science and Engineering at Seoul National University where I am advised by Prof. Gunhee Kim as a member of Vision & Learning Lab. In addition, I am currently a Research Scientist Intern at KRAFTON AI.

I am broadly interested in NLP and multimodal AI, with a particular focus on human-like embodied conversational agents that interact naturally in real-world environments. To this end, my work has advanced consistent persona modeling (MPChat, TimeChara), robust perception (MAC), and embodied action (Orak, FlashAdventure).

Currently, my research focuses on integrating multisensory perception into LLMs to support multimodal interactions in diverse (e.g., computer-use, video game, embodied) environments. In particular, I focus on enhancing decision-making capabilities of LLM/VLM agents across 2D/3D environments.

You can refer to my Research Statement: Towards Coherent Embodied Conversational Agent, if interested in.

News

Oct 09, 2025	I will be an invited speaker at the Wordplay Workshop @ EMNLP 2025!
Sep 27, 2025	Our FlashAdventure & Orak papers were accepted (as Spotlight & Outstanding, respectively) and will be presented at the Wordplay Workshop @ EMNLP 2025!
Aug 20, 2025	Our FlashAdventure paper got accepted to EMNLP 2025!
Jul 24, 2025	Our ChartCap paper got accepted to ICCV 2025 as a Highlight Poster!
Jun 05, 2025	Our Orak benchmark for video game agents is released!

Publications (* equal contribution)

FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

Jaewoo Ahn^* , Junseo Kim^* , Heeseung Yun, Jaehyeon Son , Dongmin Park, Jaewoong Cho, and Gunhee Kim

In EMNLP, 2025

HTML PDF Code
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

Dongmin Park^* , Minkyu Kim^*, Beongjun Choi^* , Junhyuck Kim , Keon Lee , Jonghyun Lee , Inkyu Park , Byeong-Uk Lee, Jaeyoung Hwang, Jaewoo Ahn, Ameya S. Mahabaleshwarkar, Bilal Kartal, Pritam Biswas, Yoshi Suhara , Kangwook Lee, and Jaewoong Cho

In Wordplay @ EMNLP, 2025 Outstanding

HTML PDF Code
ChartCap: Mitigating Hallucination of Dense Chart Captioning

Junyoung Lim, Jaewoo Ahn, and Gunhee Kim

In ICCV, 2025 Highlight

HTML PDF Code
Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates

Jaewoo Ahn^* , Heeseung Yun^*, Dayoon Ko, and Gunhee Kim

In ACL, 2025

PDF Code Poster Slides
Is a Peeled Apple Still Red? Evaluating LLMs’ Ability for Conceptual Combination with Property Type

Seokwon Song^*, Taehyun Lee^*, Jaewoo Ahn, Jae Hyuk Sung, and Gunhee Kim

In NAACL, 2025 Oral

PDF Code
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models

Jaewoo Ahn, Taehyun Lee, Junyoung Lim , Jin-Hwa Kim, Sangdoo Yun , Hwaran Lee, and Gunhee Kim

In ACL Findings, 2024

HTML PDF Code Poster Slides
Who Wrote this Code? Watermarking for Code Generation

Taehyun Lee^*, Seokhee Hong^*, Jaewoo Ahn , Ilgee Hong , Hwaran Lee, Sangdoo Yun, Jamin Shin, and Gunhee Kim

In ACL, 2024

PDF Code
mRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images

Keighley Overbay, Jaewoo Ahn^*, Fatemeh Pesaran Zadeh^*, Joonsuk Park, and Gunhee Kim

In EMNLP, 2023

PDF Code
MPCHAT: Towards Multimodal Persona-Grounded Conversation

Jaewoo Ahn, Yeda Song, Sangdoo Yun, and Gunhee Kim

In ACL, 2023

PDF Code Poster Slides
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue

Byeongchang Kim, Jaewoo Ahn, and Gunhee Kim

In ICLR, 2020 Spotlight

PDF Code Slides