Publications (* equal contribution)

2026

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

Dongmin Park^* , Minkyu Kim^*, Beongjun Choi^* , Junhyuck Kim , Keon Lee , Jonghyun Lee , Inkyu Park , Byeong-Uk Lee, Jaeyoung Hwang, Jaewoo Ahn, Ameya S. Mahabaleshwarkar, Bilal Kartal, Pritam Biswas, Yoshi Suhara , Kangwook Lee, and Jaewoong Cho

In ICLR, 2026

HTML PDF Code

2025

FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

Jaewoo Ahn^* , Junseo Kim^* , Heeseung Yun, Jaehyeon Son , Dongmin Park, Jaewoong Cho, and Gunhee Kim

In EMNLP, 2025

HTML PDF Code Poster Slides
ChartCap: Mitigating Hallucination of Dense Chart Captioning

Junyoung Lim, Jaewoo Ahn, and Gunhee Kim

In ICCV, 2025 Highlight

HTML PDF Code
Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates

Jaewoo Ahn^* , Heeseung Yun^*, Dayoon Ko, and Gunhee Kim

In ACL, 2025

PDF Code Poster Slides
Is a Peeled Apple Still Red? Evaluating LLMs’ Ability for Conceptual Combination with Property Type

Seokwon Song^*, Taehyun Lee^*, Jaewoo Ahn, Jae Hyuk Sung, and Gunhee Kim

In NAACL, 2025 Oral

PDF Code

2024

TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models

Jaewoo Ahn, Taehyun Lee, Junyoung Lim , Jin-Hwa Kim, Sangdoo Yun , Hwaran Lee, and Gunhee Kim

In ACL Findings, 2024

HTML PDF Code Poster Slides
Who Wrote this Code? Watermarking for Code Generation

Taehyun Lee^*, Seokhee Hong^*, Jaewoo Ahn , Ilgee Hong , Hwaran Lee, Sangdoo Yun, Jamin Shin, and Gunhee Kim

In ACL, 2024

PDF Code

2023

mRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images

Keighley Overbay, Jaewoo Ahn^*, Fatemeh Pesaran Zadeh^*, Joonsuk Park, and Gunhee Kim

In EMNLP, 2023

PDF Code
MPCHAT: Towards Multimodal Persona-Grounded Conversation

Jaewoo Ahn, Yeda Song, Sangdoo Yun, and Gunhee Kim

In ACL, 2023

PDF Code Poster Slides

2020

Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue

Byeongchang Kim, Jaewoo Ahn, and Gunhee Kim

In ICLR, 2020 Spotlight

PDF Code Slides