2025 Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games Dongmin Park* , Minkyu Kim*, Beongjun Choi* , Junhyuck Kim , Keon Lee , Jonghyun Lee , Inkyu Park , Byeong-Uk Lee, Jaeyoung Hwang, Jaewoo Ahn, Ameya S. Mahabaleshwarkar, Bilal Kartal, Pritam Biswas, Yoshi Suhara , Kangwook Lee, and Jaewoong Cho arXiv, 2025 HTML PDF Code Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates Jaewoo Ahn* , Heeseung Yun*, Dayoon Ko, and Gunhee Kim In ACL, 2025 PDF Code Is a Peeled Apple Still Red? Evaluating LLMs’ Ability for Conceptual Combination with Property Type Seokwon Song*, Taehyun Lee*, Jaewoo Ahn, Jae Hyuk Sung, and Gunhee Kim In NAACL, 2025 Oral PDF Code 2024 TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models Jaewoo Ahn, Taehyun Lee, Junyoung Lim , Jin-Hwa Kim, Sangdoo Yun , Hwaran Lee, and Gunhee Kim In ACL Findings, 2024 HTML PDF Code Poster Slides Who Wrote this Code? Watermarking for Code Generation Taehyun Lee*, Seokhee Hong*, Jaewoo Ahn , Ilgee Hong , Hwaran Lee, Sangdoo Yun, Jamin Shin, and Gunhee Kim In ACL, 2024 PDF Code 2023 mRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images Keighley Overbay, Jaewoo Ahn*, Fatemeh Pesaran Zadeh*, Joonsuk Park, and Gunhee Kim In EMNLP, 2023 PDF Code MPCHAT: Towards Multimodal Persona-Grounded Conversation Jaewoo Ahn, Yeda Song, Sangdoo Yun, and Gunhee Kim In ACL, 2023 PDF Code Poster Slides 2020 Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue Byeongchang Kim, Jaewoo Ahn, and Gunhee Kim In ICLR, 2020 Spotlight PDF Code Slides