Choi, Eunbi, Choi, Kibong, Chun, Sehyun, Hong, Seokhee, Hwang, Junwon, Jeon, Hyojin, Jo, Ahra, Jo, Hyunjik, Jo, Yeonsik, Kim, Joonkee, Kim, Seonghwan, Kim, Soyeon, Kim, Sunkyoung, Kim, Yireun, Kim, Yongil, Lee, Changhun, Lee, Haeju, Lee, Jinsik, Lee, Kyungmin, Park, Sangha, Ryoo, Kwangrok, Seo, Minju, Yang, Sejong, Yeen, Heuiyeen, Chang, Hwan, Choi, Stanley Jungkyu, Choi, Yejin, Han, Kyubeen, Jang, Joonwon, Jeon, Kijeong, Jeong, Geunyeong, Jo, Gerrard Jeongwon, Jung, Jiyeon, Kim, Daeseong, Kim, Dohoon, Kim, Dohyun, Kim, Hyunseo, Kim, Minu, Kim, Myoungshin, Kim, Youchul, Ko, Byungoh, Lee, Christopher, Lee, Edward Hwayoung, Lee, Honglak, Lee, Jiyoung, Lee, Sangeun, Lim, Seungwon, Lim, Woohyung, Mun, Jueun, Park, Jaewoo, Park, Jimin, Park, Jinho, Park, Yongmin, Seo, Wooseok, Song, Yongwoo, Yi, Sihyuk, Yoo, Kyungjae, Yoon, Sangyeon
Abstract
This technical report introduces EXAONE 4.5, the first open-weight vision language model released by LG AI Research. EXAONE 4.5 is architected by integrating a dedicated visual encoder into the existing EXAONE 4.0 framework, enabling native multimodal pretraining over both visual and textual modalities. The model is trained on large-scale data with careful curation, particularly emphasizing document-centric corpora that align with LG's strategic application domains. This targeted data design enables substantial performance gains in document understanding and related tasks, while also delivering broad improvements across general language capabilities. EXAONE 4.5 extends context length up to 256K tokens, facilitating long-context reasoning and enterprise-scale use cases. Comparative evaluations demonstrate that EXAONE 4.5 achieves competitive performance in general benchmarks while outperforming state-of-the-art models of similar scale in document understanding and Korean contextual reasoning. As part of LG's ongoing effort toward practical industrial deployment, EXAONE 4.5 is designed to be continuously extended with additional domains and application scenarios to advance AI for a better life.
Chinese Translation
本技术报告介绍了 EXAONE 4.5,这是由 LG AI Research 发布的首个开源权重视觉语言模型。EXAONE 4.5 通过在现有 EXAONE 4.0 框架中集成专用视觉编码器构建,实现了对视觉和文本模态的原生多模态预训练。该模型在经过精心筛选的大规模数据上训练,特别强调与 LG 战略应用领域相契合的以文档为中心的语料库。此针对性的数据设计显著提升了文档理解及相关任务的性能,同时在通用语言能力方面也带来了广泛的改进。EXAONE 4.5 将上下文长度扩展至 256K 令牌,支持长上下文推理和企业级应用场景。对比评估表明,EXAONE 4.5 在通用基准测试中表现具有竞争力,并在文档理解及韩语上下文推理方面超越了同规模的最先进模型。作为 LG 致力于实用工业部署的持续努力的一部分,EXAONE 4.5 设计为可持续扩展,涵盖更多领域和应用场景,以推动更美好生活的人工智能发展。