KPoEM: Korean Poetry Emotion Mapping Dataset and Model

Overview

[AI Summary]: The Korean Academy of Digital Humanities (KADH) announces KPoEM (Korean Poetry Emotion Mapping), the first Korean poetry emotion classification dataset containing 483 poems from five major Korean poets (Kim Sowol, Yun Dong-ju, Yi Sang, Im Hwa, Han Yong-un) with 7,662 labeled data points across 44 emotion categories. The dataset, developed by the Academy of Korean Studies Digital Humanities Research Institute led by Professor Kim Byung-jun, addresses limitations of existing emotion datasets trained on internet comments by providing poetry-specific emotion analysis that captures subtle literary expressions and Korean cultural emotions like ‘sorrow’ (서러움) and ‘heroic determination’ (비장함). The model achieves F1-micro score of 0.60 and demonstrates superior performance in understanding poetic metaphors and multi-layered emotions compared to models trained on general text.

  • Lead Researcher: Kim Byung-jun (한국학중앙연구원)
  • Dataset Size: 7,662 labeled instances (7,007 line-level, 615 work-level)
  • Emotion Categories: 44 (including Korean-specific emotions)
  • Availability: Hugging Face, Zenodo, GitHub
  • Paper: arXiv:2509.03932