Tag
This paper presents the construction of a Korean evaluation-annotated corpus (EVAD) for fine-grained aspect-based sentiment analysis in e-commerce reviews using Semi-Automatic Symbolic Propagation. It evaluates KoBERT and KcBERT models on the dataset, achieving high F1 scores in aspect-value pair recognition.
This paper presents a method for generating large-scale, labeled training datasets for legal chatbots in Korean using Local Grammar Graphs, achieving 91% F1-score with a DIET classifier.
This paper presents a systematic benchmark of token pruning—a compression technique that removes tokens and embeddings for irrelevant languages—applied to Korean-centric LLM tasks. The study evaluates popular multilingual models (Qwen3, Gemma-3, Llama-3, Aya) across different vocabulary configurations and finds that token pruning significantly improves generation stability and reduces memory footprint for domain-specific deployments.