agglutinative-language

Tag

Cards List
#agglutinative-language

Morpheus: A Morphology-Aware Neural Tokenizer and Word Embedder for Turkish

arXiv cs.CL · 6h ago Cached

This paper presents Morpheus, a neural tokenizer and word embedder for Turkish that learns morpheme boundaries without string normalization, achieving lossless tokenization and competitive embeddings for lexical retrieval, while using less GPU memory than subword tokenizers.

0 favorites 0 likes
← Back to home

Submit Feedback