Conversion of Lexicon-Grammar tables to LMF. Application to French
Summary
Describes the conversion of French verb Lexicon-Grammar tables into the LMF format, enhancing interoperability and standardization for NLP dictionaries.
View Cached Full Text
Cached at: 05/15/26, 06:24 AM
# Conversion of Lexicon-Grammar tables to LMF. Application to French Source: [https://arxiv.org/abs/2605.14816](https://arxiv.org/abs/2605.14816) [View PDF](https://arxiv.org/pdf/2605.14816) > Abstract:We describe the first experiment of conversion of Lexicon\-Grammar tables for French verbs into the Lexical Markup Framework \(LMF\) format\. The Lexicon\-Grammar of the French language is currently one of the major sources of lexical and syntactic information for French\. Its conversion into an interoperable representation format according to the LMF standard makes it usable in different contexts, thus contributing to the standardization and interoperability of natural language processing dictionaries\. We briefly introduce the Lexicon\-Grammar and the derived dictionaries; we analyse the main difficulties faced during the conversion; and we describe the resulting resource\. ## Submission history From: Eric Laporte \[[view email](https://arxiv.org/show-email/d4eef2af/2605.14816)\] **\[v1\]**Thu, 14 May 2026 13:28:24 UTC \(789 KB\)
Similar Articles
MORPHOGEN: A Multilingual Benchmark for Evaluating Gender-Aware Morphological Generation
Researchers introduce MORPHOGEN, a multilingual benchmark testing LLMs’ ability to rewrite first-person sentences in the opposite gender while preserving meaning across French, Arabic, and Hindi.
From Benchmarking to Reasoning: A Dual-Aspect, Large-Scale Evaluation of LLMs on Vietnamese Legal Text
A comprehensive dual-aspect evaluation framework for large language models on Vietnamese legal text simplification, combining quantitative benchmarking (Accuracy, Readability, Consistency) with qualitative error analysis across GPT-4o, Claude 3 Opus, Gemini 1.5 Pro, and Grok-1.
Optimizing Korean-Centric LLMs via Token Pruning
This paper presents a systematic benchmark of token pruning—a compression technique that removes tokens and embeddings for irrelevant languages—applied to Korean-centric LLM tasks. The study evaluates popular multilingual models (Qwen3, Gemma-3, Llama-3, Aya) across different vocabulary configurations and finds that token pruning significantly improves generation stability and reduces memory footprint for domain-specific deployments.
I kept a doc of every LLM term that confused me while building. Cleaned it up and open sourced it.
The author compiled a glossary of confusing LLM terms with production-oriented explanations, cleaned it up, and open-sourced it as a browsable UI on GitHub.
Markdown browser for LLMs
The author introduces TextWeb, an open-source tool that renders web pages as markdown for LLMs instead of using expensive vision models, featuring CLI and MCP server support.