Tag
This paper reformulates language generation as a stochastic optimal control problem, addressing limitations of autoregressive and diffusion models, and proposes a closed-loop diffusion method in latent control space using Flow Matching, achieving high-fidelity generation and efficient parallel sampling.
Introduces Discrete Stochastic Localization (DSL), a continuous-state diffusion framework for non-autoregressive text generation that uses unit-sphere token embeddings and a timestep-invariant denoiser, achieving better distributional faithfulness than masked discrete diffusion models on OpenWebText.
CRoCoDiL proposes a continuous and robust conditioned diffusion approach for language that shifts masked diffusion models into a continuous semantic space, achieving superior generation quality and 10x faster sampling speeds compared to discrete methods like LLaDA.
ConlangCrafter is a multi-hop LLM pipeline that automates constructed language (conlang) creation by decomposing the process into modular stages including phonology, morphology, syntax, lexicon generation, and translation. The system leverages LLMs' metalinguistic reasoning with randomness injection and self-refinement to produce coherent and typologically diverse constructed languages.