@siddarthv66: Im excited to announce something that's been cooking for a bit... We are organizing a COLM workshop in SF with a stacke…

X AI KOLs Following 05/28/26, 10:15 PM Events

Summary

Announcing the 'Context Beyond the Window' workshop at COLM in San Francisco, focused on LLM context length and diverse perspectives on extending it.

Im excited to announce something that's been cooking for a bit... We are organizing a COLM workshop in SF with a stacked cast of speakers. It will feature diverse perspectives on LLM context length and how to increase it (or maybe why we shouldnt!) If you are working on

Original Article

View Cached Full Text

Cached at: 06/01/26, 05:10 AM

Im excited to announce something that’s been cooking for a bit…

We are organizing a COLM workshop in SF with a stacked cast of speakers. It will feature diverse perspectives on LLM context length and how to increase it (or maybe why we shouldnt!)

If you are working on

Dane Malenfant (@dvnxmvl_hdf5): 🚨Excited to announce our workshop Context Beyond the Window hosted at COLM in SF! 🚨

LLMs have finite context windows, yet real-world tasks demand absorbing, retaining, and acting on information that far exceeds any single prompt.

1/3

We’re looking for submissions across:

Similar Articles

I built a 2.5D visual compiler for AI agents: It separates topology from geometry so LLMs stop generating spaghetti diagrams.

Reddit r/AI_Agents

An open-source 2.5D diagram engine in Go that separates topology from geometry to enable LLMs to generate clean architecture diagrams without spatial hallucinations.

SharQ: Bridging Activation Sparsity and FP4 Quantization for LLM Inference

arXiv cs.LG

SharQ introduces a training-free method combining activation sparsity and FP4 quantization for LLM inference, using sparse-dense decomposition and a unified FP4 weight payload. It achieves significant latency reduction and accuracy recovery over FP4-only baselines.

Can Large Language Models Reliably Code Qualitative Humanitarian Data? A Benchmark Study Against Human Expert Adjudication

arXiv cs.LG

This benchmark study evaluates 46 large language models against human experts for coding qualitative humanitarian data, finding that LLMs can achieve comparable reliability with structured prompts and reasoning, but require careful oversight for nuanced themes.

Optimizing CUDA like a Human: Micro-Profiling Tools as Expert Surrogates for LLM-Based GPU Kernel Optimization

arXiv cs.LG

KernelPro is a closed-loop multi-agent system that uses LLMs and micro-profiling tools to automatically optimize GPU kernel code, achieving geomean speedups of 2.42×/4.69×/5.30× on KernelBench and demonstrating a measured 11.6% energy reduction at matched speed.

CAT-Q: Cost-efficient and Accurate Ternary Quantization for LLMs

arXiv cs.CL

CAT-Q introduces a post-training ternary quantization method for LLMs that uses learnable modulation and softened ternarization, achieving superior performance over BitNet 1.58-bit while using only 512 calibration samples and scaling to 235B parameters.

Similar Articles

I built a 2.5D visual compiler for AI agents: It separates topology from geometry so LLMs stop generating spaghetti diagrams.

SharQ: Bridging Activation Sparsity and FP4 Quantization for LLM Inference

Can Large Language Models Reliably Code Qualitative Humanitarian Data? A Benchmark Study Against Human Expert Adjudication

Optimizing CUDA like a Human: Micro-Profiling Tools as Expert Surrogates for LLM-Based GPU Kernel Optimization

CAT-Q: Cost-efficient and Accurate Ternary Quantization for LLMs

Submit Feedback