preprocessing

Tag

Cards List
#preprocessing

How Good Can Linear Models Be for Time-Series Forecasting?

Hugging Face Daily Papers · 2026-06-25 Cached

This paper demonstrates that careful preprocessing—especially context length selection, normalization, and regularization—can make simple linear models like Ridge regression competitive with or superior to large Transformer, MLP, and CNN models on time-series forecasting benchmarks.

0 favorites 0 likes
#preprocessing

Best Preprocessing Techniques for Sentiment Analysis

arXiv cs.CL · 2026-06-24 Cached

This paper systematically investigates the optimal order of preprocessing techniques for sentiment analysis on Twitter data, finding that tokenisation is most impactful and spelling correction least, with the best order being tokenisation, cleaning, stemming, then stopword removal.

0 favorites 0 likes
#preprocessing

Hermes got expensive when I let every profile think like a senior engineer.

Reddit r/AI_Agents · 2026-05-19

The author shares how running multiple persistent AI agent profiles under Hermes led to high API costs, solved by implementing tiered model policies per profile, pre-processing inputs, and using an API gateway for cost visibility, reducing daily costs from $14-18 to $7-10.

0 favorites 0 likes
#preprocessing

A Triadic Suffix Tokenization Scheme for Numerical Reasoning

arXiv cs.CL · 2026-04-20 Cached

This paper introduces Triadic Suffix Tokenization (TST), a deterministic tokenization scheme that partitions digits into three-digit triads with explicit magnitude markers to improve numerical reasoning in large language models. The method addresses inconsistent number fragmentation in standard tokenizers by providing transparent order-of-magnitude relationships at the token level, with two implementation variants offering scalable vocabulary expansion.

0 favorites 0 likes
← Back to home

Submit Feedback