text-cleaning

Tag

Cards List
#text-cleaning

Best Preprocessing Techniques for Sentiment Analysis

arXiv cs.CL · 17h ago Cached

This paper systematically investigates the optimal order of preprocessing techniques for sentiment analysis on Twitter data, finding that tokenisation is most impactful and spelling correction least, with the best order being tokenisation, cleaning, stemming, then stopword removal.

0 favorites 0 likes
← Back to home

Submit Feedback