Tag
This paper systematically investigates the optimal order of preprocessing techniques for sentiment analysis on Twitter data, finding that tokenisation is most impactful and spelling correction least, with the best order being tokenisation, cleaning, stemming, then stopword removal.