How to Prepare Texts, Reviews, Comments, Tweets for Sentiment Analysis with No-Code
clacla
Data analysists do often need to prepare a list of product reviews, YouTube comments, tweets, etc. to for sentiment analysis. Sadly, until now, it involved writing multiple steps of Python code. This is a tutorial to achieve the same goal without writing a single line of code while still having confidence in the result.
In this tutorial, we will: remove unwanted characters, remove symbols and punctuation, remove stop words and stem words in the texts. We are going to use Fluidtable as the tool to achieve this.
Get your text in an Excel or CSV file
Get the texts you want to prepare into an Excel or CSV file. One text per row.
Then go to Fluidtable, signup and then click on Upload Excel or CSV.
Once the file is loaded it will be visualised. Click on Data cleaner to get started.
Remove unwanted characters
On the right side of the screen, click on Add a cleaning rule button.
Search and Add the cleaning rule names Remove specific characters from text.
Select the characters that you want to remove. I suggest selecting: remove numbers, punctuation, symbols, non-ASCII characters. If you want you can also specify custom ones.
On the left, you will see the preview of the cleaning rule you just setup.
Remove stop words
Stop words are words that don’t add sentiment value. Words like me, I, and
. Click on Add a cleaning rule, then Add the cleaning rule Remove specific words from text. Select Remove stop words.
Stemming words in a text
To make sentiment analysis more effective you want to stem the words to let the sentiment analysis algorithm more effective. It will transform words like borrowed to borrow
and properly to proper
. Click on Add a cleaning rule, then Add the cleaning rule Apply stemming to words in a text.
Bonus: perform sentiment analysis inside Fluidtable
In case you don’t have preferences regarding what algorithm performs the sentiment analysis, you can do it inside Fluidtable. It’s quick and easy. Else just skip this step.
Click on Add a cleaning rule, then Add the cleaning rule Replace with sentiment analysis score. It is going to be a numeric score: less than 0 is negative sentiment; equal to 0 is neutral sentiment; more than 0 is a positive sentiment. Bigger numbers are stronger sentiments.
Exporting the result
Click on Start cleaning on the top right of the screen. It will apply all these cleaning rules to all the input texts. It will then create a new table with the prepared texts.
In the toolbar, you will find the `Export` button to download an Excel or CSV with the prepared texts.
Conclusion
Doing sentiment analysis, or preparing texts for it, without writing code is possible and easy. It is powerful and ready for production.
If you have any feedback on how to improve Fluidtable for better sentiment analysis, let me know. I am the founder and developer.
In case you have questions, just leave me a comment.