The idea
- Analyze tweets of a user for being positive, negative or neutral using machine learning techniques
- Show how the mood of your tweets change over time
Why?
- Fun way to experiment with Sentiment Analysis
- Experiment with language detection
How
Gathering data
We analyzed tweets from Switzerland, England, and Brazil. We put extra care to make sure our model can do well against Swiss-German text.
Make awesome model in node
We created custom fast Natural Language Processor in node.js. Why node? It has very good run-time when dealing with lots and lots of strings. We used unsupervised machine learning techniques to teach our model the Swiss German and English writing model. Once we had a working model, we added couple other models using Bayesian inference to create an ensemble en.wikipedia.org/wiki/Ensemble_learning
Make nice front-end
Once we got our server working we thought about adding some better UI. We asked our User Experience specialist Laura to suggest improvements. See for yourself:
Problems and learnings
Language detection is needed to use the right sentiment model
Design model for Swiss-German is especially hard: the language incorporates German, with a lot of French and Italian words. Also spelling of words changes from canton to canton. If we add that most people when writing tweets are forced to use abbreviation, we get the whole picture of the challenge.
An accurate model needs a lot of data
In order to get a good result we needed to incorporate data from various people and different nationalities. The good thing is that the more you use our model the more accurate it gets.
Training data is available
One of the problems is that for humans is hard to understand the irony or sarcasm. Especially in short tweets. So it's also hard for a machine.
If you want to play with our results in this machine learning experiment:
I would like to thanks Andrey Poplavskiy for his âcss loveâ, and Adrian Philipp for his huge contribution and encouragement towards this project.
PS.
Some comments that we received, were not so nice, but as always we are happy to receive any feedback.