Working with the Airline Tweets Dataset
You will find the Airline Tweets Dataset in our GitHub Repository. Download it to a suitable folder.
In this part of the tutorial, we will find the bigrams of the data and classify it in R and then find the unigrams and classify it in Weka.
The Airline Tweets Dataset contains tweets about various airlines by passengers, about their flight experience. These Tweets are either positive or negative.
What are bigrams and unigrams? We hear you! Land here. |
---|
The Airline Tweets Dataset is made up of 2000 tweets and is arranged in the following manner.
900 POSTIVE TWEETS + 900 NEGATIVE TWEETS | 90% TRAINING SET |
---|---|
100 POSITIVE TWEETS + 100 NEGATIVE TWEETS | 10% TEST SET |