Sentiment Analysis Classification using Classical Machine Learning Approaches
Step-by-Step Tutorial with R and WEKA
Objective
To provide a hands-on Classification-101 tutorial using R and WEKA.
Overview & Purpose
This tutorial is an introductory guide to classification using R and WEKA. We will first explore the Iris dataset in R and construct a Decision Tree to see the attribute values that are responsible for splitting the tree. We will then use Support Vector Machines (SVM) classifier to classify the data.
In the second half of the tutorial, we will classify a set of Tweets based on the sentiments they express - positive or negative. We will perform Feature Extraction and then classify them using some of the various classifiers that are available in WEKA.
Tools & Software Required
R- R is a language and environment used for statistical computing, graphics and machine learning. It can be be downloaded for Windows and Mac.
R Studio- It is an IDE for R. It can be downloaded for Windows and Mac here.
WEKA- Data Mining softwarethat has tools for data pre-processing, classification, regression, clustering, association rules, and visualization. Weka for Windows and Mac can be downloaded here.
How to read this DIY tutorial document?
This document has been written in such a way that it is easy to follow for a beginner. Here are some tips to get the most out of this document.
Entire code scripts, airline dataset along with this document are available on our GitHub account: ArabWICQatar,which can be accessed from this page https://github.com/ArabWICQatar/SentimentClassificationUsingR. These materials will be required for this tutorial. Feel free to “fork”.
All commands in the Rstudio prompt are marked with “>”. If you find any line marked with “>”, it is a command that can be typed into the Rstudio console.
If you do not have any background to some sections, appropriate references for that section are provided.
If any section is marked as“optional”, you can skip them as it won’t affect your understanding of the upcoming sections. But beware, you are missing out on the fun!! :)
You will find certain commands marked as "fun". These have been incorporated to give you a quick break.
We at ArabWic Qatar hope you enjoy this tutorial!