Analyzing Political Views of Netizens: Real-time data streaming pipeline on Google Cloud Platform
Sentiment analysis has been predominantly used for analysis of customer feedback on products and services. Microblogging sites like Twitter are a very good platform for getting these reviews in an informal setting. Sentiment analysis of social media data is increasingly used by organizations as an effective tool to monitor user preferences and opinion.
With the advent of the Internet and Social Media in India, there is a surge in the expression of political views by the citizens on Twitter. Political parties also actively use this medium to create a targeted campaign and promote their ideology.
This was an opportunity for us to build an automated system which can analyze and monitor the tweets in real time. The application built was developed in less than 72 hours during Hackathon 2018 at Core Compete. For this purpose, we chose Google Cloud as our preferred platform. Google Cloud Platform (GCP) provides a rich set of tools which can help create scalable, serverless application in a very short span of time. It provides the ability to build real-time data streaming pipeline efficiently at a very low cost. Availability of serverless services such as cloud function, pub-sub, big query, and machine learning APIs, etc. made it very easy to develop the end to end pipeline to stream tweets and analyze it.
Solution Architecture and Technology Platform
Google Cloud products listed below have been used for end to end integration of the data pipeline with Google Data Studio.
- Cloud Pub-Sub
- Cloud Function
- Big Query
- Natural Language API
- Data Studio
Cloud Pub/Sub, a fully-managed real-time messaging service has been used to stream the data from Twitter API for popular election-related hashtags. Cloud Function has been used to process each tweet using Google Cloud Natural language API to get the sentiment score of the message and to append the processed data to a Big Query table.
Google Cloud Natural Language reveals the structure and meaning of text both through powerful pre-trained machine learning models in an easy to use REST API and through custom models that are easy to build with AutoML Natural Language BETA.
Cloud Natural Language can be used to extract information about people, places, events, and much more mentioned in text documents, news articles, or blog posts. It can be used to understand the sentiment about your product on social media or parse intent from customer conversations happening in a call center or in a messaging app.
Google Data Studio was used to create the summary dashboard to display real-time sentiment analysis of election-related tweets. Data used for visualization is stored in Big Query which is updated real time.
The real-time data streaming for sentiment analysis solution described above has multiple applications. Customers have been using Twitter to express their sentiments about products and services, which opens a new opportunity for organizations in the form of customer interaction in real-time.
Based on the business objective of the sentiment analysis, a minor modification to the hashtags used relevant Tweets can be streamed and parsed to the sentiment analysis pipelines. The Dashboard on the data studio can be customized to report the required metrics. Natural Language API on Google Cloud platform produces sentiment scores at the document level. Group of words parsed in one go is called document. Consumers generally express different sentiment by product or service in the same Tweet or post on the online forums. Custom models suitable for the business objective can be trained and deployed on Google Cloud platform at low cost. Core Compete in the past has deployed Voice of customer (VoC) solution for Lenovo on the AWS platform using SAS Text Analytics technology (Digital Transformation responding faster to the voice of the customer in product planning ). Sentiment analysis has great value in managing customer experience and product development.
Below are some of the use cases:
Customer Service Experience: To maintain long term relationship with the customer, providing excellent customer service experience is crucial. Customers express their views on various forum and even reaches out to call centres to resolve their issues. Customers also use social media such as Twitter and Facebook to share their experience, both good and bad. Reviews from the various online platforms and transcript of the conversation with call centre agents can be combined and used for analysing customer sentiments. Real-time data streaming and analysis can help organizations address the issues immediately and help serve customers better.
A system can be designed to track the sentiments of the customers using their post on online forums including Facebook and Twitter, which can monitor the sentiments on a real-time basis. Based on the severity of sentiment, agents can be assigned who can reach out to the customer and resolve their issue.
Telecom, Banks, and Airlines are domains where such use case can be highly effective. Sentiment analysis also plays a significant role in loyalty management in the hospitality and tourism industry.
Product Development: Electronics or fashion items have a very short life span, and it is very important to design and develop a product which on launch has a high response from the customers. Generally, customers are generous in sharing their voices and discussing their preferences. They are also likely to discuss their likes, dislikes, etc. on the social forum. Deep dive analysis of their discussion provides significant information for the Product Managers of the products/services. Rich information extracted around customers preference is useful for investing in the right categories and type of products.
Public Service and Governance: As per Statista, there are around 3.8 billion internet users in the world. According to Kantar IMRB, there will be 627 million internet users in India alone. Citizens are using various platforms available on the internet to provide feedback to the government about their policies and schemes. Opinion on policies also varies by region and demographics. There are apps made available by the government to citizens to share their views or suggestions about any scheme. The government can use the rich information available on these platforms to listen to its citizens and implement changes accordingly. Like any product, citizens may approve the objective of the scheme but may disapprove the mechanism adapted to execute it.
The solution developed by Core Compete has wide applications across the industry especially in Manufacturing, Retail, Telecom, Hospitality, and Banking. Real-time data streaming pipelines for sentiment analysis not only enables organizations to act faster but also helps deliver great customer experience.