Project

Predictive insights

Client

SentiOne

2 complex models built in 3 months

2 complex models built in 3 months

85 categories for insights’ classification

85 categories for insights’ classification

0,7 (F1 score) - accuracy of popularity prediction model

0,7 (F1 score) - accuracy of popularity prediction model

Client

SentiOne allows companies to quickly check the opinions of Internet users on a product, brand, or company in the network. The tool automatically collects reviews and articles from social network sites, Internet forums, blogs, portals, etc., in one place, which allows improving communication between brands and consumers on the Internet.

The problem

Social media marketing is crucial to modern-day commercial success. It helps businesses connect with their audience, build a stronger brand, drive website traffic, and, of course, increase sales — but how can marketing managers know their content will engage their audience?

Typically, they follow their intuition, leveraging learnings gained through years of on-the-job experience. However, such an approach is not only costly. It often fails to account for emerging trends and customer preferences.

Moreover, designing compelling content requires the ongoing analysis of enormous data sets: including reviewing past marketing campaigns, how an audience responded, and the reasons behind success or failure.

Given there’s an endless trail of posts to view across multiple social media platforms, it’s next-to-impossible for a human to analyze even a small portion of what’s available — at least, not within a reasonable timeframe.

Instead, marketers need a tool that helps them automatically understand what content works on social media in different contexts.

DLabs.AI took on the task of developing a tool that could identify the relationship between user engagement and the semantic and syntactic features of particular content.

We then came up with a system to automatically detect the most engaging content from across the web for any given category based on a defined taxonomy.

The solution

The customer knew their requirements. Nevertheless, to ensure success, we invested time in identifying the core business needs, defining measurable KPIs, and designing tests.

The project’s main objective was to develop algorithms to automatically allocate content to specific industries and themes — using AI to detect the most engaging content per industry and theme.

Step 1: Build the dataset

We started by gathering data from Twitter and Facebook, gleaning insights from content posted by the client’s users. The source of the mention information was the Social Index provided by SentiOne, which, at the time, contained 965 topics on brands, people, products, TV series and films, sports teams, and cities, among others.

Based on our analysis — and following several iterations and client consultations — we created a three-tier taxonomy per industry, including:

  • 10 categories on the 1st level,
  • 46 categories on the 2nd level,
  • 29 categories on the 3rd level.

Once we had the corpora and taxonomy, the final step of this phase was to label the gathered insights.

Human annotators completed the labeling. However, we decided to support the labeling process using an iterative selection of references that had the most significant impact on the classification quality (instead of the typical approach based on random elements).

A total of 10,000 unique posts were annotated.

Step 2: Popularity prediction model

The concept of ‘popularity’ is not so easily defined, as each social media platform measures it using different metrics.

These metrics are directly related to a user’s specific reaction to the content presented to them on a social networking site. And some of them are characteristic of a particular social media platform.

Since our project analyzed Facebook and Twitter posts, based on learnings gained from current literature, we decided to divide the work and develop two separate popularity models: one for Twitter, another for Facebook.

We combined different machine learning techniques throughout the project, using expert knowledge and human input to train the algorithms beyond an assumed accuracy level.

We tested several approaches and models to achieve the best results. We used the F1 score to measure the accuracy of trained models.

Step 3: Industry and category model

This phase’s starting point was a previously-prepared and labeled corpora of social media posts — along with a three-level taxonomy of industries and categories.

After several tests and iterations, we settled on two trained models: one for predicting categories from the highest level of taxonomy, the other for predicting all categories.

The result

After three months, thanks to the client’s ongoing communication and close cooperation, we managed to successfully build:

  • A model to predict ‘popularity’ of >0.7 (F1 score) when classifying content against user engagement.
  • A model to predict different industries and categories scoring >0.6 (F1 score) when classifying a post to a specific category.

DLabs.AI co-built this tool to help social media managers quickly identify which posts perform best in their specific industry. Social media managers can now create compelling, engaging content in a time-efficient and cost-effective manner.

  

Technologies used

Python

Django

pandas logo

Pandas

fasttext

TensorFlow

spacy logo

spaCy

Morfeusz

See it in action

See it in action
  

CLIENTS OPINION

format_quote

DLabs.AI maintained open lines of communication and met project deadlines. Professional and collaborative, the team provided excellent feedback and useful insights. Customers can expect a partner that is both technically proficient and dedicated to delivering results.

Daniel Kajak, Head of Business Development at SentiOne

AI SOLUTIONS WE’RE PROUD OF

See other AI projects that have helped our clients achieve their business goals.

Read more call_made An AI-Based Virtual Assistant That Achieves ~50% Reduction in Meeting Organization Time

An AI-Based Virtual Assistant That Achieves ~50% Reduction in Meeting Organization Time