Graph database use case: social media analysis
Graph databases can be used in many different scenarios, but it is commonly used to analyze social networks. In fact, social networks make the ideal use case as they involve a heavy volume of nodes (user accounts) and multi-dimensional connections (engagements in many different directions). A graph analysis for a social network can determine:
- How active are users? (number of nodes)
- Which users have the most influence? (density of connections)
- Who has the most two-way engagement? (direction and density of connections)
However, this information is useless if it has been unnaturally skewed by bots. Fortunately, graph analytics can provide an excellent means for identifying and filtering out bots.
In a real-world use case, the Oracle team used Oracle Marketing Cloud to evaluate social media advertising and traction—specifically, to identify fake bot accounts that skewed data. The most common behavior by these bots involved retweet target accounts, thus artificially inflating their popularity. A simple pattern analysis allowed for a look using retweet count and density of connections to neighbors. Naturally popular accounts showed different relationships with neighbors compared to bot-driven accounts.
This image shows naturally popular accounts.
And this image shows the behavior of a bot-driven account.
The key here is using the power of graph analytics to identify a natural pattern versus a bot pattern. From there, it’s as simple as filtering out those accounts, though it’s also possible to dig deeper to examine, say, the relationship between bots and retweeted accounts.
Social media networks do their best to eliminate bot accounts, as they impact their overall user base experience. To verify that this process of bot detection was accurate, flagged accounts were checked after a month. The results were as follows:
- Suspended: 89%
- Deleted: 2.2%
- Still active: 8.8%
This extremely high percentage of punished accounts (91.2%) showed the accuracy of both pattern identification and the cleansing process. This would have taken significantly longer in a standard tabular database, but with graph analytics, it&’s possible to identify complex patterns quickly.