What is Graphical Data?

Watch the video presentation:

Unlike any other data, a graphical database stores relationships between different entities, but what is an entity, any object that can be linked to another object is an entity, for example, LinkedIn users on the LinkedIn database could be their entities. A relationship between these entities could be like “Friendship”, “Subscriber” etc.

The graphical database is optimized to store these entities and relationships and help us navigate easily across them. It also helps us compute the distance from one entity to another, so let’s suppose one of friend’s friend on LinkedIn work for a company you want to apply for. It will require 2 communication links to reach that user, the longer the distance the weaker the relationship.

What are the different technical terms we give to components in a Graphical Database?

  1. Nodes – It is the entities we discussed in the previous section – In this example, it is Bob and Alice, but a group as well Chess. Hence, it is possible to have different entity types which is person and game club (Chess) in the same graphical database
  2. Edges – They are the relationship between those Nodes for this example it is “Membership” to the club and “Friendship/ Acquaintance” for Bob and Alice.

What’s a Practical Example of a Graphical Database?

The most important and fundamental element in graphical data is distance. A distance is how many hops you require to reach a target person.

In this example, if A wants to reach out to B – This will require a single hop, hence the person is very close but if D wants to reach out to C, the person needs 3 hops from D to A, A to B, and finally B to C.

What are Use-Cases for Graphical Data?

This information is highly useful when we have to perform tasks like fraud detection, so let’s suppose a fraudster in YouTube Ads, based on his friends and related account by Phone Numbers or IPs, the distance from other Users. If there are some users who are near this fraudster, we can build a hypothesis that the probability that the nearby connections are also fraudsters is fairly high as compared to the ones who are far.

The degree of a Node is the number of connections a particular entity has, hence if a person has many followers on LinkedIn, probably his degree of connections is very high. This is used when we need to find an influencer on social media for marketing and sales purpose. Using both distance and degree to come up with an idea. I will leave it to you to think and give comments on the chat, we can go through a few in the next coming video. You can see how this all works in the video below.

Files for the Tutorial