Introduction
- graph theory by Euler
There are different graph models to choose from:
- directed vs undirected graphs (edges have direction or not)
- weighted vs unweighted graphs (edges have weights or not)
- cyclic vs acyclic graphs (graphs with cycles or not)
- hypergraphs (edges can connect more than two nodes)
- property graphs (nodes and edges can have properties)
They can all be boiled down to four basic primitives, that allows you to build any graph structure:
- nodes (entities)
- labels (names)
- relationships (edges)
- properties (attributes)
The last one can be seen in the property graph model, where nodes and edges can have key-value pairs associated with them. Is the most common model today.
Motivation
- Data needs to be understandable in order to have value
- provide contextualized understanding of data is a powerful tool to make sense of data and generate knowledge
- allows to represent complex relationships and networks very intuitively, both for humans and machines
- very flexible, just add more nodes and edges as needed
Definition
- Knowledge graphs are a specific type of graph that provide contextual understanding.
- Provides a holistic view of data, capturing not only entities but also their relationships and attributes.
- Data can come from anywhere, from self-contained graph databases, to datalakes, to federated systems.
holistic meeans dealing with or treating the whole of something or someone and not just a part. federated means a group of entities that are joined together but still maintain their own autonomy.
Organizing principle
- graphs that are functional but not understandable are not very useful
- knowledge graphs are organized around semantics
- semantics is the name for the meaning of things, so people and machines can understand them
- semantics is what makes knowledge graphs different from regular graphs
- data shouldn’t need a manual to be understood
- organizing principles are contracts between the graph and its users
- taxonomies and ontologies are high level organizing principles
- taxonomies are hierarchical classifications of concepts
- taxonomies can be defined as
Categorynodes connected bySUBCLASS_OFrelationships - ontologies are more complex relationships between concepts
- ontologies allows users and machines to take actions based on the data
- there are several off-the-shelf widely used ontologies for particular domains, such as SNOMED CT
- you can choose to build your own ontology or extend an existing one, depending on your domain and use case