Neo4j’s Exploration of Cancer Types by David Wells, Aug 2024

SeniorTechInfo
2 Min Read

How to Identify and Visualize Clusters in Knowledge Graphs

In this blog post, we will delve into the fascinating world of identifying and visualizing different clusters of cancer types through the analysis of disease ontology as a knowledge graph. By setting up Neo4j in a docker container, importing the ontology, generating graph clusters and embeddings, and using dimension reduction techniques, we can plot these clusters and derive insights. While disease_ontology serves as our example, the steps outlined can be applied to explore any ontology or graph database.

Cancer types viewed as embeddings and colored by cluster

In a graph database, data is stored as nodes and relationships between nodes, allowing us to visualize connections that are not explicitly mentioned in the data. For instance, melanoma and carcinoma are subcategories of cell type cancer tumor, indicating a relationship between these cancer types.

Graph database example

Ontologies, formalized sets of concepts and relationships, play a crucial role in biological sciences. The disease ontology showcases the interrelations between different disease types, aiding in data extraction and interpretation.

Neo4j, a powerful tool for managing graph databases, can be easily set up using a docker container, simplifying the process for analysis.

        docker run \
-it -rm \
-publish=7474:7474 -publish=7687:7687 \
-env NEO4J_AUTH=neo4j/123456789 \
-env NEO4J_PLUGINS='["graph-data-science","apoc","n10s"]' \
neo4j:5.17.0

Once Neo4j is up and running, you can import the disease ontology using the n10s plugin, enabling you to explore the ontology or embed your data within it.

To continue reading and uncover the intriguing insights from the clusters and embeddings generated, visit the full code at GitHub.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *