Blog

The List of Featured Graph Database Overviews and Benchmarks

Eugene Lahansky

Graph data stores provide index-free adjacency resulting in a much better performance, if compared to traditional RDBMS. Naturally, performance is the main concern for those who work with such databases. To predict the behavior of a graph database and find potential issues before actually implementing it, developers use benchmarks that simulate the actual workloads that users will create. This post covers some useful graph database overviews and benchmarks.

Graph DB overviews

There are plenty of resources and publications that describe the basics of graph databases.

  • Survey of Graph Database Models: This paper generalizes the research conducted in the field of graph database modeling. Concentrating on data structures, query languages, and integrity constraints, the authors compare graph database models against network, relational, semantic, object-oriented, and other influential DB models. In addition, the paper provides information on levels of abstraction, base data structure, information focus, and many other characteristics of today’s graph databases.
  • Graph Databases: This book published by O’Reilly discusses how graph databases can help you to manage and query highly connected data. Through examples, you will learn how to design and implement a graph database, discover alternative methods of storing data, such as relational and NoSQL databases, as well as learn the difference between these data storage models.
  • The Current State of Graph Databases: This paper gives an overall summary of the current state of graph databases. Enumerating different categories, algorithms, and paradigms, the authors describe the graph database models in use today.

Graph DB benchmarks

Getting valid performance results is not easy. The good news is that there are a number graph database benchmarks available. Below you will find a collection of comparisons that have been published over the past couple of years. These might be useful when choosing the best option for your application.

1) Neo4j vs. DEX vs. OrientDB vs. a native RDF repository vs. SGDB

This publication presents the results of a comparative test run against Neo4j, DEX, OridntDB, a native RDF repository, and SGDB on a low end machine with a two-core 2.4 GHz Intel processor and 2 GB of RAM. Data sets ranged in size from 1 K to 1 M. The workload included operations, such as insertion of elements, local traversals, and global traversals. The preliminary results of the tests revealed issues with loading larger datasets into graph databases. In addition, poor overall performance was typical when the databases performed global traversal operations on larger networks. However, the performance was stable for local traversals with 2–3 hops.

2) DEX

Another benchmark was designed to test scalability and performance of DEX for applications with very large data sets. The authors tracked how many nodes and edges could be created by the database, the resulting size of the database, the time it took to load the database, and how many traversals could be made per unit of time. According to the results of this test, DEX does provide sufficient loading and querying speed to deal with large datasets. In addition to its great performance when dealing with billions of objects, the database only uses 36 bytes per object available in DEX.

3) Neo4j vs. MySQL

This publication compares MySQL against Neo4j to find out which one of them is more suitable for a data provenance system. In addition to structural query results for MySQL and Neo4j, the comparison takes into account database size and required disc space. Neo4j demonstrated somewhat better performance than MySQL when processing most query types. The graph database was much faster (sometimes exceeding the performance of the relational database by a factor of 10) when processing traversal queries. Neo4j required up to two times the amount of space used by MySQL, which required a larger disk space in only one test out of 12.

4) AllegroGraph vs. DEX vs. HypergraphDB vs. InfiniteGraph vs. Neo4j vs. Sones

This paper provides a comparison of current graph database models, including general features (for data storage and querying), data modeling features (data structures, query languages, and integrity constrains), and support for essential graph queries. The bottom line is that current graph DB models still need to mature. In particular, the ecosystem should define standard graph database languages (for defining, manipulating, and querying data) and notions of integrity constraints (to preserve the consistency of the database).

From these benchmarks, we can see how graph databases are more effective than relational ones. In addition, there are a lot of different approaches to prepare and perform tests. Some of the mentioned sources present such approaches and algorithms. If you know other latest benchmarks of graph databases, fell free to let us know.

Related posts:

3 Comments
  • kalyan

    Please implement a graph database (a kind of NoSQL). This graph database
    should consist of nodes (with have properties) for entities and edges (which
    have single or multiple properties and can be directional or bidirectional) for
    relationships, and support node indexing and query. The query language has
    following keywords: START, MATCH, WHERE, RETURN, ORDER BY,
    AGGREGATE, SKIP, and LIMIT.

    Example Input:
    Node martin = graphDB.createNode();
    martin.setProperty(“name”, “Martin”);
    Node pramod = graphDB.createNode();
    pramod.setProperty(“name”, “Pramod”);
    Node barbara = graphDB.createNode();
    pramod.setProperty(“name”, “Barbara”);
    martin.createRelationshipTo(pramod, FRIEND, since = 1998);
    pramod.createRelationshipTo(martin, FRIEND, since = 1998);
    martin.createRelationshipTo(barbara, FRIEND);
    barbara.createRelationshipTo(martin, FRIEND);
    START node = Node:nodeIndex(name = “Barbara”)
    MATCH path = node-[relation*1..3]->related_node
    WHERE type(relation) = ‘FRIEND’
    RETURN related_node.name, length(path), relation.since;

  • Alex Khizhnyak

    Sorry, kalyan, do not get your comment. Could you please explain what you mean?.. Thanks.

  • webdevcloud

    Hi,
    We are planning to host Neo4j 2.3.2 to cloud foundry. Currently support neo4j supported version of cloudfoundry is 1.5.x. Can you clarify how to host neo4j 2.3.2 on cloudfoundry. I have a spring boot application, which communicates with CF using a user provided services. and I have two profile cloud and default. But what is the best way to have neo4j in cloud with or without storage, where I can launch the browser. Neo4j browser runs on port 7474, can cloud foundry listen to port 7474?

Benchmarks and Research

Subscribe to new posts

Get new posts right in your inbox!