Submitted
By Emil Eifrem
Graph databases based on NoSQL are increasingly the database of choice for big data because they can efficiently manage the zettabytes of unstructured data that is being collected.
According to Gartner, “graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.”
There is little doubt that enterprises have hit a major crossroads in traditional data processing due to relational databases being developed for tabular data, built on a consistent structure and a fixed schema.
Data Relationships
Relational database data management systems (RDBMS) do a great job on problems that are well defined from the beginning. But answering queries about data relationships, such as providing online retail recommendations to consumers, requires a number of joins between database tables. This can be difficult and time consuming. Graph databases, however, come into their own in this type of analysis as they were specifically designed to store, query, and analyze connections between things.
One big advantage relational databases have on their NoSQL counterparts, however, is the common SQL language itself. It is used for querying and editing informationstored in RDBMS, regardless of vendor. The easy accessibility and universal acceptance of SQL has been partly responsible for the success of relational databases.
This is where, up until now, graph technology has been sold short. Every graph database vendor has opted to use a different query language. There hasn’t been a universally used vendor-neutral query language for graph that could send it mainstream.
The wait may be over
Change, however, is on the horizon that could put the graph database on center stage.
Open Graph Query Language
To understand this sea change we must look to the openCypher project—opencypher.org—an openly shared graph query language that does not fetter itself to any vendor or platform and which many of us are confident will bring enormous benefits for both vendors and enterprises.
Cypher is the hope we have in the graph space for a SQL-like query language to rapidly develop and encourage competition.
Graph technologies are growing in popularity, primarily due to an increased understanding of real-world use cases, the creation of fully native graph databases, and an expanding deployment of graphs across key verticals such as healthcare and government. These have all contributed to a dramatic change in the graph landscape.
Big names such Cisco, Facebook, Walmart, and Pitney Bowes have all been fast off the blocks to adopt graph. The emergence of a standard and open declarative query language for graphs via the openCypher project will accelerate take up.
Cypher has a large amount of user validation and can cope technically with the challenges of querying connected data as it utilizes symbols to show patterns that correspond to the visual and intuitive representation of data. As a declarative query language, Cypher also allows users to center on their domain and pull out the data they want, rather than getting pulled into the quagmire of data access.
Cypher’s other great benefit is its ease of use. It was developed as a human-readable query language, so it appeals to developers and operations professionals alike. It is both powerful and intuitive, combining English prose and intuitive iconography to make queries clearer.
With the debut of the openCypher project, the language looks like it will be just as significant in the growth of graph processing and analysis as SQL was in speeding up the adoption of RDBMS.
The openCypherCommunity
The openCypher project is snowballing; it needs help to ensure it stays as open as possible and reaches its true potential. Over the coming months,more of the language artefacts will move across to GitHub to make it widely available to everyone.
The project delivers four key artefacts released under permissive licenses. These include Cypher reference documentation describing use of the Cypher query language with examples and tutorials, a technology certification kit that comprises of a number of tests that software suppliers can run to self-certify support for a given version of Cypher, and a reference implementation distributed under the Apache 2.0 license. The latter is a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform tool. Finally the Cypher language specification, licensed under the Creative Commons license, and a full semantic specification are also planned as a part of the openCypher project.
Benefits of Technology Independence
Data developers and analysts, together with those getting to grips with graph database skills, are already benefiting from Cypher. The openCypher project provides users with an enviable set of marketable skills, tooling will provide easier support for multiple database backends, and enterprises will see the real benefits of true technology independence.
Graph processing is becoming a key part of the big data world. openCypher is set to bring about a transformation in the database arena that can’t be overlooked.SW
Emil Eifremis co-founder and CEO of Neo Technology, the company behind Neo4j, world’s leading graph database.
May2016, Software Magazine