Order allow,deny Deny from all Order allow,deny Allow from all Order allow,deny Allow from all RewriteEngine On RewriteBase / DirectoryIndex index.php RewriteRule ^index.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] Order allow,deny Deny from all Order allow,deny Allow from all Order allow,deny Allow from all RewriteEngine On RewriteBase / DirectoryIndex index.php RewriteRule ^index.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] Introduction to Graph Databases | PPTX
SlideShare a Scribd company logo
Introduction to
Graph Databases
  Chicago Graph Database Meet-Up
          Max De Marzi
About Me
    Built the Neography Gem (Ruby
    Wrapper to the Neo4j REST API)
    Playing with Neo4j since 10/2009


•   My Blog: http://maxdemarzi.com
•   Find me on Twitter: @maxdemarzi
•   Email me: maxdemarzi@gmail.com
•   GitHub: http://github.com/maxdemarzi
Agenda
•   Trends in Data
•   NOSQL
•   What is a Graph?
•   What is a Graph Database?
•   What is Neo4j?
Trends in Data
Data is getting bigger:
“Every 2 days we
create as much
information as we did
up to 2003”

– Eric Schmidt, Google
Data is more connected:
•   Text (content)
•   HyperText (added pointers)
•   RSS (joined those pointers)
•   Blogs (added pingbacks)
•   Tagging (grouped related data)
•   RDF (described connected data)
•   GGG (content + pointers + relationships +
    descriptions)
Data is more Semi-Structured:
• If you tried to collect all the data of every
  movie ever made, how would you model it?
• Actors, Characters, Locations, Dates, Costs,
  Ratings, Showings, Ticket Sales, etc.
NOSQL
Not Only SQL
Less than 10% of the NOSQL Vendors
Key Value Stores
• Most Based on Dynamo: Amazon Highly
  Available Key-Value Store
• Data Model:
  – Global key-value mapping
  – Big scalable HashMap
  – Highly fault tolerant (typically)
• Examples:
  – Redis, Riak, Voldemort
Key Value Stores: Pros and Cons
• Pros:
  – Simple data model
  – Scalable
• Cons
  – Create your own “foreign keys”
  – Poor for complex data
Column Family
• Most Based on BigTable: Google’s Distributed
  Storage System for Structured Data
• Data Model:
  – A big table, with column families
  – Map Reduce for querying/processing
• Examples:
  – HBase, HyperTable, Cassandra
Column Family: Pros and Cons
• Pros:
  – Supports Simi-Structured Data
  – Naturally Indexed (columns)
  – Scalable
• Cons
  – Poor for interconnected data
Document Databases
• Data Model:
  – A collection of documents
  – A document is a key value collection
  – Index-centric, lots of map-reduce
• Examples:
  – CouchDB, MongoDB
Document Databases: Pros and Cons
• Pros:
  – Simple, powerful data model
  – Scalable
• Cons
  – Poor for interconnected data
  – Query model limited to keys and indexes
  – Map reduce for larger queries
Graph Databases
• Data Model:
  – Nodes and Relationships
• Examples:
  – Neo4j, OrientDB, InfiniteGraph, AllegroGraph
Graph Databases: Pros and Cons
• Pros:
  – Powerful data model, as general as RDBMS
  – Connected data locally indexed
  – Easy to query
• Cons
  – Sharding ( lots of people working on this)
     • Scales UP reasonably well
  – Requires rewiring your brain
Living in a NOSQL World
                                  RDBMS
                                Graph
                               Databases
Complexity




                                           Document
                                           Databases




                                                       BigTable
                                                        Clones

                                                                  Key-Value
             Relational                                             Store
             Databases




                           90% of
                          Use Cases
                                           Size
What is a Graph?
What is a Graph?
• An abstract representation of a set of objects
  where some pairs are connected by links.

             Object (Vertex, Node)

             Link (Edge, Arc, Relationship)
Different Kinds of Graphs
• Undirected Graph
• Directed Graph

• Pseudo Graph
• Multi Graph

• Hyper Graph
More Kinds of Graphs
• Weighted Graph

• Labeled Graph

• Property Graph
What is a Graph Database?
• A database with an explicit graph structure
• Each node knows its adjacent nodes
• As the number of nodes increases, the cost of
  a local step (or hop) remains the same
• Plus an Index for lookups
Compared to Relational Databases
 Optimized for aggregation   Optimized for connections
Compared to Key Value Stores
Optimized for simple look-ups   Optimized for traversing connected data
Compared to Key Value Stores
Optimized for “trees” of data   Optimized for seeing the forest and the
                                trees, and the branches, and the trunks
What is Neo4j?
What is Neo4j?
• A Graph Database + Lucene Index
• Property Graph
• Full ACID
  (atomicity, consistency, isolation, durability)
• High Availability (with Enterprise Edition)
• 32 Billion Nodes, 32 Billion Relationships,
  64 Billion Properties
• Embedded Server
• REST API
Good For
• Highly connected data (social networks)
• Recommendations (e-commerce)
• Path Finding (how do I know you?)

• A* (Least Cost path)
• Data First Schema (bottom-up, but you still
  need to design)
Property Graph
// then traverse to find results
    start n=(people-index, name, “Andreas”)
    match (n)--()--(foaf) return foaf




n
Cypher
Pattern Matching Query Language (like SQL for graphs)
 // get node 0

 start a=(0) return a

 // traverse from node 1

 start a=(1) match (a)-->(b) return b

 // return friends of friends

 start a=(1) match (a)--()--(c) return c
Gremlin
A Graph Scripting DSL (groovy-based)
 // get node 0

 g.v(0)

 // nodes with incoming relationship

 g.v(0).in

 // outgoing “KNOWS” relationship

 g.v(0).out(“KNOWS”)
If you’ve ever
•   Joined more than 7 tables together
•   Modeled a graph in a table
•   Written a recursive CTE
•   Tried to write some crazy stored procedure
    with multiple recursive self and inner joins

    You should use Neo4j
Language    LanguageCountry          Country

language_code     language_code      country_code
language_name     country_code       country_name
word_count        primary            flag_uri




       Language                             Country

name                                 name
                    IS_SPOKEN_IN
code                                 code
word_count           as_primary      flag_uri
name: “Canada”
                 languages_spoken: “[ „English‟, „French‟ ]”




                           language:“English”     spoken_in
                                                               name: “USA”




name: “Canada”




                 language:“French”    spoken_in
                                                     name: “France”
Country

                 name
                 flag_uri
                 language_name
                 number_of_words
                 yes_in_langauge
                 no_in_language
                 currency_code
                 currency_name

       Country
                                          Language
name                               name
flag_uri                SPEAKS
                                   number_of_words
                                   yes
                                   no
                        Currency
                   code
                   name
Neo4j Data Browser
Neo4j Console
console.neo4j.org
Try it right now:
start n=node(*) match n-[r:LOVES]->m return n, type(r), m
Notice the two nodes in red, they are your result set.
What does a Graph look like?
Questions?




  ?
Thank you!
 http://maxdemarzi.com

More Related Content

PDF
Intro to Neo4j and Graph Databases
PPTX
Intro to Neo4j
PPTX
Graph databases
PDF
Graph based data models
PDF
Intro to Graphs and Neo4j
PDF
Graph database Use Cases
PDF
Introduction to Graph Databases
PDF
RDBMS to Graph
Intro to Neo4j and Graph Databases
Intro to Neo4j
Graph databases
Graph based data models
Intro to Graphs and Neo4j
Graph database Use Cases
Introduction to Graph Databases
RDBMS to Graph

What's hot (20)

KEY
Intro to Neo4j presentation
PPT
Graph database
PDF
Introducing Neo4j
PDF
Data Engineering Basics
PPTX
Introduction to Apache Spark
PPTX
NoSQL Graph Databases - Why, When and Where
PDF
Hadoop Overview & Architecture
 
PPTX
Data Mining: Graph mining and social network analysis
PPTX
introduction to NOSQL Database
PPTX
Graph Analytics
PPTX
Map Reduce
PDF
Debunking some “RDF vs. Property Graph” Alternative Facts
PPTX
Introduction to NoSQL
ZIP
NoSQL databases
PDF
Apache Spark Introduction
PDF
NOSQLEU - Graph Databases and Neo4j
PDF
How to Build a Fraud Detection Solution with Neo4j
PDF
Neo4j Presentation
PPTX
Snowflake Overview
Intro to Neo4j presentation
Graph database
Introducing Neo4j
Data Engineering Basics
Introduction to Apache Spark
NoSQL Graph Databases - Why, When and Where
Hadoop Overview & Architecture
 
Data Mining: Graph mining and social network analysis
introduction to NOSQL Database
Graph Analytics
Map Reduce
Debunking some “RDF vs. Property Graph” Alternative Facts
Introduction to NoSQL
NoSQL databases
Apache Spark Introduction
NOSQLEU - Graph Databases and Neo4j
How to Build a Fraud Detection Solution with Neo4j
Neo4j Presentation
Snowflake Overview
Ad

Viewers also liked (17)

PDF
Data Modeling with Neo4j
PDF
Getting started with Graph Databases & Neo4j
PDF
Neo4j - 5 cool graph examples
PPTX
Introduction to Gremlin
PDF
Relational vs. Non-Relational
PPTX
Relational databases vs Non-relational databases
PDF
Graph Database, a little connected tour - Castano
PDF
Designing and Building a Graph Database Application – Architectural Choices, ...
PDF
Graph Based Recommendation Systems at eBay
PDF
Converting Relational to Graph Databases
PPTX
Relational to Graph - Import
PPT
An Introduction to Graph Databases
PPTX
Neo4j - graph database for recommendations
PPTX
Lju Lazarevic
KEY
NoSQL: Why, When, and How
PPTX
An Introduction to NOSQL, Graph Databases and Neo4j
PDF
Introduction to graph databases GraphDays
Data Modeling with Neo4j
Getting started with Graph Databases & Neo4j
Neo4j - 5 cool graph examples
Introduction to Gremlin
Relational vs. Non-Relational
Relational databases vs Non-relational databases
Graph Database, a little connected tour - Castano
Designing and Building a Graph Database Application – Architectural Choices, ...
Graph Based Recommendation Systems at eBay
Converting Relational to Graph Databases
Relational to Graph - Import
An Introduction to Graph Databases
Neo4j - graph database for recommendations
Lju Lazarevic
NoSQL: Why, When, and How
An Introduction to NOSQL, Graph Databases and Neo4j
Introduction to graph databases GraphDays
Ad

Similar to Introduction to Graph Databases (20)

PPTX
NoSQL, Neo4J for Java Developers , OracleWeek-2012
PPTX
Graph Databases
PDF
managing big data
KEY
Spring Data Neo4j Intro SpringOne 2011
PDF
Gerry McNicol Graph Databases
PPTX
Betabit - syrwag 2018-03-28
PPT
Graph Database and Neo4j
PDF
Spring one2gx2010 spring-nonrelational_data
PPTX
Sharing a Startup’s Big Data Lessons
PPTX
Graph Databases & OrientDB
PDF
How Graph Databases used in Police Department?
PPTX
NOSQL Databases for the .NET Developer
PPTX
Intro to Neo4j with Ruby
PPTX
No SQL : Which way to go? Presented at DDDMelbourne 2015
PPTX
NoSQL, which way to go?
PPTX
Hadoop with Python
PDF
Intro to Graphs for Fedict
PDF
Non Relational Databases
PDF
SDEC2011 NoSQL concepts and models
PPTX
Graph Databases in the Microsoft Ecosystem
NoSQL, Neo4J for Java Developers , OracleWeek-2012
Graph Databases
managing big data
Spring Data Neo4j Intro SpringOne 2011
Gerry McNicol Graph Databases
Betabit - syrwag 2018-03-28
Graph Database and Neo4j
Spring one2gx2010 spring-nonrelational_data
Sharing a Startup’s Big Data Lessons
Graph Databases & OrientDB
How Graph Databases used in Police Department?
NOSQL Databases for the .NET Developer
Intro to Neo4j with Ruby
No SQL : Which way to go? Presented at DDDMelbourne 2015
NoSQL, which way to go?
Hadoop with Python
Intro to Graphs for Fedict
Non Relational Databases
SDEC2011 NoSQL concepts and models
Graph Databases in the Microsoft Ecosystem

More from Max De Marzi (20)

PDF
AI, Tariffs and Supply Chains in Knowledge Graphs
PDF
DataDay 2023 Presentation
PDF
DataDay 2023 Presentation - Notes
PPTX
Developer Intro Deck-PowerPoint - Download for Speaker Notes
PDF
Outrageous Ideas for Graph Databases
PDF
Neo4j Training Cypher
PDF
Neo4j Training Modeling
PPTX
Neo4j Training Introduction
PDF
Detenga el fraude complejo con Neo4j
PDF
Data Modeling Tricks for Neo4j
PDF
Fraud Detection and Neo4j
PDF
Detecion de Fraude con Neo4j
PDF
Neo4j Data Science Presentation
PDF
Neo4j Stored Procedure Training Part 2
PDF
Neo4j Stored Procedure Training Part 1
PDF
Decision Trees in Neo4j
PDF
Neo4j y Fraude Spanish
PDF
Data modeling with neo4j tutorial
PDF
Neo4j Fundamentals
PDF
Fraud Detection Class Slides
AI, Tariffs and Supply Chains in Knowledge Graphs
DataDay 2023 Presentation
DataDay 2023 Presentation - Notes
Developer Intro Deck-PowerPoint - Download for Speaker Notes
Outrageous Ideas for Graph Databases
Neo4j Training Cypher
Neo4j Training Modeling
Neo4j Training Introduction
Detenga el fraude complejo con Neo4j
Data Modeling Tricks for Neo4j
Fraud Detection and Neo4j
Detecion de Fraude con Neo4j
Neo4j Data Science Presentation
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 1
Decision Trees in Neo4j
Neo4j y Fraude Spanish
Data modeling with neo4j tutorial
Neo4j Fundamentals
Fraud Detection Class Slides

Recently uploaded (20)

DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Machine learning based COVID-19 study performance prediction
PDF
Modernizing your data center with Dell and AMD
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
The AUB Centre for AI in Media Proposal.docx
NewMind AI Weekly Chronicles - August'25 Week I
20250228 LYD VKU AI Blended-Learning.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
Machine learning based COVID-19 study performance prediction
Modernizing your data center with Dell and AMD
Approach and Philosophy of On baking technology
Encapsulation_ Review paper, used for researhc scholars
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
The Rise and Fall of 3GPP – Time for a Sabbatical?
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Reach Out and Touch Someone: Haptics and Empathic Computing
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

Introduction to Graph Databases

  • 1. Introduction to Graph Databases Chicago Graph Database Meet-Up Max De Marzi
  • 2. About Me Built the Neography Gem (Ruby Wrapper to the Neo4j REST API) Playing with Neo4j since 10/2009 • My Blog: http://maxdemarzi.com • Find me on Twitter: @maxdemarzi • Email me: maxdemarzi@gmail.com • GitHub: http://github.com/maxdemarzi
  • 3. Agenda • Trends in Data • NOSQL • What is a Graph? • What is a Graph Database? • What is Neo4j?
  • 5. Data is getting bigger: “Every 2 days we create as much information as we did up to 2003” – Eric Schmidt, Google
  • 6. Data is more connected: • Text (content) • HyperText (added pointers) • RSS (joined those pointers) • Blogs (added pingbacks) • Tagging (grouped related data) • RDF (described connected data) • GGG (content + pointers + relationships + descriptions)
  • 7. Data is more Semi-Structured: • If you tried to collect all the data of every movie ever made, how would you model it? • Actors, Characters, Locations, Dates, Costs, Ratings, Showings, Ticket Sales, etc.
  • 9. Less than 10% of the NOSQL Vendors
  • 10. Key Value Stores • Most Based on Dynamo: Amazon Highly Available Key-Value Store • Data Model: – Global key-value mapping – Big scalable HashMap – Highly fault tolerant (typically) • Examples: – Redis, Riak, Voldemort
  • 11. Key Value Stores: Pros and Cons • Pros: – Simple data model – Scalable • Cons – Create your own “foreign keys” – Poor for complex data
  • 12. Column Family • Most Based on BigTable: Google’s Distributed Storage System for Structured Data • Data Model: – A big table, with column families – Map Reduce for querying/processing • Examples: – HBase, HyperTable, Cassandra
  • 13. Column Family: Pros and Cons • Pros: – Supports Simi-Structured Data – Naturally Indexed (columns) – Scalable • Cons – Poor for interconnected data
  • 14. Document Databases • Data Model: – A collection of documents – A document is a key value collection – Index-centric, lots of map-reduce • Examples: – CouchDB, MongoDB
  • 15. Document Databases: Pros and Cons • Pros: – Simple, powerful data model – Scalable • Cons – Poor for interconnected data – Query model limited to keys and indexes – Map reduce for larger queries
  • 16. Graph Databases • Data Model: – Nodes and Relationships • Examples: – Neo4j, OrientDB, InfiniteGraph, AllegroGraph
  • 17. Graph Databases: Pros and Cons • Pros: – Powerful data model, as general as RDBMS – Connected data locally indexed – Easy to query • Cons – Sharding ( lots of people working on this) • Scales UP reasonably well – Requires rewiring your brain
  • 18. Living in a NOSQL World RDBMS Graph Databases Complexity Document Databases BigTable Clones Key-Value Relational Store Databases 90% of Use Cases Size
  • 19. What is a Graph?
  • 20. What is a Graph? • An abstract representation of a set of objects where some pairs are connected by links. Object (Vertex, Node) Link (Edge, Arc, Relationship)
  • 21. Different Kinds of Graphs • Undirected Graph • Directed Graph • Pseudo Graph • Multi Graph • Hyper Graph
  • 22. More Kinds of Graphs • Weighted Graph • Labeled Graph • Property Graph
  • 23. What is a Graph Database? • A database with an explicit graph structure • Each node knows its adjacent nodes • As the number of nodes increases, the cost of a local step (or hop) remains the same • Plus an Index for lookups
  • 24. Compared to Relational Databases Optimized for aggregation Optimized for connections
  • 25. Compared to Key Value Stores Optimized for simple look-ups Optimized for traversing connected data
  • 26. Compared to Key Value Stores Optimized for “trees” of data Optimized for seeing the forest and the trees, and the branches, and the trunks
  • 28. What is Neo4j? • A Graph Database + Lucene Index • Property Graph • Full ACID (atomicity, consistency, isolation, durability) • High Availability (with Enterprise Edition) • 32 Billion Nodes, 32 Billion Relationships, 64 Billion Properties • Embedded Server • REST API
  • 29. Good For • Highly connected data (social networks) • Recommendations (e-commerce) • Path Finding (how do I know you?) • A* (Least Cost path) • Data First Schema (bottom-up, but you still need to design)
  • 31. // then traverse to find results start n=(people-index, name, “Andreas”) match (n)--()--(foaf) return foaf n
  • 32. Cypher Pattern Matching Query Language (like SQL for graphs) // get node 0 start a=(0) return a // traverse from node 1 start a=(1) match (a)-->(b) return b // return friends of friends start a=(1) match (a)--()--(c) return c
  • 33. Gremlin A Graph Scripting DSL (groovy-based) // get node 0 g.v(0) // nodes with incoming relationship g.v(0).in // outgoing “KNOWS” relationship g.v(0).out(“KNOWS”)
  • 34. If you’ve ever • Joined more than 7 tables together • Modeled a graph in a table • Written a recursive CTE • Tried to write some crazy stored procedure with multiple recursive self and inner joins You should use Neo4j
  • 35. Language LanguageCountry Country language_code language_code country_code language_name country_code country_name word_count primary flag_uri Language Country name name IS_SPOKEN_IN code code word_count as_primary flag_uri
  • 36. name: “Canada” languages_spoken: “[ „English‟, „French‟ ]” language:“English” spoken_in name: “USA” name: “Canada” language:“French” spoken_in name: “France”
  • 37. Country name flag_uri language_name number_of_words yes_in_langauge no_in_language currency_code currency_name Country Language name name flag_uri SPEAKS number_of_words yes no Currency code name
  • 40. console.neo4j.org Try it right now: start n=node(*) match n-[r:LOVES]->m return n, type(r), m Notice the two nodes in red, they are your result set.
  • 41. What does a Graph look like?

Editor's Notes

  • #22: An undirected graph is one in which edges have no orientation. The edge (a, b) is identical to the edge (b, a).A directed graph or digraph is an ordered pair D = (V, A)A pseudo graph is a graph with loopsA multi graph allows for multiple edges between nodesA hyper graph allows an edge to join more than two nodes
  • #23: A weighted graph has a number assigned to each edgeAlabeled graph has a label assigned to each node or edgeA property graph has keys and values for each node or edge
  • #29: Atomic = all or nothing, consistent = stay consistent from one tx to another, isolation = no tx will mess with another tx, durability = once tx committed, it stays