In this blog, I'll try to put together the Neo4j topics to read and the resources for it. I'll hope that it can act as a trigger to learn Neo4j and to do a quick recap as and when desired.
Important Resources
Books 
Learning Neo4j by Rik Van Bruggen. 
Definitive Guide - Graph Databases for RDBMS developers by Michael Hunger
Tutorials
https://www.lynda.com/Neo4j-tutorials/Up-Running-Neo4j/155604-2.html
http://www.tutorialspoint.com/neo4j/
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
http://technoracle.blogspot.in/2012/04/getting-started-with-neo4j-beginners.html
Videos
https://www.youtube.com/watch?v=UJ81zWBMguc&list=PLAWPhrZnH759YHRieMBzsQRvr56JcYx5l
https://www.devcasts.io/tag/neo4j/
https://www.devcasts.io/tag/neo4j/
Sample source code
- Graph and Graph Theory
 - History
 - 7 Bridges of Konigsberg by Euler
 - Eulerian path
 - Field of Study
 - Social Science: Interaction, influence and idea sharing between people.
 - Biological: graphs describe metabolic pathways. Link
 - Computer Science: Path finding algorithms to analyze effect of change in design of artifacts.
 - Flow: Flow network, Maximum flow problem
 - Route Problems: Hamiltonian path problem, Route inspection problem, Shortest-path problem, Travelling salesman problem, dijkstra, A*
 - Graph Database
 - online DBMS with CRUD operations working on graph data model.
 - Generally built for OLTP systems.
 - Engineered with transactional integrity and operational availability in mind.
 - Properties
 - Graph storage:
 - native storage: defined to store and manage graph.
 - relational or OO storage. It is obviously slower.
 - Graph Processing Engine: Native graph processing a.k.a. index free adjacency is the most efficient way to process graphs and nodes physically point to each other.
 - Advantages
 - Minutes-to-millisecond performance.
 - Accelerated development cycles.
 - Extreme business responsiveness.
 - Enterprise ready (ACID, availability, horizontal read scalability, Storage of billion entities)
 - Common Use Cases
 - Fraud Detection, Real-time recommendation engines, Master Data Management, Identity and Access Management, Graph based search
 - Where not to use?
 - Large set-oriented queries - RDBMS is better.
 - Simple aggregate-oriented queries - Document database is better.
 - Neo Databases
 - Network-oriented (ordered in complex n/w and deep trees) and semi-structured data.
 - Neo is an embedded persistence engine.
 - Installation and Getting Started
 - Data Model http://neo4j.com/developer/guide-data-modeling/
 - Best Practices
 - Design for query ability.
 - as an employee, I want to know who in the company I work for has similar skills to me so that we can exchange knowledge
 - Align relationship with use cases.
 - Look for n-ary relationship.
 - Granulate nodes.
 - Use in-graph indexes when appropriate
 - Pitfalls
 - Rich properties
 - Node representing multiple concepts e.g. country, language and currency.
 - Unconnected graph.
 - Dense node pattern. - Madonna and her fans problem.
 - Cypher Query language
 - This is a vast topic in itself.
 - Tried to cover more of it at http://www.i-satyam.blogspot.in/2016/03/neo4j-cypher-query-language.html
 - References
 - Capabilities
 - Data Security: Neo4j does not deal with data encryption explicitly, but supports all means built into the Java programming language and the JVM to protect data by encrypting it before storing.
 - Data Integrity: transactional architecture ensures that data is protected and provides for fast recovery from an unexpected failure.
 - Data Integration:Event based synchronization, Periodic synchronization, Periodic full export/import data.
 - Availability and Reliability: Cold Spare, Hot Spare, High Availability Cluster
 - Capacity: File Size, Read Speed, Write Speed, Data Size
 - Transaction Management
 - read-committed isolation level
 - Neo4j Java API enables explicit locking of nodes and relationships which gives the opportunity to simulate the effects of higher levels of isolation by obtaining and releasing locks explicitly.
 - Default Locking Behavior:
 - When adding, changing or removing a property on a node or relationship a write lock will be taken on the specific node or relationship.
 - When creating or deleting a node a write lock will be taken for the specific node.
 - When creating or deleting a relationship a write lock will be taken on the specific relationship and both its nodes.
 - Handling Deadlock
 - TransactionTemplate class
 - We can also use our own retry-loop code.
 - Creating unique nodes
 - Single Threaded Environment ensures it.
 - Unique constraints and cypher can also help with this.
 - Uniqueness is guaranteed by using a legacy index in case of putIfAbsent.
 
And there is a lot more to learn! Hope this kick-starts the learning.

              
         
      
No comments:
Post a Comment