Fun with Neo4J and JRuby

Neo4J is a graph database built in Java with full support for Transactions. It work with nodes and relationships, you're able to put properties in both, graph structure are very simple structures but have great expressiveness and great work for some problems domains.

Graph are perfect for social media, because you have friends and the friends have friends and they have twitter, email, cellphone. You do have properties on the relationship like for how long time you know that find or where did you meet him last etc.


This is the classical Matrix movie relationship sample. Let's see how we structure this graph on code, I will show ruby code through JRuby. I made some changes and add new properties but the relationship didn't change that much.


#
 # I just create nodes with sumbol and put whatever you want
 #
 neo       = Neo4j::Node.new(:name => 'Neo', :realname => "Thomas Anderson", :rank => "The One")
 morpheus  = Neo4j::Node.new(:name => 'Morpheus', :rank => "Captain", :occupation => "total badass")
 trinity   = Neo4j::Node.new(:name => 'Trinity', :rank => "Girlfriend")
 smith     = Neo4j::Node.new(:name => 'Smith', :language => "C++", :version => "1.0b", :rank => "-1")
 architect = Neo4j::Node.new(:name => 'Architect', :rank => "0")
 cypher    = Neo4j::Node.new(:name => 'Cypher', :rank => "sucks")

 #
 # I'm defining a couple of relationships  
 #
 neo.outgoing(:knows) << trinity
 neo.outgoing(:knows) << morpheus
 morpheus.outgoing(:knows) << trinity
 morpheus.outgoing(:knows) << cypher
 cypher.outgoing(:knows) << smith
 smith.outgoing(:knows) << architect

 #
 # Here we have relationship properties(and my monkey ptach)
 #
 neo.outgoing(:knows).first[:since]        = "3 days"
 neo.outgoing(:knows).second[:since]       = "1 day" 

I used the Neo4j::Node to create nodes, when  you create a node you can define properties. You do that using symbols in ruby(like creating a map) and you can put as many properties you want, as you can see I put some common properties for each node BUT there unique properties for some nodes like :version for node smith.

There're relationship definitions as well, you see that when I do node.outgoing(:knows) and some other node is set. This is the most common relationship that you find on neo4j, you read from left to right and outgoing means from me to other node, you also can use ingoing to define the opposite relationship.

Finally use see some [:since] thats mean I'm defining properties for the relationship, like how long Thomas Anderson knows Trinity and Morpheus. Simple and Cool :D If you want you can get the full code on my github account.

Now, let's see a real problem solved by neo4j using Ruby through JRuby. For sake a time I reduced the problem to be smaller and more simple. Imagine you need get from one US state to another BUT imagine that you can do the shortest path and you can cross less states as possible and reach your final destination going through less states, this is a typical issue that neo4j solve for you. First let's see the big picture.

As I said I make the problem short, for this scenario you can can travel by the green cities, let's say that is our company travel policy :-) It's time to get down to code, first I will just create the nodes of this problem.
miami   = Neo4j::Node.new(:name => 'Miami'       , :state => "Florida"    , :ab => "FL")
  ny      = Neo4j::Node.new(:name => 'New York'    , :state => "New York"   , :ab => "NY")
  wichita = Neo4j::Node.new(:name => 'Wichita'     , :state => "Kansas"     , :ab => "KS")
  austin  = Neo4j::Node.new(:name => 'Austin'      , :state => "Texas"      , :ab => "TX")
  detroit = Neo4j::Node.new(:name => 'Detroit'     , :state => "Michigan"   , :ab => "MI")
  denver  = Neo4j::Node.new(:name => 'Denver'      , :state => "Colorado"   , :ab => "CO") 
  la      = Neo4j::Node.new(:name => 'Los Angeles' , :state => "California" , :ab => "CA")

I created all the cities that we need for the problem, I add some useful information like whats the city state and the city name as well. The next step is structure the relationships following the picture above.

miami.outgoing(:to) << ny 
  miami.outgoing(:to) << austin
  ny.outgoing(:to) << miami
  ny.outgoing(:to) << detroit
  detroit.outgoing(:to) << ny
  detroit.outgoing(:to) << wichita
  wichita.outgoing(:to) << detroit
  wichita.outgoing(:to) << denver
  denver.outgoing(:to) << wichita
  denver.outgoing(:to) << la
  austin.outgoing(:to) << miami
  austin.outgoing(:to) << wichita
  la.outgoing(:to) << austin
  la.outgoing(:to) << denver

Every city has two routes for other cities, there is no city with more them two routes in the case you really need cross other cities to archive you target. We are done with the structures, let's code a function in ruby that cover the shortest path to the following route.


We are in Miami and you want go to Wichita, I already draw the shortest path so this will help us to see if the code works, as you can see there are a longer path if we take NY, BUT if we go through Austin we reach the shortest path to our final destination Wichita.


class PathFinder
 def self.print_shortest_path(from,to)  
     puts "Route from #{from[:name]} to #{to[:name]}:"
  it   = Neo4j::Algo.shortest_path(from,to).iterator
  last = from[:name]
  it.next
  while it.hasNext
   n = it.next
   if !n[:name].blank?
    puts "  #{last} -> #{n[:name]}"
    last = n[:name]
   end
  end
 end
end

I'm using the Neo4j::Algo.shortest_path algorithm to find the shortest path, this function is already shipped with neo4j.I just iterate through the results node and printed the information in some reasonable format. This code will produce the following output.

$ jruby cities.rb
I, [2011-04-12T00:29:26.021000 #31908]  INFO -- : Enable remote shell at port port=9332
Route from Miami to Wichita:
  Miami -> Austin
  Austin -> Wichita

Hooray! Is working. You can change the cities and you gonna see that still works. This is it, with a very simple code we archive this great result, in the end of the day is all about the structure some structures are much better for some problems domains. You can get the full source code on my github account.

Cheers,
Diego Pacheco

Popular posts from this blog

Telemetry and Microservices part2

Installing and Running ntop 2 on Amazon Linux OS

Fun with Apache Kafka