I have been reading about neo4j last few days. I got very confused about whether I need to use REST API or if can I go with Java APIs.
My need is to create millions of nodes which will have some connection among them. I want to add indexes on few of node attributes for searching. Initially I started with embedded mode of GraphDB with Java API but soon reached
OutOfMemorywith indexing on few nodes so I thought it would be better if my neo4j is running as service and I connect to it through REST API then it will do all memory management by itself by swapping in/out data to underlying files. Is my assumption right?
Further, I have plans to scale my solution to billion of nodes which I believe wont be possible with single machine’s neo4j installation. I also believe Neo4j has the capability of running in distributed mode. For this reason also I thought continuing with REST API implementation is best idea.
Though I couldn’t find out any good documentation about how to run Neo4j in distributed environment.
Can I do stuff like batch insertion, etc. using REST APIs as well, which I do with Java APIs with Graph DB running in embedded mode?
Do you know why you are getting your
OutOfMemoryException? This sounds like you are creating all these nodes in the same transaction, which causes it to live in memory. Try committing small chunks at a time, so that
Neo4jcan write it to Disk. You don’t have to manage the memory of Neo4j aside from things like cache.
Distributed mode is in a Master/Slave architecture, so you’ll still have a copy of the entire DB on each system. Neo4j is very efficient for disk storage, a Node taking 9 Bytes, Relationship taking 33 Bytes, properties are variable.
There is a Batch REST API, which will group many calls into the same HTTP call, however making REST calls is still a slower then if this were embedded.
There are some disadvantages to using the REST API that you did not mentions, and that’s stuff like transactions. If you are going to do atomic operations, where you need to create several nodes, relationships, change properties, and if any step fails not commit any of it, you cannot do this in the REST API.