Similar Projects
To the best of our knowledge, there is no other Storage interface implementation offering both scalability and fault tolerance the way NEO does:
- FileStorage Single-file storage
- RelStorage Relational database storage
- DirectoryStorage Multi-file storage
- Zeo Networked, multi-storage RPC.
- ZeoRaid Fault-tolerant clustering of Zeo servers
- Ceph Although not an object database, its design is very close to NEO
Pages on related topics
Some interesting pages on topics related to NEO, but not written for/about NEO:
- Wikipedia page on object databases
- xrootd/Scalla, a petascaled object database system
- (Brewer's) CAP Theorem (wikipedia), and a very-well written article on what CAP doesn't cover by Daniel Abadi
- Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions by Atul Adya
- The Multi-Queue Replacement Algorithm for Second Level Buffer Caches by Yuanyuan Zhou and James F. Philbin, used for client nodes ZEO-level caching
- The C10K Problem, by Dan Kegel
- Lessons Learned From Managing A Petabyte
Some notes from this paper, with NEO in mind:
- Thick client, thin server
status in NEO: designed this way
- Client-side compression
status in NEO: implemented
- Limiting file descriptor count
status in NEO: implemented on client, requires consideration on master/storage
- Load balancing
status in NEO: static balancing is done, dynamic is not
- Local disc cache, data geographical proximity to user
status in NEO: not implemented, for now NEO isn't expected to fit long-distance distribution
ram caching is done, though (both at NEO and ZODB levels), NEO level being implemented with a special intermediate caching algorythm
- Fine-grained locks are better
status in NEO: designed this way, and optimistic transaction consistency used in ZODB also helps
- Data flushed to disk only during commit
status in NEO: designed this way
- No need for central index updates for object changes
status in NEO: designed this way
- Large updates & update pooling
This is not expected to happen at NEO level, but at application level (ex: Zope)
- Data duplication
status in NEO: designed this way
- Sequential storage support (eg. tapes) the same way as random access storage (eq. disks)
status in NEO: not implemented, for now NEO requires all data (historical and current) to be accessible, to fit the needs of ZODB. It is unsure if this can be implemented at all, and heavily depends on application level behaviour (ex: Zope)
- Deferral system
status in NEO: not implemented, NEO focuses on interactive use (short transactions) rather than heavy data processing (long transactions) for the moment, so such feature is not on top of priority list
Finally, it seems that the biggest difference between described systems and NEO/ZODB sits around the meaning of "transaction" and expected application behavior inside a transaction: NEO provides the same level of isolation as ZODB does, which is (supposed to be) PL-2+, as per Atul Adya's thesis denomination (see below), which looks stricter than transaction isolation (shortly) described here.