NEO, the distributed transactional NoSQL object database of Nexedi, has reached on November 24th 2017 a storage size of 100 TB on a redundant array of inexpensive computers. This success demonstrates that both ERP5 open source ERP/CRM and Wendelin out-of-core Big Data platform can power the biggest commercial bulkloads, both for transactional and data science applications.
The data stored in NEO consists of 30 big data streams of about 3.3 TB, each of which can be accessed sequentially or randomly.
In order to achieve this result, Nexedi has been running a NEO ingestion test drive for 40 days on 6 inexpensive Dedibox computers, each of which with 3 SATA disks of 6TB, 1x Intel® Xeon® E5 1410 v2 and 64 GB RAM. 30 concurrent fluentd streams of data have been ingested into ERP5 / Wendelin platform powered by NEO at an average growth rate of 2.5 TB per day. The NEO cluster was configured with 30 independent Zope application servers and 18 independent replicated storages for a total disk usage of 88.6 TB. A compression factor of 43.5% was observed on the random data that was ingested in this test run.
NEO database relies on an innovative protocol that turns a cluster of independent storage engines into a single transactional storage space. NEO currently supports MariaDB, MySQL, SQLite and POSIX filesystem as possible storage engines. For the current tests, MariaDB has been used with two different storage backends: RocksDB and TokuDB. Both RocksDB and TokuDB have shown similar peformance: RocksDB write performance was more consistent whereas TokuDB read performance was a bit better. Detailed test report will be published soon.