Rapid.Space has three design goals: slash costs, high performance and ethics. This is achieved by operating with Free Software re-certified Open Compute Project (OCP) servers hosted in energy efficient data-centers and by eliminating any form of hardware redundancy. Each OCP server has:
Data-centers which host Rapid.Space servers may only have a single electrical power source and a single network access. No power generator is needed. Batteries only supply power during 2 minutes. Servers are designed to operate up to 35 Celsius degree outdoor temperature in free airflow data-centers with no air-conditioning nor cooling.
All design principles that are usually taken for granted in data-centers or cloud computing systems have been questioned by Rapid.Space. Rather than relying on hardware redundancy, any form of redundancy or resiliency should be implemented through software and multiple data-centers.
For example, network resiliency is implemented through re6st software which is capable of overcoming routing issues that often happen on the Internet. Application resiliency can be implemented either using SlapOS's resiliency technology or by relying on databases which are nativly redundant over multiple data-centers.
One of the challenges faced by Rapid.Space was to reach similar or better performance as bare metal servers provided by OVH, Online, Hetzner while cutting costs as much as possible on the server management infrastructure. A Rapid.Space data-center should ideally operate without any form of IPMI management network, without virtual keyboards and monitor (KVM), without power switches, without routers, without having to ever reinstall servers, without anti-DDOS system, without any of those sophisticated hardware or systems that took a decade to build by other providers.
Simplification and cost cutting goals led to the following decisions:
This approach sounds ideal in many aspects: lower cost, less hardware and less staff to maintain the system. Performance of networking or CPU Is not degraded by virtualization thanks to virtio. However, this approach faces one hurdle: the risk of poor performance of storage virtualization especially in the case of database applications.
Storage virtualization often leads to poor I/O performance. We have experienced how Storage Area Network (SAN) systems often lead to overall database performance 10 to 100 lower than with a consumer grade 200€ SSD because of latency or congestion on the Fiber Channel network. Similar issues can happen with network block device for similar reasons. Whenever poor database performance is the consequence of high latency to access the storage, not much can be done besides moving to locally attached storage.
In order to evaluate Rapid.Space performance, we decided to evaluate the performance of a real ERP application developed by Nexedi which produces and posts to accounting ledger about 300.000 invoices in about 8 hours: "ERP5 Billing Run". It is the biggest database application operated by Nexedi. It is currently hosted using OVH on high-end Xeon dedicated servers with locally attached SSD (NVME). All services of this applications are automatically orchestrated over bare metal with SlapOS.
This "ERP5 Billing Run" application combines both CPU and disk I/O. It is written in python language. The underlying database is MariaDB which is used both for relational queries and to store BLOBs with NEO. Faster CPU and faster disk I/O will lead in slower time to complete the test.
We then compared different scenarii:
In the case of OCP hardware Qemu configuration was optimized for performance by using virtio, pass-through and cache parameters.
Our conclusions are:
Based on these results, all Rapid.Space OCP servers are now configured with two SSD disks: one for the system image and one entirely dedicated to qemu process. In the near future Rapid.Space will add a third disk for bare metal deployment of database.
Nexedi will keep on relying on OVH because it is one of the best providers of high performance dedicated servers in the world. Nexedi will use Rapid.Space VMs and bare metal to address use cases not covered by OVH or to cut costs on use cases that do not require highest performance.
Overall, the "ERP5 Billing Run" test is consistent with the pedigree of Rapid.Space servers: very high-end servers that were built 3 years ago for large corporations and that are now about 30% slower than recent generation high-end servers, once they have been re-certified and upgraded with a brand new SSD. Considering their cost, it is deal!
Even with the best possible configuration, qemu disk performance is still 30% slower than bare metal. This is obviously much better than 800% slower or even worse as we often experienced with SAN or some public cloud computing services. Yet, some applications may need even more disk performance. For those applications, Rapid.Space will provide with each virtual machine a FusionIO dedicated storage. Rapid.Space customers will be able to deploy any SlapOS database profile (MariaDB, NEO, etc.) to benefit from bare metal performance of locally attached FusionIO disks, while running their applications inside qemu virtual machine.
We expect a performance boost of 200% to 1000% through this approach compared to best results with qemu.
We also expect to open SlapOS nano-container technology to Rapid.Space users, so that any GNU/Linux software (MariaDB, Postgresql, ERP5, NEO, Wendelin, Spark, etc.) can be deployed on bare metal next to the dedicated VM. If a Rapid.Space user needs a different database, a different version or a different configuration, he or she can extend SlapOS software library.
This is the other beauty of Rapid.Space: all its source code is Free Software. Rapid.Space customers are free to contribute to it and improve it.