High Performance Virtual Machines for Database Applications

Rapid.Space is a new provider of low cost, ethical yet high performance servers targeted at big data applications, high performance computing, software development and large database applications. Rapid.Space provisions on each server a single "dedicated virtual machine" which takes control of all resources of the server. Performance degradation for memory, CPU or 10 Gbps networking is not noticeable. Performance degradation for disk I/O is at most 30%, a 800% improvement compared to traditional storage virtualisation technologies used by some cloud computing services.
  • Last Update:2018-10-01
  • Version:002
  • Language:en

Rapid.Space has three design goals: slash costs, high performance and ethics. This is achieved by operating with Free Software re-certified Open Compute Project (OCP) servers hosted in energy efficient data-centers and by eliminating any form of hardware redundancy. Each OCP server has:

  • a single 128GB SSD disk for base system image;
  • a single 4TB SSD (SATA) disk for data;
  • a single 10 Gbps network interface connected to a single switch per rack;
  • a single power supply per server;
  • 256 GB RAM;
  • 20 cores (Xeon 2680v2).

Data-centers which host Rapid.Space servers may only have a single electrical power source and a single network access. No power generator is needed. Batteries only supply power during 2 minutes. Servers are designed to operate up to 35 Celsius degree outdoor temperature in free airflow data-centers with no air-conditioning nor cooling. 

All design principles that are usually taken for granted in data-centers or cloud computing systems have been questioned by Rapid.Space. Rather than relying on hardware redundancy, any form of redundancy or resiliency should be implemented through software and multiple data-centers.

For example, network resiliency is implemented through re6st software which is capable of overcoming routing issues that often happen on the Internet. Application resiliency can be implemented either using SlapOS's resiliency technology or by relying on databases which are nativly redundant over multiple data-centers.

No IPMI, No KVM, No Power Switches

One of the challenges faced by Rapid.Space was to reach similar or better performance as bare metal servers provided by OVH, Online, Hetzner while cutting costs as much as possible on the server management infrastructure. A Rapid.Space data-center should ideally operate without any form of IPMI management network, without virtual keyboards and monitor (KVM), without power switches, without routers, without having to ever reinstall servers, without anti-DDOS system, without any of those sophisticated hardware or systems that took a decade to build by other providers.

Simplification and cost cutting goals led to the following decisions:

  • the standard unit of management is the virtual machine: it eliminates the need for IPMI, KVM and power switches which are replaced by software equivalent (NoVNC);
  • all network is IPv6 based and self-routed by babel: it eliminates the need for routers or expensive IPv4 address class;
  • bootloader is Linuxboot with read-only system image self-installed on dedicated SSD: servers do not need to be reinstalled;
  • network traffic is blocked by default: it eliminates the need for expensive DDOS mitigation or malware detection.

This approach sounds ideal in many aspects: lower cost, less hardware and less staff to maintain the system. Performance of networking or CPU Is not degraded by virtualization thanks to virtio. However, this approach faces one hurdle: the risk of poor performance of storage virtualization especially in the case of database applications. 

High performance disk I/O with qemu

Storage virtualization often leads to poor I/O performance. We have experienced how Storage Area Network (SAN) systems often lead to overall database performance 10 to 100 lower than with a consumer grade 200€ SSD because of latency or congestion on the Fiber Channel network. Similar issues can happen with network block device for similar reasons. Whenever poor database performance is the consequence of high latency to access the storage, not much can be done besides moving to locally attached storage.

In order to evaluate Rapid.Space performance, we decided to evaluate the performance of a real ERP application developed by Nexedi which produces and posts to accounting ledger about 300.000 invoices in about 8 hours: "ERP5 Billing Run". It is the biggest database application operated by Nexedi. It is currently hosted using OVH on high-end Xeon dedicated servers with locally attached SSD (NVME). All services of this applications are automatically  orchestrated over bare metal with SlapOS.

This "ERP5 Billing Run" application combines both CPU and disk I/O. It is written in python language. The underlying database is MariaDB which is used both for relational queries and to store BLOBs with NEO. Faster CPU and faster disk I/O will lead in slower time to complete the test. 

We then compared different scenarii:

  • use of Rapid.Space OCP server and qemu with a qcow2 disk image on a locally attached storage (qemu qcow2);
  • use of Rapid.Space OCP server and qemu with a disk partition of a locally attached storage (qemu /dev/sdb2);
  • use of Rapid.Space OCP server and qemu with the complete device of a locally attached storage (qemu /dev/sdb);
  • use of bare metal Rapid.Space OCP server on a locally attached storage (bare metal);
  • use of an OVH OpenStack (dual C2-120 VM with 120 GB RAM each) with locally attached disk;
  • use of an OVH dedicated server (single Big-HG server with 256 GB RAM).

In the case of OCP hardware Qemu configuration was optimized for performance by using virtio, pass-through and cache parameters.

Disk performance benchmark of qemu with ERP5 Billing Run
  OCP qemu qcow2 OCP qemu /dev/sdb2 OCP qemu /dev/sdb OCP bare metal OVH OpenStack OVH Dedicated
Duration 85h 85h 12h55 9h51 16h22 8h27
VM Manager SlapOS SlapOS SlapOS N/A OpenStack N/A
Orchestrator SlapOS SlapOS SlapOS SlapOS SlapOS SlapOS
Database NEO + InnoDB NEO + InnoDB NEO + InnoDB NEO + InnoDB NEO + InnoDB NEO + InnoDB
Pystone (CPU test) 150.000 150.000 150.000 150.000 200.000 230.000
vCore 40 40 40 40 64 40
Xeon Generation v2 v2 v2 v2 N/A v3
Frequency 2.8/3.6 GHz 2.8/3.6 GHz 2.8/3.6 GHz 2.8/3.6 GHz 3.1 GHz 2.6/3.3 GHz
Storage 4 TB SATA SSD 4 TB SATA SSD 4 TB SATA SSD 4 TB SATA SSD 4 TB High Speed 4 TB NVME SSD
RAM 256 GB 256 GB 256 GB 256 GB 2 x 120 GB 256 GB
Monthly Price N/A N/A 195€ 195€ 2261€ 769€
Redundancy No No No No Yes No

Our conclusions are:

  • it is possible to reach near bare metal disk performance with qemu by attaching a whole SSD device to qemu;
  • attaching a partition or using a disk image in qemu reduces disk performance by an order of magnitude;
  • bare metal execution of database is about 30% faster than the best qemu configuration;
  • OVH dedicated servers are faster than Rapid.Space but cost more;
  • OVH OpenStack VMs are smaller and slower than Rapid.Space despite faster CPU, provide redundancy but cost much more;
  • Rapid.Space costs much less but is about 30% slower than the fastest dedicated servers on the market.

Based on these results, all Rapid.Space OCP servers are now configured with two SSD disks: one for the system image and one entirely dedicated to qemu process. In the near future Rapid.Space will add a third disk for bare metal deployment of database.

Nexedi will keep on relying on OVH because it is one of the best providers of high performance dedicated servers in the world. Nexedi will use Rapid.Space VMs and bare metal to address use cases not covered by OVH or to cut costs on use cases that do not require highest performance.

Overall, the "ERP5 Billing Run" test is consistent with the pedigree of Rapid.Space servers: very high-end servers that were built 3 years ago for large corporations and that are now about 30% slower than recent generation high-end servers, once they have been re-certified and upgraded with a brand new SSD. Considering their cost, it is deal!

Further performance with SlapOS nano-containers, FusionIO and Free Software

Even with the best possible configuration, qemu disk performance is still 30% slower than bare metal. This is obviously much better than 800% slower or even worse as we often experienced with SAN or some public cloud computing services. Yet, some applications may need even more disk performance. For those applications, Rapid.Space will provide with each virtual machine a FusionIO dedicated storage. Rapid.Space customers will be able to deploy any SlapOS database profile (MariaDB, NEO, etc.) to benefit from bare metal performance of locally attached FusionIO disks, while running their applications inside qemu virtual machine.

We expect a performance boost of 200% to 1000% through this approach compared to best results with qemu.

We also expect to open SlapOS nano-container technology to Rapid.Space users, so that any GNU/Linux software (MariaDB, Postgresql, ERP5, NEO, Wendelin, Spark, etc.) can be deployed on bare metal next to the dedicated VM. If a Rapid.Space user needs a different database, a different version or a different configuration, he or she can extend SlapOS software library.

This is the other beauty of Rapid.Space: all its source code is Free Software. Rapid.Space customers are free to contribute to it and improve it.

Contact

  • Photo Jean-Paul Smets
  • Logo Nexedi
  • Jean-Paul Smets
  • jp (at) nexedi (dot) com
  • Jean-Paul Smets is the founder and CEO of Nexedi. After graduating in mathematics and computer science at ENS (Paris), he started his career as a civil servant at the French Ministry of Economy. He then left government to start a small company called “Nexedi” where he developed his first Free Software, an Enterprise Resource Planning (ERP) designed to manage the production of swimsuits in the not-so-warm but friendly north of France. ERP5 was born. In parallel, he led with Hartmut Pilch (FFII) the successful campaign to protect software innovation against the dangers of software patents. The campaign eventually succeeeded by rallying more than 100.000 supporters and thousands of CEOs of European software companies (both open source and proprietary). The Proposed directive on the patentability of computer-implemented inventions was rejected on 6 July 2005 by the European Parliament by an overwhelming majority of 648 to 14 votes, showing how small companies can together in Europe defeat the powerful lobbying of large corporations. Since then, he has helped Nexedi to grow either organically or by investing in new ventures led by bright entrepreneurs.
  • Logo Nexedi
  • Thomas Gambier
  • thomas (dot) gambier (at) nexedi (dot) com
  • Software Developer