Wendelin Exanalytics Libre

WENDELIN combines Scikit Learn machine learning and NEO distributed storage for out-of-core data analytics in python

Wendelin - Open Source Big Data Platform

The Wendelin platform is part of an ongoing research project Nexedi is leading. The goal is to develope the technological framework for a open source Big Data platform "Made in France". Wendelin will integrate libraries widely used in the data science community for data collection, analysis computation and visualization. The research project also comprises development of first prototype applications in the automative and green energy sector as it's purpose is to provide a ready-to-use solution applicable in different industrial scenarios.

Wendelin Logo Wendelin Core
Wendelin Logo NEO
Wendelin Logo ERP5
Wendelin Logo SlapOS

One Stack To Rule Them All

The Wendelin stack is written in 100% Python. On the base layer, SlapOS handles configuration, deployment and management of all components running on the Wendelin stack. Distributed storage is provided by NEO while ERP5 is used as platform to connect the various libraries, store data, provide a connecting user interface plus enable the creation of web-based Big Data applications up to integration of more complex business processes ("Convergence Ready"). At the heart of Wendelin is "Wendelin Core", component that will provide out-of-core computation capabilities allowing Wendelin based stacks to go beyond the limits imposed by available RAM in a cluster of machines. On top of this stack different libraries will be integrated - most importantly Scikit-Learn for machine learning and Juypter which was the topic for the current release.

New Features in 0.5

After focussing on easing installation in the last release, Wendelin 0.5 is all about integrating Juypter into Wendelin. As with scikit, the idea was to provide a familiar interface to get started quickly without having to understand everything that happens under the hood from the get-go. You can check out the iPython Notebook pictured below here.

Wendelin | Screenshot Integration Jupyter

Besides integrating Juypter we have also switched to using Wendelin.Core 0.5 by default which includes the new ZBLk1 block storage which stores multiple objects in the database (distributed) compared to ZBlk0 (more info/performance tests).

We have also switched to setting up Wendelin in a "cluster mode". Instead of initially running on a single node, Wendelin can now be setup on a cluster of Zope nodes (the smallest cluster just having a single node), allowing parallel code execution through activities (this is one of the features provided and managed by ERP5 sitting under Wendelin).

Finally, some work has gone into improving the wendelin-standalone script that installs all components required in a VM. The VM has also been switched to setup with auto-restart enabled, so that a reboot also updates all components to their latest versions. All efforts here are going into providing easy to use tools for someone to evaluate, test and develop on Wendelin.

Try yourself

If you want to try your hands at Wendelin, the website now has a Developer Documentation which includes a detailed Installation Guide along with instructions for configuration instructions. There is also a sample Jupyter demo showing how to work with Juypter on Wendelin.

A look ahead

Two major topics are scheduled for Wendelin 0.6.

  • We will provide means to track data ingestion and status of analytics being run. This will include finding a way to describe what data is being ingested, for example by adding a sensor/data source description and using Data Supply to mimic real examples. This will most likely be handled using a variation of ERP5 business processes along with trade_state.
  • We will wrap up Wendelin.Core version 1 and move on version 2, which means implementation on the file system for speed and simplicity. This will take quite a bit of time so don't expect to see Wendelin 0.6 before a couple of months. The new release will then also include a performance comparison showing how Wendelin.Core 0.6 is superior to 0.5.
  • On the sideline we will also work on SlapOS to provide containerization as it seems like this way of doing things seems to be moving from bandwagon to becoming a standard process which should also be available through SlapOS.