Wendelin Exanalytics Libre

WENDELIN combines Scikit Learn machine learning and NEO distributed storage for out-of-core data analytics in python

Table of Contents

Wendelin vs. Hadoop

  Wendelin Hadoop
High-level programming language Python  Java
Low-level language C/C++/FORTRAN N/A
Standard data structure Numpy N/A
Native x86 compiler Numba N/A
GPU compiler Parakeet N/A
Machine learning Scikit-learn Weka
Distributed storage NEO Spark
Distributed processing ERP5 Activity Job Tracker
Management portal ERP5 Data Cloudera Manager
Natural language processing NLTK Lucene
Video processing OpenCV-python N/A
Financial statistics Pandas N/A
Distributed index MariaDB
TokuDB
Spider
Sphinx
Solr
Cloud deployment and orchestration SlapOS Zookeeper

Wendelin focuses on python based data analytics and in particular on Numpy standard whereas Hadoop mostly related to Java programming world. Thanks to this, Wendelin can benefit more quickly from the growing homogenization of scientific computing on python. Some similarities however exist between both architectures as illustrated in the following table, with some typical examples of software components used in both cases.