WENDELIN combines Scikit Learn machine learning and NEO distributed storage for out-of-core data analytics in python
Table of Contents
Wendelin vs. Hadoop
|
Wendelin |
Hadoop |
High-level programming language |
Python |
Java |
Low-level language |
C/C++/FORTRAN |
N/A |
Standard data structure |
Numpy |
N/A |
Native x86 compiler |
Numba |
N/A |
GPU compiler |
Parakeet |
N/A |
Machine learning |
Scikit-learn |
Weka |
Distributed storage |
NEO |
Spark |
Distributed processing |
ERP5 Activity |
Job Tracker |
Management portal |
ERP5 Data |
Cloudera Manager |
Natural language processing |
NLTK |
Lucene |
Video processing |
OpenCV-python |
N/A |
Financial statistics |
Pandas |
N/A |
Distributed index |
MariaDB
TokuDB
Spider
Sphinx |
Solr |
Cloud deployment and orchestration |
SlapOS |
Zookeeper |
Wendelin focuses on python based data analytics and in particular on Numpy standard whereas Hadoop mostly related to Java programming world. Thanks to this, Wendelin can benefit more quickly from the growing homogenization of scientific computing on python. Some similarities however exist between both architectures as illustrated in the following table, with some typical examples of software components used in both cases.
Related Articles