Wendelin Exanalytics Libre

WENDELIN combines Scikit Learn machine learning and NEO distributed storage for out-of-core data analytics in python


This short HowTo will teach you how to ingest data inside Wendelin platform using fluentd. In order to do so you must have already a Wendelin instance ready and know its URL and username / password to access. There's no need of additional configuration at Wendelin side as it come already pre configured.

You can read wendelin-HowTo.Install.Wendelin.Standalone to know how to install Wendelin.

For the purpose of the HowTo we will show how to ingest a simple JSON data but it can be anything.

Step 1: Install fluentd and Wendelin fluentd plugin

root@debian: ~$ apt install ruby ruby-dev
root@debian: ~$ gem install --user-install fluentd
root@debian: ~$ gem install --user-install fluent-plugin-wendelin

Step 2: Clone default Wendelin's plugin directory

Before this step you need to be aware of your Wendelin's instance URL, username and password.

ivan@debian: ~$ clone https://lab.nexedi.com/nexedi/fluent-plugin-wendelin.git
ivan@debian: ~$ cd fluent-plugin-wendelin/example
# set proper username / password and URL in configuration file!
ivan@debian: ~/fluent-plugin-wendelin/example$ vi to_wendelin.conf
ivan@debian: ~/fluent-plugin-wendelin/example$ ~/.gem/ruby/2.7.0/bin/fluentd -v -c to_wendelin.conf

Step 3: Ingest

ivan@debian: ~$ curl -X POST -d 'json={"foo1":"bar1"}' http://localhost:8888/default_http_json

Step 4: Check everything is successfully ingested at Wendelin side

Wendelin's Data model is quite complex. For the purpose of the HowTo it's enough to see where data was successfully ingested. In the concrete example it's ingested inside a "Data Stream" object which has a reference "default_http_json-default_http_json". By going to this object's view we shall see it's size which should increase after multiple "curl" calls.