Overview

In this tutorial we will learn how to:
- Install and configure Wendelin on a webrunner.
- Ingest data into Wendelin using Fluentd. (open source data collector used to ingest data to Wendelin) Fluentd Website.
- Manipulate data in Wendelin using Jupyter Notebook. Jupyter Website.
- Present data in Wendelin on a web site using Renderjs and jIO javascript libraries.

1. Getting Wendelin on a Webrunner

Request a Webrunner

For this tutorial, you will use services provided by vifib.com.
On Vifib request a Webrunner:
Go to services.
Click on Add button.

Request a Webrunner

Choose Webrunner software.
On the next screen choose the latest version.
Choose a name for your Webrunner and click Proceed.
Wait for connection parameters to appear and Monitoring Status turn green.

Request a Webrunner

You will need three connection parameters: url, init-user and init-password.
Click on url parameter and use the user and password parameters for authentication.

Install Wendelin

Click Open Softwere Release.
Open Slapos > Sofrware > Wendelin and click Open Software.
You should now see text file with first line [buildout]. Click Green arrow to start compilation.

Install Wendelin

Follow the log on the screen.
Compilation takes several hours.
IF compilation fails, click the green arrow again. It might be needed to restart the compilation few times.
In the end at least Building state should be completed.

Connection Information

When installation is complete, switch to the Services and click Connection Information tab.
In slappart0 you should see something like in the screenshot above. If not, click the green arrow again, wait until finishes and refresh.
Note down family-default-v6, inituser-login and inituser-password parameters.

Request frontend

Go back to Vifib screen of your Webrunner service and scroll down to find parameters Custom Frontend Backend URL and Type.
For Custom Frontend Backend url enter family_default_v6 parameter, and for the Custom Frontend Backend Type enter Zope and click Save.
Wait until custom-frontend-url appears under connectin parameters and click on it.

Login

Click on Zope management interface.
Login using username and password, that you noted down.
go to: http://softinstxxxx.host.vifib.net/erp5.

Configure Site

Go to: My Favorites > Configure your Site .

Your ERP5 instance is now ready, but it has only very generic core library. This core can be specialized through different business templates for different purposes.
This can be done manually, installing individual business templates that provide your ERP5 instance with utilities that you need.
That can be done in My Favorites > Manage Business Templates
But Wendelin uses a lot of different business templates and that would be a lot of work, so configurator is provided that makes that work for you and installs just the business templates that are important for Wendelin.

Install Configuration

Select Wendelin Configuration.

IF there is no Wendelin Configurator, you have to install it:
- Go to My Favourites > Manage Business Templates .
- Click on Import/Export button (blue and red arrows).
- Select Exchange > Install Business Templates from Repositories .
- Check box before erp5_wendelin_configurator business template and click Install Business Templates from Repositories.

Install Configuration

Make sure that all the boxes are checked.
Click Set of data Notebook module and on the next screen Install .

Wait

Installation may take a few minutes
Once down, click "Start using your ERP5 System"
Login again.

Main Interface

Your instance is now ready to use

The start screen shows a list of modules (data-types) directly accessible.
You can also access them through Modules selection tab at the top of the screen.
Probably you noticed that after configuration there are a lot more of them.
Modules can be contain anything from Persons, Organizations to Data Streams.
Modules prefixed with Portal are not displayed (e.g. portal ingestion policies)

2. Simulate Sensor and ingest data via Fluentd

Create Ingestion Policy

Now we will setup Wendelin to receive data from fluentd.
Goto: http://softinstxxxx.host.vifib.net/erp5/portal_ingestion_policies/
Ingestion policy is security Setting to prevent arbitrary stream from being sent

Currently fluentd and Wendelin are setup to receive streams of data
A data stream is a file, created by ingestion policy, to which data is continually appended

Fast Input

Hit the "Green Runner" to create new Ingestion Policy
Enter "pydata" as reference name and "Pydata" as title and click Create Ingestion Policy .
This creates a new ingestion policy.

The ingestion policy includes default scripts to be called on incoming data.
If you want to modify data handling, you could now write your own script.

Wendelin in Production

First Wendelin prototype in production was used to monitor wind turbines.
Wendelin is used to collect data and manage wind parks.
Allowed to use Machine Learning for failure prediction, structural health monitoring.

Photo: Fotolia.fr - David Hense

Wendelin in Production

Each wind turbine equipped with multiple sensors per blade and PC collecting sensor data
Each PC including fluentd for ingesting into Wendelin (network availability!)

Basic FluentD

FluentD is open source unified data collector
Has a source and destination, generic, easy to extend
Can handle data collection even under poor network conditions
More info: FluentD Website

Complex FluentD

Setup of FluentD can be much more complex.
In Wendelin production, every turbine has its own FluentD.

Record Audio

For demostration of data ingestion we will use simple audio .wav file.
You can record it on your own, using another SlapOS Webrunner hosting only a webpage:
https://softinst56756.host.vifib.net/public/project/hyperconvergence/
Click "Record" to Start/Stop audio and "Save" to save
Or you can download sample.wav

Webrunner using fluentD

For this part of tutorial you will need a webrunner using Fluentd:
- Request another webrunner for Fluentd.
- Install the same way, just now choose slapos > software > fluentd.
- When installation completes, go to Editor (You dont need to request frontend.)

Upload to Monitor

Wendelin-ERP5 Monitor Webrunner File Upload

Go to the Editor, click "This Project"
Make a folder for this tutorial
Left click and select upload file
Upload your .wav file.

Forward File to fluentD

Wendelin-ERP5 Webrunner FluentD Configuration

In the Editor, open folder that you created.
Create a new file, name it YOUR_NAME.cfg
Code on next slide (please copy&paste)

We are now creating a configuration file to pass to fluentd
The file contains all parameters for fluentD regarding data source and destination
Normally this is set upfront, but for the tutorial we hardcode

FluentD Configuration File (Gist)

<source>
    @type bin
    format none
    path /srv/slapgrid/slappart9/srv/runner/PUT_YOUR_WAV_HERE/!!YOURNAME!!*.wav
    pos_file /srv/slapgrid/slappart9/srv/runner/!!YOURNAMEGOESHERE!!.pos
    enable_watch_timer false
    read_from_head true
    tag pydata
</source>
<match pydata>
    @type wendelin
    @id wendelin_out
    streamtool_uri http://!!URL_TO_YOUR_ZOPE!!/erp5/portal_ingestion_policies/pydata
    user zope
    password insecure
    buffer_type memory
    flush_interval 1s
    disable_retry_limit true
</match>

Copy this script to new file that you created and edit paths to the vaw file and to the pos file, uri to your wendelin and username and password.
Make sure all the paths are correct.

Fluentd configuration file has two parts. Input part, represented by the source tag and output part represented by the match tag.
Each part first define which plugin shall fluentd use with @type variable.
Source tag then define path to file that you wish to upload, and path for .pos file with @file and @pos_file variables.
.pos file is created by fluentd and is used for tracking position in a file during ingestion.
@tag variable in the source part should match the match tag
Output part is a match tag. It must match a tag given to the data in the source part. It contains uri to your ingestion policy, and username and password for your Wendelin
For more information on writing Fluentd configuration files check out: Fluentd configuration documentation.

Save and send from Terminal

Wendelin-ERP5 Webrunner FluentD File Transfer

Switch to the terminal and run fluentd with:
software/"YOUR SOFT NUMBER"/bin/fluentd -c path/to/YOUR_NAME.cfg
To run fluents with your configuration including what file to send to Wendelin.
This will create new Data Stream in your Wendelin instance, and send data to it.

Notice that Fluentd is running until you interrupt it and waiting for new input.
In our case, we had just a small .vaw file, but we could also have continuous stream of data from the sensor and fluentd would be continuously appending it to Data Stream.

Check created Data Stream

Wendelin-ERP5 - Verify File was received

Head back to Wendelin/ERP5.
In the Data Stream Module, check the file size of the pydata stream.
It should show a file size larger than 0.

3. Work with Ingested Data

Out-of-Core

Wendelin.Core enables computation beyond limits of existing RAM
We have integrated Wendelin and Wendelin.Core With Jupyter
In Jupyter we can use ERP5 Kernel (out-of-core compliant) vs. Python 2 Kernel (default Jupyter)

Enable Data Notebook

Head to My Favourites > Preferences.
Click Default System Preferences and open Data Notebook tag.
Check the box to enable Data Notebook and click the save icon.

Head to Jupyter

Go back to connection information in webrunner and open jupyter-url.
For password we have to find a partition, where Jupyter is installed. For that go to services>process tab, check in which slappart is Jupyter (e.g slappart 7), then you will find the password in instance/slappart7/knowledge0.cfg
Authenticate.
Start a new ERP5 Notebook. This will make sure you use the ERP5 Kernel .

Note that to open jupyter-url you need IPv6 access. If you don't have it, you have to request a new frontend on vifib with jupyter-url as backend url, and backend type notebook, and then click on secure access url to access it.
The Python 2 Kernel is the default Jupyter Kernel
Using Python 2 will disregard Wendelin and Wendelin.Core, so it's basic Jupyter.
Using ERP5 Kernel will use Wendelin.core in the background.
To make good use of it, all code written should be Out-of-core "compatible"
For example you should not just load a large file into memory (see below).

Learn ERP5 Kernel

Help yourself with Notebook

Passing login/password will authenticate Juypter with Wendelin/ERP5
The reference you set will store your notebook in the Date Notebook Module

Getting Started

Authenticate, and set arbitrary reference for your notebook.

This is always the first step when you start a new Notebook with ERP5 Kernel.
It makes sure that you are connected to ERP5/Wendelin instance any you can work with objects on it.
It also creates Data Notebook object on your ERP5/Wendelin instance.
You can go to Data Notebook Module and see that your Data Notebook object is now saved there.

Accessing Objects

Type context, this will give you the Wendelin/ERP5 Object.
Type context.data_stream_module["1"] to get your uploaded sound file.

Accessing data works the same ways throughout [IPv6]:30002/erp5/[module_name]/[id].
All modules you see on the Wendelin/ERP5 start page can be accessed like this.
Once you have an object you can manipulate it.
Note that accessing a file by internal id (1) is only one way.
The standard way would be using the reference of the respective object, which will also allow to user portal_catalog to query.

Import libraries

Import necessary libs.

Accessing Data Itself

Try to get the length of the file using getData and via iterate
Note then when using ERP5 kernel all manipulations should be "Big Data Aware"
Just loading a file via getData() works for small files, but will break with volume

It's important to understand that manipulations outside of Wendelin.Core need to be Big Data "compatible"
Internally Wendelin.Core will run all manipulations "context-aware"
An alternative way to work would be to create your scripts inside Wendelin/ERP5 and call them from Juypter
Scripts/Manipulations are stored in Data Operations Module

Compute Fourier

Proceed to fetch data using getData for now
Extract one channel, save it back to Wendelin and compute FFT

We can call methods from Wendelin/ERP5.
Wendelin/ERP5 has a system of method acquistion. Every module can come with its own module specific methods and method names are always context specific ([object_name]_[method_name] ). Base methods on the other hand are core methods of Wendelin/ERP5 and applicable to more than one object.

Display Fourier

Check the rendered Fourier graphs of your recorded sound file

Save Image

Save the image back to Wendelin/ERP5.
Close figure with plt.close() function, otherwise it will show on all outputs

Create BigFile Reader

Add a new class BigFileReader
Allows to pass out-of-core objects

Rerun using Big File Reader

Rerun using the Big File Reader
Now one more step is out of core compliant
Verify graphs render the same

We are now showing how to step by step convert our code to being Out-of-Core compatible
This will only be possible for code we write ourselves
Whenever we have to rely on 3rd party libraries, there is no guarantee that data will be handled in the correct way. The only option to be truly Out-of-Core is to either make sure the 3rd party methods used are compatible and fixing them accordingly/committing back or to reimplement a 3rd party library completely.

Check the graphs

Verify graphs render the same
Don't forget to close the figure.

Redraw from Wendelin

This is the way to redraw the plot directly from data stored in Wendelin/ERP5
Imidiatelly after you create content, though, it doesn't work. You must wait for the object to be catalogued.

Verify Images are Stored

Head back to Wendelin/ERP5
Go to Image module and verify your stored images are there.

Verify Data Arrays are Stored

Switch to the Data Array module
Verify all computed files are there.

4. Visualize, Display computed data

Running Web Sites from Wendelin

Last step is to display results in a web app
Head back to main section in Wendelin/ERP5
Go to Website Module

One of the modules in erp5 is Web Site Module.
We will use it to create simple Web Site for presentation of our result.

WebSite Module

Website Module contains websites
Open renderjs_runner - ERP5 gadget interface

Front end components are written with two frameworks, jIO and renderJS
jIO (Gitlab) is used to access documents across different storages
Storages include: Wendelin, ERP5, Dropbox, webDav, AWS, ...
jIO includes querying, offline support, synchronization
renderJS (Gitlab) allows to build apps from reusable components
Both jIO/renderJS are asynchronous using promises

Renderjs Runner

Parameters for website module
see ERP5 Application Launcher - base gadget
Open new tab: http://softinstxxxx/erp5/web_site_module/renderjs_runner/
It is important not to forget / at the end of the url, otherwise link will not work!

Apps from gadgets are built as a tree structure, the application launcher is the top gadget
All other gadgets are child gadgets of this one
RenderJS allows to publish/aquire methods from other gadget to keep functionality encapsulated

Renderjs Web App

ERP5 interface as responsive application
We will now create an application like this to display our data

Clone Website

Go back to renderjs_runner website
Clone the website

Rename Website

Change id to pydata_runner
Change name to PyData Runner
Save

Publish Website

Select action Publish and publish the site
This changes object state from embedded to published
Try to access: http://softinstxxxx/erp5/web_site_module/pydata_runner/

Every object in ERP5 has a state (For example draft, published,...).
Workflows are used to change the state of objects.
A workflow in this case is to publish a webpage, which means changing its status from Embedded to Published.
Workflows (among other properties) can be security restricted. For example, everybody can see Web Site in published state, but only its creator can see it while it is still in draft state.
This concept applies to all documents in ERP5.

Layout Properties

Change to Tab "Layout Properties tab"
Update Front Page Gadget to pydata
Refresh your app (disable cache), it will be broken, as pydata page gadget doesn't exist

One advantage working with an aync promise-chain based framework like renderJS is the ability to capture errors
It is possible to capture errors on client side, send report to ERP5 (stack-trace, browser) and not fail the app
Much more fine-grainded control, we currently just dump to screen/console

Web Page Module

Now we will create pydata page gadgets. Like with website we will clone existing default gadgets and modify them.
Change to web page module
Search for reference %worklist%

The web page module includes html, js and css files used to build the frontend UI
The usual way of working with static files is to clone a file, rename its reference and publish it alive (still editable)

Clone Worklist gadgets

Open both files in new tabs, clone, change title.
Replace "worklist" in references and titles with "pydata", save and publish alive.
When published alive, object are still editable.
We will now edit both files to display our graph.

Pydata Gadget HTML

Go to edit tab on html gadget.
Copy and paste this script to the contents of the gadget.

This is a default gadget setup with some HTML.
Gadgets should be self containable so they always include all dependencies
RenderJS is using a custom version of RSVP for promises (we can cancel promises)
The global gadget includes promisified event binding (single, infinite event listener)
We are using RenderJS and jIO javascript libraries.
More info:
- RenderJS Website
- jIO Website
This is a default gadget setup with some HTML.
Gadgets should be self containable so they always include all dependencies
RenderJS is using a custom version of RSVP for promises (we can cancel promises)
The global gadget includes promisified event binding (single, infinite event listener)

Pydata Gadget JS

Same with javascript gadget.
Copy and paste this script to the javascript gadget (in the edit tab).

Save, refresh web app

Once you saved your files, go back to the web app and refresh
You should now have a blank page with header set correctly
This are just default template gadgets.
We will now update our gadgets to fetch our graph and display it

Update Pydata Gadget HTML

Update html gadget with this script.

Took from existing project, HTML was created to fit a responsive grid of graphs
Added JS library for multidimensional arrays: NDArray
Added JS libarary for displaying graphs: Dygraph

Pydata Gadget JS (1)

Update js gadget with this script (screenshots on this and next slides).

First we only defined options for the Dygraph plugin
In production system these are either set as defaults or stored along with respective data

Pydata Gadget JS (2)

Add methods outside of the promise chain
Simplified (removed actual creation of date objects)

Pydata Gadget JS (3)

Edit url variable for your instance and for id of your spectrum2 Data Array!

"ready" triggered once gadget is loaded
define gadget specific parameters
"render" called by parent gadget or automatically
we hardcode url parameter, by default it would be URL based

Pydata Gadget JS (4)

Orchestrated process starting with a cancellable promise queue
First step requesting the full file (NOT OUT-OF-CORE compliant - we load the whole file)
Return file converted into ndarray
Convert data into graph compatible format, store onto gadget
"declareService" triggered once UI is built
Graph will be rendered there.

Refresh Web Application

Example computes client-side as project requires to work offline "in the field"

Summary: What did we do?

We installed Wendelin on our KVM and configured it.
We ingested data with Fluentd.
We work with data using Jupyter Notebook.
We presented data in Wendelin using RenderJS and jIO javascript libraries.

Wendelin Tutorial - Installing and Using Wendelin on Webrunner

Overview

1. Getting Wendelin on a Webrunner

Request a Webrunner

Request a Webrunner

Request a Webrunner

Install Wendelin

Install Wendelin

Connection Information

Request frontend

Login

Configure Site

Install Configuration

Install Configuration

Wait

Main Interface

2. Simulate Sensor and ingest data via Fluentd

Create Ingestion Policy

Fast Input

Wendelin in Production

Wendelin in Production

Basic FluentD

Complex FluentD

Record Audio

Webrunner using fluentD

Upload to Monitor

Forward File to fluentD

FluentD Configuration File (Gist)

Save and send from Terminal

Check created Data Stream

3. Work with Ingested Data

Out-of-Core

Enable Data Notebook

Head to Jupyter

Learn ERP5 Kernel

Getting Started

Accessing Objects

Import libraries

Accessing Data Itself

Compute Fourier

Display Fourier

Save Image

Create BigFile Reader

Rerun using Big File Reader

Check the graphs

Redraw from Wendelin

Verify Images are Stored

Verify Data Arrays are Stored

4. Visualize, Display computed data

Running Web Sites from Wendelin

WebSite Module

Renderjs Runner

Renderjs Web App

Clone Website

Rename Website

Publish Website

Layout Properties

Web Page Module

Clone Worklist gadgets

Pydata Gadget HTML

Pydata Gadget JS

Save, refresh web app

Update Pydata Gadget HTML

Pydata Gadget JS (1)

Pydata Gadget JS (2)

Pydata Gadget JS (3)

Pydata Gadget JS (4)

Refresh Web Application

Summary: What did we do?