Notebooks are a simple, fun and efficient way to create business reports or play with data science libraries such as scikit-learn or pandas. They are also a great tool for interactive visualisation of data. We have been using Jupyter notebooks at Nexedi as part of Wendelin out-of-core Big Data project. Using Jupyter, we could increase the productivity of engineers in charge of producing reports about the structural health of wind turbines in Gemany. Using Jupyter, we could also provide a tool to analyse and visualise sales trends of a trading company in China. We are currently integrating Jupyter-Lab as the default IDE of SlapOS Edge Computing software.
Jupyter has been of tremendous help at Nexedi. We will keep on using it, especially Jupyter-Lab for server-based software development. But Jupyter has two major drawbacks: it does not scale and it is not ubiquitous. It does not scale because serving Jupyter notebooks to thousands or millions of users requires powerful servers and complex software setup, which tends to break the kind of efficiency or simplicity one can experience as a single user. It is not ubiquitous because one has to install Jupyter to use it, and most users today no longer want to install anything and instead expect everything to be free.
Along came Iodide, a project originated by Mozilla and lead by Hamilton Ulmer and Brendan Colloran.
Iodide is distributed as a collection of static files that can be hosted on a low-cost Content Delivery Network (CDN). There is no need of any kind of application server to deploy Iodide and disseminate it to million users.
The absence of server side component is what makes Iodide so different and so much better than Jupyter for most applications. With Iodide, a multinational corporation can distribute to all its employees a business reporting framework packed with the best of A.I. libraries without virtually no investment in any form of license or infrastructure.
A surprising consequence of this dual headed runtime is that it slashes drastically the time it takes to develop interactive data analysis and visualisation Web based applications. During our initial evaluation of Iodide, we tried to develop a simple application which we had previously developped using two other frameworks: Bokeh and Wendelin. Here are our results.
In both cases, the assigned developer was using the framework for the first time. Before Iodide existed, the choice was simple: Wendelin provided enterprise grade features that are needed to industrialise a data collection and processing project whereas Bokeh provided rapid development. It was one or the other.
It is also much easier to integrate Iodide to Wendelin than Bokeh or other frameworks based on an application server. With RenderJS, Iodide can become a UI gadget component, just like other existing components (graphs, spreadsheets, etc.). This is because one of the key concepts of Iodide is to have no dependency to any application server and to be based only on static files.
Nexedi therefore decided to invest in Iodide so that it becomes the standard rapid application development environment of the next generation Wendelin platform by combining benefits of Wendelin and of Iodide in the same environment. Iodide will also become a standard reporting tool of OfficeJS HTML5 suite with close integration to ERP5 open source ERP/CRM.
A senior developer, Roman Yurchak, was sponsored by Nexedi to contribute to the core of of Pyodide. This includes adding dynamic module loading (from arbitrary Web URL), automating Numpy test coverage and porting scikit-learn.
Meanwhile, Richard Szczerba, a young developer has been finalising and extending the work of Laurent Sebellin (Nexedi GmbH) to integrate Iodide into OfficeJS. OfficeJS Iodide can now save notebooks in ERP5 or Dropbox, stream data from remote storages and - of course - operate offline.
This investment will keep on in 2018 and 2019 thanks to funding secured from France's FUI public research fund. Any developer or intern is welcome to join Nexedi for short to long time and contribute to the development of Iodide or Pyodide (write to firstname.lastname@example.org).
OfficeJS Iodide Notebook is a new member of the OfficeJS appstore that provides a simple way to use, edit and manage, online or offline, multiple Iodide notebooks stored locally or on remote online storages.
OfficeJS is an HTML5 appstore that includes an OpenXML compatible office suite (text, spreadsheet, presentation), an HTML5 compatible office suite (text, spreadsheet, illustration, imaging) and a few applications for daily use (expense tracking, bookmarks, PDF, music player, etc.).
Thanks to service worker technology powered by CribJS, OfficeJS HTML5 applications can operate both online and offline (only on latest IOS). This is how OfficeJS Iodide Notebook can operate entirely offline.
Thanks to storage abstraction powered by JIO, OfficeJS HTML5 applications can store data locally inside the browser (IndexedDB), remotely onto online storages (Dropbox, Google Drive, WebDAV, ERP5, etc.) and synchronise both. This is how OfficeJS Iodide Notebook can store and retrieve the notebook's jsmd text to and from a wide variety of storage without depending on any application server or changing Iodide's code (currently, the last run of any cell is stored in localStorage by Iodide).
And thanks to RenderJS, a lightweight component framework, OfficeJS applications run fast on slow devices (low-end smartphone, ARM based chromebooks, etc.) and can integrate well with other frameworks (Angular, REACT, etc.). RenderJS is the framework that provides to OfficeJS Iodide Notebook the ability to add and display a list of notebooks that support full text search.
All OfficeJS applications are implemented as a collection of static assets (HTML, CSS, JS, etc.) that can be encapsulated into a ZIP file and hosted on any static HTTPS server. No application server is needed. Same applies to OfficeJS Iodide Notebook.
The build process involves the following steps:
The complete OfficeJS Iodide Notebook consists of about 130 files for a total of 40 MB of assets. It was tested on a load end ARM based laptop with reasonable performance.
We then decided to run some tests on Iodide to ensure that it can do the job in a real business context.
To compare Iodide and Jupyter, we selected a python based Jupyter notebook that retrieves sales records from an ERP (ERP5), processes them and visualises results with matplolib. This report is used in a daily business situation to analyse best sales in a trading company and display trends.
We thus created an equivalent notebook using Iodide and Pyodide. We used JIO to retrieve sales record from ERP5 and Plotly.js to visualise results. The rest of the code, mainly in python, is similar to the original Jupyter notebook.
We achieved in rather short time to make a Pyodide notebook that provides same results as the oiginal Jupyter notebook.:
Here are our observations:
Interfacing Iodide with data sources automatically has been a kind of challenge and a dilemma until now:
We solved this dilemma in OfficeJS by integrating JIO as part of our standard Iodide build. JIO is a library that provides a unified way to local and remote data sources from the Web browser:
It applies to both data (ex. files as in a file system) and records (ex. lines in a relational database) through a unified API.
Thanks to JIO, we were able to eliminate the need of exporting data to CSV/Excel. We could instead to load data straight from our ERP (ERP5) through simple JIO calls.
The illustration bellow shows an example of accessing ERP5 data from Iodide:
This data can then be turned into a graph:
JIO can operate without any server side proxy or adapter. It can be extended to support more data sources (ex: Google Drive, Amazon S3, Qiniu,etc.). Its only (important) limitation is that it requires good support of cross-origin resource sharing (CORS), a standard feature of HTTP protocol that some cloud providers still refuse to support. Without CORS, a simple HTTP proxy is a must.
A few programmers at Nexedi started using Iodide and Pyodide as a scratchpad to test snippets of js and python code.
Iodide notebooks are saved and synchronised in Nexedi's ERP5 instance, just like any other corporate document.
We hope to increasingly use Iodide to generate business reports for our own use or for our customers. Thanks to RenderJS, we plan to integrate Iodide into our business applications as iframes in the same way as we do currently with complex spreadsheets.
Based on our current use of Iodide and Pyodide, we would like to see or contribute to the following improvements:
Article Contributors: Richard Szczerba, Sven Franck, Valentin Benozillo, Jean-Paul Smets.