Analyse: Work with Ingested Data Out-of-Core Wendelin.Core enables computation beyond limits of existing RAM We have integrated Wendelin and Wendelin.Core With Jupyter ERP5 Kernel (out-of-core compliant) vs. Python 2 Kernel (default) Todo: Head to Jupyter (Notebook) Head to Juypter http://[x].pydata-class.erp5.cn Start a new ERP5 Notebook This will make sure you use the ERP5 Kernel The Python 2 Kernel is the default Jupyter Kernel Using Python 2 will disregard Wendelin and Wendelin.Core, so it's basic Jupyter Using ERP5 Kernel will use Wendelin.core in the background To make good use of it, all code written should be Out-of-core "compatible" For example you should not just load a large file into memory (see below) Todo: Learn ERP5 Kernel (Notebook) Note you have to connect to Wendelin/ERP5 The reference you set will store your notebook in the Date Notebook Module Passing login/password will authenticate Juypter with Wendelin/ERP5 Note that your ERP5_URL in this case should be your internal url You can retrieve it be running erp5-show -s in your webrunner terminal Note, outside of the tutorial we would set the external IPv6 adress of ZOPE Todo: Getting Started (Notebook) Connect, set arbitrary reference and authenticate Todo: Accessing Objects (Notebook) Import necessary libs Type context , this will give you the Wendelin/ERP5 Object Type context.data_stream_module["1"] to get your uploaded sound file Accessing data works the same ways throughout [IPv6]:30002/erp5/[module_name]/[id] All modules you see on the Wendelin/ERP5 start page can be accessed like this Once you have an object you can manipulate it Note that accessing a file by internal id (1) is only one way The standard way would be using the reference of the respective object, which will also allow to user portal_catalog to query Todo: Accessing Data Itself (Notebook) Try to get the length of the file using getData and via iterate Note then when using ERP5 kernel all manipulations should be "Big Data Aware" Just loading a file via getData() works for small files, but will break with volume It's important to understand that manipulations outside of Wendelin.Core need to be Big Data "compatible" Internally Wendelin.Core will run all manipulations "context-aware" An alternative way to work would be to create your scripts inside Wendelin/ERP5 and call them from Juypter Scripts/Manipulations are stored in Data Operations Module Todo: Compute Fourier (Notebook) Proceed to fetch data using getData for now Extract one channel, save it back to Wendelin and compute FFT Note, that ERP5 kernel at this time doesn't support %matplotlib inline Note the way to call methods from Wendelin/ERP5 (Base_renderAsHtml ) Wendelin/ERP5 has a system of method acquistion. Every module can come with its own module specific methods and method names are always context specific ([object_name]_[method_name] ). Base methods on the other hand are core methods of Wendelin/ERP5 and applicable to more than one object. Todo: Display Fourier (Notebook) Check the rendered Fourier graphs of your recorded sound file Todo: Save Image (Notebook) Save the image back to Wendelin/ERP5. Todo: Create BigFile Reader (Notebook) Add a new class BigFileReader Allows to pass out-of-core objects Todo: Rerun using Big File Reader (Notebook) Rerun using the Big File Reader Now one more step is out of core compliant Verify graphs render the same We are now showing how to step by step convert our code to being Out-of-Core compatible This will only be possible for code we write ourselves Whenever we have to rely on 3rd party libraries, there is no guarantee that data will be handled in the correct way. The only option to be truly Out-of-Core is to either make sure the 3rd party methods used are compatible and fixing them accordingly/committing back or to reimplement a 3rd party library completely. Todo: Redraw from Wendelin (Notebook) Redraw the plot directly from data stored in Wendelin/ERP5 Todo: Verify Images are Stored Head back to Wendelin/ERP5 Go to Image module and verify your stored images are there. Todo: Verify Data Arrays are Stored Switch to the Data Array module Verify all computed files are there.