Wendelin Exanalytics Libre

WENDELIN combines Scikit Learn machine learning and NEO distributed storage for out-of-core data analytics in python

Table of Contents

This note was written to show how get Data from Data Stream and Data Bucket Stream and write to Data Array


Get Data From Data Stream

chunk_list = my_data_stream.readChunkList(start_offset, end_offset)

Returns a list of data chunks. To get directly a string:

my_string = ''.join(my_data_stream.readChunkList(start_offset, end_offset))

Get Data From Data Bucket Stream

bucket_list = readBucketList(start_offset, bucket_count)

Returns a list of buckets. Each bucket is a string.

Work with Data Array

my_zbigarray = my_data_array.getArray()
# check if zbigarray is already initialised:
if my_zbigarray is None:
  my_zbigarray = my_data_array.initArray(shape, dtype)
# append to zbigarray
# get a slice from zbigarray as ndarray
my_ndarray = zbigarray[start:end]
# get whole zbigarray as ndarray (careful, zbigarray might be very big)
my_ndarray = zbigarray[:]