Introduction

nuts-flow is a data processing pipeline that is largely based on Python’s itertools.

nuts are thin wrappers around itertool functions and provide a >> operator to chain iterators in pipelines to construct data flows. The result is an easier to read flow of data.

The following two examples show the same data flow. The first is using Python’s itertools and the second is using nuts-flow:

>>> from itertools import islice
>>> list(islice(filter(lambda x: x > 5, range(10)), 3))
[6, 7, 8]
>>> from nutsflow import Range, Filter, Take, Collect, _
>>> Range(10) >> Filter(_ > 5) >> Take(3) >> Collect()
[6, 7, 8]

Both data flows extract the first three integers in the interval [0, 8[ that are greater than five. However, the linear arrangment of processing steps with nuts-flow is easier to read than the nested calls of itertool functions.

nuts-flow is the base library for nuts-ml, a data pre-processing pipeline for deep learning.