Divide and conquer¶
It is frequently necessary to either split a data flow into multiple flows or combine data flows. The following nuts are specifically designed for this purpose. In this context the Partition and the MapMulti nuts might be of interest as well.
Zip¶
Zip(*iterables) combines two or more iterables like a zipper taking at
every step an element from each iterable and outputting a tuple of the
grouped elements. Here an example
>>> from nutsflow import *
>>> numbers = [0, 1, 2]
>>> letters = ['a', 'b', 'c']
>>> numbers >> Zip(letters) >> Collect()
[(0, 'a'), (1, 'b'), (2, 'c')]
Zip finishes when the shortest iterable is exhausted. See
>>> Range(100) >> Zip('abc') >> Collect()
[(0, 'a'), (1, 'b'), (2, 'c')]
Note that Zip can zip more than two iterables:
>>> '12' >> Zip('ab', '+-') >> Collect()
[('1', 'a', '+'), ('2', 'b', '-')]
If the output of Zip is required to be flat Flatten can be called
>>> [0, 1, 2] >> Zip('abc') >> Flatten() >> Collect()
[0, 'a', 1, 'b', 2, 'c']
but using Interleave is simpler in this case.
Instead of nuts-flow’s Zip, Python’s zip could be used alternatively:
>>> zip(numbers, letters) >> Print() >> Consume()
(0, 'a')
(1, 'b')
(2, 'c')
Unzip¶
Unzip(container=None) reverses a Zip operation:
>>> numbers, letters = [0, 1, 2] >> Zip('abc') >> Unzip()
>>> list(numbers)
[0, 1, 2]
>>> list(letters)
['a', 'b', 'c']
Per default Unzip returns iterators but often the results are required
as lists or other collections (see above). Unzip allows to provide a
container to collect the results:
>>> zip([0, 1, 2], 'abc') >> Unzip(list) >> Collect()
[[0, 1, 2], ['a', 'b', 'c']]
This equivalent to Unzip() >> Map(list) >> Collect() but shorter.
Interleave¶
Interleave works like Zip but does not group zipped results in
tuples. Instead an iterator over a flattened sequence of interleaved elements
is returned:
>>> numbers = [0, 1, 2]
>>> letters = ['a', 'b', 'c']
>>> numbers >> Interleave(letters) >> Collect()
[0, 'a', 1, 'b', 2, 'c']
Also in contrast to Zip, Interleave does not stop when the shortest
input iterable is depleted. Elements are returned until all inputs are
depleted:
>>> Range(10) >> Interleave('abc') >> Collect()
[0, 'a', 1, 'b', 2, 'c', 3, 4, 5, 6, 7, 8, 9]
Concat¶
Apart from zipping or interleaving iterators, they can also be concatenated
using Concat:
>>> Range(5) >> Concat('abc') >> Collect()
[0, 1, 2, 3, 4, 'a', 'b', 'c']
>>> '12' >> Concat('abcd', [3, 4, 5]) >> Collect()
['1', '2', 'a', 'b', 'c', 'd', 3, 4, 5]
Note that Concat is memory efficient and does not materialize any of the
input iterables or the concatenated result in memory; e.g. in contrast to the
following code:
>>> list(Range(5)) + list('abc')
[0, 1, 2, 3, 4, 'a', 'b', 'c']
Tee¶
Tee([n=2]) creates multiple independent iterators from a single iterable.
>>> numbers1, numbers2  = Range(5) >> Tee(2)
>>> numbers1 >> Collect()
[0, 1, 2, 3, 4]
>>> numbers2 >> Collect()
[0, 1, 2, 3, 4]
Tee is only useful if the returned iterators are advanced largely
synchronously. Otherwise the memory consumption is identical to simply
materializing the input iterable and referencing it, e.g.
>>> numbers1 = Range(5) >> Collect()
>>> numbers2 = numbers1
A simple example where Tee is useful would be to add each number in the
input iterable to its predecessor:
>>> add = lambda a, b: a + b
>>> numbers1, numbers2  = Range(5) >> Tee(2)
>>> numbers1 >> Drop(1) >> Map(add, numbers2) >> Collect()
[1, 3, 5, 7]
Iterators, in contrast to streams, do not allow to go back and Tee provides
a way to overcome this limitation.