Transforming images¶

Images are rarely in the shape, format or condition suitable for training and need to be transformed in some fashion. nuts-ml provides a wide and easily extensible range of transformation functions.

In the following example we resize two input images, read from disk, to width 64 and height 128, using TransformImage:

>>> imagenames = ['color.jpg', 'grayscale.jpg']
>>> read_image = ReadImage(None, 'tests/data/img_formats/nut_*')
>>> resize = TransformImage(0).by('resize', 64, 128)
>>> imagenames >> read_image >> resize >> PrintColType() >> Consume()
item 0: <tuple>
  0: <ndarray> shape:128x64x3 dtype:uint8 range:0..242
item 1: <tuple>
  0: <ndarray> shape:128x64 dtype:uint8 range:23..235

TransformImage extracts the images from column 0 of the tuples returned by ReadImage and applies the transformation specified by by. As the output of PrintColType shows, the resulting images are indeed of the specified shape with 128 rows, 64 columns and a channel axis in the case of color images.

Transformation can be chained. For instance, we can easily resize the images, adjust the contrast and convert all images to RGB format:

>>> normalize = TransformImage(0).by('resize', 64, 128).by('contrast', 1.1).by('gray2rgb')
>>> imagenames >> read_image >> normalize >> PrintColType() >> Consume()
item 0: <tuple>
  0: <ndarray> shape:128x64x3 dtype:uint8 range:0..250
item 1: <tuple>
  0: <ndarray> shape:128x64x3 dtype:uint8 range:0..241

As you can see, all images now have a channel axis, have larger range (due to the contrast adjustment) and are of the specified dimensions.

See TransformImage in transformer.py for a list of available transformations or run help(TransformImage.by). Each transformation can also be used for image augmentation (more of that later). Custom transformations can be added via register

>>> def my_brightness(image, c):
>>> ... return (image * c).astype('uint8')
>>> TransformImage.register('my_brightness', my_brightness)

>>> normalize = TransformImage(0).by('resize', 64, 128).by('my_brightness', 0.5)
>>> imagenames >> read_image >> normalize >> PrintColType() >> Consume()
item 0: <tuple>
  0: <ndarray> shape:128x64x3 dtype:uint8 range:0..121
item 1: <tuple>
  0: <ndarray> shape:128x64 dtype:uint8 range:11..117

Note

In most cases image transformation expect RGB or grayscale images of data type uint8 – though there are exceptions (e.g. rerange). When chaining transformations make sure that expected input and output image formats of the transformations do match.

In addition, it is easy to implement custom nuts that can perform arbitrarily complex operation. For instance, instead of using TransformImage we can implement a custom transformation on the samples ourselves

@nut_function
def ChangeBrightness(sample, c):
    image, label = sample
    new_image = (image * c).astype('uint8')
    return new_image, label

samples = [('nut_color.gif', 'color'), ('nut_monochrome.gif', 'mono')]
read_image = ReadImage(0, 'tests/data/img_formats/*')
samples >> read_image >> ChangeBrightness(0.5) >> PrintColType() >> Consume()

however, in this case we also have to extract the image from sample column 0 and return a new sample with the transformed image and the label.

Note

Style guide: names of (custom) nuts are in CamelCase to distinguish them from plain Python functions. Also nuts are implemented as classes, which agrees with the use of CamelCase.

Transformations can be applied to multiple images in a sample. In the following code, each sample contains two images (columns 0 and 1) that are resized and converted to RGB:

>>> samples = [('color.jpg', 'monochrome.jpg'), ('color.png', 'monochrome.png')]
>>> read_image = ReadImage((0,1), 'tests/data/img_formats/nut_*')
>>> normalize = TransformImage((0,1)).by('resize', 64, 128).by('gray2rgb')
>>> samples >> read_image >> normalize >> PrintColType() >> Consume()
item 0: <tuple>
  0: <ndarray> shape:128x64x3 dtype:uint8 range:0..242
  1: <ndarray> shape:128x64x3 dtype:uint8 range:0..255
item 1: <tuple>
  0: <ndarray> shape:128x64x3 dtype:uint8 range:0..242
  1: <ndarray> shape:128x64x3 dtype:uint8 range:0..255

TransformImage converts each input image to a corresponding output image. A common task, however, is to extend the training data set by creating multiple output images for an input image. These so called augmentations are the topic of the next section.