Einops: Make Your Life Easier (Mostly)

When working in PyTorch, you are often faced with the need to manipulate tensors of multiple dimensions into various shapes or maybe combine dimensions together. There’s a wealth of functions specifically for that too. To name a few, view, permutate, stack, tile, and concat are some of the most common ones, but the list goes on. One big problem with these functions though is that sometimes it can be really hard to wrap your head around what they are doing to your tensors. Not to worry though. There is a great module called einops for this and in this article we will dive in and see exactly how einops can make your life easier.

As a bit of background, Einops was introduced to me by a friend also working in the CV industry, and it was very clear from the beginning how it can help you to write tensor manipulation code in a way that is a bit easier to reason about, and hopefully clearer to whoever is reading the code afterwards as well.

Para una guía un poco más intensiva, podes ver su sitio de web. En este articulo, yo voy a cobrar unos casos claros que veo para visión por computadora, y después hablar de unos casos en que no recomiendo usar Einops en lugar de las funciones ofrecido por PyTorch.

Al primero, hacemos un data loader simple para ver unos ejemplos reales.

Python

import torch
from torchvision import datasets
from torchvision.transforms import ToTensor
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt

from einops import rearrange, reduce, repeat, pack, unpack

training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

train_dataloader = DataLoader(training_data, batch_size=16, shuffle=True)

images, labels = next(iter(train_dataloader))

Unos Casos de Uso Potenciales

Organizando Imágenes en Un Stack

Después de experimentar con Einops un poco, hay unos situaciones que creo que serían muy útiles a mis proyectos directamente. El primero situación es organizar imágenes a una pila (stack). Hay una variedad de razones para hacer una pila, pero un caso muy simple es visualizar un batch de imágenes.

For me, someone who writes PyTorch fairly regularly, my first instinct is to grab stack or concat to do this. However, if you attempt that, you’ll remember that stack and concat both expect a list or tuple of tensors rather than a single tensor. Instead, you would want to use view in this situation. Here is what that code might look like:

Python

stacked_image = images.view(images.shape[0] * images.shape[2], images.shape[3])

figure = plt.figure(figsize=(10, 10))
plt.imshow(stacked_image.squeeze(), cmap="gray")

As you can see it is a bit clunky. You have to multiply different values from the shape of the tensor together to get the new tensor shape that you want. You could simplify this a bit by splitting the name into intuitive variables and then using those to calculate the new shape instead. However, it is still a little bit uncomfortable. It isn’t always easy to remember what view does right away. if you use einops not only can you assign an intuitive name to each dimension, but you can do it all in one function and it is very clear that you are “rearranging” the tensor, which is exactly what is happening.

Python

stacked_image = rearrange(images, 'b c h w -> c (b h) w')

figure = plt.figure(figsize=(10, 10))
plt.imshow(stacked_image.squeeze(), cmap="gray")

Manipulando a Bounding Boxes

Another situation that comes up a lot for me anyways is the need to join and separate bounding boxes and their corresponding predictions. Generally, if it were me, I would use torch.concat to join the bounding boxes and predictions together. However, torch.concat requires that you specify which dimension to join on and sometimes it can be hard to imagine how your data is getting concatenated at first glance.

Python

torch.concat([bounding_boxes, predictions.unsqueeze(dim=1)], dim=1)

En este caso, se puede usar el método "pack" de einops pack method instead. Using the einum inspired notation, you can see how the height is being kept while the widths are being joined together. To me it is slightly debatable if this syntax is actually more readable than a regular concat function, but one thing that is very nice about this is that pack returns a variable called “Packed Shapes” that is essentially information about the way your data was packed.

Python

bbox_and_pred, ps = pack([bounding_boxes, predictions], 'h *')

This can be very useful if you are concatenating and splitting the same data over and over again. To split up my bounding boxes and predictions would require a bit of fussing with tensor indices, but by using unpack, I can pass it my Packed Shapes y separar los datos a las formas originales muy simplemente.

Python

bounding_boxes, predictions = unpack(bbox_and_pred, ps, 'h *')

La unica cosa mala que yo veo de todo esto es que tenés que manejar Packed Shapes variable later on. There is a good chance that if you returned your packed data from a function you would accidentally lose the Packed Shapes variable or you would have to work out a way to also return that from the function as well. All in all, I don’t think it is a huge problem though.

Situaciones En Que No Usaría Einops

Además de todos los ejemplos buenos en la documentación de einopstambién existen unos ejemplos como MaxPool and image augmentation. To be honest, as much as I think rearrange is clearer than concat or view, I don’t believe that using it in place of MaxPool es mas fácil entender.

Recomiendo a cualquiera persona usando einops is to think hard about what is actually more readable. In many cases rearrange and pack can be much more intuitive than the names and usage of default torch methods. However, as in any programming language, attempting to be too clever can result in unreadable code.

Source Code

Como siempre, el source code for my post. This time it’s just a jupyter notebook así que estoy ilustrando unas cosas básicas de las funciones de einops. einops functions.

Conclusión

If you made it this far, hopefully you saw how einops can make your life easier. Einops is a great package for tensor manipulation that can make your code more readable and save you time when wrapping your head around how to manipulate tensors. Just remember to be rational when using it and always use the simplest way to write something rather than the most clever way to write something.

For more, check out my other PyTorch related articles!

Einops: Make Your Life Easier (Mostly)

Unos Casos de Uso Potenciales

Organizando Imágenes en Un Stack

Manipulando a Bounding Boxes

Situaciones En Que No Usaría Einops

Source Code

Conclusión

Comentarios

Deja un comentario Cancelar respuesta