Here is one thing that every engineer working with AI hears at some point:
Garbage In, Garbage Out
However, despite that, something that I still see often enough in the Computer Vision industry and no doubt affects other machine learning disciplines as well is a complete lack of checking what is coming out of the data loader.
Once I was working with a client who had an object detection model that performed ok, but any attempts to improve performance failed. They tried playing with augmentation parameters, changing model hyperparameters, and adding more data, but nothing made a notable impact on the performance.
When I finally got my hands on the source code, something felt a bit fishy about their augmentations. I did a bit of digging and found that their object detection model was actually rewritten from a version written in Chainer. The model itself was completely rewritten in Tensorflow, but the augmentations had just been cut and pasted assuming they would work exactly as they did with the original Chainer implementation.
The first thing I did was implement a visualizer that takes a batch from the training data loader and displays it with the bounding boxes. It turns out that the augmentation functions didn’t work as expected and the result was bounding boxes that were nowhere near the objects they were associated with.
After that, I just removed all the Chainer augmentation functions, replaced it with an augmentation library I knew worked well with Tensorflow, double checked it by visualizing my batch again and then retrained the model. From then on, the model performed much better and the client learned a valuable lesson about GIGO.
So How Do I Visualize?
The first way is to just simply display your batch with matplotlib or save it as an image using OpenCV. I will usually just stack my images into one big image for convenience. Just make sure to apply the bounding boxes to the images before you visualize it!
Another great way is to log your images to Weights and Biases. If you use the PyTorch Lightning Wandb Logger, it’s even easier because the logger provides a log_image function. One thing you could even consider is to automate logging a batch of images before each training run so then you can go back and see what a batch of images looks like if a training run doesn’t go well.
A Word of Caution
When you visualize your batch, make sure to undo normalization or it will be hard to tell what your images look like. Likewise, don’t turn off normalization just to visualize and then forget to normalize before passing to your model.
Putting garbage into your model can be detrimental to getting any sort of good performance. Always visualize your batches before you start training!