Computer Vision: How Your Laptop Sees Things

You may have already noticed it, but in this day and age computers being able to understand images is a wide spread concept.  When you open your photo album app and search for “baby kittens”, somehow the computer magically knows you are looking for adorable little furballs instead of last night’s dinner or whatever else you thought was an appropriate target for your photography skills.  On a more critical level, self-driving cars need to be able to differentiate between humans, bicycles, cars and, in a worst-case scenario, baby kittens when driving.  So exactly how does a computer actually see using images?

The Two Big Categories of Computer Vision Algorithms

To put it short and simple, computers use algorithms to understand images.  The categories of algorithms used to understand images and perform various tasks related to that information are called Computer Vision algorithms.  Generally speaking, they can be split into two categories: traditional and deep learning-based.  Traditional algorithms exploit properties of a photo to accomplish various tasks such as edge finding, detecting specific shapes and determining corresponding points in two photos. On the other hand, deep learning algorithms make use of neural networks to “digest” an image and use that digested output to perform tasks that are impossible for many traditional algorithms.  This could be things like classification of images (i.e. is it a dog or a cat?), detecting specific kinds of objects in an image, or following a person as they move in a video.

So Which Algorithms are Better? Traditional or Deep Learning Algorithms?

Truth be told, neither kind of algorithm is a perfect solution to all computer vision problems.  There are some tradeoffs that need to be made when selecting one or the other.  Deep Learning algorithms can be extremely powerful, so it is easy to want to reach for them in the beginning, but two major advantages of traditional algorithms is that they tend to be very efficient in the sense that they require less processing power, and they do not require any training before use.  Under the right circumstances, you can solve a problem using traditional algorithms and it could potentially be both faster and cheaper to develop than a deep learning solution.

The important thing is that you consider both possibilities when starting a new project.  My recommendation would be to always start by checking to see if there are traditional algorithms that solve your problem and that they work effectively on the types of images you want to process.  If there are not or the ones that exist don’t work with your images, then check if there are pre-trained deep learning models you can use.  This can give you the benefits of deep learning without the cost and time commitment of training a model yourself.  Finally, if neither of those are an option, then you know that you need to train your own model.

Learning About Computer Vision Algorithms

It’s impossible for me to cover all the different kinds of algorithms in this blog post (although, I will be working to cover more topics as I go along!), so if you would like to learn more about each kind of algorithm here are a few resources for you.

For traditional algorithms, browsing OpenCV documentation is a great start.  OpenCV is a library for C++ and Python that offers a wide variety of tools for working with traditional computer vision algorithms.  Likewise, learnopencv.com, a website targeted at teaching how to use OpenCV is another good resource and provides many examples with source code.

For deep learning algorithms, the PyTorch documentation has many tutorials on Computer Vision that are a good place to start.  If you want an even deeper dive, check out PapersWithCode and arxiv to find the latest research on Computer Vision in the field of Deep Learning.

Conclusión

Giving computers the ability to “see”, to understand images and perform tasks based on their understanding has opened a whole new world of possibilities for what we can do. The field of Computer Vision is expanding on a daily basis and new algorithms are coming out all the time. The algorithms I mentioned today are just a small portion of all the different kinds out there. If you are someone interested in using Computer Vision for a project or product, or even just curious about learning about the technology, then make sure to take a little bit of time to learn about both categories of algorithms we talked at today. With a little luck, you might even find that information here on Perception ML!


Publicado

en

por

Comentarios

Deja un comentario

es_ARES