A Starting Point for Learning Visual SLAM

If you are like me, you are probably fascinated by the idea of SLAM and since you are here reading Perception ML articles, you are probably even more excited about Visual SLAM.  Along my path to become a better engineer, I have been doing some self-study about Visual SLAM.

Of course, it’s not exactly a beginner friendly topic,  so there aren’t a ton of resources out there on it.  However, there is one free book that most people point to and it is a great starting point for learning about Visual SLAM.  It’s called “Introduction to Visual SLAM” and the book can be obtained for free from the author’s github repository.  In addition to covering a wide range of useful topics, it also includes code samples in C++ that you can experiment with to learn more about the content in each chapter.

It has 13 chapters split into the following topics:

Part 1

  1. Introduction to Visual SLAM
  2. 3D Rigid Body Motion
  3. Lie Group and Lie Algebra
  4. Cameras and Images
  5. Non-linear Optimization

Part 2

  1. Visual Odometry: Part 1
  2. Visual Odometry: Part 2
  3. Filters and Optimization Approaches: Part 1
  4. Filters and Optimization Approaches: Part 2
  5. Loop Closure
  6. Dense Reconstruction
  7. Practice: Stereo Visual Odometry
  8. Discussions and Outlook

This touches on a lot of the core knowledge necessary for doing Visual SLAM, but I want to emphasize the word “touches” here.  This book makes it a point of keeping things simple by primarily introducing the parts of each topic directly related to Visual SLAM.  This is great because you don’t spend forever covering math that isn’t directly related to what you want to do.  On the other hand, this can also be problematic because sometimes you will lack some knowledge and intuition that could help you understand the topics better.

How I Recommend Using this Book

While reading it, one issue I have run into is not having enough background knowledge about some of the topics covered.  This is a bit of a problem because, for me at least, it leads to skimming over the stuff you don’t know and missing some important information.

Instead of skimming, I highly recommend taking a deeper dive on each topic.  All of these topics have great individual resources out there that you can watch to catch up on some areas that you might be unfamiliar with and help you better understand the concepts in this book.

Another thing that you should do is bounce questions off of someone who knows about the topic.  Of course, that might not always be an option.  I don’t necessarily have a SLAM engineer waiting to answer my questions at any given time either.  In that case, try using ChatGPT about things you don’t understand and asking it for confirmation on ideas you think you understand.  You can even get it to quiz you on ideas.  Just remember to keep an eye out for fishy responses.  We are all adults here though.  You should know not to trust ChatGPT blindly.

Anyways, here are a few useful resources I found for some of the individual concepts.  I’ll add more as I find them.

Some Useful Supplementary Resources

Lie Group and Lie Algebra:  Tom Drummond’s slides on Lie Groups and Lie Algebra

Cameras and Images: First Principles of Computer Vision

Kalman Filters: kalmanfilters.net


Learning Visual SLAM is a large task.   Introduction to Visual SLAM is a great starting point, but make sure to look out for supporting resources to fill in the blanks where you need them.  Most of the resources are from famous universities like Columbia, Stanford etc, so you can even think of it as giving yourself a top class education!  Good luck with your studies, and feel free to drop me a message if you have any questions. 


Leave a Reply