For anyone reading articles on this website, I don’t think I need to stress the importance of logging metrics of ML models, but when it comes to Libtorch, things just aren’t quite as easy as they are with PyTorch. You could, of course, save everything to a csv file and then visualize it yourself later on, but it’d be nice to see stuff in real time while your model is actually training. Not only that, you will also likely want an organized way to keep track of your training runs.
If you look around a bit, there are some options that might work, but it’s a bit difficult to decide which one is actually the best. When working with PyTorch, there are typically official APIs for Python provided by all of the major logging platforms, but in C++ (at least as far as I can tell) you have two options:
- Using a ready-made library from a repo on Github
- Creating your own logger
Ready-made Options
Jumping straight into the topic, here are a few ready made options if you just want to grab something and go. This option is convenient because there isn’t much thinking involved. You just add it to your project and then follow their usage instructions to get started.
- Logging in Tensorboard using this logger by RustingSword:
https://github.com/RustingSword/tensorboard_logger
- Logging in MLFlow using this logger by Estshorter:
https://github.com/estshorter/mlflow.cpp
- Logging in Weights & Biases using this logger by yhisaki:
https://github.com/yhisaki/wandb-cpp
If, for some reason, the above options don’t work for you or you’d prefer an alternative where you can interface directly with an API provided by a logging tool, then you can also choose to create your own logger.
Creating Your Own Logger
MLFlow offers REST API in addition to providing packages for Python, R and Java meaning that if you have a request library available to you, you can interface with it directly. (This might be convenient for projects already using Boost, which contains its own requests library)
MLFlow provides endpoints for creating and finding experiments and runs as well as logging metrics, so you can write your own class that creates a new experiment and run and then logs your metrics. The interface might look something like this:
int main() {
MLFlowLogger logger{"mlflow:5050", "test2"};
std::vector<float> loss = {300.0, 100.0, 50.0, 10.0, 5.0, 1.0, 0.5};
int step = 0;
for(auto value : loss) {
logger.log_metric("loss", value, step);
step++;
}
}
Obviously, I am just using placeholder values for loss here so you’d want to incorporate this into your training loop, but that shouldn’t be too hard to do.
I would share the implementation of the full logger here, but that would make this post extremely long, so if you would like to see a simple example of a working logger (with MLFlow setup in a docker container) check out mine here. I make use of the restclient-cpp library and a simple json library to make a very basic logger.
Conclusion
There are several ways to log metrics for your Libtorch and what you need for your project will ultimately come down to your own constraints. There is no wrong answer, but definitely consider what method will be easier for you to develop and maintain as your project continues.
References
For information on the MLFlow REST API check out the MLFlow documentation:
Leave a Reply
You must be logged in to post a comment.
Connect with
Login with Linkedin