Posts by Nick Romero

PyTorch Offline Tuning with TunableOp

24 February 2026

In an earlier blog post, we explored how PyTorch TunableOp can potentially accelerate models through online tuning - where during model execution, PyTorch benchmarks and selects optimal BLAS kernels. While online tuning is effective, it introduces overhead due to the time needed to execute the ML model from end-to-end. If this is done once, the overhead may be acceptable, but for repeated tuning it may be cost-prohibitive to keep re-running the model.

Read more ...

Empowering Developers to Build a Robust PyTorch Ecosystem on AMD ROCm™ with Better Insights and Monitoring

21 October 2025

The PyTorch ecosystem is a vibrant and expansive collection of tools, libraries, and community-driven projects that enhance and extend the core PyTorch framework. It empowers researchers and developers to build, train, and deploy deep learning models across a wide range of domains with flexibility and efficiency.

Read more ...