monitoring Archives - ML Journey

Best Practices for Monitoring ML Models in AWS

November 9, 2025 by Peter Song

Machine learning models deployed to production require continuous monitoring to maintain their effectiveness and reliability. Unlike traditional software where bugs manifest as clear errors, ML models degrade silently as data distributions shift, business contexts evolve, and edge cases emerge that weren’t present in training data. AWS provides comprehensive monitoring capabilities through SageMaker Model Monitor, CloudWatch, … Read more

Monitoring Machine Learning Models with Prometheus and Grafana

September 14, 2025 by Peter Song

Machine learning models in production require continuous monitoring to ensure they perform as expected over time. Unlike traditional software applications, ML models face unique challenges including data drift, concept drift, and model degradation that can silently erode performance. This comprehensive guide explores how to leverage Prometheus and Grafana to build robust monitoring systems for your … Read more

How to Monitor Machine Learning Models in Production

November 2, 2025July 19, 2025 by Peter Song

Deploying a machine learning model to production is just the beginning of your ML journey. The real challenge lies in ensuring your model continues to perform effectively over time. Without proper monitoring, even the most sophisticated models can silently degrade, leading to poor business outcomes and eroded user trust. Machine learning model monitoring in production … Read more