aws Archives - ML Journey

Best Practices for AWS DMS Monitoring and Logging

December 7, 2025 by Peter Song

AWS Database Migration Service (DMS) has become the go-to solution for migrating databases to AWS, enabling everything from simple lifts-and-shifts to complex heterogeneous migrations and ongoing replication for hybrid architectures. Yet the power of DMS comes with operational complexity—replication tasks can lag, fail silently during full loads, encounter data type conversion errors, or experience network … Read more

How to Use AWS Data Pipeline for Machine Learning

December 6, 2025 by Peter Song

Machine learning workflows are inherently data-intensive, requiring orchestration of complex sequences: data extraction from multiple sources, transformation and cleaning, feature engineering, model training, validation, and deployment. Managing these workflows manually quickly becomes unsustainable as complexity grows. AWS Data Pipeline, a web service for orchestrating and automating data movement and transformation, provides infrastructure for building reliable, … Read more

Deploying Debezium on AWS ECS or Fargate

December 1, 2025 by Peter Song

Debezium’s change data capture capabilities transform databases into event streams, enabling real-time data pipelines, microservices synchronization, and event-driven architectures. While Kafka Connect provides the standard deployment model for Debezium connectors, running this infrastructure on AWS demands careful consideration of container orchestration options. ECS (Elastic Container Service) and Fargate offer distinct approaches to deploying Debezium—ECS provides … Read more

Custom Model Deployment with SageMaker Endpoints

November 23, 2025 by Peter Song

Deploying machine learning models to production is one of the most critical yet challenging phases of any ML project. While training a model that achieves excellent accuracy on test data is an accomplishment, the real value emerges only when that model serves predictions reliably at scale. Amazon SageMaker Endpoints provide a powerful managed infrastructure for … Read more

Real-World AWS ML Use Cases in Retail and Marketing

November 22, 2025 by Peter Song

Machine learning has transitioned from experimental technology to core business infrastructure in retail and marketing. Companies leveraging AWS ML services report measurable improvements—conversion rate increases of 15-40%, customer acquisition cost reductions of 20-35%, and inventory efficiency gains exceeding 25%. These aren’t aspirational projections but documented results from organizations that moved beyond pilot projects to production … Read more

AWS Textract Machine Learning Use Cases

November 22, 2025 by Peter Song

Amazon Textract represents a significant advancement in document processing, leveraging machine learning to automatically extract text, handwriting, tables, and structured data from scanned documents. Unlike traditional optical character recognition (OCR) that simply identifies text characters, Textract understands document context, relationships, and layout, making it capable of handling complex real-world documents that have challenged automation efforts … Read more

Building Serverless CDC Pipelines with Lambda and Firehose

November 13, 2025 by Peter Song

Change Data Capture (CDC) has become essential for modern data architectures, enabling real-time analytics, audit trails, and downstream system synchronization. While traditional CDC solutions require managing complex infrastructure—database servers, streaming platforms, and processing clusters—AWS Lambda and Kinesis Firehose offer a fully serverless alternative that scales automatically, requires no infrastructure management, and costs nothing when idle. … Read more

Data Engineering on AWS – Everything You Need to Know

November 11, 2025 by Peter Song

Data engineering has become the backbone of modern data-driven organizations, and Amazon Web Services (AWS) provides one of the most comprehensive ecosystems for building robust data pipelines and analytics platforms. Whether you’re migrating from on-premises infrastructure or building a greenfield data platform, understanding AWS’s data engineering capabilities is essential for making informed architectural decisions. This … Read more

How to Send CDC Events to Kinesis: Complete Implementation Guide

November 10, 2025 by Peter Song

Streaming database changes to Amazon Kinesis unlocks real-time data processing capabilities—enabling event-driven architectures, powering analytics dashboards with fresh data, and triggering automated workflows within seconds of database modifications. Change Data Capture (CDC) to Kinesis represents a powerful pattern, but implementing it correctly requires understanding multiple integration approaches, configuration nuances, and operational considerations. Poor implementations result … Read more

Best Practices for Monitoring ML Models in AWS

November 9, 2025 by Peter Song

Machine learning models deployed to production require continuous monitoring to maintain their effectiveness and reliability. Unlike traditional software where bugs manifest as clear errors, ML models degrade silently as data distributions shift, business contexts evolve, and edge cases emerge that weren’t present in training data. AWS provides comprehensive monitoring capabilities through SageMaker Model Monitor, CloudWatch, … Read more