Detect and handle data skew on AWS Glue
AWS Big Data
MAY 1, 2024
How to detect data skew When an AWS Glue job has issues with local disks (split disk issues), doesn’t scale with the number of workers, or has low CPU usage (you can enable Amazon CloudWatch metrics for your job to be able to see this), you may have a data skew issue. Another thing that you can use is the summary metrics for each stage.
Let's personalize your content