AWS Data Engineer. (Glue and Amazon S3)

Posted on September 3, 2025

Apply Now

Job Description

  • *AWS Data Engineer. (Glue and Amazon S3) Job Description*
  • *No. of Engineer:* 1 *Location:* Remote *Duration:* 4 months
  • *Description:*
  • This job description focuses on the design, development, and maintenance of data
  • pipelines and ETL (Extract, Transform, Load) processes within the AWS ecosystem.
  • The role heavily leverages AWS Glue for data transformation and Amazon S3 for
  • data storage.
  • *Key Responsibilities:*
  • * Data Pipeline Design and Development: Design, build, and maintain
  • scalable and efficient data pipelines using AWS Glue, integrating with other
  • AWS services like Lambda, EMR, Redshift, Kinesis, and Athena.
  • * ETL Job Creation and Management: Develop and optimize AWS Glue ETL
  • jobs (using Spark/Python/Scala) to clean, transform, and enrich raw data for
  • analytics or other downstream applications.
  • * Data Cataloging and Discovery: Use AWS Glue Data Catalog and crawlers
  • to automatically discover, catalog, and manage metadata for various data
  • sources, including those stored in Amazon S3.
  • * Amazon S3 Data Management: Manage data storage in Amazon S3,
  • including setting up appropriate storage classes, managing data lifecycle
  • policies, and ensuring data security and accessibility.
  • * Data Quality and Validation: Implement data quality checks and validation
  • rules within the ETL process to ensure data accuracy and reliability.
  • * Monitoring and Troubleshooting: Monitor the performance and health of
  • Glue jobs and data pipelines, troubleshoot issues, and optimize for efficiency
  • and cost-effectiveness.
  • * Collaboration and Communication: Work with data engineers, data
  • scientists, and other stakeholders to understand data requirements and
  • deliver robust data solutions.
  • Required Skills and Qualifications:
  • * AWS Services Expertise: In-depth knowledge of AWS Glue, Amazon S3,
  • and other relevant AWS big data and analytics services.
  • * Programming Languages: Proficiency in programming languages commonly
  • used with AWS Glue, such as Python, Scala, and Spark.
  • * Data Warehousing and Data Lakes: Understanding of data warehousing
  • concepts, data lake architectures, and best practices for data storage and
  • retrieval.
  • * SQL and Database Skills: Strong SQL skills for data manipulation and
  • analysis, and understanding of database design principles.
  • * Analytical and Problem-Solving Skills: Ability to analyze complex data
  • problems, design effective solutions, and troubleshoot technical issues.
  • * Communication Skills: Effective communication skills to collaborate with
  • team members and explain technical concepts clearly

Required Skills

* data pipeline design and development: design build and maintain scalable and efficient data pipelines using aws glue integrating with other aws services like lambda emr redshift kinesis and athena. * etl job creation and management: develop and optimize aws glue etl jobs (using spark/python/scala) to clean transform and enrich raw data for analytics or other downstream applications. * data cataloging and discovery: use aws glue data catalog and crawlers to automatically discover catalog and manage metadata for various data sources including those stored in amazon s3. * amazon s3 data management: manage data storage in amazon s3 including setting up appropriate storage classes managing data lifecycle policies and ensuring data security and accessibility. * data quality and validation: implement data quality checks and validation rules within the etl process to ensure data accuracy and reliability.