AWS Data Engineer. (Glue and Amazon S3)
Posted on September 3, 2025
Job Description
- *AWS Data Engineer. (Glue and Amazon S3) Job Description*
- *No. of Engineer:* 1 *Location:* Remote *Duration:* 4 months
- *Description:*
- This job description focuses on the design, development, and maintenance of data
- pipelines and ETL (Extract, Transform, Load) processes within the AWS ecosystem.
- The role heavily leverages AWS Glue for data transformation and Amazon S3 for
- data storage.
- *Key Responsibilities:*
- * Data Pipeline Design and Development: Design, build, and maintain
- scalable and efficient data pipelines using AWS Glue, integrating with other
- AWS services like Lambda, EMR, Redshift, Kinesis, and Athena.
- * ETL Job Creation and Management: Develop and optimize AWS Glue ETL
- jobs (using Spark/Python/Scala) to clean, transform, and enrich raw data for
- analytics or other downstream applications.
- * Data Cataloging and Discovery: Use AWS Glue Data Catalog and crawlers
- to automatically discover, catalog, and manage metadata for various data
- sources, including those stored in Amazon S3.
- * Amazon S3 Data Management: Manage data storage in Amazon S3,
- including setting up appropriate storage classes, managing data lifecycle
- policies, and ensuring data security and accessibility.
- * Data Quality and Validation: Implement data quality checks and validation
- rules within the ETL process to ensure data accuracy and reliability.
- * Monitoring and Troubleshooting: Monitor the performance and health of
- Glue jobs and data pipelines, troubleshoot issues, and optimize for efficiency
- and cost-effectiveness.
- * Collaboration and Communication: Work with data engineers, data
- scientists, and other stakeholders to understand data requirements and
- deliver robust data solutions.
- Required Skills and Qualifications:
- * AWS Services Expertise: In-depth knowledge of AWS Glue, Amazon S3,
- and other relevant AWS big data and analytics services.
- * Programming Languages: Proficiency in programming languages commonly
- used with AWS Glue, such as Python, Scala, and Spark.
- * Data Warehousing and Data Lakes: Understanding of data warehousing
- concepts, data lake architectures, and best practices for data storage and
- retrieval.
- * SQL and Database Skills: Strong SQL skills for data manipulation and
- analysis, and understanding of database design principles.
- * Analytical and Problem-Solving Skills: Ability to analyze complex data
- problems, design effective solutions, and troubleshoot technical issues.
- * Communication Skills: Effective communication skills to collaborate with
- team members and explain technical concepts clearly
Required Skills
* data pipeline design and development: design
build
and maintain scalable and efficient data pipelines using aws glue
integrating with other aws services like lambda
emr
redshift
kinesis
and athena. * etl job creation and management: develop and optimize aws glue etl jobs (using spark/python/scala) to clean
transform
and enrich raw data for analytics or other downstream applications. * data cataloging and discovery: use aws glue data catalog and crawlers to automatically discover
catalog
and manage metadata for various data sources
including those stored in amazon s3. * amazon s3 data management: manage data storage in amazon s3
including setting up appropriate storage classes
managing data lifecycle policies
and ensuring data security and accessibility. * data quality and validation: implement data quality checks and validation rules within the etl process to ensure data accuracy and reliability.