Data Engineer � Python, Data Streaming & Databricks

Posted on August 12, 2025

Apply Now

Job Description

Job Title: Data Engineer � Python, Data Streaming & Databricks
Location: Remote
Experience Level: 5+ yrs
Job Summary:
We are looking for a skilled Data Engineer with strong expertise in Python, real-time data streaming, and Databricks to join our data team. You will be responsible for designing, building, and maintaining scalable data pipelines and infrastructure that supports real-time and batch processing. The ideal candidate is passionate about data engineering and eager to work with cutting-edge technologies in a collaborative environment.
Key Responsibilities:
Design and develop robust data pipelines for real-time and batch processing using tools like Apache Kafka, Spark Structured Streaming, and Databricks.
Implement ETL/ELT workflows using Python and optimize them for scalability and reliability.
Utilize Databricks notebooks and Delta Lake for data transformations and processing at scale.
Collaborate with data scientists, analysts, and software engineers to understand data requirements and deliver clean, reliable datasets.
Monitor, troubleshoot, and improve performance of data systems in production.
Ensure data quality, integrity, and governance across all data platforms.
Automate data workflows and implement CI/CD for data pipeline deployments.
Required Qualifications:
5+ years of experience in data engineering roles.
Strong programming skills in Python for data processing and automation.
Hands-on experience with real-time data streaming tools such as Apache Kafka, Spark Streaming, or Flink.
Proficient in using Databricks for big data processing and notebook-based development.
Experience working with Delta Lake, Parquet, and data lakehouse architectures.
Familiarity with SQL and distributed data processing frameworks.
Experience with at least one cloud platform (AWS, Azure, or GCP).
Strong understanding of data modeling, data warehousing, and pipeline orchestration tools (e.g., Airflow, Azure Data Factory).
Preferred Qualifications:
Experience with containerization and orchestration tools (Docker, Kubernetes).
Knowledge of CI/CD pipelines for data engineering projects.
Experience with data observability, lineage, and governance tools.
Background in supporting machine learning workflows or data science teams.

Required Skills

5+ years of experience in data engineering roles. strong programming skills in python for data processing and automation. hands-on experience with real-time data streaming tools such as apache kafka spark streaming or flink. proficient in using databricks for big data processing and notebook-based development. experience working with delta lake parquet and data lakehouse architectures. familiarity with sql and distributed data processing frameworks. experience with at least one cloud platform (aws azure or gcp). strong understanding of data modeling data warehousing and pipeline orchestration tools (e.g. airflow azure data factory).

Recruiter: Thanesh Sahu

Company: The AI Matters

Chat on WhatsApp

Key Details

Job Type contract

Location Type remote

Location Remote

Experience 5+ years

Salary Range INR 90,000 - 100,000 / monthly

Application Deadline August 19, 2025