GCP Data Engineer(5+ Years , IC Role ,Bangalore,Hyderabad -Hybrid)
Posted on July 7, 2025
Job Description
- GCP Data Engineer(5+ Years , IC Role
- ,Bangalore,Hyderabad -Hybrid)
- Location:Bangalore, Hyderabad,
- Work Model: Hybrid (3 days from office)
- Experience Required: 5+ years
- Role Type: Individual Contributor
- Role Summary
- We are hiring a GCP Data Engineer (Individual Contributor) to join a large-scale data transformation
- initiative for a USA-based global bank. The engineer will work on building and optimizing highperformance batch and real-time data pipelines on Google Cloud Platform (GCP), collaborating with
- data architects, business teams, and DevOps.This role requires strong command over BigQuery,
- Dataflow, Composer, Pub/Sub, and GCS, as well as hands-on experience with SQL and Python
- scripting for building production-grade data solutions. Exposure to Terraform-based GCP provisioning
- is expected.You will be responsible for developing reusable and scalable data workflows that support
- analytics, reporting, and digital applications in a regulated banking environment, with attention to
- security, compliance, and performance.
- Must-Have Skills & Required Depth
- Skill Skill Depth
- BigQuery Independently handled end-to-end ingestion pipelines from GCS and Pub/Sub,
- complex SQL logic (joins, CTEs, aggregations, partitioning), performance tuning
- using clustering, materialized views, and scheduled queries.
- GCS (Google
- Cloud Storage)
- Managed bucket-level configuration, access permissions (IAM), versioning, and
- lifecycle rules. Integrated GCS with Dataflow and BigQuery for historical and
- incremental loads. Performed large file operations (CSV, JSON) for structured
- data processing.
- Cloud Dataflow
- (Apache Beam)
- Contributed to building batch and streaming pipelines with Pub/Sub as source
- and BigQuery as sink. Familiar with windowing, watermarking, PCollections,
- and DoFn transforms. Exposure to Python-based Beam SDK preferred; full
- pipeline architecture design not mandatory.
- SQL (Advanced) Hands-on with SQL query design for analytical workloads. Used advanced
- constructs like window functions, nested queries, lateral joins, and time-based
- functions. Tuned queries for performance and cost using partition filters, explain
- plans.
- Cloud Composer /
- Airflow
- Contributed to DAG creation and enhancement for orchestration of GCP
- workflows. Experience with retries, branching, task dependencies, sensor
- usage, and scheduling. Comfortable debugging DAG failures via Airflow UI/logs;
- full DAG architecture ownership not required.
- GCP Data Engineer(5+ Years , IC Role
- ,Bangalore,Hyderabad -Hybrid)
- Pub/Sub Implemented real-time ingestion from messaging streams into Dataflow and
- BigQuery. Familiar with topic creation, subscription patterns (pull/push), and
- message acknowledgment handling. Used for event-driven pipelines with error
- handling in production.
- Python Used for ETL transformations, DAG scripting in Composer, data cleansing logic
- (nulls, special characters), and API integration. Proficient with libraries like
- pandas, json, re, and working with GCP SDKs.
- Terraform Exposure to provisioning GCP services like buckets, datasets, and service
- accounts using reusable Terraform modules. Contributed to infrastructure-ascode practices in DevOps teams; did not lead end-to-end module authoring.
- Nice-to-Have Skills
- Skill Skill Depth
- Google Kubernetes
- Engine (GKE)
- Conceptual understanding of container orchestration in GCP. Familiar with
- basics of deployments, services, and Helm; has not managed productiongrade GKE clusters.
- Bigtable (NoSQL) Exposure to read/write APIs for large-scale columnar data; familiar with
- schema modeling but has not managed NoSQL production workloads.
- Kafka / Hadoop Experience migrating from legacy big data stacks; basic hands-on in
- message streaming and HDFS-based pipelines.
- Oracle GoldenGate Worked on ingesting historical data using GoldenGate into GCS; used as
- part of hybrid migration setups.
- CI/CD (Jenkins, GitLab) Integrated GCP jobs into Jenkins pipelines; used GitLab for version
- control and basic YAML-based pipeline triggers.
- PySpark Used for distributed processing of large files in legacy Hadoop
- environments; experience with DataFrame APIs and performance tuning
- in cluster mode.
- Data Reconciliation /
- Audit
- Implemented row-count matching and metadata audits between source
- and target systems; built audit tables in BigQuery to support compliance.
Required Skills
gcp data engineer