Cassandra Expert

Posted on January 9, 2026

Apply Now

Job Description

Job Description

Overview

Experience required45 years

Location: Remote

BGV: Yes

Key Responsibilities

  • Strong, handson experience managing productiongrade Apache Cassandra clusters
  • Experience defining and enforcing Cassandra best practices, governance, and operational standards
  • Ability to create detailed runbooks and SOPs for:
    • Node addition and removal
    • Cluster rebalancing
    • Repair operations
    • Version upgrades (experience with Cassandra 4.x required; upgrade planning to 5.x expected)
  • Proven experience in Cassandra performance tuning, including: JVM tuning, Cache configuration, Thread pool and timeout tuning
  • Strong understanding and handson ability to identify and resolve:
    • Hot partitions
    • Read/write amplification issues
    • High latency and failure scenarios
  • Deep understanding of Cassandra architecture and internals
  • Experience reviewing and optimizing:
    • Cluster topology (replication strategy, consistency level, etc.)
    • Disk, memory, and storage layouts
  • Ability to define and maintain capacity planning guidelines based on data growth and workload patterns
  • Handson experience setting up monitoring and alerting for Cassandra clusters
  • Ability to monitor and alert on critical metrics such as:
    • Read/write latency
    • Read/write failures
    • Repair health
    • Disk usage and storage trends
  • Experience defining and executing backup and restore strategies:
    • Snapshots vs incremental backups
    • Backup validation and restore drills
    • Ability to plan and execute DR simulations and ensure operational readiness
  • Strong Linux fundamentals and troubleshooting skills
  • Automation and scripting skills (Shell/Python)
  • Experience operating Cassandra in cloud environments (AWS/GCP/Azure)

Preferred Skills

  • Handson experience with Cassandra 5.x or largescale version upgrades
  • Experience with infrastructure as code (Terraform, Ansible, etc.)
  • Exposure to SRE practices (SLIs, SLOs, error budgets)
  • Experience integrating Cassandra monitoring with tools like:
    • Prometheus & Grafana
    • Datadog, New Relic, or similar observability platforms
  • Experience optimizing cost efficiency for largescale database operations
  • Exposure to Kubernetesbased Cassandra deployments

Qualifications

  • Prior experience on projects involving Cassandra.

Other Details

  • Standardize Cassandra Best Practices & Governance
  • Create runbooks and SOPs for:
    • Node Addition/removal
    • Rebalancing
    • Repair operations
    • Version upgrade (we want to move from 4.1.3 to 5xx)
  • Establish capacity planning guidelines
  • Setup monitoring, alerting and observability for Cassandra DB:
    • Monitoring for latency, read/write failures, repair health, disk usage
  • Current State Analysis and Tuning
  • Deepdive review of existing setup
  • Cluster topology (replication etc)
  • Disk, memory, and JVM tuning
  • Tune (JVM, Cache, Thread pool and timeout)
  • Identify and fix:
    • Hot partitions
    • Read/write amplification issues etc
  • Operation reliability and DR
  • Define backup and restore strategy:
    • Snapshots vs incremental backups
    • Restore drills.
  • Cost Efficiency in terms of Operating DB.

Required Skills

governance apache cassandra clusters

Clarification Board

Your Clarifications
"Send your Job Related Query - you'll get a reply soon."