Data Engineer

Onsite (CBD Belapur), India 4-5 Years FULL TIME Posted 2026-04-18

Job Description

Role Overview: We are seeking a Data Engineer who can design, build, and optimize scalable ETL pipelines and real-time data processing systems, handling high-volume datasets in distributed environments. Key Highlights: • Expertise in PySpark, Apache Spark (Core, SQL, Structured Streaming) • Strong experience with Kafka (Confluent/Apache) for real-time data ingestion • Hands-on with Informatica or similar ETL tools • Experience with Cloudera Hadoop ecosystem and distributed databases (SingleStore preferred) • Strong programming skills in Python / Scala / SQL • Exposure to large-scale data processing (~40TB data, ~5TB daily ingestion) • Experience in batch and real-time architectures • Knowledge of Medallion Architecture (Bronze, Silver, Gold layers) is a plus • Familiarity with workflow orchestration tools like Airflow/Oozie • Experience in BFSI or Telecom domains preferred Key Responsibilities: • Build and optimize scalable ETL pipelines (batch + real-time) • Develop streaming frameworks for low-latency analytics • Ensure data quality, governance, and performance optimization • Collaborate with cross-functional teams for data-driven insights • Improve system scalability, reliability, and cost efficiency

Back to all jobs