Cynosure Corporate Solutions
Databricks Developer - ETL/PySpark
Job Location
chennai, India
Job Description
- UK shift Key Skills: DevOps, ETL, Databricks, Spark SQL, Pyspark, SQL, cloud platforms (AWS, Azure) We are seeking a highly motivated and experienced Databricks Developer to join our growing team. In this role, you will be responsible for designing, developing, and optimizing data pipelines within the Databricks environment. You will leverage your expertise in Spark SQL, PySpark, and cloud platforms (AWS or Azure) to transform and process data from various sources, ensuring high data quality and performance. You will also collaborate with cross-functional teams to deliver impactful data solutions that drive business insights. Key Responsibilities : - Design, develop, and maintain efficient and scalable data pipelines using Databricks for ETL processes. - Optimize data pipelines for performance, reliability, and cost-effectiveness. - Implement data transformations and processing logic using Spark SQL and PySpark. - Integrate data from diverse sources, including data lakes, APIs, and external databases, using Unity Catalog. - Manage and maintain data quality, ensuring accuracy and consistency. - Develop and manage views and stored procedures within Databricks. - Perform in-depth data analysis to identify data quality issues and propose solutions. - Monitor data pipeline performance and troubleshoot issues. - Implement data validation and testing procedures. - Collaborate with front-end teams to support report generation and dashboard creation. - Work closely with data engineers, data scientists, and business analysts to understand data requirements. - Communicate effectively with stakeholders to provide updates and address concerns. - Utilize GitLab for version control, CI/CD automation, and task management (Jira). - Implement and maintain DevOps best practices for data pipeline deployment and management. - Contribute to the automation of deployment and testing processes. - Work with healthcare or clinical trial data, understanding the nuances of this type of data. - Implement data processing that adhere to healthcare industry standards and regulations. Required Skills and Qualifications : - Bachelor's or Master's degree in Data Engineering, Computer Science, or a related field. - Minimum of 3 years of hands-on experience in Databricks development. - Strong proficiency in Spark SQL and PySpark. - Extensive experience with SQL and relational databases. - Proven experience in designing and developing ETL workflows and data models. - Expertise in creating and optimizing views and stored procedures in Databricks. - Solid understanding of cloud platforms, specifically AWS or Azure. - Experience with version control systems, particularly Git and GitLab. - Familiarity with CI/CD pipelines and DevOps practices. - Excellent problem-solving and analytical skills. - Strong communication and collaboration skills. - Experience with Jira. - Experience with Unity Catalog. - Experience with data lakes. - Experience with healthcare or clinical trial data. - Knowledge of data governance and security best practices. - Experience with other big data technologies (ref:hirist.tech)
Location: chennai, IN
Posted Date: 5/13/2025
Location: chennai, IN
Posted Date: 5/13/2025
Contact Information
Contact | Human Resources Cynosure Corporate Solutions |
---|