Mastech Digital
StreamSets Lead - Data Validation
Job Location
tamil-nadu, India
Job Description
Job Description : StreamSets Lead About Talent Solutions Partner (Placeholder) : This detailed job description is presented by a talent solutions partner, working on behalf of our esteemed client. Our client is a leading organization seeking to enhance their data engineering capabilities with a focus on stream data processing. About the Role : We are seeking a highly skilled and experienced StreamSets Lead to join our client's team in Chennai. This is a permanent, full-time opportunity for a data engineering professional with a minimum of 5 years of experience, specifically focusing on designing, developing, and managing data pipelines using the StreamSets Data Collector platform. The role requires a strong understanding of data integration principles, pipeline performance optimization, and collaboration with various data stakeholders. While primarily a permanent position, we are also open to considering experienced Freelancers/Contractors with strong StreamSets and SQL expertise for need-based remote support. Location : Chennai, Tamil Nadu (3 days/week in office) Opportunity Type : Permanent (Full Time) Experience : 5 years in data engineering, with significant hands-on experience in StreamSets. Notice Period : Immediate Joiner / Currently Serving / Less than 45 days Job Responsibilities : As a StreamSets Developer/Engineer Lead, your responsibilities will encompass the entire data pipeline lifecycle: - Lead the design and development of robust and efficient data pipelines using the StreamSets Data Collector based on diverse business requirements. - Implement complex data transformations, validations, and enrichments within StreamSets pipelines to ensure data quality and usability. - Thoroughly test pipelines for performance, reliability, data quality, and adherence to specifications. - Establish and manage monitoring for pipeline performance metrics, including throughput, latency, and error rates. - Troubleshoot and quickly resolve issues related to pipeline failures, data inconsistencies, or performance bottlenecks. - Optimize existing pipelines for improved efficiency, scalability, and cost-effectiveness as data volumes and processing requirements grow. - Ensure that all data handling practices within StreamSets pipelines comply with organizational security policies and relevant regulatory requirements. - Implement and maintain encryption, access controls, and auditing mechanisms within pipelines to protect sensitive data. - Create and maintain comprehensive documentation for pipeline configurations, dependencies, operational procedures, and best practices. - Share knowledge and provide guidance to team members through training sessions, documentation, and collaborative tools to enhance collective expertise. - Analyze pipeline performance to identify bottlenecks and optimize configurations for improved efficiency and faster data processing. - Plan and implement strategies for scaling pipelines to handle increasing data volumes and processing requirements effectively. - Collaborate effectively with cross-functional teams, including data architects, developers, data scientists, and business analysts, to understand data requirements and ensure technical solutions align with business objectives. - Communicate updates, issues, technical challenges, and resolutions clearly and effectively to all relevant stakeholders. Required Skills : - Proven expertise in designing, developing, and deploying data pipelines using StreamSets Data Collector. - Strong proficiency in configuring data sources, processors (transformations), and destinations within StreamSets. - Solid experience in monitoring, troubleshooting, and optimizing StreamSets pipeline performance. - Strong SQL skills for data querying, validation, and pipeline logic implementation. - Basic understanding of Apache Kafka architecture, including topics, partitions, brokers, and consumer groups. - Experience collaborating with data architects and business stakeholders. - Understanding of data governance, security, and compliance principles in the context of data pipelines (ref:hirist.tech)
Location: tamil-nadu, IN
Posted Date: 5/9/2025
Location: tamil-nadu, IN
Posted Date: 5/9/2025
Contact Information
Contact | Human Resources Mastech Digital |
---|