Data Engineer
• Lead
• On-Site
• Data
Mark status as:
✨ The Role in One Sentence
Tide is seeking a Lead Data Engineer with PySpark expertise to design, develop, and optimize data pipelines and platforms.
📋 What You'll Likely Do
40%: Design and implement scalable ETL/ELT pipelines using PySpark for batch and real-time data processing.
30%: Optimize PySpark performance by identifying and resolving bottlenecks using Spark UI and advanced techniques.
30%: Implement data quality checks and monitoring mechanisms to ensure data accuracy and reliability.
🧑💻 Profiles Doing This Job
High Priority: 8+ years in data engineering, with 4+ years focused on PySpark.
High Priority: Expert-level proficiency in PySpark and strong Python programming skills.
High Priority: Experience with distributed data storage solutions and version control systems (Git).
📈 How This Role Will Look on Your CV
Led the design and optimization of data pipelines using PySpark in a production environment.