Key Responsibilities
- Design, develop, and optimize data pipelines using Databricks and PySpark .
- Build scalable ETL/ELT data solutions integrating data from various structured and unstructured sources.
- Work extensively with Azure Data Lake Storage (ADLS) for data ingestion, transformation, and storage.
- Develop robust, reusable, and high-performance code in Python .
- Create and maintain SQL scripts, queries, and stored procedures to support data transformations and analytics.
- Implement data workflows using Azure Data Factory (ADF) .
- Apply functional programming principles to build clean, modular, and testable data processing logic.
- Collaborate with cross-functional teams in an Agile environment to deliver end-to-end data engineering solutions.
- Ensure data quality, data governance, and best practices in pipeline development.
- Perform performance tuning, debugging, and troubleshooting of pipelines and jobs.
Required Skills & Qualifications
- 5+ years of experience as a Data Engineer or similar role.
- Strong expertise with Databricks notebooks, workflows, and cluster management.
- Hands-on experience with PySpark for distributed data processing.
- Strong proficiency in Python and SQL.
- Experience with Azure Data Lake , ADF , and other components of Azure data services.
- Background in functional programming concepts and best practices.
- Experience working in Agile methodologies (Scrum/Kanban).
- Excellent problem-solving and communication skills.