Job Description

Data Pipeline Architecture and Development:

Architect and optimize scalable data storage solutions, including data lakes, warehouses, and NoSQL databases, supporting large-scale analytics.
Design and maintain efficient data pipelines using technologies such as Apache Spark, Kafka, Fabric Data Factory, and Airflow, based on cross-functional team requirements.

Data Integration and ETL:

Develop robust ETL processes for reliable data ingestion, utilizing tools like SSIS, ADF, and custom Python scripts to ensure data quality and streamline workflows.
Optimize ETL performance through techniques like partitioning and parallel processing.

Data Modeling and Schema Design:

Define and implement data models and schemas for structured and semi-structured sources, ensuring consistency and efficiency while collaborating with data teams to optimize performance.

Data Governance, Security, and Compliance:

Establish and enforce data governance policies, ensuring data quality, security, and compliance with regulations, using tools like Microsoft SQL Server.
Implement access controls, encryption, and auditing to protect sensitive data and collaborate with IT to address vulnerabilities.

Infrastructure Management and Optimization:

Manage and optimize cloud and on-prem infrastructure for data processing, monitor system performance, and implement disaster recovery enhancements.
Leverage automation for provisioning, configuration, and deployment to improve operational efficiency.

Team Leadership and Mentorship:

Provide technical leadership, mentoring team members in best practices and cloud technologies, while aligning data engineering initiatives with strategic goals.

Skills Required

Bachelor’s degree or higher in Software Engineering, Computer Science, Engineering, or a related field.
3-5 years of experience in data engineering, with a proven history of designing and implementing complex data infrastructure.
Proficient in Python, Scala, or Java, with experience in scalable, distributed systems.
Strong knowledge of cloud computing platforms and related services like AWS Glue, Azure Data Factory, or Google Dataflow.
Expertise in data modeling, schema design, and SQL query optimization for both relational and NoSQL databases.
Excellent communication and leadership skills, with the ability to collaborate effectively with cross-functional teams and stakeholders.

Data Engineer (3-5 Yrs)

Apply for this position