Upwork is hiring a Part-time Data Engineer - Python, Google Cloud Platform, Apache Beam, and Apache Airflow Specialist

Part-time Data Engineer - Python, Google Cloud Platform, Apache Beam, and Apache Airflow Specialist

Upwork  ·  US  ·  $160k/yr - $260k/yr
about 2 years ago

We are a dynamic and forward-thinking organization seeking a skilled Part-time Data Engineer with expertise in Python, Google Cloud Platform (GCP), Apache Beam, Apache Airflow, and BigQuery. This part-time contracting position offers a flexible work arrangement while providing an opportunity to contribute to exciting data engineering projects.

Responsibilities:

Data Pipeline Development: Collaborate with the team to design, develop, and maintain data pipelines using Python and Apache Beam. These pipelines will efficiently extract, transform, and load data from diverse sources into target data repositories on Google Cloud Platform.

Apache Airflow Workflow Management: Utilize your knowledge and experience of Apache Airflow to design, schedule, and monitor complex data workflows. Implement data pipelines as Airflow DAGs (Directed Acyclic Graphs) to enable easy orchestration and scheduling.

Google Cloud Platform (GCP) Expertise: Leverage your proficiency in GCP services, including BigQuery, Dataflow, and Cloud Storage, to implement robust data pipelines that ensure data availability, security, and cost-effectiveness.

Data Transformation and Validation: Implement data transformation processes to cleanse, enrich, and validate data as part of the ETL workflow. Incorporate data quality checks and monitoring mechanisms to guarantee data accuracy and integrity.

Performance Optimization: Analyze and optimize data pipelines to enhance performance, minimize latency, and improve overall data processing efficiency.

Data Modeling and Schema Design: Collaborate with data analysts and stakeholders to understand data requirements and design appropriate data models and schemas for efficient data storage and retrieval.

Monitoring and Troubleshooting: Set up monitoring systems and dashboards to proactively identify issues in data pipelines. Provide timely resolution to any pipeline-related problems and troubleshoot incidents efficiently.

Security and Compliance: Ensure data handling and storage adhere to security and privacy policies. Implement access controls and data encryption techniques to safeguard sensitive information.

Documentation: Create and maintain comprehensive technical documentation related to data pipelines, dataflows, and associated systems.

Requirements:

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

Proven experience as a Data Engineer, Software Engineer, or similar role with a focus on data engineering and ETL processes.

Strong proficiency in Python programming and experience with data processing libraries and frameworks.

Hands-on experience with Google Cloud Platform services, particularly BigQuery, Dataflow, and Cloud Storage.

In-depth knowledge of Apache Beam for building data processing pipelines.

Experience with Apache Airflow for workflow management and orchestration.

Familiarity with data modeling, data warehousing, and data storage principles.

Proficiency in SQL for querying and manipulating large datasets.

Strong problem-solving and analytical skills, with an ability to tackle complex data engineering challenges.

Excellent communication and collaboration skills to work effectively in a team-oriented environment.

Prior experience in a cloud-based, scalable, and distributed data environment is a plus.

Join us in this part-time opportunity to showcase your data engineering expertise and contribute to innovative projects. Apply now and be part of a team that values flexibility, excellence, and a passion for cutting-edge technology.

Job is closed

This job is already closed and no longer accepting applicants, sorry.