The Role
The Data Engineering team is responsible for building and managing data pipelines and data visualizations that powers analytics, machine learning, and AI across Linktree, including finance, growth, product analysis, customer support, sales, marketing, and people functions. We maintain Linktree's Enterprise data and build a creative, reliable, and scalable analytics data model that provides a unified way of analyzing our customers, our products and drive growth and innovations.
You'll be joining a team that is very smart and very direct. We ask hard questions and challenge each other to improve our work continually. We are self-driven yet collaborative. We're all about enabling growth by delivering the right data and insights in the right way to partners across the company.
What You Will Do
- Partner with Product Manager, analytics, and business teams to review and gather the data/reporting/analytics requirements and build trusted and scalable data models, data extraction processes, and data applications to help answer complex questions.
- Design and implement data pipelines to ETL data from multiple sources into a central data warehouse.
- Design and implement real-time data processing pipelines using Apache Spark Streaming.
- Improve data quality by leveraging internal tools/frameworks to automatically detect and mitigate data quality issues.
- Develop and implement data governance procedures to ensure data security, privacy, and compliance.
- Implement new technologies to improve data processing and analysis.
- Coach and mentor junior data engineers to enhance their skills and foster a collaborative team environment.
- A degree in Computer Science or equivalent with 6+ years of professional experience as a Software Engineer or in a similar role.
- Experience building scalable data pipelines in Spark using Airflow scheduler/executor framework or similar scheduling tools.
- Experience with Databricks and its APIs.
- Experience with modern databases (Redshift, Dynamo DB, Mongo DB, Postgres or similar) and data lakes.
- Proficient in one or more programming languages such as Python/Scala and rock-solid SQL skills.
- Champion automated builds and deployments using CICD tools like Bitbucket, Git.
- Experience working with large-scale, high-performance data processing systems (batch and streaming)
- Experience working for SAAS companies
- Experience with Machine Learning
- Committed code to open source projects
- Experience building self-service tooling and platforms
- Experience with Machine Learning
This job is already closed and no longer accepting applicants, sorry.