
Senior Data Engineer IRC275102
- Polska
- Stała
- Pełny etat
- Experience architecting ML-based solutions in conjunction with DS teams, software engineering teams, and Product teams.
- Proven experience translating data science prototypes into production services with clear APIs, SLAs/SLOs, and acceptance criteria in high-volume, low-latency contexts (e.g., AdTech).
- Proven experience designing, building, and operating batch/streaming feature pipelines with schema control, validation, lineage, and offline/online parity using Python, Airflow/Composer, Kafka, and BigQuery; leveraging Spark, MySQL, and Redis as appropriate.
- Proven experience implementing reproducible ML training workflows (data prep, hyperparameter tuning, evaluation) with artifact and model versioning on public cloud (GCP strongly preferred).
- Proven experience packaging and deploying models as containers/services with staged promotion, canary/shadow/A/B rollouts, rollbacks, and environment parity via CI/CD.
- Proven experience running scalable inference (batch, microservice, streaming) that meets latency/error budgets, with autoscaling, observability, and SRE-style reliability practices.
- Proven experience establishing CI/CD for data and models with automated tests, data quality gates, model regression/drift detection, and API/data contract testing.
- Proven experience applying DevSecOps in ML systems: IAM, secrets management, network policies, vulnerability scanning, artifact signing, and policy-as-code on GCP.
- Proven experience collaborating with data science on feature design, labeling/annotation strategies, evaluation metrics, error analysis, and defining retraining triggers/schedules.
- Exposure to contributing to product strategy and KPI definition; planning experiments (A/B) and prioritizing ML features aligned to SaaS delivery and operational needs.
- Exposure to coaching and uplifting teams on data/ML testing, observability, CI/CD, trunk-based development/XP, and writing clear documentation (design docs, runbooks, model/data cards).
- Proven experience operating in ambiguous, fast-changing environments; iterating from prototype to production with safe rollouts, clear ownership, and continuous improvement.
- Strong English, excellent influencing and communication skills, and excellent documentation skills.
- Work with product, product engineering, data engineering, and data science peers to build and support our AdTech platform.
- Build data-oriented solutions that are simple, scalable, reliable, secure, maintainable, and make a measurable impact.
- Provide our teams with the data they need to build, sell, and manage our platform, and scale DS prototypes into production solutions. Develop, deliver and maintain batch and real-time data pipelines, analysis services, workflows and orchestrations, and create and manage the platforms and data infrastructure that hold, secure, cleanse and validate, govern, and manage our data.
- Manage our data platform, incorporating services using Airflow, CloudSQL, BigQuery, Kafka, Dataproc, and Redis running on Kubernetes and GCP.
- Support our Data Science teams with access to data, performing code reviews, aiding model evaluation and testing, deploying models, and supporting their execution.
- Employ modern pragmatic engineering principles, practices, and tooling, including TDD/BDD/ATDD, XP, QA Engineering, Trunk Based Development, Continuous Delivery, automation, DevSecOps, and Site Reliability Engineering.
- Contribute to driving ongoing improvements to our engineering principles, practices, and tooling. Provide support and mentorship to junior engineers.
- Develop and maintain a contemporary understanding of AdTech developments, industry standards, partner and competitor platform developments, and commercial models, from an engineering perspective.