Data Engineer

Fetcherr

  • Warszawa, mazowieckie
  • Stała
  • Pełny etat
  • 17 godzin temu
  • Aplikuj teraz
Fetcherr, experts in deep learning, algo, e-commerce, and digitization, is disrupting traditional systems with its cutting-edge AI technology. At its core is the Large Market Model (LMM), an adaptable AI engine that forecasts demand and market trends with precision, empowering real-time decision-making. Specializing initially in the airline industry, Fetcherr aims to revolutionize industries with dynamic AI-driven solutions.Fetcher is seeking a Data Engineer to build large-scale optimized data pipelines using cutting-edge technology and tools. We're looking for someone with advanced Python skills and a deep understanding of memory and CPU optimization in distributed environments. This is a high-impact role with responsibilities that directly influence the company's strategic decisions and data-driven initiatives.Key Responsibilities:
  • Design and build scalable, cross-client data pipelines and transformation workflows using modern ELT tools, ensuring high performance, reusability, and cost-efficiency across diverse data products. Leverage orchestration frameworks like Dagster to manage dependencies, retries, and monitoring.
  • Develop and operate distributed data processing systems that handle large-scale workloads efficiently, adapting to dynamic data volumes and infrastructure constraints. Apply frameworks such as Dask or Spark to unlock parallelism and optimize compute resource utilization.
  • Deliver robust, maintainable Python solutions by applying sound software engineering principles, including modular architecture, reusable components, and shared libraries. Ensure code quality and operational resilience through CI/CD best practices and containerized deployments.
  • Collaborate with data scientists, engineers, and product teams to deliver validated, analytics-ready data that aligns with business requirements. Support team-wide adoption of data modeling standards and efficient data access patterns.
  • Proactively safeguard data quality and reliability by implementing anomaly detection, validation frameworks, and statistical or ML-based techniques to forecast trends and catch regressions early. Enforce backward compatibility and data contract integrity across pipeline changes.
  • Document workflows, interfaces, and architectural decisions in a clear and structured manner to support long-term maintainability. Maintain up-to-date data contracts, system runbooks, and onboarding guides for effective cross-team collaboration.
You’ll be a great fit if you have...
  • 4+ years of hands-on experience building and maintaining production-grade data pipelines at scale
  • Expertise in Python, with strong grasp of data structures, performance optimization, and modern data processing libraries (e.g. pandas, NumPy)
  • Practical experience with distributed computing frameworks such as Dask or Spark, including performance tuning and memory management
  • Proficiency in SQL, with a deep understanding of query optimization, analytical functions, and cost-efficient query design
  • Experience designing and managing transformation logic using dbt, with a focus on modular development, testability, and scalable performance across large datasets
  • Strong understanding of ETL/ELT architecture, data modeling principles, and data validation
  • Familiarity with cloud platforms (e.g. GCP, AWS) and modern data storage formats (e.g. Parquet, BigQuery, Delta Lake)
  • Experience with CI/CD workflows, Docker, and orchestrating workloads in Kubernetes
Nice to Have :
  • Experience with Dagster or similar workflow orchestration tools
  • Familiarity with automated testing frameworks for data workflows, such as pytest, Great Expectations, Pandera, Behave, Hamilton, dbt tests or similar
  • Deep interest in performance optimization and vectorized computation, especially in Dask/pandas -based pipelines
  • Ability to design cross-client, cost-efficient solutions that prioritize scalability, modularity, and minimal resource consumption
  • Strong grounding in software architecture best practices, including adherence to SOLID, YANGI, KISS, DRY, CoC, OOP, CoI, LOD principles and code reuse through shared libraries (strong pro)

Fetcherr

Podobne oferty pracy

  • Service Engineer (optical industry)

    Grafton Sp. Z o.o.

    • Warszawa, mazowieckie
    Grafton Recruitment to zespół doświadczonych specjalistów w zakresie rekrutacji, którzy podchodzą do każdego kandydata w sposób indywidualny. Proces rekrutacyjny nie ogranicza się …
    • 21 dni temu
  • AI Engineer

    Hays

    • Warszawa, mazowieckie
    AI Engineer Warszawa NR REF.: 1194378 Hays IT Contracting to współpraca oparta na zasadach B2B. Nasza firma dopasowuje specjalistów IT do najciekawszych projektów technologiczn…
    • 24 dni temu
  • Data Engineer

    Scalo

    • Warszawa, mazowieckie
    W Scalo zajmujemy się dostarczaniem projektów software'owych i wspieraniem naszych partnerów w rozwijaniu ich biznesu. Tworzymy oprogramowanie, które umożliwia ludziom dokonywanie …
    • 1 dzień temu