Ref: #67471

Principal Data Engineer

  • Practice Data

  • Technologies Business Intelligence Jobs and Data Recruitment

  • Location Manitowoc, United States

  • Type Permanent

A Series A Business are revolutionizing the shopping experience using the power of generative AI and rich messaging technologies to build a personalized shopping assistant for every consumer.  

The Role

We are seeking a Principal Data Engineer with deep expertise in Spark to lead the design and data infrastructure. This is a senior level hands on technical role, ideal for someone passionate about building scalable data systems, mentoring engineers and helping shape data strategy.  As a thought leader on their data engineering team you will architect systems that support high performance batch and real time data processing, power advanced analytics and drive our AI team forward.

Key Responsibilities:

  • Own the architecture and strategic direction of scalable, distributed data infrastructure across cloud platforms. 
  • Design and build a data compilation system to normalize, match, and merge products, reviews, and editorial data from thousands of data sources
  • Use the latest NLP, LLMs, and embedding models to generate the highest quality datasets with automated data auditing and reporting
  • Implement real time and batch data processing systems to power AI/ML use cases
  • Collaborate with engineering, AI and product teams to ensure data availability and reliability
  • Develop backend data solutions that support microservices architecture and a rapidly scaling product environment
  • Manage and extend integrations with third party e-commerce platforms to expand Wizard’s data ecosystem
  • Mentor and support data engineers, establishing best practices

You

  • 8+ years of software development and data engineering experience with demonstrated ownership of production grade data infrastructure
  • Bachelor's degree in Computer Science or a related field, or equivalent practical experience.
  • Deep expertise in building ETL pipelines using Apache Spark, Databricks, or Hadoop is required
  • Strong understanding of distributed computing and modern data modeling techniques for scalable systems.
  • Expert in Python with experience implementing software engineering best practices
  • Solid understanding of distributed computing and data modeling for scalable systems.
  • Hands-on experience with both relational (MySQL / PostgreSQL) and NoSQL (MongoDB, DynamoDB, Cassandra) databases
  • Excellent communicator and collaborator, with a passion for mentoring, knowledge-sharing, and team growth

Nice to Have:

  • Experience working in early-stage, high-growth environments
  • Familiarity with MLOps pipelines and integrating ML models into data workflows.
  • Passionate about problem-solving with a proactive approach to finding innovative solutions.

 

Attach a resume file. Accepted file types are DOC, DOCX, PDF, HTML, and TXT.

We are uploading your application. It may take a few moments to read your resume. Please wait!