Ref: #61425

Lead Data Engineer (Remote)

  • Practice Data

  • Technologies Business Intelligence Jobs and Data Recruitment

  • Location Manitowoc, United States

  • Type Permanent

A Series A is revolutionizing the shopping experience using the power of generative AI and rich messaging technologies to build a personalized shopping assistant for every consumer.

The Role
They are currently looking for a Lead Data Engineer to take charge of their data engineering initiatives, focusing on enhancing data collection, storage, and analysis. This senior position is integral to their data infrastructure, enabling data-driven decision-making and supporting their ambitious growth objectives.

Key Responsibilities:

  • Architect and scale a state-of-the-art data infrastructure capable of handling batch and real-time data processing needs with unparalleled performance.
  • Collaborate closely with the data science team to oversee data systems, ensuring accurate monitoring and insightful analysis of business processes.
  • Design and implement robust ETL (Extract, Transform, Load) data pipelines, optimizing data flow and accessibility.
  • Develop comprehensive backend data solutions to bolster microservices architecture, ensuring seamless data integration and management.
  • Engineer and manage integrations with third-party e-commerce platforms.

Experience Required

  • A Bachelor's degree in Computer Science or a related field, with a solid foundational knowledge of data engineering principles.
  • 7-10 years of software development experience, significantly focusing on data engineering.
  • Proficiency in Python or Java, with a deep understanding of software engineering best practices.
  • Expertise in distributed computing and data modeling, capable of designing scalable data systems.
  • Demonstrated experience in building ETL pipelines using tools such as Apache Spark, Databricks, or Hadoop.
  • Extensive experience with NoSQL databases, including MongoDB, Cassandra, DynamoDB, and CosmosDB.
  • Proficiency in real-time stream processing systems such as Kafka, AWS Kinesis, or GCP Data Flow.
  • Skilled in utilizing caching and search technologies like Redis, Elasticsearch, or Solr.
  • Familiarity with message queuing systems, including RabbitMQ, AWS SQS, or GCP Cloud Tasks.
  • Experience with Delta Lake, Parquet files, and AWS, GCP, or Azure cloud services.
  • A strong advocate for Test Driven Development (TDD) and experienced in version control using Git platforms like GitHub or Bitbucket.
Attach a resume file. Accepted file types are DOC, DOCX, PDF, HTML, and TXT.

We are uploading your application. It may take a few moments to read your resume. Please wait!