Skip to content

Senior Data Engineer

  • Remote
    • São Paulo, São Paulo, Brazil
  • Technology

Job description

We are assisting a leading cloud consulting firm specializing in cloud-native development, data and AI modernization, and secure cloud operations. As an AWS Premier Partner, they help organizations scale with cutting-edge technologies while fostering a culture of innovation, collaboration, and continuous learning.

As a Data Engineer, you will be responsible for designing and maintaining secure, scalable, and production-ready data pipelines to support both machine learning and analytics use cases. You'll work across all data layers—ingestion, transformation, and serving—ensuring high data quality, traceability, and consistency across environments. Your work will directly empower both the data science team and inference pipelines by delivering clean, well-documented, and reliable datasets and features.

Key Responsibilities

  • Build and maintain data ingestion pipelines from sources like Snowflake, DynamoDB, and others into S3, Athena, and SageMaker Feature Store.

  • Ensure metadata cataloging and data lineage through AWS Glue Catalog and standardized logging practices.

  • Perform data quality checks using AWS Glue Data Quality before promoting data into ML pipelines.

  • Develop transformation jobs for batch and real-time features consumed by ML models.

  • Implement scheduled or event-based triggers for data refreshes and model retraining workflows.

  • Collaborate closely with ML Engineers to align feature formats, delivery timing, and expectations between training and inference environments.

Job requirements

Must-Have

  • Hands-on experience with AWS Glue, Athena, S3, Snowflake, and DynamoDB.

  • Strong skills in SQL, PySpark, and data transformation logic.

  • Familiarity with metadata management, data cataloging, and lineage tracking.

  • Experience with data quality frameworks, preferably AWS Glue Data Quality.

  • Proficiency in scripting and infrastructure-as-code tools such as Terraform or Terragrunt.

Nice-to-Have

  • Exposure to ML pipelines, SageMaker Feature Store, and MLflow integration.

  • Understanding of CI/CD pipelines, especially using GitLab.

Remote
  • Brazil
Technology

or