
Job description
We are supporting an early-stage AI startup that is building a cutting-edge knowledge retrieval platform, designed to help enterprises and software teams unlock the full value of their data with AI that delivers trusted, accurate, and scalable results.
Their platform enables domain-optimized retrieval, retrieval-augmented generation (RAG), and real-time tuning, helping organizations tackle some of the most complex AI challenges: preventing hallucinations, scaling retrieval pipelines, and keeping AI grounded in dynamic enterprise data.
About the Role
We are hiring a Senior DevOps Engineer to lead the design and operation of the cloud infrastructure that powers its AI platform.
This engineer will play a key role in building scalable, secure, and highly performant environments, working closely with the AI/ML and data engineering teams.
The role involves hands-on work with IaC tools, CI/CD pipelines, Kubernetes (EKS), Kafka, and hybrid cloud/on-prem architectures — with a strong emphasis on automation, observability, and infrastructure reliability.
Responsibilities
Design and implement Infrastructure as Code (Terraform, Ansible)
Build and optimize CI/CD pipelines (GitHub Actions, Jenkins, Bitbucket) for rapid and reliable deployments
Deploy and manage Kubernetes (AWS EKS) clusters and containerized applications
Operate Kafka-based data pipelines for real-time data streaming
Support hybrid infrastructure (cloud + on-premises) with a focus on security and performance
Implement AWS networking configurations (EC2, VPC, Transit Gateway, Route Tables)
Write and maintain automation scripts (TypeScript, Python, Bash)
Ensure observability, monitoring, and incident response readiness
Collaborate with AI/ML, Data, and Product teams to align infrastructure with evolving platform needs
Job requirements
Must-Have
Proven experience with Infrastructure as Code (Terraform, Ansible)
Strong skills in CI/CD pipeline design (GitHub Actions, Jenkins, Bitbucket)
Expertise in Kubernetes (AWS EKS) and container orchestration
Experience operating Kafka and supporting data streaming pipelines
Hands-on experience with hybrid cloud/on-premises environments
Strong knowledge of AWS services (EC2, VPC, Transit Gateway, Route Tables)
Proficient in TypeScript, Python, Bash
Solid Git expertise and experience with source control best practices
Experience with Linux and Windows environments
Nice-to-Have
Experience in Big Data / Analytics or Machine Learning environments
Familiarity with AWS CloudFormation
Exposure to cloud-to-cloud or multi-cloud migrations
Background working in high-compliance environments (e.g., healthcare, finance)
Why Join?
This is an opportunity to work with a highly skilled, collaborative team tackling real-world AI engineering challenges — where your contributions will directly impact mission-critical AI systems used by enterprise customers.
- Brazil
or
All done!
Your application has been successfully submitted!