Application deadline date has been passed for this Job.
Job Description
Job Description
Our client is an Information Technology & Services company offering products and services such as telecom, AI, customer relationship management platform and more. They are looking for an experienced DevOps Engineer as a lead role for their DevOps environment with his experiences, best practices, and a collaborative attitude to help drive DevOps initiatives. The responsibilities include both managing and building processes for automation as well as contributing to the development of internal tools to achieve operational efficiency.
Job Responsibilities
- Develop automation flows that enable fast delivery and replace manual operating procedures wherever they exist to enable self-service operations
- Drive analysis, design, and development of automation tools for deployment, development, and operational tasks
- Deploy & operate Kubernetes clusters
- Maintain AI infrastructure clusters
- Maintain models training infrastructure (GPU clusters)
- Lead architectural improvements and evaluate new technologies
- Maintain and enhance our CI/CD pipelines
- Deploy & manage monitoring/observability infrastructure for staging & production level systems and services
- Prioritize requests from operations & engineering teams fairly while demonstrating a sense of empathy
- Make sure new environments meet requirements and conform to best practices
Requirements
- 2+ years’ experience within hands-on technical DevOps/Cloud engineering
- Good knowledge of Python or Golang
- Infrastructure as Code experience (Terraform, CloudFormation, etc)
- Experience with Kubernetes deployment patterns and tools such as Helm, Kustomize and Operators
- Experience utilizing DevOps tool chains including Jenkins, Docker, Nexus, SonarQube, GitHub
- Experience with tools used for observability such as Elasticsearch, Kibana, Grafana, Prometheus, Jaeger etc.
- Experience with SQL & NoSQL databases such as PostgreSQL and MongoDB
- Exposure to Agile environments (use of Jira/Confluence, sprints, etc.)
- Enterprise level experience
- Great communication skills and team player mentality
- Interest in AI projects
Desirable
- Experience with event steaming tools (i.e. Apache Kafka) and architecture patterns
- Experience with OLAP architectures
- Understanding of network engineering and security principles (e.g. protocols, routing, switching, filtering, firewall rules, etc.)
- Experience with Azure public clouds is a plus
- Experience with AI infrastructure projects is a strong plus
- Experience with Kubeflow or Fairing frameworks is a strong plus