Job Description
We are seeking an experienced Senior DevOps Engineer to join our team and lead the design, deployment, and management of both cloud-based and on-premise infrastructures. The ideal candidate will have hands-on experience in building scalable systems, automating deployments, and ensuring reliability, security, and performance across diverse environments.
Responsibilities
- Design, build, and maintain scalable CI/CD pipelines across multiple environments (development, staging, production).
- Build, manage, and optimize AWS and Azure cloud infrastructure for high availability and cost efficiency.
- Configure and maintain on-premise infrastructure, including virtualization, networking, and storage systems.
- Automate infrastructure provisioning using Terraform, Ansible, or CloudFormation.
- Implement monitoring, alerting, and logging systems (e.g., Prometheus, Grafana, ELK) to ensure visibility and uptime.
- Collaborate with development teams to streamline deployments, improve release processes, and ensure smooth integration between code and infrastructure.
- Ensure systems comply with security, data protection, and compliance standards (e.g., SOC 2, ISO 27001).
- Troubleshoot infrastructure, networking, and performance issues across environments.
- Optimize system performance and scalability to handle variable workloads and traffic spikes.
- Support disaster recovery and backup strategies for both on-prem and cloud infrastructure.
Qualifications
- Bachelors degree in Computer Science, Information Technology, or a related field.
- 5+ years of experience in DevOps, Cloud Engineering, or Infrastructure Management.
- Proven experience with hybrid infrastructure setups (cloud and on-premise).
- Strong proficiency in Docker, Kubernetes, and containerized application deployment.
- Experience with message queues and task queues (e.g., RabbitMQ, Celery, Flower).
- Expertise in infrastructure as code (Terraform, Ansible, CloudFormation).
- Strong scripting and automation skills (Bash, Python, or similar).
- Solid understanding of networking, load balancing, VPNs, and system security.
- Hands-on experience with monitoring and observability tools (Grafana, ELK, CloudWatch, Prometheus).
- Experience in CI/CD tools, preferably GitHub Actions or equivalent.
- Excellent problem-solving, collaboration, and communication skills.
- Experience working in fast-paced environments, scaling systems to support dynamic, high-traffic use cases.
- Applicants must be residing in South East Asia Region - Malaysia and Taiwan preferred.