SalaryPeak

HPC / Linux / Cloud Lead Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Contract

TEAMLEASE DIGITAL SINGAPORE PTE. LTD.
Singapore 8+ years Posted May 9, 2026

Salary Range

SGD 156,000 - SGD 174,000 /year

SGD 13,000 - SGD 14,500/month

Skills Required

Microsoft ClusterComputing InfrastructureHPCKubernetesimplementing monitoring toolsJob SchedulingHigh Availabilitymanaging workloadDockerDocker ContainerHybrid Cloud

Job Description

HPC / Linux / Cloud Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Renewable Contract

About the Role

We are seeking motivated HPC / Linux / Cloud Engineers (Lead Engineer) to support and scale high-performance computing (HPC) environments powering advanced analytics, AI/ML, and GPU-intensive workloads.

This role offers a rare opportunity to work across Linux infrastructure, GPU clusters, Kubernetes platforms, and cloud-native environments supporting enterprise-scale compute platforms.

Candidates with strengths in Linux systems, cloud engineering, platform engineering, containers, or HPC environments are encouraged to apply.

Lead Engineer

  • 8+ years of relevant experience

Key Responsibilities

  • Manage and support Linux-based compute infrastructure and HPC clusters
  • Administer CPU/GPU environments for AI/ML and high-performance workloads
  • Monitor infrastructure performance, availability, and system health
  • Troubleshoot Linux, storage, networking, and cluster-related issues
  • Support and optimize workload schedulers such as PBS Professional or Slurm
  • Deploy and manage containerized workloads using Docker and Kubernetes
  • Support cloud-based HPC deployments across AWS, Azure, or GCP
  • Assist AI/ML teams with GPU utilization, workload tuning, and performance optimization
  • Automate infrastructure provisioning and configuration using Terraform, Ansible, or similar tools
  • Maintain system documentation, SOPs, and operational procedures
  • Collaborate with engineering, data science, and infrastructure teams to improve platform reliability and scalability
  • Participate in incident response, root cause analysis, and system recovery activities

Required Skills & Experience

  • Hands-on Linux administration experience (RHEL/CentOS/Ubuntu or similar)
  • Experience supporting infrastructure, cloud, or platform environments
  • Exposure to AWS, Azure, or Google Cloud Platform
  • Familiarity with containers and orchestration technologies
  • Basic scripting or automation experience (Bash, Python, Shell, etc.)
  • Understanding of infrastructure monitoring and troubleshooting

Preferred / Good-to-Have Skills

  • HPC or cluster computing environments
  • Job scheduling platforms such as:PBS ProfessionalSlurm
  • Kubernetes, Docker, or Singularity
  • GPU computing technologies:NVIDIA CUDAPyTorchTensorFlow
  • Infrastructure as Code:TerraformCloudFormationAnsible
  • Monitoring and observability tools:GrafanaPrometheusELK Stack
  • Experience supporting AI/ML workloads
  • Exposure to hybrid cloud infrastructure
  • Understanding of networking, storage, and security concepts in HPC environments

Ideal Candidate Profile

You are someone who:

  • Enjoys solving infrastructure and performance challenges
  • Has strong Linux troubleshooting skills
  • Is interested in GPU, AI/ML, or large-scale compute systems
  • Can work across cloud, infrastructure, and platform engineering domains
  • Is eager to learn and grow within HPC and modern compute environments

We value depth in a few technical areas rather than expecting expertise across every technology listed.

Why Join

  • Work on cutting-edge GPU and AI/ML platforms
  • Gain exposure to both cloud and HPC technologies — a highly sought-after combination
  • Opportunity to work on enterprise-scale compute environments
  • Build expertise in modern platform engineering and high-performance infrastructure
  • Grow into specialized HPC, Kubernetes, GPU, or cloud engineering roles