Job Description

HPC / Linux / Cloud Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Renewable Contract

About the Role

We are seeking motivated HPC / Linux / Cloud Engineers (Lead Engineer) to support and scale high-performance computing (HPC) environments powering advanced analytics, AI/ML, and GPU-intensive workloads.

This role offers a rare opportunity to work across Linux infrastructure, GPU clusters, Kubernetes platforms, and cloud-native environments supporting enterprise-scale compute platforms.

Candidates with strengths in Linux systems, cloud engineering, platform engineering, containers, or HPC environments are encouraged to apply.

Lead Engineer

8+ years of relevant experience

Key Responsibilities

Manage and support Linux-based compute infrastructure and HPC clusters
Administer CPU/GPU environments for AI/ML and high-performance workloads
Monitor infrastructure performance, availability, and system health
Troubleshoot Linux, storage, networking, and cluster-related issues
Support and optimize workload schedulers such as PBS Professional or Slurm
Deploy and manage containerized workloads using Docker and Kubernetes
Support cloud-based HPC deployments across AWS, Azure, or GCP
Assist AI/ML teams with GPU utilization, workload tuning, and performance optimization
Automate infrastructure provisioning and configuration using Terraform, Ansible, or similar tools
Maintain system documentation, SOPs, and operational procedures
Collaborate with engineering, data science, and infrastructure teams to improve platform reliability and scalability
Participate in incident response, root cause analysis, and system recovery activities

Required Skills & Experience

Hands-on Linux administration experience (RHEL/CentOS/Ubuntu or similar)
Experience supporting infrastructure, cloud, or platform environments
Exposure to AWS, Azure, or Google Cloud Platform
Familiarity with containers and orchestration technologies
Basic scripting or automation experience (Bash, Python, Shell, etc.)
Understanding of infrastructure monitoring and troubleshooting

Preferred / Good-to-Have Skills

HPC or cluster computing environments
Job scheduling platforms such as:PBS ProfessionalSlurm
Kubernetes, Docker, or Singularity
GPU computing technologies:NVIDIA CUDAPyTorchTensorFlow
Infrastructure as Code:TerraformCloudFormationAnsible
Monitoring and observability tools:GrafanaPrometheusELK Stack
Experience supporting AI/ML workloads
Exposure to hybrid cloud infrastructure
Understanding of networking, storage, and security concepts in HPC environments

Ideal Candidate Profile

You are someone who:

Enjoys solving infrastructure and performance challenges
Has strong Linux troubleshooting skills
Is interested in GPU, AI/ML, or large-scale compute systems
Can work across cloud, infrastructure, and platform engineering domains
Is eager to learn and grow within HPC and modern compute environments

We value depth in a few technical areas rather than expecting expertise across every technology listed.

Why Join

Work on cutting-edge GPU and AI/ML platforms
Gain exposure to both cloud and HPC technologies — a highly sought-after combination
Opportunity to work on enterprise-scale compute environments
Build expertise in modern platform engineering and high-performance infrastructure
Grow into specialized HPC, Kubernetes, GPU, or cloud engineering roles

HPC / Linux / Cloud Lead Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Contract

Skills Required

Job Description

HPC / Linux / Cloud Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Renewable Contract

About the Role

Lead Engineer

Key Responsibilities

Required Skills & Experience

Preferred / Good-to-Have Skills

Ideal Candidate Profile

Why Join

About TEAMLEASE DIGITAL SINGAPORE PTE. LTD.

Similar Jobs

Onshore Full Stack DevOps Engineer

HPC / Linux / Cloud Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Renewable Contract

HPC / Linux / Cloud Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Renewable Contract

Senior DevOps Engineer

Financial Business Analyst (Private Equity)