SalaryPeak

Infrastructure Engineer

SECOND TALENT SG PTE. LTD.
Singapore 5+ years Posted Apr 13, 2026

Salary Range

SGD 108,000 - SGD 156,000 /year

SGD 9,000 - SGD 13,000/month

Skills Required

Cloud AdministrationKubernetesPostgreSQLUbuntuArchitectCluster ServerGPUNetwork Load Balancingmanaging workloadArchitectural GlassInfrastructure ArchitectureGrafanaRedisFirewalls

Job Description

We are representing a client—an innovative, fast-growing company in the robotics and AI space. The company is building an open-source humanoid robotics platform and is committed to democratizing access to advanced robotics by openly sharing hardware designs, firmware, and machine learning models with the global developer community.

The Role

As an Infrastructure Engineer, you will own and evolve the platform that everything the firm runs on, from inference serving to training rigs to the agentic coding infrastructure that powers day-to-day engineering. You will work deep in the stack, across OpenStack, Kubernetes, and bare metal, and set the technical direction for how the company's Cloud scales.

What You'll Do

  • Own and operate the company's Cloud: OpenStack Nova compute, Neutron networking, Trove database services, across Prod, Dev, and Sysadmin clusters
  • Manage Kubernetes clusters via Cluster API and kubeadm, including control plane operations, node lifecycle, and cluster upgrades
  • Manage and improve our inference platform: vLLM serving, AIBrix for multi-model orchestration and autoscaling across a fleet of NVIDIA GPUs
  • Build and maintain autoscaling at every layer: Cluster Autoscaler, HPA, and KEDA for event-driven workload scaling
  • Operate platform services: Kafka, Redis, PostgreSQL, OpenSearch, Prometheus
  • Own the observability stack: Grafana, Mimir, Tempo, Loki, Pyroscope, OnCall, one pane of glass across all clusters
  • Manage GitOps deployments via ArgoCD and identity via Keycloak integrated with Google Workspace
  • Harden network security across private load balancers, firewalls, and VPC segmentation
  • Support training infrastructure: self-service VM provisioning, RunPod burst capacity, Weights and Biases integration
  • Drive infrastructure reliability, cost efficiency, and capacity planning as the platform scales

What We're Looking For

  • 5+ years of hands-on infrastructure engineering experience in production environments
  • Extensive experience with OpenStack in production: Nova, Neutron, Cinder, Trove, Horizon, and CLI administration
  • Strong Kubernetes experience without managed control planes: Cluster API, kubeadm, self-managed clusters
  • Deep Linux proficiency: RHEL, Ubuntu, or equivalent, including kernel-level debugging and performance tuning
  • Experience with infrastructure-as-code and automation: Ansible, Terraform, or equivalent
  • Familiarity with GPU infrastructure: inference serving, vLLM, model orchestration, and cluster management
  • Solid understanding of GitOps workflows and tools like ArgoCD
  • Experience with observability: Prometheus, Grafana, distributed tracing, log aggregation
  • Strong networking fundamentals: VPCs, firewalls, load balancers, private cluster architecture
  • Comfort operating in a high-ownership environment where you make architecture decisions and move fast

Bonus points for:

  • Experience with KVM virtualization and storage backends like Ceph
  • Familiarity with vLLM internals: PagedAttention, continuous batching, tensor parallelism
  • Experience with KEDA or event-driven autoscaling patterns
  • Background in AI/ML infrastructure or GPU cluster operations at scale
  • Prior open-source contributions to OpenStack, Kubernetes, or adjacent projects