SalaryPeak

Platform Operations Engineers

DRAKE INTERNATIONAL (SINGAPORE) LIMITED
Singapore 4+ years Posted Mar 31, 2026

Salary Range

SGD 60,000 - SGD 96,000 /year

SGD 5,000 - SGD 8,000/month

Skills Required

IntegrationDesignVulnerability Scanningcommission statementsAWSIct Disaster Recovery ManagementFull Stack DevelopmentRPOAccess ControlBackup policiesECSI-9 Compliance

Job Description

Key Responsibilities

A. Day 1 Scope: Platform Enablement (Design, Build, Validate, Commission) & Application Migration (3 applications)

A1. Platform Enablement

  • Infrastructure readiness & onboarding: Fulfil CStack Self-Hosted onboarding prerequisites; review and validate VPC, subnets, route tables, VPCE, CIDR allocation, trust relationships, and related configurations prior to stack launch.
  • Identity integration: Design and configure Microsoft Entra ID, OpenID Connect (OIDC) integration for access to ArgoCD and application workloads, aligned to NYP security requirements.
  • AWS service provisioning via Infrastructure-As-Code (IaC): Deploy and configure non-CStack AWS services (e.g., RDS, S3, AWS Backup, SES/SQS where applicable) using the IaC deployment method (e.g. Terraform) and governance process.
  • Disaster Recovery (DR) design & implementation within GCC AWS environment: Design, build, and configure disaster recovery architecture for the CStack Self-Hosted platform and tenant workloads, including DR environment setup, data Official (Closed) and Non-Sensitive replication/synchronisation, backup and restore configurations, recovery procedures, and Annual IT DR testing to meet company-defined RTO and RPO objectives.
  • Security & compliance engineering: Implement least-privilege IAM roles/policies and permission boundaries for workloads, pipelines, and supporting services; ensure alignment to CloudSCAPE remediation requirements and IM8 controls.
  • DevSecOps pipelines: Configure SHIP-HATS pipelines aligned with SGTS standards (build/test/security scans/quality gates/deployment validation) with auditable traceability, compliance to IM8.
  • GitOps enablement: Implement ArgoCD GitOps governance including repository structures, environment overlays (non-prod/prod), automated sync policies, drift detection, rollback approach, and RBAC for developer access.
  • Deployment standards: Create reusable Helm/Kustomize templates to standardise onboarding of future workloads and ensure pull-based, compliant deployments.
  • Observability: Deploy Prometheus, Grafana and alerting rules; integrate with approved monitoring/log aggregation platforms where required; ensure dashboards and alerts cover cluster/workload health, resource utilisation, deployment activity, and backup status.
  • Secrets & certificates: Implement ExternalSecrets (integrated with AWS Secrets Manager or SSM Parameter Store) and cert-manager for TLS certificate lifecycle automation; ensure ingress endpoints are secured via HTTPS using approved certificate authorities.
  • Security assurance: Ensure delivered configurations/components have no known critical vulnerabilities (e.g., OWASP Top Ten/CVEs) and support remediation of findings.
  • Documentation & handover: Produce clear English documentation (runbooks, operational guides, architecture/design artefacts) and deliver knowledge transfer sessions to enable the company to manage and maintain the platform.

A2. Application Migration (3 applications from 1 CStack to the other CStack)

  • Restructure and standardise Kubernetes manifests into GitOps conventions; establish environment overlays for staging and production.
  • Optimise CI/CD pipelines (SHIP-HATS) to support secure builds, testing, scanning and controlled releases.
  • Plan and execute secure data migration to Amazon RDS, migration of secrets, DNS, and ingress resources under supervision, ensuring confidentiality and data integrity.
  • Deploy migrated applications to non-production and production environments; conduct deployment validation, smoke tests, and support UAT.
  • Define and execute cutover plans with clear rollback/fallback procedures to minimise service disruption.
  • Prepare operational documentation for migrated workloads (runbooks, troubleshooting guides, backup/restore procedures).
  • Collaborate with application teams on platform compatibility (e.g., container compatibility with CStack worker nodes) and performance stability under expected load.

B. Day 2 Operations & Maintenance (Year 1 firmed with option to renew up to 4 years)

  • Operational support: Provide incident management and troubleshooting for the CStack Self-Hosted Platform hosted on GCC AWS; coordinate for platform level incidents when required.
  • Service continuity: Ensure continuity of operations with minimal service disruption during maintenance and upgrade windows.
  • Routine operations: Perform scheduled health checks (logs, metrics, performance thresholds), patch management, and minor configuration changes.
  • Cloud security posture: Perform ongoing remediation of CloudSCAPE findings and support remediation of VAPT findings for tenant-managed components, at no additional cost where required by contract.
  • EKS upgrade lifecycle: Support CStack’s EKS upgrade cycles (including API deprecation review, manifest updates, functional testing, and post-upgrade validation).
  • Operational reporting: Produce monthly operational reports covering incidents, changes, risks, compliance posture, capacity/performance trends, and improvement actions.
  • Support alignment: Deliver support in accordance with agreed service levels and response/resolution expectations (including emergency services where activated).

Key Deliverables

  • End-to-end platform architecture package (HLD/LLD diagrams, network/VPC design, IAM model, DNS design, OIDC design, GitOps & CI/CD designs).
  • Implemented and validated CStack Self-Hosted tenant configurations and supporting AWS services in GCC AWS environment.
  • Configured SHIP-HATS pipelines with security scanning and auditable deployment traceability.
  • ArgoCD GitOps setup (repo structure, RBAC, sync policies, drift detection/rollback patterns).
  • Observability stack (Prometheus, Grafana dashboards, alert rules) and any required integration with NYP monitoring/log aggregation.
  • Backup and recovery implementation (Velero + AWS Backup), retention policies, restoration test evidence meeting RTO/RPO targets.
  • Secrets and certificate management implementation (ExternalSecrets + cert-manager) with secure ingress/TLS.
  • Migration artefacts for 3 applications (plans, updated manifests/templates, cutover/rollback plans, UAT support materials, runbooks).
  • VAPT/CloudSCAPE remediation evidence and tracking.
  • Knowledge transfer sessions and handover package.
  • Monthly operational reports during Year 1 operations.

Required Qualifications & Experience

  • Proven hands-on experience designing and operating Kubernetes platforms (preferably EKS) in production, including multi-environment (prod/non-prod) setups.
  • Strong AWS engineering skills across networking (VPC/subnets/routes/VPCE/CIDR), IAM (roles/policies/trust relationships/permission boundaries), and core services (RDS/PostgreSQL, S3, AWS Backup, KMS, SES/SQS where applicable).
  • Demonstrated experience implementing GitOps with ArgoCD (RBAC, repo strategy, sync policies, environment overlays, drift detection).
  • Demonstrated experience building secure CI/CD pipelines using SHIP-HATS including automated testing, security scanning, and deployment validation.
  • Experience in implementing platform observability (Prometheus/Grafana/Alertmanager or equivalent) and operational alerting practices.
  • Experience in implementing backup/restore for Kubernetes and AWS services; ability to plan and evidence recovery tests against RTO/RPO targets.
  • Experience with secrets management and certificate automation (ExternalSecrets, AWS Secrets Manager/SSM, cert-manager, TLS/PKI basics).
  • Experience in remediating cloud security findings (e.g., CloudSCAPE-like controls, vulnerability management, hardening) and supporting VAPT remediation.
  • Experience in migrating containerised applications between Kubernetes platforms, including data/secrets/DNS/ingress cutovers, rollback planning, and minimal-downtime execution.
  • Ability to produce clear, audit-friendly documentation and runbooks; experience in conducting knowledge transfer workshops.
  • Working knowledge of application stacks referenced in scope (e.g., NodeJS, ReactJS) and AWS Services (such as AWS RDS, S3, EKS and etc) to support migration troubleshooting.
  • At least 1 year experience on CStack platform (Application Deployment and Operation Support) and other SGTS products.

Mandatory Compliance / Eligibility Requirements

  • Location: All personnel must be based in Singapore.
  • Compliance: Ability to work in environments governed by IM8 and cloud governance requirements, including CloudSCAPE scanning standards and CStack security model.
  • AWS certification: Proposed Platform Operations Engineer must have AWS Certified Solution Architect – Professional and AWS Certified DevOps Engineer – Professional certification and include this in their CV.
  • Scrum Master Certification: Proposed Business Analyst must have Certified Scrum Master certification and experience in agile application modernisation projects.
  • Endpoint security: Must use own engineering machines capable of onboarding to SEED (Security Suite for Engineering Endpoint Devices).
  • Data handling: Maintain confidentiality of all data handled during implementation, migration, and testing; do not export production data outside environments unless explicitly authorised in writing; conduct data migration under supervision.
  • Security assurance: Ensure no known critical vulnerabilities in delivered configurations/components and support remediation of CloudSCAPE/VAPT findings as required.

Preferred / Nice-to-Have

  • Experience in implementing Kubernetes security controls (pod security standards, network policies, namespace isolation) and service-to-service communication controls.
  • Experience with IT service management practices (incident/problem/change) and producing structured monthly operational reporting.
  • Experience with application modernisation projects with containerisation.

Success Measures (What “Good” Looks Like)

  • Platform delivered, validated, and commissioned by the required timeline, with clear audit-ready documentation.
  • GitOps and CI/CD workflows are operational, secure, and repeatable, with RBAC and traceability implemented.
  • Backup and restore processes are implemented and evidenced, meeting target RTO/RPO and retention/audit requirements.
  • CloudSCAPE and VAPT findings are remediated within agreed timelines, with clear evidence and tracking.
  • 3 applications migrated successfully with minimal disruption, successful UAT, and documented rollback plans.
  • Day 2 operations deliver stable service levels (availability and response/resolution expectations), with proactive reporting and continuous improvement.