SalaryPeak

Senior Site Reliability Engineer

ADVANCED E-SOLUTIONS PTE. LTD.
Singapore 5+ years Posted 4w ago

Salary Range

SGD 84,000 - SGD 102,000 /year

SGD 7,000 - SGD 8,500/month

Skills Required

SplunkKubernetesIncident ResponseAWSLoad BalancingReplicationProduction SupportRelease EngineeringCloud ServicesCapacity PlanningDatabasesApplication Deployment

Job Description

Description:

Must have

• Excellent hands on experience in Core Java 1.8, Spring, Spring Boot, Quartz
• Hands on experience with messaging systems such as RabbitMQ and IBM MQ
• Strong working experience on Linux Operating Systems (Oracle Linux 7.6)
• Experience with Application Servers, preferably IBM WebSphere / Apache Tomcat8.5.x
• Excellent and proven experience in Oracle SQL and PL/SQL


Good to have
• Experience with monitoring tools such as Tivoli, and Splunk
• Prior experience in payments processing systems or the banking/financialservices industry
• Experience with Shell scripting
• Understanding in supporting large, complex, high availability, high volumeapplications
• Understanding of failover mechanisms and disaster recovery


POSITION OVERVIEW : Software Development Specialist


POSITION GENERAL DUTIES AND TASKS :
Role Summary
The Senior Site Reliability Engineer (L7) is a hands on technicalengineering role responsible for building, automating, scaling, and maintaininghighly reliable, secure, and resilient cloud and hybrid infrastructureplatforms.
The role focuses on infrastructure engineering, container orchestration,Infrastructure as Code (IaC), observability, incident response, and platformlevel application development.
Key Responsibilities
Good to Have : Deploy, configure, and maintain AWS resources including EC2,ECS, EKS, VPC, IAM, NAT, and networking components.
• Good to Have : Build secure and scalable cloud networking (VPCs, subnets,routing, VPN, firewalls).
• Work with load balancers, reverse proxies, API gateways, DNS management, andnetwork routing.
• Build CI/CD pipelines using Jenkins, GitLab CI, or GitHub Actions.
• Support application releases and coordinate deployments across environments.
• Implement logging/monitoring using Prometheus, Grafana, Datadog, Splunk, orCloudWatch.
• Participate in incident response, troubleshooting, on-call rotation, andpost-incident RCA.
• Perform system performance tuning, patching, capacity planning, andoptimization.
• Improve system reliability through automation, redundancy, and engineeringbest practices.
• Implement and maintain IaC using Terraform or CloudFormation.
• Automate provisioning, configuration, and environment setup using scripting(Python, Bash, Go).
• Develop reusable automation modules, templates, pipelines, and cloudengineering patterns.
• Build, deploy, and manage containerized applications using Docker.
• Operate and optimize Kubernetes clusters (EKS or on prem).
• Implement autoscaling, service mesh, pod security, and workload monitoring.
• Develop automation services, internal tooling, and platform utilities usingCore Java, Spring Boot, Quartz, and Erlang.
• Build wrappers/services for IBM MQ and RabbitMQ messaging flows.
• Create schedulers, orchestration components, and internal micro services foroperational tasks.
• Write integrations, connectors, and event-driven components for infraautomation.
• Build custom alerts, webhook handlers, log processors, and reliabilitytooling.
Technical Universities skill:

Area Technologies / Tools:
Operating Systems & Virtualization Enterprise Linux, VMware, OVM, X86server clusters
Containerization & Orchestration Kubernetes, Docker
Application Development (Platform) Core Java1.8, Spring, Spring Boot, Quartz,Erlang
Messaging Platforms IBM MQ, RabbitMQ, Erlang/Mnesia
IaC & Automation Terraform, Ansible, CloudFormation, Chef
Scripting Languages Python, Go, Bash
CI/CD Tooling Jenkins, GitLab CI, GitHub Actions
Observability & Logging Prometheus, Grafana, Datadog, Splunk
Databases & Storage Oracle, HA DB clusters, NFS, HPE Nimble, DataDomain
Load Balancing & Networking F5 LTM/ASM/ASR, DNS, network routing, proxies
File Transfer & Directory Services GoAnywhere, Tivoli Directory Server
Cloud Platforms AWS, Azure, GCP
Security Technologies Hardware Security Modules (Payshield or equivalent)

Experience Requirements
• 5+ years of experience as an SRE, DevOps Engineer, Cloud Engineer, orPlatform Engineer.
• Strong hands on expertise with AWS cloud services (EC2, ECS, EKS).
• Practical experience with IaC tools such as Terraform and CloudFormation.
• Deep working knowledge of Kubernetes, Docker, cloud networking, loadbalancers, and proxies.
• Hands on experience with CI/CD pipelines, release engineering, observabilitytooling, and monitoring stacks.
• Experience supporting databases including partitioning, replication,sharding, and high availability setups.
• Prior involvement in incident response, production support, and reliabilityengineering practices.
Desirable
• Good Understanding on infrastructure, F5, network
• knowledge of ISO20022, ISO8583, and Swift MT formats
• Experience in shell scripting, Python.
• Experience within payments processing systems or finance/banking industry.
• Experience in supporting applications using different languages and/orcharacter sets.