Skip to content
Careers

Senior Applications Support Specialist

Remote - United Kingdom JR013381

Key Responsibilities

Incident & Problem Management

  • Lead major incident (MI) bridges and restore service with minimum business impact.
  • Handle all L3 escalations, perform deep diagnostics across Java, JVM, middleware, OS, and infra.
  • Own technical RCAs, drive long‑term and systemic remediation.
  • Identify recurring failure patterns and risks.

Reliability Engineering

  • Apply SRE principles: SLIs/SLOs, error budgets, resilience patterns.
  • Tune JVM parameters, analyze thread/heap dumps, and improve performance.
  • Influence application architecture for fault tolerance, scalability, and recoverability.
  • Validate DR readiness, failover behavior, and resilience testing outcomes.

Change, Release & Risk

  • Provide technical approval and risk assessment for high-risk changes.
  • Enforce operational readiness for new apps and major releases.
  • Ensure changes meet audit, compliance, and regulatory expectations.

Automation, Monitoring & Observability

  • Build advanced automation using Shell/Python/PowerShell.
  • Develop frameworks for health validation, automated recovery, and compliance checks.
  • Define observability standards; optimize alerts and improve MTTR.

Leadership & Mentorship

  • Mentor L1/L2 teams; review and approve runbooks, SOPs, and KB articles.
  • Act as a trusted technical advisor to stakeholders and leadership.

Skills & Qualifications

Technical (Mandatory)

  • Strong knowledge of application architecture, distributed systems, and middleware.
  • Java expertise: JVM internals, GC, memory management, thread/heap dump analysis, performance tuning.
  • .Net — CLR internals, garbage collection, memory management, thread/dump analysis, and application performance tuning.
  • Strong Unix/Linux, networking basics, and advanced scripting (Shell/Python/PowerShell/VBS).
  • Advanced SQL and understanding of databases; Autosys (or equivalent scheduler).
  • Handson with observability tools: Splunk, AppDynamics/Dynatrace, ELK, Grafana, Prometheus.

Reliability & Operations

  • Major incident leadership, deep RCA, change/release readiness, DR & resilience engineering.
  • Experience in regulated production environments.

Soft Skills

  • Strong technical leadership and decision‑making.
  • Clear communication during high‑pressure incidents.
  • Ownership mindset and business awareness.

Experience & Education

  • 7–12+ years in Application Reliability, Production Support, SRE, or platform operations.
  • Bachelor’s degree in Computer Science/Engineering or equivalent.
  • ITIL, cloud, or industry certifications (preferred).
  • Banking/financial domain experience (preferred).

Working Conditions

  • On‑call and after‑hours support as required.
  • Fast‑paced environment with multiple priorities.
  • Hybrid working model
JR013381

More career opportunities at Ensono

Explore additional openings with our team, and apply today.

Start your digital transformation today.