Randstadeos
Seninor SRE
SRE
Develop a telco grade PaaS capability for Sky.
• Design, document, and implement a PaaS solution to onboard and integrate vendor provided or requested applications with Sky’s telecommunications infrastructure.
• Take part in an on-call rota to action symptoms before they become outages.
• As a senior SRE engineer, be responsible for the engineering and support of production environments, including automation of patches, upgrades, reliability and performance improvements
• Ownership lab facilities for Dev & Test activities of PaaS
• Develop assurance, monitoring, and management capabilities for PaaS infrastructure using Zabbix, Prometheus, Grafana, and ELK stack.
• Act as technical escalation point for colleagues within the team.
• Act as a day to day technical point of contact for the engineers in other teams.
• Lead creation of automated reports for various services and PaaS infrastructure.
• Manage the operational playbook for the PaaS infrastructure and the services running within it.
• Automate dashboards and reporting for the platform against SLOs, SLAs and KPIs.
• Support managers with inputs on resourcing as needed.
• Monitor and manage Linux VMs, Containers and applications.
• Support and lifecycle management of various applications and services, including patching, upgrades, updates and troubleshooting.
• Plan and lead proactive disaster recovery testing.
• Work with suppliers to onboard their VNFs and CNF
Experience working with Public cloud, OpenStack, VM, Linux boxes (Cloud & Linux boxes (important)
• Strong background automating the configuration and management of large-scale platforms: Linux, Git, any scripting language like Python, Go, Bash etc(Atleast one)
• Experience in database deployment and management (SQL, NoSQL). Eg Couchbase, PostgreSQL
• Linux system administration & configuration management, primarily with CentOS or Ubuntu.
• Experience of building and maintaining CI/CD pipelines.
• Experience with automation/orchestration with tools such as Ansible and Terraform.
• Knowledge about web servers like nginx or Apache etc(any web would work if they don’t have the primary)