Life Science Jobs in India - Find Pharma, Biotech, Clinical Research & Medical Jobs
Roche logo

Incident Manager - Digital Health Products

Pune, MH

Job Description

At Roche you can show up as yourself, embraced for the unique qualities you bring. Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted and respected for who you are, allowing you to thrive both personally and professionally. This is how we aim to prevent, stop and cure diseases and ensure everyone has access to healthcare today and for generations to come. Join Roche, where every voice matters.

The Position

Who We Are

At Roche, we are passionate about transforming patients’ lives, and we are bold in both decision and action - we believe that good business means a better world. That is why we come to work every single day. We commit ourselves to scientific rigor, unassailable ethics and access to medical innovations for all. We do this today to build a better tomorrow.

Roche is strongly committed to a diverse and inclusive workplace. We strive to build teams that represent a range of backgrounds, perspectives and skills. Embracing diversity enables us to create a great place to work and to innovate for patients.

Step into the Future of IT with Roche!

As a seasoned Site Reliability Engineer (SRE) Incident Manager at Roche, you will leverage your deep software engineering expertise to propel our products to new heights of robustness, scalability and reliability. You’ll be at the helm of incident management, orchestrating swift and effective responses to ensure our digital infrastructure remains available, performant, and secure. This isn't just a role—it's an invitation to shape the backbone of technological innovations forward.

Your Mission

Design and maintain cutting-edge tools, scripts and frameworks that automate repetitive tasks, streamline software deployment and manage expansive systems with unparalleled efficiency.

Partner closely with forward-thinking development teams to architect and implement high-performance solutions that elevate system efficiency, optimize resource utilization and enhance deployment processes for superior uptime and user satisfaction.

Your Impact

Lead the charge in incident management and response. Detect system anomalies, troubleshoot swiftly and conduct thorough root cause analyses to prevent recurring issues.

Champion continuous improvement by refining monitoring and alerting mechanisms, conducting insightful post-incident reviews and embedding best practices in software lifecycle management. Your strategic foresight and meticulous planning will ensure our systems are not only reliable but also superlatively performant.

By joining our elite team, you will play a pivotal role in delivering seamless experiences to our end-users, exceeding business and customer demands, and solidifying Roche's reputation as a leader in IT innovation.

Your Core Responsibilities

  • Lead the lifecycle of incidents from initial detection to successful resolution and post-incident review.

  • Coordinate and guide response teams to resolve incidents quickly and efficiently.

  • Develop, implement, and refine incident management processes and procedures.

  • Conduct regular reviews to identify root causes and drive the implementation of corrective actions.

  • Work on-call outside of normal working hours and weekends as scheduled to ensure continuous support.

  • Oversee the onboarding of new products into our support ecosystem.

  • Provide consistent and clear updates to technical teams, business users, and management.

  • Partner with engineering, cybersecurity, DevOps, product managers, test engineers, support teams, and administrators to enhance system reliability.

  • Identify and implement opportunities to enhance incident management processes and tools.

  • Align SRE activities with product release planning to streamline product adoption.

  • Actively contribute to the growth and development of the SRE team's capabilities, nurturing a stronger, more inclusive, and resilient team.

Who You Are:

  • Minimum Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent professional experience.

  • Experience in either site reliability engineering, software engineering or related fields with production on-call experience.

  • Deep understanding of incident prioritization, escalation processes, and service level management (SLA/SLO/SLI).

  • Proven ability to engage, influence, and build relationships with stakeholders across various levels.

  • Familiar with AWS and/or Azure, including setting up, monitoring, and maintaining cloud resources (incl. Kubernetes, EKS, AKS, GKE, etc knowledge).

  • Proficiency with observability tools.

  • Self-motivated to improve system reliability and operational efficiencies.

  • Hands-on experience with incident management tools.

  • Demonstrated troubleshooting capabilities, especially in cloud and distributed system environments.

  • Experience driving war rooms in the case of P1 and experience leading or participating in blameless postmortems.

  • Excellent communication, teamwork and documentation skills, with a proactive and self-motivated approach to improving system reliability and operational efficiencies.

  • We value and encourage candidates from diverse backgrounds and experiences, believing that diverse perspectives drive innovation and success.

  • Excelling in both spoken and written English communication.

Why Join Us?

By joining our team, you will be part of a dynamic environment where your contributions will directly impact the resilience and reliability of our services. You will have opportunities for professional growth and the ability to collaborate with industry leaders. Let’s drive the future of IT stability together, ensuring an exceptional experience for our customers.

Ready to make a difference? Apply now to be our next SRE Incident Manager and help us build a more reliable future!

Who we are

A healthier future drives us to innovate. Together, more than 100’000 employees across the globe are dedicated to advance science, ensuring everyone has access to healthcare today and for generations to come. Our efforts result in more than 26 million people treated with our medicines and over 30 billion tests conducted using our Diagnostics products. We empower each other to explore new possibilities, foster creativity, and keep our ambitions high, so we can deliver life-changing healthcare solutions that make a global impact.


Let’s build a healthier future, together.

Roche is an Equal Opportunity Employer.