Cloud Native Live: Designing and Operating Reliable Cloud Services
Watch Now

We care about reliability, even when nothing is down

Reliability.org is a community for people interested in achieving better software application and infrastructure reliability through design, development, and operations

join US if you are passionate about:
Discussing reliability practices, research, & frameworks
Experimenting and sharing experiences that drive improvement
Connecting with other reliability-minded people

CNCF Live Stream Event

Thu, Mar 30, 10:00AM (PDT)

Designing and Operating Reliable Cloud Services – A View from the Trenches

Watch the Replay
it's no secret

Delivering reliability comes at a cost

Superior customer experience is a strategic advantage that can be damaged with unexpected outages. But ensuring reliability takes a lot of hard work.

42%

say 'decreasing costs' is a top leadership priority for infrastructure reliability

2,084

hours spent by eng to diagnose and repair incidents in the past year

72%

of engineers see repetitive ops tasks as necessary evil or worst part of their job

Quotes

What industry leaders have to say about reliability

"Reliability is a key concern when it comes to the cloud. Businesses need to be able to trust that their data and applications will be available when they need them."

John Roese

"Automating incident response enables organizations to detect and resolve issues faster, reducing the mean time to resolution and minimize the impact of incidents on the business."

Gene Kim

"The biggest challenge facing cloud computing is figuring out how to make it more reliable and predictable."

Werner Vogels

"Cloud services are becoming more reliable, but they're not immune to outages. The key to achieving reliability in the cloud is to build in redundancy and to have a clear incident response plan."

Gartner

Let's tackle challenges we share

Great reads on reliability

The book heuristic: If you've enjoyed books like these, there's a 90% chance you've found the right community 🙂

Count me in!

Site Reliability Engineering

Reliability and Availability of Cloud Computing

The Site Reliability Workbook: Practical Ways to Implement SRE

The Phoenix Project: A Novel about IT, DevOPs, and helping Your Business win

The DevOps Handbook: How to Create World -Class Agility, Reliability, & Security in Technology Organizations

Implementing Service Level Objectives

DevSecOps: A leader's guide to producing secure software without compromising flow, feedback and continuous improvement

Team Topologies: Organizing Business and Technology Teams for Fast Flow

The Lean Startup

Site Reliability Engineering

Reliability and Availability of Cloud Computing

The Site Reliability Workbook: Practical Ways to Implement SRE

Site Reliability Engineering

The DevOps Handbook: How to Create World -Class Agility, Reliability, & Security in Technology Organizations

Implementing Service Level Objectives

DevSecOps: A leader's guide to producing secure software without compromising flow, feedback and continuous improvement

Team Topologies: Organizing Business and Technology Teams for Fast Flow

The Lean Startup

Site Reliability Engineering

Reliability and Availability of Cloud Computing

The Site Reliability Workbook: Practical Ways to Implement SRE

The Phoenix Project: A Novel about IT, DevOPs, and helping Your Business win

COME ALONG FOR THE RIDE

Community Roadmap

🚀

Launch the community

Bring together like-minded folks to explore software service reliability

📚

Curate a content library

A repository for Eng, DevOps, and SRE research, tools, and content

👥

Community-built best practices

Develop open-source resources for delivering world-class reliability

📅

Community-led events & services

Create opportunities to connect practitioners and organizations

Get INvolved

Join the conversation

Discover how others are improving reliability in their own organizations

Join Our Slack Community