Embedding Resilience Beyond Checkboxes: My DevOpsCon NYC 2024 Presentation
Today, I had the privilege of presenting at DevOpsCon NYC 2024 on a topic close to my heart: “Beyond Checkboxes: Embedding Resilience.” As the field of DevOps evolves, it’s critical to shift from superficial metrics to truly embedding resilience into every layer of our systems. My presentation focused on how we can think differently, learn from historical approaches, and adopt resilient frameworks that are built to withstand the challenges of modern infrastructures.
Key Highlights:
1. Rethinking Technical Debt
I explored the concept of technical debt, as first coined by Ward Cunningham, and how it has evolved over the decades. From the 1980s' focus on automated operations to the 2020s' shift toward agentic infrastructure, our industry’s relationship with debt and resilience has profoundly changed. I emphasized the importance of addressing this debt with foresight to avoid creating fragile systems.
2. Learning from History
Drawing on lessons from historical figures like Abraham Wald, who improved aircraft survivability in WWII by focusing on what was missing (bullet holes in planes that didn’t return), I urged the audience to adopt a similar mindset in assessing and strengthening their systems. Often, what we overlook or don’t measure can be just as critical as what we do.
3. Deming's Profound Knowledge
I introduced W. Edwards Deming’s System of Profound Knowledge as a lens for understanding complexity. This includes four elements: the theory of knowledge, theory of variation, theory of psychology, and systems thinking. These elements are crucial for fostering sustainable change in organizations, particularly as we move into the AI-driven future.
4. AI and the Future of DevOps
I discussed the increasing role of AI in shaping DevOps practices. However, I warned that while AI offers numerous benefits, it also comes with risks, including shadow AI, technical debt, and new cybersecurity threats. I highlighted how leaders must adopt an AI vision that prioritizes scalability and maintainability to ensure long-term success.
5. Operational Definitions and Resilience
A significant theme was the need for clear operational definitions, particularly in AI and DevOps. Consistent definitions lead to better communication and systemic weaknesses. By aligning on definitions across teams and adopting continuous improvement, we can enhance the resilience of our systems.
The future of DevOps is both exciting and challenging, especially with the rise of AI. It’s essential that we embed resilience into our systems—not just check off boxes. I’m thrilled to have shared these insights and look forward to continuing the conversation with the DevOps community.
Feel free to connect with me to discuss these topics further!
#DevOps #Resilience #TechnicalDebt #AI #DevOpsCon2024 #JohnWillis #WEdwardsDeming #SystemsThinking #Leadership
Deming Updates
Muhammad Hanif writes on Dr. W. Edwards Deming's System of Profound Knowledge and how it can provide a holistic approach for improving organizational effectiveness.
https://www.linkedin.com/feed/update/urn:li:activity:7249823778677940228/
Dr. Marcell Vollmer covers 8 Japanese techniques to stop overthinking.
https://www.linkedin.com/feed/update/urn:li:activity:7248185170837614592/
Dave Mangot critiques the unrealistic goal of zero defects in software development, and advocates for a focus on quality engineering instead.
https://www.linkedin.com/feed/update/urn:li:activity:7249831297693450241/
Christopher Chapman notes updates on the Taguchi Loss Function Analyzer app, showcasing how different loss functions can be used to fit Red Bead Experiment data.
https://www.linkedin.com/feed/update/urn:li:activity:7249050919168811008/
Daniel Prager highlights Dr. Deming's view that a teacher's role is to foster a love for learning, provide deep subject knowledge, and uncover each individual's unique learning process.
https://www.linkedin.com/feed/update/urn:li:activity:7248607807279640576/
Mike Rother emphasizes the importance of practicing scientific thinking to achieve success through learning and adaptiveness by using Toyota Kata.
https://www.linkedin.com/feed/update/urn:li:activity:7248706957769908224/
Carlos Jiménez contrasts the challenges of implementing "LEAN 1.0" focused on broad training with the success of "LEAN 2.0", which emphasized a targeted approach using Lean Champions, daily practice, and management commitment.
https://www.linkedin.com/feed/update/urn:li:activity:7247771758303264768/
Paul Dakin argues that poor management, rooted in outdated scientific management theories, is at the core of the productivity problem, and advocates for adopting Deming's principles.
https://www.linkedin.com/feed/update/urn:li:activity:7245407071112687616/