The old way of thinking about web systems? Nothing can go down, so we must build the ultimate cluster. The new way of thinking about third-party cloud services? Things go down, so we need to know how to deal with that.
“I could say it’s a problem with serverless, but it’s a problem with any service we use. Who’s taken time to get coffee while GitHub is down, or Maven, or NPM?” asked Zender.tv CEO Patrick Debois in his keynote for the Oracle Code San Francisco conference, colocated in San Francisco with JavaOne and Oracle OpenWorld 2017.
The savings that developers gain by not creating critical services themselves comes at a new cost: increased risk around availability. Not surprisingly for the man who coined the term DevOps, Debois has a knack for finding the trickiest behavioral problems in software development. Serverless—and what he called “service-full”—architectures are his latest focus.
Ops Don’t Go Away In Serverless
Serverless computing, such as Oracle’s just-announced open source serverless functions platform, is a form of utility computing. The infrastructure is completely abstracted from the developer, who concentrates only on the execution of application logic. Surveying users of serverless architectures about their pain points, Debois found that “it’s monitoring, debugging, testing, latency, deployment, and so on—which is funny. When you ask people in serverless what’s their problem, it’s all those opps topics that you want to go away—you hit them again,” he said.
Debois described several gotchas his team found working with Amazon Web Services for live events and using its functions-as-a-service offering, Lambda: “The first time we used AWS Lambda, we built a meme generator for one of our customers, and when it got into production, we got errors about disk full. Like, what? Apparently, the disk gets reused and there’s a limit. Which is obviously in the documentation, but…something we hit.”
Other issues that Debois noted were performance lags due to cold starts of underlying idle machines, load balancing that didn’t scale fast enough, container version issues, and difficulty deciphering the APIs.
“Amazon services are API-centric, but not very user-friendly. There’s typically a layer of translation needed to make it easier to work with,” he said. “All this reasoning about these dependencies reminded me of promise theory, championed by my friend Mark Burgess,” Debois explained, referencing Burgess’s book on the topic with Jan Bergstra. “In the ecosystem that we are working in, you are an agent. You make promises to others in the system, but a promise doesn’t guarantee the outcome.”
8 Promises That Cloud Services Should Keep
After outlining the relationships and architectural complexities of promise theory, Debois shared principles that service providers should follow, such as:
- Communicate the status of your promise outward: “The number one lesson learned from the outage on Amazon was they didn’t have a status page.”
- Expose your own metrics: A graphics service had the guts to show “severe render errors—that’s very brave.”
- Show that you care about other agents: Expose your logs.
- Provide an API to backup external data: “What if your account is wiped? How do you reproduce your settings?”
- Provide and seek fast feedback on your change status.
- Do post-mortems: “If you had a real failure, man up and describe what happened.”
- Be proactive: Make others keep their promises.
- Share your roadmap: “Show that you listen to those who depend on you.”
The Hardest Problems Are Human
As one of Debois’s many humorous slides put it, “I introduced DevOps and all I got was a remote API.” As a culture of serverless and service-full development processes supplants the face-to-face interactions that DevOps celebrates, the difficulties haven’t disappeared—they’ve just migrated elsewhere, Debois argued: “I don’t have any money invested in DevOps companies, so I’m OK.”
Which part is hardest? People. “It’s the human touch that we’re looking for in this,” he said, recommending talking at conferences as one of many ways to bridge the communication gap. “I learn a lot about the internal workings of a company when I listen to its people talk about their projects.”