Live Blogging Developer Week, 2020, Day 1
Keynote: Launch Darkly, What Developers Need to Know, 6:00 Pacific
Let’s talk feature flags and testing in production!
The recent Iowa Caucus debacle made for a good segue into the subject of production software testing. TC Currie introduced the topic that way and talked about testing in production and feature flags, and then started off the conversation by asking Edith about the same.
Edith paused for a brief nod to Oakland, where LaunchDarkly and this conference are both based, and then talked about something I can easily relate to: long periods of development, followed by long periods of testing, and then deployments. Those were the bad old days. And they featured a lot of misalignment between testing and production realities.
Feature flags combat this.
The simplest concept is that you push code into production guarded by a conditional. You can then, via configuration, turn the bits on or off on the fly without deploying additional code.
This confers the powerful ability to segment users. You could, for instance, turn a new feature on for only 5 people. Or maybe you only want to turn it on for users in a certain state or country.
Testing in Production
This conversation led to a natural question about testing in production. Obviously, you can’t avoid testing in production, but you also don’t need to do all of your testing in production, either. What’s the right balance?
Edith actually wrote an article once about killing your staging server, admittedly as a bit of hyperbole. But the idea is that the staging server can provide a false sense of security that nothing is wrong.
Feature flagging helps you manage production complexity. In a world with all kinds of dependent microservices, you need the ability to shut these things on and off at will, which is what feature flags allow. This allows a level of production control that you can’t adequately prepare for ahead of time.
Feature flags provide a failsafe or a killswitch. Edith refers to this as “minimizing the blast radius.”
Avoiding Deployment Stress
Another interesting benefit of feature flagging is avoiding what Edith calls “push and prayer releases.” She describes a situation where a West Coast developer, giant cup of coffee in hand, is deploying something at 4:47 AM and hoping for the best.
Feature flags really ramp down the risk in such a situation. If something goes wrong, you don’t do frantic support or rollbacks. Instead, you just turn off the feature.
Setting Up Tests in Production
From here, TC segued to a question about what developers need to know about setting up for testing in production.
Edith answered that, first of all, it’s not an either/or proposition. You should still have your pre-production validations and such. But now you’re building in production safety valves to deploy more confidently.
The idea is that your development process now involves creating failsafes ahead of time, as well as pre-thinking your rollout strategy. So this gives you a more intuitive understanding of options in production, meaning that you can run experiments more easily and with less risk.
Feature Flag Development Tips/Tricks
TC asked an interesting wrap-up question about how developers can implement feature flags successfully in their own applications. Here were Edith’s tips:
- Think about boundaries. Why are you segmenting your users?
- A good naming strategy.
- Have a good strategy for your defaults, agree on it, and document. Does “false” correspond to “on” or “off”?
- Have a feature flag removal strategy, since feature flags are technical debt.
And, that was it — an interesting talk about testing in production and how feature flags help with that.
How to Build an Enterprise-Wide API Strategy, 5:00 Pacific
Unfortunately, I came in just a little late. I snuck out between sessions to wolf down some food because I was starving, and I missed a little bit, despite my best efforts.
But I’m very interested in this topic, so I hustled down.
APIs: Current State — Lots of Opportunities
When I came in, Iddo was talking about three buckets of API:
- Private APIs
- Partner APIs
- Public APIs
And he was also talking about the lifecycle and proliferation of APIs in general. The gist, as I understand it, is that companies keep these things around and tend to layer them on top of each other, making discoverability of them an increasing challenge.
APIs also have a significant commercial angle. Consider:
These companies are all generating at least 50% of their revenue via their APIs, with Expedia generating 90% this way. This underscores the importance of the partnership API.
But There Are Also Challenges
We’re also seeing more companies exposing public APIs. That figure has increased exponentially over the last 15 years.
This happens because it significantly increases developer productivity and accelerates development. Developers can leverage platforms that they don’t have time or resources to develop themselves.
But this creates a lot of challenges. Integrating APIs is hard. Developers have to learn, in a sense, almost a new language every time they want to use one. It also creates an explosion in runtime dependencies and exposure to risk over which you have no control, including concerns like compliance (e.g. GDPR)
Creating an API Strategy
So how, then, do you create a broad strategy? There are 5 components to a successful API strategy.
- Executive support. While developer adoption is critical, executive support is the catalyst.
- Organization structure. Who approves APIs to be used and published? Who defines API standards? Broad standards, or up to each individual team? Companies need to define who owns what, from an org chart perspective.
- API platforming. You need a standard place in the organization that developers can go and discover APIs — a marketplace, so to speak. At Rapid, they describe this as the “API Hub.” The idea is to have a standard place for developers to self-service publish and consume.
- Peripheral tooling. Consider things like API design and documentation. Developers can use tools like Postman and Swagger to make their lives easier with respect to API creation. Testing and monitoring are also important peripheral concerns.
- Education and awareness. It’s important to think of API development as a skill, so having programs in place to provide education around development, testing, etc. of APIs is important.
RapidAPI is a solution that provides API marketplaces. It’s available as a white-label serviced within enterprises, but also serves as a first class way to search for public APIs.
Iddo took us through a demo of RapidAPI, discovering and making use of an API in real time in front of us. He was actually able to generate a language-specific code snippet to get going very quickly with consumption.
This one is pretty easy, as far as I’m concerned.
One of the things that I enjoy the least when working on our internal software for Hit Subscribe is consuming external APIs. It’s always an unpleasant hodgepodge of reading documentation, trial and error, debugging authentication issues in Fiddler, etc. It’s hours of my life wasted.
So, anything that makes this easier is a win in my column. The next time I have to consume a new API, I’m going to head here, check out the marketplace, and see what kind of boost I get to my productivity. I see no downside to trying this out.
An Introduction to Microservices with the Serverless Framework, 4:00 Pacific
I’ll cut him some slack, as long as I’m one of the people he stops the talk to help.
Anyway, he outlined 4 different goals:
- Develop/test a microservice
- Deploy to AWS
- Debug the service
- Monitor the service
Let’s do it. How would this go? He has material for people afterward.
- Create an AWS account
- Install the AWS CLI.
- Configure the AWS CLI
- Install the Serverless Framework: binary installer for Mac/Linux, Chocolatey on Windows, NPM.
- Clone the project.
All of this will get to the point of building his toy application, “Serverless Jams.” I like where this is going.
Serverless Jams allows you to add a phone number, receive a voting code text, and then vote on songs. The idea is to exercise a number of different APIs.
We’re going to do this with Amazon API gateway, Lambda, a DynamoDB table, and Simple Notification Service.
What does the code look like?
Simple enough. Now, on the front-end:
And then serverless.yaml, which wires things up from a configuration perspective. This is where the magic happens, and Fernando walked us through all of the information that needs to go in here to create the right sets of permissions.
I know only enough about this to shoot myself in the foot, but he’s making me feel like I’d be able to handle it in an afternoon.
In furtherance of this, Fernando asked how many lines of code all of this would take, combined with the front and back end?
The answer? 436 lines of code, “with me being a pretty poor front end developer.”
(I’m with you there, Fernando, but I think if we switched to a language without significant whitespace on the backend, we could probably slam all of it into 50 lines, tops)
With the structure and code squared away, he mentioned 3 potential deployment options.
- Local AWS Keys (with 2 different sub-options)
- Using the Serverless Dashboard, assuming you’ve created an account.
- Serverless CI/CD. (He won’t touch on this today, but this is a good approach for teams that have established a serverless approach)
And, with that, he popped open an IDE and started working. Then it was time to sign up for serverless and get into the dashboard. He did this all pretty quickly, but it honestly looked really straightforward.
After development, configuration and deployment, it was time to test and debug. First issue was that he mistyped his phone number, and the second was that he’s already used his number for another demo. So this led to the actually-even-cooler situation where someone in the audience gave a phone number, received the code, and entered it.
At the end, perhaps not surprisingly, everything worked. This left a little time for a deeper dive, and the audience voted to take a more detailed look at the front end.
In the end, enthusiastic applause.
This was a lot of fun to watch and well-presented. And there’s something infectious about watching a talk that’s basically, “look, this cool thing is totally more doable than you might have thought.”
My takeaway here is that technologies like Lambas make glue code much more approachable. But that’s been true for a while, and I was aware of that.
But now, looking at Serverless as an orchestration framework, for lack of a better term, I have the sense that I can put together some very well integrated functionality very quickly. I have, perhaps, a false sense of confidence that I could do this in an afternoon. But in reality, I can’t imagine it’d take terribly longer.
If I put on my “old man programmer” hat, I can recall a time when building an app that sent me authentication codes via text would have required weeks, lots of planning and integration, and much pain. And it’d probably have been expensive.
Now, a lot of cool things are possible. Look for me to auto-spam you all text messages sometime soon.
Interlude, 3:50 Pacific
One of the things that I’m interested in doing, longer term, is to see whether sponsors, speakers, or conference organizers would be interested in live blogging as a value-add in some capacity. In order to evaluate that, I’m just running the experiment of doing this, journalism-style.
One thing that I’m learning is that documenting a bunch of talks in a row is kind of tiring, in a weird way. It’s almost like a mental version of the soreness you get from doing a weird form of exercise that stretches a muscle you barely you knew you had. “I didn’t even know it was possible to be sore there.”
Anyway, this is more mentally tiring than I would have thought.
Building Highly Scalable Applications, 3:00 Pacific
This is a talk by Mark Piller, CEO/founder of Backendless.
We start off here with the premise that scalability is hard and that “let’s throw more servers at it” is a recipe for failure. So it must not be an afterthought; it must be baked in from the beginning.
So think of scalability as a part of every single component of your architecture. And with that in mind, this is going to be a walk through Mark’s experience.
These are rules that you need to think about before adding additional servers. You need to squeeze everything you can out of existing resources:
- Avoid blocking execution — you don’t want to create bottlenecks like this.
- Avoid centralization — in general, centralized information or decision-points scale poorly.
- Foster easy replication — it should always be easy to add additional servers, whatever kind of servers they are.
- Stateless programming model — it’s easier to scale up without transient information to track.
- Pagination — whenever you work with a database, it’ll grow, so a “SELECT * FROM…” approach isn’t going to cut it.
- Test and validate before committing to any component of your tech stack — load testing is key.
I’m not in a position to snap pictures of the slides, but he posted a diagram of their architecture. Slides will be live in a couple of weeks, so you can check that out when they are. It’s worth a look.
He talked about how this reference architecture allows for easy horizontal scale.
Brief Summary/Rest of Talk
At this point, I’m going to take a break from faithful and time-consuming note-taking/blogging to take in a bit of the talk. 2.5 consecutive hours of live blogging makes me want to come up for air.
So I’m going to enjoy the talk more passively and document just an overview. Here’s the gist of what he talked about:
- Database: a lot of good detail on how Backendless approaches database scalability.
- Redis: they use it for caching, job/task queues, atomic counters and synchronization.
- File System: they tried a lot of different approaches but settled on GlusterFS.
- Caching: they prefer Ehcache.
- Async Job Processing (mass emails, push notifications, etc): they use a Redis queue.
- Code best practices: avoid using synchronized (Java) and favor the Future API.
- Kubernetes: lets you scale fast by just adding pods to workers.
- Monitoring: if you expect failure, you can learn to react to signs of failure and prevent it.
In some senses, this was a pearls on swine situation. Meaning, I find the subject of massive scale to be theoretically interesting to consider, but it doesn’t have a lot of application to a guy who earns a living running a content marketing company and occasionally updating our internal line of business software.
So to me, this talk was conceptually interesting, but it also served as a way for me to index technologies that I should check out, should the need arise. In other words, here is a company successfully implementing these principles in the trenches. And Mark shared both which technologies had worked for them, and which hadn’t.
So, if I suddenly found myself, Quantum-Leap-Style, as the CIO of a company looking for scale, I’d have a first place to go to guide my research. I could review the techs that he’d mentioned, along with why some worked and some didn’t, and have good guidance for where to start.
(This talk generated a LOT of questions/discussion, part of which was probably because it was a packed house. But I think this is top of mind for a lot of folks.)
Good Rules for Bad Apps, 2:00 Pacific
This is a talk by Shem Magnezi, engineer at WeWork. It’s a series of rules that you can follow for building apps that suck.
He’s apparently given this talk a lot, and it shows. The slides are fun/engaging. I’m currently looking at one that has Christian Bale’s batman confronting Heath Ledger’s Joker, and life is good.
He wants to clarify, upfront, that he’s talking about “bad’ apps not in the colloquial sense of “bad as good,” but in the sense of bad, making you miserable.
He also mentioned that he used to talk a lot about how to build good apps, but that it could be more helpful to think in terms of how you ruin apps (and how to avoid that).
So, how does one ruin an application?
1. Ask for As Many Permissions as Possible
Consider a flashlight. One screen. Single button to turn the flashlight on and off. Simple stuff.
So it needs to be able to take and record video, right? I mean, obviously. Sure, you may not need these permissions now, but who knows what the future holds?
And then, it probably goes without saying that you need to prevent the user from using the app altogether if they don’t agree.
2. Don’t Communicate Anything
Alright, now let’s say that we’re building a reminder application. What do I have to do over the next week?
When it loads, you see nothing there. Great! Nothing to do. Or, wait… is it just that your reminders haven’t loaded? Do you not have a connection?
A good way to make a bummer of a user experience is not to communicate anything about its current state or what’s happening behind the scenes.
3. Don’t Save Screen State
Let’s move on to a less trivial app: buying a few books through an eCommerce app.
Add the books to the cart, go to checkout, and then, well, fill in a lot of information. First name, last name… you get the idea. Then you need, of course, your credit card information.
So, as you reach into your wallet for your credit card, the screen rotates accidentally. Oops, everything is all gone. For some reason, that triggers a refresh of the form.
There’s no better way to create a maddening experience than forcing you to fill all of that out again for no good reason.
4. Don’t Optimize App Size
You’re looking through the app store and you decide to install an app. You go into the store, find some well-reviewed, heavily downloaded app and you get ready to go.
But then, wait a second. Why is this app 70 MB? Yikes! What if you’re somewhere with a bad signal or you don’t have time to wait.
So you skip it, do something else, and later wonder why the download is so large. Then, maybe, you dig into it and realize that they’re packaging in all kinds of images of different sizes and iterations, perhaps for features that you’re not even going to use.
But then maybe you dig in further and find that there’s a huge file containing all sorts of phone numbers in different countries. You probably don’t need all of those.
And maybe, this continues with a lot of different examples, all of which combine to add up to a lot of unnecessary data coming along with each download. This is an experience that Shem has had, and it stopped him from downloading an app when he could have used it, which, obviously, is bad.
(As an aside, this is an interesting analog to the SOLID “interface segregation principle.”)
5. Ignore Material Design Specs
Have you ever seen a beautifully laid out app that had buttons and a general user interface paradigm that was completely new and foreign to you? That’s an interesting conundrum.
You may like it aesthetically, but you’ll have no idea what to do. We’ve come to expect a mobile experience where things are intuitive, lining up with what we’re already familiar with.
So if you want to create a bad app, you can make sure to do stuff the user has no experience with. Bonus points if it’s not even aesthetically pleasing.
6. Create Intro, Overlay, and Hints
Can you picture an app that shows you a LOT of explanations? It requires six pages of onboarding wizards to help you understand what’s happening. And it pops dialogs to help with new features, often which are non-dismissable.
Some of this, I’d imagine, can be useful. But a good way to create a bad app is to bludgeon the user with exposition at every stage of use. If you find yourself needing to do this, your app probably needs to be more intuitive.
7. Ignore Standard Icons and Widgets
The phone providers give you a lot of standard icons and widgets on the screen. You should probably use those.
But if you want to build a bad app, use your own mysterious ones that nobody understands.
8. Create Your Own Login Screen
You know the feeling of getting a new app and immediately being prompted to generate login credentials? Well, take that, and add to it the feeling of having to hand type in and then remember a new password.
Bummer, right? Wouldn’t it be better to just log in with Google or Facebook or whatever?
When you ask new users to use a login screen that you’ve hand-created, you’re asking them to trust you. A lot. You probably shouldn’t do this unless you want to build a bad app.
9. Support the Oldest OS Version
When you’re a mobile developer, you need to look at the different OS versions that you need to support. You can actually go and check out a breakdown of the current user base to see who is using what. You make decisions with this, like the minimum version to support.
Product management, of course, by default, won’t want to lose any users. “We should go back and support all the things!”
But they don’t understand the complexity of checking for those users, adding conditional code, and generally juggling all of these concerns. You, as the developer, will.
But, if you want to build a bad app, let yourself be overridden on this account. Support all the things.
10. Make Decisions without Data
Imagine an app with a button, like a “donate” (and give us money) button. You probably want as many people clicking on this as possible.
So the product manager wants this button to be green. But then the designer has the idea that the button should really be red. And the developer, well, the developer doesn’t care about color but wants it to be at the place in the screen where it would be the easiest to implement.
What should you do? Well, if you want to build a bad app, you should probably duke it out, going with the strongest opinion. But if you don’t want to build a bad app, you should probably rely on real, measurable data to see what works best.
If You Don’t Want to Build a Bad App, What Should You Do Instead?
So, flipping out of fun sarcasm mode, what’s a quick list of things you should do instead? Here is Shem’s guidance there, in a nutshell:
- Permissions? Instead, ask for only what you need
- Communications? Instead, notify about loading and empty state.
- Lose screen state? Instead, save screen state.
- Large app size? Instead, use vectors and modules.
- Unknown UX? Instead, use material design whenever possible.
- Have introductory exposition? Instead, let them explore and offer hints in context.
- Mysterious icons? Instead, use predefined icons.
- Roll your own login screen? Instead, use single signon.
- Support every framework version? Instead, know your users and strategically target them.
- Make decisions with no real data? Instead, measure data and use A/B testing.
He has more rules, which you can find at his site.
I really enjoyed this talk a lot. He’s a good speaker and it’s an engaging and relatable premise, but my enjoyment goes deeper than that.
Like anyone, I spend a lot of my time using my phone for very tactical, often time-sensitive purposes. I’m trying to catch a ride or look up whether my plane is late or whatever. So my phone is present for some of the tensest, most annoying moments of my life.
And it is these moments that provide some of the most intense technical frustration. Waiting for something to take forever to load when you’re already late, or getting bonked with some kind of cryptic error message that won’t let you proceed to the next screen.
And when you’re confronted with these moments, nobody around you cares. They’re not interested in the temper tantrum that you’re bottling up.
So for all of those angry, frustrated moments when I had no one to confide in, I feel vindication from this talk. Shem captured a bunch of frustrating, relatable moments and made them into actionable lessons, in a funny way. It’s nice to know that I’m not alone in my intense frustration with mind-boggling UX choices.
A 3 Part Scoring Formula for Promoting Reliable Code, 1:00 Pacific
This is a talk by Chen Harel, co-founder of OverOps.
He wants to suggest a few quality gates that those of us in attendance can take back to our engineering groups. This is ultimately about preventing severity one production defects.
Consider that speed and stability are naturally at odds when it comes to software development and deployment. With an emphasis on time to market, the cost of poorly written software is actually growing, notwithstanding agile methodologies and increased awareness of the importance of software quality.
And here’s a contextual linchpin for the talk:
“Speed is important. Quality is fundamental.”
So how do organizations address this today? Here are some ways:
- DevOps: “you build it, you run it,” to increase accountability.
- A “shift left” approach to software quality — bake quality concerns into the process earlier.
But how do we measure quality?
Measuring Code Quality
Measuring software quality is all about good data. And we tend not to have that data readily at our disposal as much as we might want.
Here are some conventional sources of this type of data:
- Static code analysis
- Code coverage
- Log files
But what about the following as a new approach?
- New errors
- Increasing errors
Using these metrics, the OverOps team was able to create a composite means of scoring code quality.
A Scoring Process
So, let’s look at the reliability score. Releases are scored for stability and safety based these measures. And it requires the following activities for data gathering:
- Detect all errors and slowdowns
- Classify each detected event
- Prioritize them by severity
- Score the build
- Block builds that score too low
- And then, in retrospect, visualize the data
Toward this, let’s consider some data detection methods. Manually, they have log files and metrics libraries. But they can automatically detect issues using APM, log aggregators, and error tracking.
(Halfway through the talk, and I’m really enjoying this. As both a techie and business builder, gathering actionable data is a constant focus for me these days. I love the idea of measuring code quality both with leading and lagging indicators, and feeding both into a feedback loop.)
Now, drilling a little further into step (2) above, classification. Ask questions like:
- Is the dev group responsible?
- What’s the type or potential impact?
- What dependent services are affected?
- What is the volume and rate of this issue — how often did it happen or not happen?
When it comes to prioritizing, we can think of new errors as ones that have never been observed. And a new error is severe if it’s
- A critical exception
- Volume/rate exceeds a threshold
There’s also the idea of increasing errors that can help determine severity. An error can become severe if its rate or volume is increasing past a threshold.
And you think about errors in terms of seasonality as well to mitigate this concern a bit. That is, do you have cyclical error rates, depending on time of day or week, or other cyclical factors? If so, you want to account for that to make sure temporary rate increases aren’t expected as the normal course of business.
And, finally, you can think of prioritizing slowdowns. Slowdowns mean response time starts to take longer, and a slowdown becomes severe based on the number of standard deviations it is away from normal operation.
So based on classification and priority, the OverOps team starts to assign points to errors that occur. They took a look at severity, as measured by things like “did this get us up in the middle of the night,” and adjusted scoring weights accordingly until they fit known data.
This then provides the basis for future prediction and a reliable scoring mechanism.
Now, assume all of this is in place. You can automate the gathering of this type of data and generate scores right from within your CI/CD setup, using them as a quality gate.
A Code Quality Report
Having this integrated into your build not only allows you to reject builds that don’t pass the quality gate. You can also generate some nice reporting.
Have a readout for why a given build failed, and have general reporting on the measured quality of each build that you do.
I’ve spent a lot of time in my career on static code analysis, which I find to be a fascinating topic. It promises to be a compile-time, leading indicator of code quality, and, in some ways, it does this quite well. But the weakness here is that it’s never really tied reliably into actual runtime behaviors.
In a sense, a lot of static analysis involves predicting the future. “Methods this complex will probably result in bugs” or “you might have exposed yourself to a SQL injection.”
But the loop never gets closed. Does this code wind up causing problems?
I love this approach because it starts to close the loop. By all means, keep doing static analysis. But also run experiments and measure what’s actually happening when you deploy code and feed that information back into how you work.
Joy! 12:50 Pacific
As I come up on my 40th birthday, I think the appropriate headline for this is “Old Man Figures Out Smart Phone.”
Sched support was very responsive and helpful. From what I can tell, the email address it registered me with is the one associated with my Facebook account. So “logging in” via Facebook apparently meant using Facebook’s email address on file as my Sched username.
Live and learn, I guess.
Time to grab a soda from the lounge and go listen to a talk about a scoring formula for reliable code. As anyone who follows my stuff on static analysis will know, this one is of particular interest to me.
Troubleshooting, 12:35 Pacific
So far, so, well, not the best. I signed up for my schedule of events a while back through some kind of (at the time) seamless combination of Eventbrite, the conference itself, and something called Sched. I did this all with my Facebook account.
Apparently, the Facebook auth option went away, though. So now I’m left with no way to log in and show the ushers that I’ve reserved a seat.
C’est la vie. I think I probably picked a relative corner case way to sign up Plus, I can’t imagine the complexity of coordinating all of this stuff, logistically, across several apps and authentication protocols.
I’d be happier if I could attend the talks, but the people doing support are being really responsive, so hopefully it’ll all be sorted soon.
Pre-Conference, 11:30 Pacific
Those of you who follow my blog might be disappointed to know that I put no effort into second passes on my writing. What you get is just a dump of whatever pops into my head, typos and all.
In a lot of content situations, you might consider this a liability. Well, today, I’m going to strive to make it in an asset. I’m going to live blog this conference in a style that will probably read like most of my blog posts.
This week, Amanda and I are in San Francisco/Oakland to meet with clients and so that I can attend DeveloperWeek. I’m hoping to enjoy some talks, meet some folks, find potential clients, and de-hermit a little bit.
And now, here I am, attending a professional conference for the first time in years, ready to have a whole lot of human interaction (for some definition of “ready”). And I’m going to live-blog the whole thing, gonzo style (but with less chemical alteration than Hunter Thompson).
So stay tuned to the blog for updates throughout the day for the next three days. I’ll blog a lot about the talks, but more generally what I’m up to. And I invite you to weigh in about which talks I should attend or what you’d like to hear about. Comments, tweets, etc. all welcome.
If you’re curious about the location and the hotel, I’d like to report that I’m enjoying both. I have enough status left over with Marriott from my management consulting days that they gave us a room with a great view of Oakland.
So life is good! Ready to hear about the latest and greatest in tech.