Buy My Book, "The Manager's Path," Available March 2017!

Saturday, June 12, 2021

An incomplete list of skills senior engineers need, beyond coding

For varying levels of seniority, from senior, to staff, and beyond.

  1. How to run a meeting, and no, being the person who talks the most in the meeting is not the same thing as running it
  2. How to write a design doc, take feedback, and drive it to resolution, in a reasonable period of time
  3. How to mentor an early-career teammate, a mid-career engineer, a new manager who needs technical advice
  4. How to indulge a senior manager who wants to talk about technical stuff that they don’t really understand, without rolling your eyes or making them feel stupid
  5. How to explain a technical concept behind closed doors to a senior person too embarrassed to openly admit that they don’t understand it
  6. How to influence another team to use your solution instead of writing their own
  7. How to get another engineer to do something for you by asking for help in a way that makes them feel appreciated
  8. How to lead a project even though you don’t manage any of the people working on the project
  9. How to get other engineers to listen to your ideas without making them feel threatened
  10. How to listen to other engineers’ ideas without feeling threatened
  11. How to give up your baby, that project that you built into something great, so you can do something else
  12. How to teach another engineer to care about that thing you really care about (operations, correctness, testing, code quality, performance, simplicity, etc)
  13. How to communicate project status to stakeholders
  14. How to convince management that they need to invest in a non-trivial technical project
  15. How to build software while delivering incremental value in the process
  16. How to craft a project proposal, socialize it, and get buy-in to execute it
  17. How to repeat yourself enough that people start to listen
  18. How to pick your battles
  19. How to help someone get promoted
  20. How to get information about what’s really happening (how to gossip, how to network)
  21. How to find interesting work on your own, instead of waiting for someone to bring it to you
  22. How to tell someone they’re wrong without making them feel ashamed
  23. How to take negative feedback gracefully

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Saturday, May 29, 2021

Management Basics: Determining a Performance Rating

 originally posted on

One of the most stressful parts of the end-of-year process for managers is the dreaded performance rating. This process forces you to boil down all of the work that a person did over the year, all of their accomplishments and misses, into a numeric score (often from 1–5) that may also come with words like ‘meets expectations’, ‘exceeds expectations’, or the unhappy ‘misses expectations’.

If you work for a company that has a ‘pay for performance’ model, your rating will influence the employee’s compensation. It may be used as an input for promotions, and yes, as a factor in firing or laying off employees. And so while this process is painful, every manager needs to get comfortable with assigning ratings to their engineers, and justifying those ratings to both the person receiving the rating, and potentially to their peers, boss, or other stakeholders who have a say in the distribution of rating scores at the company.

Your manager rating is a combination of measurable inputs and your own judgment. Today, I’m going to focus on how you can start to get comfortable with this step, by developing your own judgment and the supporting metrics you can use to guide your ratings.

Evaluations start long before it’s time to actually determine a rating

The first major input in any kind of fair evaluation is based on the work the employee needed to accomplish, and the work they did accomplish. This process starts with both goal-setting and a review of the expectations of the role, as well as any areas you already know they need to improve on. This may seem obvious but, as we’ll see, it is actually a tricky thing to get right!

As an example of using goal-setting as an input in performance rating, let’s explore a method that splits goals up into three categories: must be achieved, stretch goals, and moonshot goals.

By setting ‘must be achieved’ goals, you inform the employee on what the important work for the year should be, given their level and role. Then you collaboratively think about what stretch goals might look like, including a goal or two that would be really outstanding if they could achieve it — a moonshot goal. When you have your whole organization calibrated on how to set these kinds of goals, you can use them as a critical input for performance rating. If someone met the base goals, they probably met expectations. If they met the base goals and some of the stretch goals, they exceeded or substantially exceeded expectations. If they achieve base, stretch, and a moonshot goal, that would imply a year of incredible achievement that would support the highest rating of all.

Of course, the downside of goal-setting is that goals often become irrelevant in the course of a year. This is where your judgment must come into play. When they missed a goal, was it because it became irrelevant, because they failed to execute, or because they were unavoidably busy with unplanned critical work? Ideally you revisit goals regularly and adjust them when they become irrelevant, but I’m a realist and know that many people forget to do this or get too busy to bother. It’s a lot of work to constantly track this! So most of us must get comfortable with looking across the scope of goals (hit, missed, and deferred) and value them as they are.

Judgment is a major part of evaluating more senior people’s goals, particularly more-senior managers. On the one hand, they will tend to be responsible for the achievements of a team of people, and many things can happen to a team in a given year to derail their goals. On the other hand, the more senior a manager gets, the more they are expected to anticipate ways in which their team may fail to achieve their goals; this anticipation and course correction is part of performing well as an experienced manager. Only you can tell the difference between a manager who was blindsided by events and a manager who just failed to plan well, or who couldn’t course correct their team effectively.

Decide on your own axes of evaluation and attempt to apply them evenly

A manager I used to work with had a very methodical approach that he used to evaluate managers on his team. He had seven characteristics of management that he considered essential to doing the job, and would score each manager on each area, then roughly average them to get a final score. A different manager that I worked with at Rent the Runway did something similar with our four engineering ladder attributes. She would grade each person based on their level and role, and use that to justify how she rated her team.

I find this approach to be a helpful part of my ratings process. Looking across a set of attributes that I believe are important forces me to think about all of the skills a person brings to the table, and how they seem to be doing at each of them. This helps me identify strengths and weaknesses, and structures my thinking when I am evaluating across people in similar roles.

Rating by category works when you have a lot of clarity about what is important (as in the seven characteristics of managers), or a ladder that is well-written to support this. But it does have its downsides, and these will become apparent when you try to put this into practice. People are not easy to put into boxes. You may have a manager that is incredibly weak in one area and incredibly strong in another. Does this average out? Or are they actually underperforming, because their weak area is so essential that it means they aren’t doing their job? Or on the flip side, are they over-performing because, for this role and team, the weak area doesn’t matter so much? Now you have to add judgment into the mix.

It’s also hard to apply this model when you have a team of people who all have somewhat different jobs. It’s very hard to write level criteria that works well for both front-end and back-end engineers, let alone systems specialists, reliability engineers, and DBAs. Then add in the fact that you may be managing someone who is in a bespoke role, say developer relations or technical writing. Now you may not have enough data points to make sure that you are rating the person fairly, because there is really no one else for you to compare them to.

The final component has to be your own judgment

This brings us to the final aspect of performance rating: manager judgment. As with all things in the world of people management, there is no perfect algorithm you can apply to ensure total fairness and accuracy. You do not want to set yourself up to deny good ratings to people doing good work when they don’t fit perfectly in a box, or when they had all of their goals upended by a strategic shakeup halfway through the year. But you also don’t want to unfairly reward or punish people through an arbitrary process that relies on how much you like or dislike them.

Setting clarity at the beginning of the performance evaluation period via goal-setting and job alignment is important because it gives your employees a clearer idea of what they are going to be evaluated against. Breaking roles down into components and rating each person against each component helps you consider a balanced picture of someone’s strengths and weaknesses across the areas that matter. But ultimately, these inputs are merely some of the data you need, and you must consider the full picture of their work against the ever-changing requirements and challenges of your workplace.

The most interesting and useful part of this exercise is comparing what you get from this data against your gut reaction to the rating you think someone should get. If you are open-minded, the data will show you that you’re off both high and low in different cases.

This is an opportunity for you to broaden your thinking: what aspects of this role are really important but unstated in our level guidelines? It will force you to postmortem your own leadership: how did we miss on the goals here so badly? And sometimes, it will force you to acknowledge your own bias: why do I always want to give this kind of person a lower rating even though objectively their work looks as good as this other kind of person?

Your own ratings are rarely the final step. Most companies force an alignment across teams and managers via a calibration exercise in order to ensure ratings are fairly applied across the company. But if you spend good time up-front getting very clear in your own mind the rating you believe someone deserves, and the reasoning behind that rating, you are well-prepared to go into that calibration exercise and defend your ratings with thoughtfulness and care. And you want to be prepared, because once the rating is set, you’ll have what may be the hardest conversation of all: the one where you share the rating with the employee.

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Sunday, January 24, 2021

Make Boring Plans

You’re probably familiar with the concept of Choose Boring Technology. If you’re not, I’ll wait for you to read the excellent blog post by Dan McKinley that inspired a much-needed correction in tech to balance “innovation” with stability. I’m here to take this to the next level, and talk about how “boring” should apply not just to your technology choices, but to your plans.

I spoke to someone several months ago who was frustrated with their management chain. They were anxious about the fact that the management chain was always pushing on delivery in an unpredictable way. The team felt really high pressure, even though the projects they were working on were all part of long-running infrastructure renovations. Why was this so stressful? Why, they asked, was the plan not already laid out? Why isn’t this boring?

Why isn’t this boring?

It might say something about the area that I focus on, Platform Engineering*, that “why isn’t this boring” would ever come up. You see, usually when people are in this situation, they blame everything but the lack of planning for their problems. It is a common belief in engineering that, with a clear enough vision, the rest of the pieces of work will fall into place. With a well-understood goal and smart engineers, the idea is that you can trust that people will work towards that vision faithfully and deliver something great. And this does, in rare cases, seem to work. After all, half of the hiring wisdom of the past has been “hire smart people and get out of their way.” Magic can happen with a small, highly-motivated group of people building a new thing towards a clear goal.

However, this concept of building towards a grand vision falls apart when you are building the underlying software that other engineers rely on. For better and for worse, Platform often has to be the place where we push new things to the rest of the company. A big change in the platform is the definition of an innovation token being spent. You want to move to Kubernetes? You’re gonna spend a lot of time figuring out how to operate it well in your environment, to start. You want to support a massive monorepo for the whole company? Hello innovation tokens everywhere, as you try to make it scale and perform well for all of your engineers and all of the languages they want to use. Speaking of new languages, you want to introduce Rust, or O’Caml, or even just C++17? The platform will have to support it.

Before you go blaming the Platform team for spending all of the innovation tokens for the company, remember that these initiatives are often driven by someone else. If Platform doesn’t support Kubernetes, some team will decide to build shadow infrastructure because they’re convinced that it will solve the problem they have to handle with their tens of microservices, and then it will land in the lap of Platform after a year with none of the work done to make it easy to operate, but all of the operational expectations anyway. Our goal is to build just enough ahead of you so that when you realize you need the capacity, it’s there, or can be with minimal fuss, and it’s reliable to boot.

Novel Technology Deserves Boring Plans

Since we often end up in the land of novel technology, we owe it to ourselves and our customers to be boring in other ways. And the most important way that a Platform team can be boring is by writing boring plans.

It’s great to have a vision for the future of the platform. To achieve this vision, a non-trivial amount of our job is not just building new big, scalable, complex software infrastructure, but moving everyone from the last generation of this software infrastructure to the next generation. Upgrade the programming languages, the operating systems, the libraries. Move from OpenStack to Kubernetes, from on-prem to the cloud, from maven to bazel, from svn to git. Migrate from the old storage system that was optimized for a rare legacy usecase, to a new storage system with higher availability and performance.

Making these changes happen, under the covers, has both interesting parts and boring parts. If you’re not a platform engineer, you shouldn’t see the interesting parts. The interesting parts are where we go and tune the kernel to perform well for our workloads. The interesting parts are where we build out automatic failover, so that we can meet the availability needs of the workloads. The interesting parts are the many patches we might contribute back to the inevitably-broken open source projects that hold half the world together but still don’t seem to understand how to work with FQDNs. The interesting parts are where we understand deeply the dependencies of our technology stack, the opportunities and limitations, and build solutions for our customers that fix limitations and unlock new opportunities.

When we don’t attend to the boring parts by making our plans predictable, the interesting parts turn into extra stress on top of the overwhelming anxiety of juggling these moves. When you make plans that start and end with the vision “we will move everyone to the public cloud, and it will be great,” you find yourself in the exhausting situation of running all of your old infrastructure, trying to figure out the new cloud stuff, and dealing with customers who are confused and angry that the thing they want to do doesn’t seem to quite work in either world.

Contrast this to the team that turns that vision into boring plans. They start with a small proof of concept, migrating perhaps a single application and learning in the process. Then they do the work of looking across other applications on the old platform, to see which ones are similar to the one that is now in the cloud. They work with those users to get them migrated and running, all the while gaining comfort with this new environment and uncovering the interesting gotchas. They write down what they’re learning, so that each new step in the migration builds on the last, and others can be pulled in without a huge knowledge transfer. The team focuses on the hard parts of the moment, whether they are figuring out data mirroring, or fixing a bug in a popular open source project, and they are free from the anxious overhead of wondering what is happening tomorrow. The users are also free from the stress of wondering when the work they need will be delivered, because the team has communicated plans that account for this process of iteration, learning, and gradual migration.

A Strategic Plan Is Obvious and Simple, Even Boring

Making boring plans is a foundational step in getting good at setting engineering strategy. Strategy is often confused with innovation and vision in tech circles, but they are far from the same thing. Having a future vision and recognizing the potential of innovations is valuable in building great strategy, but strategies that rely on unproven magic bullets are not good strategies. Good strategy identifies a problem with the current situation, proposes a principled approach to overcome it, and then shows you a coherent roadmap to follow. Strategy is not in the business of razzle-dazzle, it’s in the business of getting to the core of the issues so that the solution becomes simple and obvious. Good strategy provides the clarity that enables boring plans.

To become great at technology strategy, start by getting good at making boring plans. Get clear about the problem you are overcoming with your plans. Make the principles of the work at each stage clear:

  • How do we know when we’re in exploration mode, and how do we know when we’re ready to commit to a direction?
  • Have we talked to our users? Do we understand how they are using our systems, and have we made plans that account for their needs?
  • What are the problems we’re focused on solving right now, and which problems are we leaving to worry about another day?
  • How do we know if we’re on the wrong track, what are the guardrails, milestones, or metrics that tell us whether the plan needs review?

Your teams need more than a clear idea of the end state and the hope that smart engineers will inevitably get you there. Plans that are formed around hope are failing plans; hope is not a plan. Plans that change constantly are failing plans. When your plans are constantly changing, it is a sign that you either are making plans that express a certainty you don’t have, or you haven’t done your research to get the right certainty in place. Either of these is a waste of time and an unnecessary stress on the team.

So leaders, you owe it your teams, and to your users, to free them from the tyranny and stress of uncertainty. You must do the work to go beyond vision, create concrete actions, and make boring plans.

Saturday, November 21, 2020

Driving Cultural Change Through Software Choices


This tweet got me thinking about change, and how software engineers (and especially, Platform teams) can drive cultural change throughout companies.

First, let’s take the question. You want to change the engineering values that your company is expressing. You don’t just want to create a heavyweight process (your checkin fails if you don’t reach X code coverage, for example), you want engineers to start to value these things enough that they don’t need a process to enforce them.

I’ve driven and watched culture change happen enough times to know how to do it from the position of senior leadership. You change what you reward and focus on, and repeat that change enough that people will start to naturally change their perspective (or, sometimes, leave). If you go from putting all of your focus and attention on new projects and turn your prioritization to stability, taking the time to praise teams who improve their stability, promoting engineers who complete projects that are related to stabilization, and crucially, set prioritized goals for your teams to work on stability, your culture will change from prioritizing new projects to prioritizing stability.

This is a powerful force, but it is slow, and what’s worse, it can have negative consequences. You don’t want teams who are afraid to do new things for fear that they will be punished for a lack of stability, for example. And if you overcorrect for a perceived cultural gap, you can end up chasing away otherwise great engineers who believe that their skills are no longer wanted or valued because they no longer have an important skill set. So trying to make your teams change their technology approaches purely via a cultural focus is not always the best approach.

There is another lever that is available, however, particularly to platform developers. And that is the lever of product features.

I’ve never met an engineer who didn’t occasionally copy-paste-modify some code. One of my earliest professional software lessons was that when you set up a codebase full of tests, other engineers are likely to write tests for their code because there will be lots of examples for how to test. This generalizes to the observation that people are most likely to take an existing thing and tweak it into a new thing that does what they need, and in the process they will take the good and bad from that existing thing. So if you want them to follow a best practice, put it in their starting templates.

Take building a service. If you start with nothing but a system that runs a simple web server, you might go through the effort to also set up metrics and monitoring and healthchecks, but you might also feel like you’re busy and you just want to get the code that you absolutely must write done. On the other hand, when you start with a service framework that is already set up with metrics and monitoring and healthchecks, you’re more likely to do a little bit of work to make those at least mildly useful. This was one of the insights that Dropwizard gave me back in the day: pre-integration of stuff that you really need to run a service well means that your services are better from day one.

Platform developers these days get this. Your infrastructure software now comes out of the box with observability hooks built in. We’re all probably more attuned to basics of creating reliable software now because our tools push it on us from the get-go, so there’s no cultural revolution necessary.

We can go further than observability. Security can be a process, or a cultural value, but you can also go quite far by providing tools and platforms that have good security practices baked in to them, so that you’re not relying on the good citizenship of your development team. Testing is often hampered by the overhead of running tests, and investment into infrastructure that makes tests easy and fast to run is important to supporting a culture of software quality validation.

All of this is to say that developers have more power than they imagine to change the engineering culture around them. As you build software that others will use or that your peers will work on, are you making it easy for them to do the right thing? If you build platforms, bake in easy integrations for the software values you want to see. If you’re in the position to choose new tools, pick ones that support the standards you want taken seriously. And as you write code, make it easy for others who will copy-paste what you’ve done to then do the right thing.

Tuesday, September 22, 2020

The Management Flywheel

Have you ever worked on a team that felt like it was just stuck in a rut? Somehow things were always just one fix away from improving: the next project, the next quarter, the next hire, this would turn the situation around. And yet these projects came, the quarters went by, new people were hired and joined and left and nothing ever really improved. It’s a sadly common situation, and one of the few that I believe can be laid squarely at the feet of the team’s manager.

Birmingham Museums Trust — Richard Trevithick’s 1802 steam locomotive with flywheel

I’ve spent a lot of time over the past few years thinking about how you know whether a manager is great. When everything is going well, all a decent manager has to do is not screw things up, and it’s not always easy to tell on paper whether a manager is merely good or truly excellent. A person might have thorough training, they might have a large team, they might even have smart things to say on Twitter, but are they actually great at the job? None of these things will tell you the answer.

But ask a manager about how they’ve gotten a team out of a rut, and now you start to hear about a real, common situation that a manager can make or break. There are some managers out there who just know how to take a team that is in a rut and turn them around. They may have different ways to describe their approach, but the outcome is the same: they start to turn the management flywheel.

“The Flywheel” is a popular startup analogy, and is best described in this classic writeup. The flywheel is heavy and painful to start, and starts off slow, but as it gathers momentum over time it goes faster and faster. Turning around a team feels like getting this flywheel spinning. Most managers who see a team in a rut can quickly detect many things that are going wrong. But it’s the response to these problems that distinguishes the great ones.

Some managers in this situation will proclaim that they are going to make big changes. Technical managers often see the flaws in the architecture or the legacy approaches to technology that the team is using, and immediately set about to overhaul the whole system. They want to change the language that the system is written in, or move to event-driven microservices, or rewrite the whole thing to run on Lambda so that there’s no support to worry about. Product-vision managers see that the product vision is lacking, and immediately paint a big picture for the team that articulates a beautiful world they could be providing for their customers. Talent-focused managers take one look at the people on the team and immediately decide that they are just the wrong people, and the only solution is to immediately overhaul the people and hire a bunch of “the right”** engineers to replace them.

My experience is that most of these managers, in these situations, will fail. I have personally failed in all of these ways at least once in my career. But we won’t admit we failed. The technical managers will blame the legacy system and how impossible it is to rewrite the system while supporting the terrible decisions of the previous leaders. The product-vision managers will blame the team for not figuring out how to take their grand vision and turn it into a roadmap that makes sense. The talent-focused managers will somehow never manage to get the right team in place.

The managers who succeed in this may have big ideas about the technology, the product, and the talent and culture of the team, but they don’t just start with these ideas. Instead, they identify the little things that can be changed. Questions like “how do we decide what we’re working on today” and “do we have clear responsibilities for core tasks” start to get resolved. These managers may tackle confusing on-call schedules, or onerous project management expectations. The best will look across the projects and quickly re-prioritize work to gain focus for the team.

These little things start to build up steam. Now the team feels less burdened by unwieldy processes, and starts to make decisions faster. They know who is responsible for being on call, and stop swarming on every incident. They are working on fewer projects, and slowly start actually finishing those projects instead of dragging them out indefinitely.

While this is happening, the team is getting happier, and the manager is learning more about the deeper challenges of the team. Is the product direction muddy? Is the team missing a critical set of skills that need to be hired? Does the entire architecture need to be revamped? These are critical questions to figure out and answer, but they are never the first questions that a manager over a team in a rut should be worrying about. There is inevitably a set of simple things that can be improved to get the team through this transition, to start building the flywheel that will make the big things easier. Because it’s easier to demand more of your product counterparts when the team is executing. It’s easier to hire when the team is engaged and excited to add new talent. And it’s much, much easier to fix a bad architecture when the team is able to ship changes.

When you find yourself in a rut, remember that you don’t have to solve the root cause of everything wrong with the team as a first act. Start with the little problems. Give the team some small wins, clarity, and focus. Make their job a little bit easier, and help them work a little bit faster. Build up speed on the flywheel. Once you’ve gotten the team to be productive, they will still need you to set direction, and to resolve interpersonal and inter-team conflicts. They will need vision, mission, psychological safety, and inspirational goal-setting. But big things start small, so don’t forget to sweat the small stuff.

** The right engineers range from “Engineers from the best colleges/FAANGs-only” to “only startup engineers who are hungry for an opportunity” to “only people who actually understand this industry” to “only people who appreciate my values” depending on the manager.

Thanks to Kelly Shortridge for feedback on this post.

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Saturday, May 9, 2020

Product for Internal Platforms

For the past 3 years, I have been running a platform engineering organization. Since that term is vague, where I work it means the software side of infrastructure. Compute platforms like kubernetes, storage systems, software development tools, and frameworks for services are part of the mandate. Our customers are other engineers at the company.

I also oversee the product team for this area. Now, I’m not a product manager (which I’ll shorten to PM for the rest of this post, not to be confused with project manager), and I rely on my PM team heavily for their expertise. But that doesn’t mean that I can personally neglect the product side, and indeed I spend a lot of time thinking about the products and strategy for my org as part of my day-to-day work.

These are some things I’ve observed and learned in this process. I share them with you because I think many folks in platform-type teams, especially engineers and those of you on platform teams without formal product managers, might benefit from understanding how to approach these problems.

What’s so hard about product for (internal) platform?

Customer Group Size

Platform product decision-making is something of a unique discipline. When many people think about product managers, the image that springs to mind is a product manager for large consumer-facing products. Metrics, metrics, metrics! A/B tests and user studies and KPIs and design sprints and revenue models. I’m sure that any time you are building products for a large user base, this is a factor, and PMs for AWS probably have a lot of metrics to guide them. But A/B testing for an audience of hundreds doesn’t usually teach you much. When you’re building platform for internal customers at a small-to-midsized company, a metrics-driven strategy is harder to apply.

Captive Audience

Not only do we have a small group of customers, we have a captive audience. Other teams can and sometimes do decide to go off on their own and build their own platforms, but for many types of products, we provide the only option. You can ask customers what they want or how they like your products, but they may not want to complain to their colleagues. Of course, platform products also suffer from the problem that some engineers always think they could build something better, if only they had the time, so you also have a customer segment that seems to never be satisfied no matter how hard you work.

A captive audience leads us to believe that basic metrics for customer adoption are not interesting, which in turn leads us to ignore them, sometimes to our peril. Many platform teams end up with several overlapping half-finished products because they assumed a captive audience would lead to product success.

It’s Hard To Think Like Your Customer

Finally, there is the universal challenge of platforms. Good software products for engineers tend to come from someone with a clear problem in front of them, who built specifically to solve that problem. They are intimately familiar with the customer because they are the customer. They are building for one person or one team, and they can clearly see what needs to be solved. But the platform team doesn’t build solutions for only one user. The whole value of a platform team is providing broadly-useful systems, so we are rarely presented with a well-specified need to fill.

When you are on a platform team, it is easy to lose the feel for what it is like to use your own products, because you are deep in the details. You spend your day living and breathing the ins and outs of git, and your users know the 3 commands they have to memorize and otherwise rely on ohshitgit to get themselves out of trouble. In a perfect world, platform product managers are regularly using the products they support in order to identify pain points and gaps that engineers might miss and users may not complain about. In the real world it’s hard to find time to use your products in anger when you are also dealing with all the other parts of the PM job.

You can observe this dilution of focus and quality in a lot of open source platform products that are created by big companies in order to sell you on their cloud solution or to create an industry standard. It starts look like software that is building to be built. And you absolutely see it in internal platform projects at companies big and small. When platform teams build to be building, especially when they have grand visions of complex end goals with few intermediary states, you end up with products that are confusing, overengineered, and far from beloved.

So how do you solve this?

If the challenges can be summed up as: a small, captive audience, that is hard to truly empathize with, and a tendency to build thoughtlessly, what can you do? Here are a few approaches I’ve found to help:

Assimilate and Expand

You don’t have a huge customer base to test things on, so how do you find a successful product? Don’t be ashamed to take over a system from a team that built it with themselves in mind, if that system seems to be the right general concept for the wider company. A lot of platform teams don’t like doing this, because they think that it means they will have to live with decisions that they don’t agree with. They forget that when you take a product from a team that built it, you already have a reasonably satisfied customer to start with! For better or worse, someone showed that they had a problem, and they solved it, and you wouldn’t be taking it over if you didn’t think this problem was worth solving in a holistic fashion, right?

I did this when I built a global service discovery solution long ago. Another team had first identified the problem and created their own version of a solution using ZooKeeper. The solution was fine for their needs, but didn’t solve the general needs of everyone at the company for global scaling. So I took over the idea of the project, and turned it into true platform infrastructure, built for a big company and not just one team therein. There were plenty of product decisions to make as part of that work, but the core identification of the problem as worth solving was done for me. There is a lot of interesting work in taking a solution that is locally-optimized and turning it into something that can be used by a diverse set of applications.

Partner to Prototype

Another way to identify promising new opportunities is to partner with another team, and even embed someone into that team, to understand a problem better. Partner teams are likely to come by and ask if you are planning to build something to solve their various problems. When you believe that this is a good specific example of something that will become a general pattern, take advantage of this request to learn more! In fitting with the goal of really understanding the feel of a problem, having platform engineers build an application with a prototype idea for a platform within it, then using the lessons from that project to extract a more general system, is a productive way to quickly iterate an idea into something that is usable. After all, the hardest part of the product side of platform engineering is figuring out usability. Want to know how people will actually write code around this offering? Well, writing code around the offering yourself is a good way to figure that out.

Make a Migration Strategy Early

In platform teams a lot of the product job is figuring out how to make open source products or popular strategies work for your company. Take kubernetes. The product challenges around internal kubernetes are in the decisions you make on how to integrate it into the existing ecosystem in order to get people to adopt it without too much argument. If you are a company of a certain age, you may already have an old private cloud solution running around. Everyone is used to running on VMs, but you think kubernetes will give you some operational improvements and also encourage the company to start to rethink its software practices to be a bit more modern.

That is all well and good, but the product work is not “tell everyone you have kubernetes now and they have to use it.” Instead, the product work is to identify different types of customers and figure out what will make it easy for them to migrate. What are the carrots you can provide to get people to do work that they don’t care about doing? Perhaps the carrots are efficiencies in getting access to compute or storage. Perhaps you can offer a higher SLO with the new product. Perhaps it is faster, more secure. But these things don’t just happen. You have to choose which features you are highlighting to your customers, you have to help them understand the offering and advantages, and you have to deliver on those promises.

Despite having captive audiences, platform teams are notorious for creating half-finished product offerings that somehow fail to get adopted. When your platform organization is running three different generations of solutions to the same problem with no clear plan to remove any of them, and your customers are both confused by the offerings and dissatisfied with them, you have a serious product failure on your hands. The migration strategy must be a primary part of the product planning.

You Aren’t Google, So Don’t Build When You Don’t Have To

My final piece of product advice to platform teams is to remember that you aren’t Google (unless you are, in which case, hi!). When you have a platform team of 7, or even 100, you must be extremely thoughtful about what you choose to build. Platform teams of all sizes can get bogged down trying to imitate systems that have been built up over years at big companies. Even when those big companies provide their solutions as open source software, they often encode all kinds of assumptions about the surrounding ecosystem of available products and the culture and needs of the engineers using the product that may not work well in your company. It is not good product management to say “Google does it, therefore we should.”

Instead, start with a clear understanding of the problem, and an accounting of your existing ecosystem and culture, before diving into a technical solution. Your data volume is out of control. You might need to solve this with a better storage solution, or you might need to solve it by identifying the top data producers, and asking whether the data they’re storing is actually valuable. You’ll often find that the data is garbage, or the developers can change their workflow, or a little bit of query performance tuning makes this application scale just fine in a normal RDBMS. Only build when you have exhausted the alternatives.

Summing Up

Great platform teams can tell a story about what they have built, what they are building, and why these products make the overall engineering team more effective. They have strong partner relationships that drive the evolution of the platform with focused offerings that meet and anticipate future needs of the rest of the company. They are admired as strong engineers who build what is needed, to high standards, and they are able to invest the time to do that because they don’t overbuild.

Whether you are a platform engineer, engineering manager, or PM, it pays to remember that you still need to be customer-focused and strategic about your platform offerings. Without a clear strategy for showing impact and value, you end up overlooked and understaffed, and no amount of cool new technology will solve that problem.

Thanks to my darling product and platform friends for their feedback on drafts, especially Renee and Pete.

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Sunday, May 12, 2019

OPP (Other People's Problems)

A hard lesson for me over the past several years of my career has been figuring out how to pick my battles. I’ve seen many friends and colleagues struggle with this as well: how do you know when to involve yourself in something, and how do you know when to stay out of it? How do you figure out where the line is?

The setup

If you’re reading this looking for advice, you’re probably a go-getter. You consider yourself a responsible person, who cares deeply about doing things right. Your care may be focused on software and systems, or on people and organizations, or on processes and policies, or all of the above.

This attitude has probably served you well in your career, especially those of you who have been working for a number of years. You’ve been described as having a “strong sense of ownership,” and people admire your ability to think broadly about problems. You try to think about the whole system around a problem, and that helps you come up with robust solutions that address the real challenges and not just the symptoms.

And yet, despite these strengths, you’re often frustrated. You see so many problems, and when you identify those problems, people sometimes get mad. They don’t take your feedback well. They don’t want to let you help fix the situation. Your peers rebuff you, your manager doesn’t listen to you, your manager’s manager nods sympathetically and then proceeds to do nothing about it.

That kind of grinding frustration can wear you down over time. I know, because I’ve been there. I left a cushy big company job because I saw too many things I felt powerless to fix. When I got into management, I figured that I would have the power to make things right. Then I thought that when I became the leader of all of engineering I could do it. Then I thought that when I became an executive I would be able to do it. Chasing that dream of being able to fix all the things contributed to me feeling exhausted, and dare I say a bit burned out, when I finally decided to leave my CTO gig.

Escaping the Trap

I’ve come to realize that there isn’t a job where you can fix all the things. It is true that founders have immense ability to set direction and culture, but trying to control everything happening in a company causes many other problems that are outside of the scope of this essay. Assuming that you are not a founder, you should just take a minute to really let it sink in: there is no place you can go where you can control everything and fix all the problems, no matter how much you get promoted. There’s always going to be something you can’t fix.

So how do you decide where to exert your energy?

Step one: Figure out who owns this problem

If it’s your job (or the job of someone who reports to you), great. Go to it! Tend your own garden first. Make systems that are as robust as you believe systems should be. Follow processes that you believe are effective and efficient. If you are not leading by example, you have to start there. Stop reading now and go fix the things!

If there’s no clear owner, do you know why? Is it just because no one has gotten around to doing it, or has the organization specifically decided not to do it? If no one’s gotten around to doing it, can you do it yourself? Can your org do it, just within your org?

If it’s someone else’s job, how much does it affect your day to day life? Does it bother you because they’re doing it wrong, or does it actually, really, significantly make it harder for you to do your job? Really? That significantly? There’s no work around at all? If it is not directly affecting your job, drop it!

Step two: Talk to all the people

If you don’t clearly own the problem, you need to talk to people. If you feel tired by the idea of talking to all people, stop! This is a sign that you should not pick this battle! It’s already draining you and you haven’t even started on the path to addressing it. It is probably a good idea to just try to let it go, or at best, tell your manager that you worried about whatever it is, and then let it go.

If you’re ok with talking to all the people, then get out there and get a sense of the problem beyond you and your team. You can do this formally, with a document that you prepare addressing the problem as you see it, or informally, as a series of user interviews. You will need this information to make a case to fix it, and to make a plan for how to fix it. Does the information you’ve gathered from others make you think that perhaps this problem really isn’t as important as you first thought it was? Is someone else already solving this problem? Great! Let it go!

If you know who should own this, you need to give them a chance to fix it. Which means you need to come with examples of how the problem is impacting you or your team. Missing those examples? Stop! This is not your problem to fix! Don’t go bringing problems based on people from other teams complaining to you. Those teams need to bring up the problems themselves. If you must, tell their manager that you’re hearing these complaints, and let that manager decide whether to deal with it.

Step three: Plan the fix

Ok so you talked to all the people and the problem is still not fixed. Assuming no one owns the problem and you really still want to own it and fix it, great. Make a concrete plan for how you will fix it, and share that plan with the people who need to know about it. You should expect that you will need to get feedback and revise your plan, and the amount of feedback and revision required will be directly related to how big the problem is, how many people it impacts, and how controversial the fix you’re proposing is. Expect this feedback, buy-in, and revision process to take a while. You’ll need feedback from all corners, friends are good to start with but be sure to include your skeptics too. Your goal is to convince everyone that they want you to solve the problem. Yes, that means a lot more talking. If you’re tired now, maybe this isn’t your problem to solve!

If this problem is with another team, and you talked to that team, brought them clear examples of why it is truly a big deal, and they haven’t answered your concerns to your satisfaction, you have a choice. Do you escalate to your manager? If you have clear examples of why it’s a problem, and your peer hasn’t been able to do anything, this is a perfectly fine time to escalate! Consider though whether you can negotiate a fix with the other team, or work around the other team. And, especially when the problem is cultural, consider whether you really need to make this problem into a big deal, or whether you can just let it go.

Step four: Enact the plan

And now we get to the tricky part. You saw this problem, you complained about it, you made it your own, and now you have to fix it! This is what you wanted, right?

Unfortunately, the fix will almost certainly take a lot longer than you’re thinking it will take, and it will probably be a lot of work on your part to see it all the way through. And by the way, it’s unlikely that you get to give up any of your other problems to take this one on. But this is something you feel passionately about, and that should make the extra work worth it to you.

Step five: Think about how many of these you can actually do

Good job! You fixed the problem! It was probably a little bit harder to fix than you expected huh? Especially if it was a cultural thing that you needed to change. But it should feel good to have it fixed. You can see the improvement you were aiming for, and you’ve got a great story to tell.

Take a moment to reflect on whether it was worth the effort to you, and think about how many more things like this you see at the company you’re in that you really want to change just as much as that one.

Think about what else you could be doing with that extra energy. Finishing a critical project sooner? Hiring a few people who are more in tune with your way of doing things, who might be able to fix things for you? Writing a novel? Getting a personal best on your deadlift?

Pick Your Culture First and Foremost

Learning how to pick your battles is also about learning how to pick your company and pick your boss, because your job really shouldn’t be all or even mostly about battles. Going through this exercise of solving an unowned problem is fun once in a while, but it’s a real drag when you feel like you’re surrounded by such problems, you can’t ignore them, and you’re powerless to fix them. That is a good sign that it’s time to find a new job, preferably somewhere that is more in tune with your way of doing things. Life is so much more fun when you have people around you that you trust to solve problems, even the problems you have a lot of opinions about.

Flowchart courtesy twitter user
Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!