Buy My Book, "The Manager's Path," Available March 2017!

Sunday, August 14, 2022

The Product Culture Shift

Adding product management to more traditional software infrastructure organizations, sometimes with a shift towards platform engineering, is all the rage today. As someone who has done both these things, it doesn’t surprise me to see so many people struggling to make it work. Both of these shifts require going from a siloed, process, tech-focused mindset to a portfolio, usability, and customer-focused mindset. This is a hard transformation, and it’s easy for people who have spent their whole career building infrastructure to misunderstand what product and platform really mean. So I thought I’d share the secret to making this work.

Your whole engineering culture has to change.

Yes, seriously.

Infrastructure organizations tend to be good at many things. They are good at cost management, vendor negotiations, and running systems at scale. They have specialists who know the murky depths of databases, the nuances of networking, and how to debug nasty kernel issues. They may even be good at triaging bug requests from tens or hundreds of teams, planning large-scale failure tests, and coordinating massive migrations.

Unfortunately, they are not usually very good at thinking about the people who will use their systems, taking their preferences into account, and treating them like customers who they are trying to keep. Why should they? The people who use their systems are a captive audience! As I mentioned in my post about product for internal platforms, this is a major challenge that platform and infrastructure teams have to overcome to build great products.

This culture, with its focus on cost, scale, and process, over people and usability, is very hard to root out. And you don’t want to lose all those rare skills in the process. So what do you do?

You can’t just rub product managers on it and call it a day.

To start, let’s be clear about one thing: as tempting as it might be, just hiring product managers won’t fix this problem. Even if you could find enough good product managers who want this type of job, which you can’t, product managers are only useful when they are paired with willing engineering teams. If the engineering teams don’t feel a sense of ownership for delivering a great product to their customers, product managers are unlikely to close that gap, and they will more likely turn into glorified backlog groomers than true product leaders.

You need to change the way you support your products.

The ticket system black hole is a great way to make your customers feel more like a burden than a focus. I understand that it is hard to manage all the incoming requests for your teams, but taking a close eye to how you provide support, what your response time is for questions, and how you triage incoming issues is critical to this transition. Your engineers should spend time supporting their products. If they are not regularly answering questions, they are missing a chance to appreciate the pain that customers are facing when trying to use the systems. Be careful about making this optional, or leaving it to only junior engineers. Your senior folks will not build the kind of humane products that you need if they are incapable of interacting with the users in a polite and helpful way, no matter how brilliant they might seem. If you have someone you can’t trust to engage productively in help situations, watch out, because this person is probably not building products that are easy to use. Over time, you may find yourself redoing a lot of their work because it is harder to support and drives a high volume of complaints.

You need to update your interview process.

I recommend adding screening for what I call “customer empathy” to all of your interview lineups. This doesn’t have to be deep, it can be simple as asking them how they think about how they write code so that other developers can understand it, or what their approach is to answering questions about the systems they have built. But you want to set the tone that you expect your developers to think not just about how to build the systems, but about the people who are going to use them or work with them.

You need to update your systems of recognition and reward.

If you only promote people who solve big technical problems, you’re going to have a hard time retaining the people who do the work to smooth out the usability edges, actively listen to the customer teams, and adjust their work priorities to fix the stuff that is causing the most pain. So look closely at what you are celebrating, paying, and promoting, and make sure you are including work that makes the product better whatever that looks like, even if it isn’t the hardest technical bits. Remember, this is a cultural change, and cultural changes that don’t involve changes to what is valued when it comes to recognition and rewards are destined to failure.

Do you have too many project managers?

There may always be some need for project managers, but in infrastructure organizations, heavy reliance on project managers can result in a lack of up-front technical planning around one of the most common infrastructure team tasks: migrations. If your migrations are so painful that both your team and your customer teams need project managers to understand where all the dependencies lie and track what is happening, you are not taking ownership of the user experience for your software. Yes, migrations are part of your UX! I am astonished at how often infrastructure teams offer new systems that do not have compatibility with the systems they are replacing, and expect the customers to do all of the work to migrate to those new systems, often on a timeline dictated by the initiating team.

If you are moving to a platform model, you are going to need to own much more of the migration than you have up until now. That platform value add must include lowering the migration pain for customers, which means getting better at doing them entirely yourselves. By limiting the number of project managers now, you force engineers to face project management work that they will not want to do. And the good ones will realize that if they created automation to support the migration, whether it is detection of dependencies, compatibility bridging libraries, or abstractions that allow them to change the internals without changing the client libraries, they won’t have to do so much of that tedious project management work. By saving themselves time, they will save their customers time. So limiting project managers is a good forcing function. Just make sure that you are giving teams time to do this work, and not just making them miserable or shifting the project management onto your new product managers.

Your teams are going to spend more time talking to customers, and less time purely writing code.

There’s no way to shortcut the product mindset transition for your engineering team. You can’t just add a product manager or a customer advisory board meeting once a quarter and call it a day. The team will need to spend more time with the customers, and more time strategically planning for how to address holistic concerns, rather than just triaging the latest set of customer complaints. There will be an up-front cost as you change the way people work. They may complete fewer tickets or other process-oriented measures of productivity, and the pace of work might look slower than it did when they were just churning through a never-ending backlog of tickets. But over time, the work that is produced should be better, as measured by customer surveys, adoption, migration timelines, and eventually, engineering productivity.

Keep it fun!

By the time companies go through this transition, they are often in a deep state of us-vs-them between their infrastructure organization and their other business/product engineering teams. This situation never feels good to the infrastructure organization. No matter how much they may claim outwardly that their users are hopeless or ungrateful (or worse), it is simply not much fun to have an antagonistic relationship with your users, and to feel like you are deep inside of an us-vs-them dynamic with colleagues. So while this transition is going to be tricky, it can also be fun if you let it. Get feedback from your users about what they love about the product. Share kudos as they come in, and take the time to celebrate improvements in your customer satisfaction metrics. Make sure your teams are part of the celebrations when their work enables an application team to do something they couldn’t do before. This is an exciting opportunity, a chance to learn, to modernize your approaches to work, and to create a more positive culture, and leading with a positive attitude will make all the difference in how fun it is for everyone.

Wrapping up

In many ways, this cultural shift echoes the changes that happened during the “devops”/SRE transformation. Engineers in SRE-focused organizations do not build code that they carelessly throw over the wall to an operations team. In the same way, engineers in a product-focused organization do not build software without consideration of the users of that software. These transformations ask more of the engineering teams, but deliver higher-quality outcomes as a result. It’s expensive and takes time but I promise you, it’s worth it.

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Saturday, January 15, 2022

Structural Lessons in Engineering Management

Software engineers are attracted to formulas, algorithms, and structures. As people whose job it is to take ideas and turn them into predictable executable code, it is unsurprising that we’re drawn to ways of thinking that categorize and systematize things. This attraction continues as engineers become engineering managers and leaders. I should know, I wrote a pretty popular book that is full of best practices, tips, tricks, and process suggestions to apply to the various challenges of leadership.

However, I think it’s worth talking about the downside of going too far with structural thinking as it pertains to teams and organizations. Let’s take a common area where this structural approach happens, namely that of organizational structures and interactions.

There are certain rules that I believe in, one of which is that engineering managers do not scale well beyond about 7–8 direct reports. It’s hard to keep the context of what each of these people is doing and thinking about in your head well enough that you can effectively guide them. You end up spending an outsized amount of time on management tasks when your direct team size grows much beyond this number; or, even worse, you just start doing a bad job at those management tasks because you don’t have the time to do them well.

On the flip side, many organizations do not want managers who only manage a couple of people directly. Someone who manages only two people is often in a limbo state: are they an individual contributor? Are they a manager? The HR system is binary: if they have direct reports, they must be a manager. But the reality is that if you only manage a couple of people, most of your time is spent doing things other than management. If this person is a new manager, they may not be learning the things they need to learn to grow into a confident managerial role. On the other hand, if this person is and wants to remain an individual contributor, their management is likely to be a bit lacking for these two individuals.

The temptation is to move from these two beliefs into the notion that you must create a clean tree structure for your organization. Each manager manages three to eight people; when a team grows too large you rebalance the tree. All is clean, and easy; our organizational structure works.

Experienced managers know how quickly this falls apart. People quit, and suddenly the manager of 5 people only has 2 direct reports. Managers don’t just drift between teams depending on the immediate size of the organization, they have skills and interests that make them sticky to their group. Individual contributors get rightly frustrated when their manager changes too often, and they prefer managers who have some understanding of their work. The teams need to have some coherence of vision and purpose; hiring is never a steady drip but comes and goes in bursts. Your ideals quickly fall apart when faced with the reality of a living organization and the needs of the people in it, so your trees sometimes have skinny branches, and sometimes very fat ones. The best you can do is try to nudge it back into shape over time, or very occasionally, reorganize into something more appropriate.

Photo by Gilly Stewart on Unsplash

Another failure mode happens when leaders get enamored of applying engineering analogies to define team interactions. APIs are a fine concept but the idea that team interactions can be strictly defined by an API is as laughable as the idea that programmers won’t figure out a way to exploit every undocumented feature you provide in one. In this case, Hyrum’s Law applies as much to engineering organizations as it does to software. Humans will and must talk to each other, trade favors, and negotiate based on changing circumstances. When you are in the thick of trying to get things done, you will use what you can find, whether it’s an undocumented API or an engineer on another team who is willing to lend a hand.

Ironically, a lot of software engineers turned managers believe that they are taking humans into account with these rigid structures. When you see the world in systems, interactions, and efficiencies, you can trick yourself into believing that the humans inside of these systems will be happier if the system is well-organized. While I agree that there is value to organizing your teams effectively, the marginal value that the teams might experience is likely to be offset by the unintentional friction that happens for the people who don’t fit perfectly into your structure. If you don’t work to stay ahead of what the people in your team want and need, if you don’t create growth opportunities or appropriate technical challenges even though it’s not the most optimal thing for the system as it currently exists, you will find yourself losing good engineers and managers. The people in the system are much trickier than you are imagining, and the structure that accounts for their skills and happiness will have organic features as much as centrally-planned ones.

Managing a startup team at a new and growing company can lead inexperienced managers to believe that they can define the structures and processes to be perfectly tuned to what their company needs at the moment. I went through this phase myself, and I know the temptation to allow the ever-changing demands of startup life to justify changing and tweaking of teams and processes and projects in pursuit of the ideal execution machine. While managing a large organization at a more stable company, I’ve come to appreciate the cost and overhead of such a systems-focused approach. When you are thinking in terms of years and not months, it becomes important to accept that you may hang out in imperfect structures for many months or even longer, because you are trying to hang onto people for years.

So, new leaders, read those books and expose yourself to those ideas, but don’t stop there. Listen to your people, and don’t be afraid to break your perfect structures to accommodate their needs. They are your best source of learning, and your most valuable asset.

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online! 

Saturday, October 9, 2021

How New Managers Fail Individual Contributors

 Most companies have carefully created separate senior career tracks that provide details of the differences between being a manager and being an individual contributor (IC). And yet, many people still believe that you can’t get ahead without becoming a manager, and many companies who want more senior individual contributors struggle to promote people on this path. This is a shame; great engineers really shouldn’t need to manage large teams to get promoted, and companies lose out on a critical skillset when they push all of their good engineers into management.

Why do we have this problem despite all our efforts? I believe the problem with keeping people on the technical track starts with managers. Specifically, it starts with new managers. You see, most people become managers right at the point where career tracks split between “technical” and “management” specializations. The result is that many new managers have most recently been very technical, yet they have no idea what it means to climb the technical track, but they will be managing people who want to follow that path. To be a great manager, you can’t afford to let the ICs on your team feel that they have no career path, so it’s up to you to manage this well. Here are some common pitfalls that you should work to avoid.

1. Doing all the technical design work yourself

You’re just coming off of being an IC or maybe a tech lead, so you are still pretty deep in the technical details (especially if you’re now managing the team you were working on). You might still be writing some code, which is fine if you’re managing a very small team. But it is critical that you step back on the technical decisions to make room for the team members to own things and grow.

This is going to be hard because you may have a small team with a lot on its plate, and the other ICs may not have your skills at communication or project management. If you respond by filling in for their skill gaps, you are going to quickly hit two problems. First, you won’t be able to scale because you’ll be too busy doing technical stuff to take on a bigger team. Second, you won’t be able to scale because you won’t have a person to whom you can delegate.

2. Doing all of the project management yourself

This one you would probably love to give up, I know. If you have a very small team, as a manager you’re the right person to do most of the project management for the team. But for their career growth, your technical track folks also need to learn how to run projects themselves. The more senior you get on the technical track, the more that you will be expected to understand not only how to solve really hairy technical problems but how to break down the solution into milestones, and even into projects that can be worked on by multiple people.

Teaching someone how to run a project is painful, and they will often say that they don’t want to do the work, don’t want to learn it, and make your life difficult in the process. And yet, teaching your ICs these skills is one of the best things you can do for their future promotion prospects! Plus, it’s one of the best things you can do for your own future prospects. A manager who successfully creates a tech lead capable of solid design work and project management now has the bandwidth to take on more and expand their scope.

3. Neglecting to Give Feedback

Many new managers are comfortable giving technical feedback, and uncomfortable giving other kinds of feedback. They freely criticize the design and technical work of their team, but they don’t challenge their team members on other growth areas like collaboration, communication style, or project ownership. The result is the impression that management is the way to have technical authority over a group, which leads ICs to wonder what the technical track is even for.

One of the ways to give feedback that will stick is to give it in context of career growth. Take the time to understand the technical and non-technical skills that your company looks for in senior engineers, and use that framing to set goals on both of these aspects for your team. This will force you to pay attention to more than just the technical delivery, and make it easier to talk about non-technical areas for improvement as needed for future promotion.

4. Hoarding information

You’re now in a position where people will naturally pass information on to you. You may be in more planning meetings with the product team, or staff meetings with your boss and peers, and you may become the person who gets pinged directly when someone has a question or request for your group. This means that you’ve now got a lot more details about what is going on around your team, and this information is critical for you to lead your team well. You must distill this information and then communicate it to your team in a way that helps them understand their work.

When you don’t give your team the context for the work and just pass on tasks and work items to them, you make it clear that they are simply “doers” and your job is the job of “decider.” There is a fine line between giving the team focus time and excluding them from meetings where they would get necessary information and context to feel ownership of the projects. Your growth challenge is to learn the balance of providing information to the team and inviting them along to get that information, while not overwhelming them with meetings.

5. Focusing Too Much On Your Personal Output

As a manager, your output is not measured by your individual work. Rather, your output is measured by the work of your team and the people that you influence. The work you choose to do, and the work you choose to neglect or delegate, will lead to amplified outcomes in both positive and negative directions.

If you continue to focus on your personal contributions, such as writing code, technical design, and day-to-day decision-making, you will constrain the output of your team to only what you can fit into your schedule. If it’s your code that gets you over the finish line for every project, you aren’t providing multiplicative value for the team, you’re providing the additive value of your work as an engineer. When you turn your focus to the work you can do to improve the team’s output, by training them to do these tasks, ensuring that they work well as a team, and giving them the context they need to make decisions themselves, you now start to create multiplicative value. When they become more productive and less reliant on your hands-on work, your time is freed to identify bigger challenges. This is the path to growth for the whole team, but it’s hard to find if you’re heads-down in the details.

Conclusion

New managers, make sure that you aren’t trying to be a senior engineer who has direct reports. If your heart is in the code and systems, perhaps you should be on that technical track yourself! Otherwise, remember that your job is now about generating leverage by developing your team, which means delegating the technical work to them while helping them identify other skills they will need to successfully grow as an engineer. If you can do this, you’ll have a bright career in management, and a loyal group of amazing senior individual contributors to work with in the future.

Originally published on leaddev.com

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Saturday, June 12, 2021

An incomplete list of skills senior engineers need, beyond coding

For varying levels of seniority, from senior, to staff, and beyond.

  1. How to run a meeting, and no, being the person who talks the most in the meeting is not the same thing as running it
  2. How to write a design doc, take feedback, and drive it to resolution, in a reasonable period of time
  3. How to mentor an early-career teammate, a mid-career engineer, a new manager who needs technical advice
  4. How to indulge a senior manager who wants to talk about technical stuff that they don’t really understand, without rolling your eyes or making them feel stupid
  5. How to explain a technical concept behind closed doors to a senior person too embarrassed to openly admit that they don’t understand it
  6. How to influence another team to use your solution instead of writing their own
  7. How to get another engineer to do something for you by asking for help in a way that makes them feel appreciated
  8. How to lead a project even though you don’t manage any of the people working on the project
  9. How to get other engineers to listen to your ideas without making them feel threatened
  10. How to listen to other engineers’ ideas without feeling threatened
  11. How to give up your baby, that project that you built into something great, so you can do something else
  12. How to teach another engineer to care about that thing you really care about (operations, correctness, testing, code quality, performance, simplicity, etc)
  13. How to communicate project status to stakeholders
  14. How to convince management that they need to invest in a non-trivial technical project
  15. How to build software while delivering incremental value in the process
  16. How to craft a project proposal, socialize it, and get buy-in to execute it
  17. How to repeat yourself enough that people start to listen
  18. How to pick your battles
  19. How to help someone get promoted
  20. How to get information about what’s really happening (how to gossip, how to network)
  21. How to find interesting work on your own, instead of waiting for someone to bring it to you
  22. How to tell someone they’re wrong without making them feel ashamed
  23. How to take negative feedback gracefully

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Saturday, May 29, 2021

Management Basics: Determining a Performance Rating

 originally posted on LeadDev.com

One of the most stressful parts of the end-of-year process for managers is the dreaded performance rating. This process forces you to boil down all of the work that a person did over the year, all of their accomplishments and misses, into a numeric score (often from 1–5) that may also come with words like ‘meets expectations’, ‘exceeds expectations’, or the unhappy ‘misses expectations’.

If you work for a company that has a ‘pay for performance’ model, your rating will influence the employee’s compensation. It may be used as an input for promotions, and yes, as a factor in firing or laying off employees. And so while this process is painful, every manager needs to get comfortable with assigning ratings to their engineers, and justifying those ratings to both the person receiving the rating, and potentially to their peers, boss, or other stakeholders who have a say in the distribution of rating scores at the company.

Your manager rating is a combination of measurable inputs and your own judgment. Today, I’m going to focus on how you can start to get comfortable with this step, by developing your own judgment and the supporting metrics you can use to guide your ratings.

Evaluations start long before it’s time to actually determine a rating

The first major input in any kind of fair evaluation is based on the work the employee needed to accomplish, and the work they did accomplish. This process starts with both goal-setting and a review of the expectations of the role, as well as any areas you already know they need to improve on. This may seem obvious but, as we’ll see, it is actually a tricky thing to get right!

As an example of using goal-setting as an input in performance rating, let’s explore a method that splits goals up into three categories: must be achieved, stretch goals, and moonshot goals.

By setting ‘must be achieved’ goals, you inform the employee on what the important work for the year should be, given their level and role. Then you collaboratively think about what stretch goals might look like, including a goal or two that would be really outstanding if they could achieve it — a moonshot goal. When you have your whole organization calibrated on how to set these kinds of goals, you can use them as a critical input for performance rating. If someone met the base goals, they probably met expectations. If they met the base goals and some of the stretch goals, they exceeded or substantially exceeded expectations. If they achieve base, stretch, and a moonshot goal, that would imply a year of incredible achievement that would support the highest rating of all.

Of course, the downside of goal-setting is that goals often become irrelevant in the course of a year. This is where your judgment must come into play. When they missed a goal, was it because it became irrelevant, because they failed to execute, or because they were unavoidably busy with unplanned critical work? Ideally you revisit goals regularly and adjust them when they become irrelevant, but I’m a realist and know that many people forget to do this or get too busy to bother. It’s a lot of work to constantly track this! So most of us must get comfortable with looking across the scope of goals (hit, missed, and deferred) and value them as they are.

Judgment is a major part of evaluating more senior people’s goals, particularly more-senior managers. On the one hand, they will tend to be responsible for the achievements of a team of people, and many things can happen to a team in a given year to derail their goals. On the other hand, the more senior a manager gets, the more they are expected to anticipate ways in which their team may fail to achieve their goals; this anticipation and course correction is part of performing well as an experienced manager. Only you can tell the difference between a manager who was blindsided by events and a manager who just failed to plan well, or who couldn’t course correct their team effectively.

Decide on your own axes of evaluation and attempt to apply them evenly

A manager I used to work with had a very methodical approach that he used to evaluate managers on his team. He had seven characteristics of management that he considered essential to doing the job, and would score each manager on each area, then roughly average them to get a final score. A different manager that I worked with at Rent the Runway did something similar with our four engineering ladder attributes. She would grade each person based on their level and role, and use that to justify how she rated her team.

I find this approach to be a helpful part of my ratings process. Looking across a set of attributes that I believe are important forces me to think about all of the skills a person brings to the table, and how they seem to be doing at each of them. This helps me identify strengths and weaknesses, and structures my thinking when I am evaluating across people in similar roles.

Rating by category works when you have a lot of clarity about what is important (as in the seven characteristics of managers), or a ladder that is well-written to support this. But it does have its downsides, and these will become apparent when you try to put this into practice. People are not easy to put into boxes. You may have a manager that is incredibly weak in one area and incredibly strong in another. Does this average out? Or are they actually underperforming, because their weak area is so essential that it means they aren’t doing their job? Or on the flip side, are they over-performing because, for this role and team, the weak area doesn’t matter so much? Now you have to add judgment into the mix.

It’s also hard to apply this model when you have a team of people who all have somewhat different jobs. It’s very hard to write level criteria that works well for both front-end and back-end engineers, let alone systems specialists, reliability engineers, and DBAs. Then add in the fact that you may be managing someone who is in a bespoke role, say developer relations or technical writing. Now you may not have enough data points to make sure that you are rating the person fairly, because there is really no one else for you to compare them to.

The final component has to be your own judgment

This brings us to the final aspect of performance rating: manager judgment. As with all things in the world of people management, there is no perfect algorithm you can apply to ensure total fairness and accuracy. You do not want to set yourself up to deny good ratings to people doing good work when they don’t fit perfectly in a box, or when they had all of their goals upended by a strategic shakeup halfway through the year. But you also don’t want to unfairly reward or punish people through an arbitrary process that relies on how much you like or dislike them.

Setting clarity at the beginning of the performance evaluation period via goal-setting and job alignment is important because it gives your employees a clearer idea of what they are going to be evaluated against. Breaking roles down into components and rating each person against each component helps you consider a balanced picture of someone’s strengths and weaknesses across the areas that matter. But ultimately, these inputs are merely some of the data you need, and you must consider the full picture of their work against the ever-changing requirements and challenges of your workplace.

The most interesting and useful part of this exercise is comparing what you get from this data against your gut reaction to the rating you think someone should get. If you are open-minded, the data will show you that you’re off both high and low in different cases.

This is an opportunity for you to broaden your thinking: what aspects of this role are really important but unstated in our level guidelines? It will force you to postmortem your own leadership: how did we miss on the goals here so badly? And sometimes, it will force you to acknowledge your own bias: why do I always want to give this kind of person a lower rating even though objectively their work looks as good as this other kind of person?

Your own ratings are rarely the final step. Most companies force an alignment across teams and managers via a calibration exercise in order to ensure ratings are fairly applied across the company. But if you spend good time up-front getting very clear in your own mind the rating you believe someone deserves, and the reasoning behind that rating, you are well-prepared to go into that calibration exercise and defend your ratings with thoughtfulness and care. And you want to be prepared, because once the rating is set, you’ll have what may be the hardest conversation of all: the one where you share the rating with the employee.

Enjoy this post? You might like my book, The Manager’s Path, available on Amazon and Safari Online!

Sunday, January 24, 2021

Make Boring Plans

You’re probably familiar with the concept of Choose Boring Technology. If you’re not, I’ll wait for you to read the excellent blog post by Dan McKinley that inspired a much-needed correction in tech to balance “innovation” with stability. I’m here to take this to the next level, and talk about how “boring” should apply not just to your technology choices, but to your plans.

I spoke to someone several months ago who was frustrated with their management chain. They were anxious about the fact that the management chain was always pushing on delivery in an unpredictable way. The team felt really high pressure, even though the projects they were working on were all part of long-running infrastructure renovations. Why was this so stressful? Why, they asked, was the plan not already laid out? Why isn’t this boring?

Why isn’t this boring?

It might say something about the area that I focus on, Platform Engineering*, that “why isn’t this boring” would ever come up. You see, usually when people are in this situation, they blame everything but the lack of planning for their problems. It is a common belief in engineering that, with a clear enough vision, the rest of the pieces of work will fall into place. With a well-understood goal and smart engineers, the idea is that you can trust that people will work towards that vision faithfully and deliver something great. And this does, in rare cases, seem to work. After all, half of the hiring wisdom of the past has been “hire smart people and get out of their way.” Magic can happen with a small, highly-motivated group of people building a new thing towards a clear goal.

However, this concept of building towards a grand vision falls apart when you are building the underlying software that other engineers rely on. For better and for worse, Platform often has to be the place where we push new things to the rest of the company. A big change in the platform is the definition of an innovation token being spent. You want to move to Kubernetes? You’re gonna spend a lot of time figuring out how to operate it well in your environment, to start. You want to support a massive monorepo for the whole company? Hello innovation tokens everywhere, as you try to make it scale and perform well for all of your engineers and all of the languages they want to use. Speaking of new languages, you want to introduce Rust, or O’Caml, or even just C++17? The platform will have to support it.

Before you go blaming the Platform team for spending all of the innovation tokens for the company, remember that these initiatives are often driven by someone else. If Platform doesn’t support Kubernetes, some team will decide to build shadow infrastructure because they’re convinced that it will solve the problem they have to handle with their tens of microservices, and then it will land in the lap of Platform after a year with none of the work done to make it easy to operate, but all of the operational expectations anyway. Our goal is to build just enough ahead of you so that when you realize you need the capacity, it’s there, or can be with minimal fuss, and it’s reliable to boot.

Novel Technology Deserves Boring Plans

Since we often end up in the land of novel technology, we owe it to ourselves and our customers to be boring in other ways. And the most important way that a Platform team can be boring is by writing boring plans.

It’s great to have a vision for the future of the platform. To achieve this vision, a non-trivial amount of our job is not just building new big, scalable, complex software infrastructure, but moving everyone from the last generation of this software infrastructure to the next generation. Upgrade the programming languages, the operating systems, the libraries. Move from OpenStack to Kubernetes, from on-prem to the cloud, from maven to bazel, from svn to git. Migrate from the old storage system that was optimized for a rare legacy usecase, to a new storage system with higher availability and performance.

Making these changes happen, under the covers, has both interesting parts and boring parts. If you’re not a platform engineer, you shouldn’t see the interesting parts. The interesting parts are where we go and tune the kernel to perform well for our workloads. The interesting parts are where we build out automatic failover, so that we can meet the availability needs of the workloads. The interesting parts are the many patches we might contribute back to the inevitably-broken open source projects that hold half the world together but still don’t seem to understand how to work with FQDNs. The interesting parts are where we understand deeply the dependencies of our technology stack, the opportunities and limitations, and build solutions for our customers that fix limitations and unlock new opportunities.

When we don’t attend to the boring parts by making our plans predictable, the interesting parts turn into extra stress on top of the overwhelming anxiety of juggling these moves. When you make plans that start and end with the vision “we will move everyone to the public cloud, and it will be great,” you find yourself in the exhausting situation of running all of your old infrastructure, trying to figure out the new cloud stuff, and dealing with customers who are confused and angry that the thing they want to do doesn’t seem to quite work in either world.

Contrast this to the team that turns that vision into boring plans. They start with a small proof of concept, migrating perhaps a single application and learning in the process. Then they do the work of looking across other applications on the old platform, to see which ones are similar to the one that is now in the cloud. They work with those users to get them migrated and running, all the while gaining comfort with this new environment and uncovering the interesting gotchas. They write down what they’re learning, so that each new step in the migration builds on the last, and others can be pulled in without a huge knowledge transfer. The team focuses on the hard parts of the moment, whether they are figuring out data mirroring, or fixing a bug in a popular open source project, and they are free from the anxious overhead of wondering what is happening tomorrow. The users are also free from the stress of wondering when the work they need will be delivered, because the team has communicated plans that account for this process of iteration, learning, and gradual migration.

A Strategic Plan Is Obvious and Simple, Even Boring

Making boring plans is a foundational step in getting good at setting engineering strategy. Strategy is often confused with innovation and vision in tech circles, but they are far from the same thing. Having a future vision and recognizing the potential of innovations is valuable in building great strategy, but strategies that rely on unproven magic bullets are not good strategies. Good strategy identifies a problem with the current situation, proposes a principled approach to overcome it, and then shows you a coherent roadmap to follow. Strategy is not in the business of razzle-dazzle, it’s in the business of getting to the core of the issues so that the solution becomes simple and obvious. Good strategy provides the clarity that enables boring plans.

To become great at technology strategy, start by getting good at making boring plans. Get clear about the problem you are overcoming with your plans. Make the principles of the work at each stage clear:

  • How do we know when we’re in exploration mode, and how do we know when we’re ready to commit to a direction?
  • Have we talked to our users? Do we understand how they are using our systems, and have we made plans that account for their needs?
  • What are the problems we’re focused on solving right now, and which problems are we leaving to worry about another day?
  • How do we know if we’re on the wrong track, what are the guardrails, milestones, or metrics that tell us whether the plan needs review?

Your teams need more than a clear idea of the end state and the hope that smart engineers will inevitably get you there. Plans that are formed around hope are failing plans; hope is not a plan. Plans that change constantly are failing plans. When your plans are constantly changing, it is a sign that you either are making plans that express a certainty you don’t have, or you haven’t done your research to get the right certainty in place. Either of these is a waste of time and an unnecessary stress on the team.

So leaders, you owe it your teams, and to your users, to free them from the tyranny and stress of uncertainty. You must do the work to go beyond vision, create concrete actions, and make boring plans.

Saturday, November 21, 2020

Driving Cultural Change Through Software Choices

 




This tweet got me thinking about change, and how software engineers (and especially, Platform teams) can drive cultural change throughout companies.

First, let’s take the question. You want to change the engineering values that your company is expressing. You don’t just want to create a heavyweight process (your checkin fails if you don’t reach X code coverage, for example), you want engineers to start to value these things enough that they don’t need a process to enforce them.

I’ve driven and watched culture change happen enough times to know how to do it from the position of senior leadership. You change what you reward and focus on, and repeat that change enough that people will start to naturally change their perspective (or, sometimes, leave). If you go from putting all of your focus and attention on new projects and turn your prioritization to stability, taking the time to praise teams who improve their stability, promoting engineers who complete projects that are related to stabilization, and crucially, set prioritized goals for your teams to work on stability, your culture will change from prioritizing new projects to prioritizing stability.

This is a powerful force, but it is slow, and what’s worse, it can have negative consequences. You don’t want teams who are afraid to do new things for fear that they will be punished for a lack of stability, for example. And if you overcorrect for a perceived cultural gap, you can end up chasing away otherwise great engineers who believe that their skills are no longer wanted or valued because they no longer have an important skill set. So trying to make your teams change their technology approaches purely via a cultural focus is not always the best approach.

There is another lever that is available, however, particularly to platform developers. And that is the lever of product features.

I’ve never met an engineer who didn’t occasionally copy-paste-modify some code. One of my earliest professional software lessons was that when you set up a codebase full of tests, other engineers are likely to write tests for their code because there will be lots of examples for how to test. This generalizes to the observation that people are most likely to take an existing thing and tweak it into a new thing that does what they need, and in the process they will take the good and bad from that existing thing. So if you want them to follow a best practice, put it in their starting templates.

Take building a service. If you start with nothing but a system that runs a simple web server, you might go through the effort to also set up metrics and monitoring and healthchecks, but you might also feel like you’re busy and you just want to get the code that you absolutely must write done. On the other hand, when you start with a service framework that is already set up with metrics and monitoring and healthchecks, you’re more likely to do a little bit of work to make those at least mildly useful. This was one of the insights that Dropwizard gave me back in the day: pre-integration of stuff that you really need to run a service well means that your services are better from day one.

Platform developers these days get this. Your infrastructure software now comes out of the box with observability hooks built in. We’re all probably more attuned to basics of creating reliable software now because our tools push it on us from the get-go, so there’s no cultural revolution necessary.

We can go further than observability. Security can be a process, or a cultural value, but you can also go quite far by providing tools and platforms that have good security practices baked in to them, so that you’re not relying on the good citizenship of your development team. Testing is often hampered by the overhead of running tests, and investment into infrastructure that makes tests easy and fast to run is important to supporting a culture of software quality validation.

All of this is to say that developers have more power than they imagine to change the engineering culture around them. As you build software that others will use or that your peers will work on, are you making it easy for them to do the right thing? If you build platforms, bake in easy integrations for the software values you want to see. If you’re in the position to choose new tools, pick ones that support the standards you want taken seriously. And as you write code, make it easy for others who will copy-paste what you’ve done to then do the right thing.