Buy My Book, "The Manager's Path," Available March 2017!

Friday, January 6, 2017

How Do Individual Contributors Get Stuck? A Primer

Occasionally, you may be asked to give constructive feedback on your peers, perhaps as part of review season. If you aren’t a naturally critical person but you want to give someone a valuable insight, you may find this task daunting. To that end, I suggest the following:

Pay attention to how they get stuck.

Everyone has at least one area that they tend to get stuck on. An activity that serves as an attractive sidetrack. A task they will do anything to avoid. With a bit of observation, you can start to see the places that your colleagues get stuck. This is a super power for many reasons, but at a baseline, it is great for when you need to write a review and want to provide useful constructive feedback.
How do people get sidetracked? How do people get stuck? Well, my friend, here are two incomplete lists to get you started:

Individual Contributors often get sidetracked by…

  1. Brainstorming/architecture: “I must have thought through all edge cases of all parts of everything before I can begin this project”
  2. Researching possible solutions forever (often accompanied by desire to do a “bakeoff” where they build prototypes in different platforms/languages/etc)
  3. Refactoring: “this code could be cleaner and everything would be just so much easier if we cleaned this up… and this up… and…”
  4. Helping other people instead of doing their assigned tasks
  5. Jumping on fires even when not on-call
  6. Working on side projects instead of the main project
  7. Excessive testing (rare)
  8. Excessive automation (rare)

Individual Contributors often get stuck when they need to…

  1. Finish the last 10–20% of a project
  2. Start a project completely from scratch
  3. Do project planning (You need me to write what now? A roadmap?)
  4. Work with unfamiliar code/libraries/systems
  5. Work with other teams (please don’t make me go sit with data engineering!!)
  6. Talk to other people (in engineering, or more commonly, outside of engineering)
  7. Ask for help (far beyond the point they realized they were stuck and needed help)
  8. Deal with surprises or unexpected setbacks
  9. Navigate bureaucracy
  10. Pull the trigger and going into prod
  11. Deal with vendors/external partners
  12. Say no, because they can’t seem to just say no (instead of saying no they just go into avoidance mode, or worse, always say yes)
“AHA! Wait! Camille is missing something! People don’t always get stuck!” This is true. While almost everyone has some areas that they get overly hung up on, some people also get sloppy instead of getting stuck. Sloppy looks like never getting sidetracked from the main project but never finishing anything completely, letting the finishing touches of the last project drop as you rush heedlessly into the next project.

Noticing how people get stuck is a super power, and one that many great tech leads (and yes, managers) rely on to get big things done. When you know how people get stuck, you can plan your projects to rely on people for their strengths and provide them help or even completely side-step their weaknesses. You know who is good to ask for which kinds of help, and who hates that particular challenge just as much as you do.

The secret is that all of us get stuck and sidetracked sometimes. There’s actually nothing particularly “bad” about this. Knowing the ways that you get hung up is good because you can choose to either a) get over the fears that are sticking you (lack of knowledge, skills, or confidence), b) avoid such tasks as much as possible, and/or c) be aware of your habits and use extra diligence when faced with tackling these areas.

Wednesday, January 4, 2017

Hey Diddle Diddle, Data to Fiddle

When I worked in finance ages ago, there was a system used by many (but not me!) that was basically a combination of a gigantic distributed database plus a scripting language that allowed you to run calculations over information in that database. One of the things that you could easily do, as far as I understand, was "diddle" a piece of information. The "diddle" would change that piece of data inside of a particular scope, so that you could quickly see different calculations over the graph, without necessarily persisting that data back to the larger system. This was a useful construct for exploring what might happen with changes to different input data and exploring different scenarios. (The first half of this blog post provides some insights into how the system might have worked).

Whether my understanding is exactly right or not is irrelevant except that this concept of "diddling" stuck with me. There are times when what you want to do is take persistent data, make a small change to it on the fly, and use the results of that change without necessarily persisting it back to the original data set.

I've often thought that this concept is particularly useful in places like personalization. Imagine the situation where you have a complex set of results that you wish to display to a user, like say, the google search results. We all know that the ranking system for google results is a complex beast, relying on a huge amount of precomputed data, for example, the links between pages. But now, to compute that graph with personalization taken into account? You're probably not calculating all personalization vectors on the fly, but when I go to search for "java" you're also probably not doing a lookup of pre-computed personalized results for "java + userid:camille." Instead, you're applying a "diddle" function to the top set of the overall graph, and showing me the results in the diddled order that makes sense for me.

There are two parts to the concept that make it powerful for me. The first is the idea that you are changing things temporarily. To serve large sets of results fast (or, in the case of google, to be able to function at all), you need to pre-calculate a huge amount of data. You're doing a complex piece of work that takes some time, you don't want to have to redo it for every request. However, you don't want to force yourself to store all the work for all possible scenarios up-front. But, diddling in my mind presents a second element: it is only applied to the set of data within a limited scope. You don't diddle across the entire search index. You diddle the first few results, the ones that matter to the user in question.

There can be a ton of technical complexity to implement such a concept in practice. One immediate challenge is that of "diddling" in such a way as to drop results from the top set, thus requiring a re-querying for additional responses to get enough data to satisfy the user. The purpose of this post is not to go into the technical details of how you might implement such a thing, but to show you that you can reframe your thinking on a problem like this through its phases. Just because you have a list of thousands as your first pass of results doesn't mean you need to personalize across that whole data set to get the best results for the end user. If your goal is to get the most personally relevant of the most generally relevant, you probably want to operate on the top of the generally-ordered list, not necessarily the whole list itself.

There's many ways to attack such problems, and I have no idea how companies like Google solve the challenge of personalizing results under the hood. I do know that, to me, the idea of mixing indexed and computed results in personalized querying is a sticky one, and it's an analogy I use frequently. It helps you remember that there is value in the underlying order of the results as provided by the source of truth, and that personalization is often an enhancement on an underlying set of computed data, not the fundamental computation itself. Pulling out to a larger picture, remember that when your data gets in front of a human end-user, they are going to operate on only the tiny surface area they, as a human, can process at any one time. So in cases where you're serving humans, you can apply different patterns on the fly to the human-visible surface area that would be too expensive to apply to the entire data set at large.

Tuesday, November 29, 2016

Building and Motivating Engineering Teams

I have agreed to give a guest lecture for a class at Yale, and they’ve asked me to speak about “building and motivating engineering teams” from the perspective of a smaller startup. The readings for my section include A Field Guide to Software Developers by Joel Spolsky. I remember reading it when it was first written. I admire Joel’s work, and the piece has many valuable takeaways.

However. The industry has changed a lot in the years since this piece was written. In 2007, the options for engineers here in NYC were far more limited. You worked for a bank, a media company, ad tech was starting to blossom, some e-commerce. Google or Fog Creek if you were lucky. There were plenty of jobs but there was far less of a classic “tech company” presence than there is today.

Over the past 9 years, we’ve seen a massive increase in the number of startups here in NYC. Every major tech company has an office here, the Google office alone has thousands of engineers. We’ve also seen an increase in the supply of engineers. Between students realizing the value of a tech degree, people changing careers, and bootcamps rapidly churning out graduates, the mix of types of people who write software has changed. Most of the people I knew writing software in 2007 had tech or tech-adjacent (math, physics) degrees. My team when I left Rent the Runway had a significant number of developers who had moved into tech from other areas, some without degrees at all.

All of this is to say, playing to 2007 nerd stereotypes is not always a good way to build a team here in NYC. What DO engineers want? I believe that you have 3 axes with which to twiddle. Depending on your company, you can lean more on one to cover problems with the others, but you have to find your balance.

Money. Joel hides this at the bottom of his piece. It is correct to say that engineers don’t care about money above all else, but far too many people delude themselves into thinking that this means that you can lag industry standard salaries and build a good team on the basis of some other factor.

Salaries have gone way up in the past 10 years. People know that they can get paid a certain amount of money, and if you try to hire people who can easily make 50% more somewhere else, they are almost certainly not going to accept your offer. Figure out what the market spread is. The top will be companies like Google, Facebook, some financial companies. The bottom might be non-profits or extremely early startups with significant equity grants. Even a startup with 10–20 people is going to struggle to hire without paying close to market rate. Engineers are expensive. New engineers are expensive. Senior engineers are really expensive. They know their value. You have to pay them.

You are building a company. You are not going to have a perfect, smooth ride. Things will go well, things will go poorly. Very few teams have a straight shot to greatness that is so clear it can cover all management woes. When you don’t pay people well enough, you contribute to undermining their resilience in the face of problems at work. Think of it as the baseline of Maslow’s Hierarchy. Money does not solve all problems for most people, but lack of money exacerbates all irritations.

Purpose. You’re building a company. You want to inspire people to work for you, so you sell them on the mission of the company, the product they’ll be working on. You have to do this because you are not building a company with a bunch of insanely hard tech problems. If you are, you can lean on that to hire people, but to be real, these days most of us are not. The days when everyone had hard technical scaling problems have ended. Scaling is not the same bold new frontier that it was even 5 years ago. Sure, there are always technical challenges to be found in organizations, but for many companies those technical challenges revolve around matching technology with the product and business. This means you have to work harder to sell the learning opportunities.

Leadership undermines their teams when they refuse to let engineers into the non-technical decision-making processes. In Joel’s article he talks about developers wanting to be allowed to make decisions within their own realm of expertise, which is certainly a bare minimum. I would encourage you to go further than that. If you are building a product-focused business, where the challenges are less purely technical and more about engaging a customer, your engineers need to feel connection to that business. They need to feel like they understand it, like they can have ideas about it. This is why I’m such an advocate for cross-functional product development teams. Put engineering, product management, marketing, operations together as a group and let them work as a team to solve problems. Don’t just throw work over the wall to engineering and expect them to implement it.

Respect. The undercurrent that I like least in Joel’s piece is the undercurrent that engineers need to be coddled and pampered. Giving them “toys” instead of “tools,” keeping them out of the politics. There are plenty of engineers who want to be given hard work to do and left alone, in their private offices, to think and code. But there are increasingly more engineers who want to build businesses, and they want to be treated like adults in the process.

Our engineering teams are not overgrown children. They are not idiot-savants who can produce software but must be given sufficient cookies to do so. They are highly paid professionals. Let’s treat them that way. You should expect your engineers to show up for your business. Respect is not pampering, it is not treating the team like the stars of the show. Rather, respect is challenging the team to show up and grow up. Respect is giving them clear, achievable goals and holding them accountable. My experience has been that most great engineers want to work somewhere that inspires them to achieve. Many of us stop at the idea of “hard technical problem” when we think about inspiring our engineering teams, but challenging them to partner with people who have different perspectives is another way you can help them grow.

Respect that engineers are smart individuals who often have more to add to your business than just their coding talents, and teach them to respect that the other parts of the business have equally valuable skills and perspectives. Engineers don’t need to feel like the company royalty to be inspired to do good work, but they do need the opportunity to be treated like a partner.

These days, you have a lot of competition for talent, but you also have a lot of talent to choose from. Understand your company’s positioning. If you can’t pay top of market, you will have to rely on a balance of finding undeveloped talent and giving engineers other reasons to want to work for your company. For most of us, that means giving them a voice beyond the purely technical, and challenging them to see and understand perspectives outside of engineering.

Friday, August 19, 2016

Microservices: Real Architectural Patterns

A dissection of our favorite folk architecture


I’m fascinated by the lore and mystery behind microservices. As a concept, microservices feels like one of the most interesting folk architectures of the modern era. It’s useful enough to be applied widely across different usage patterns and also vague enough to mean many different things.

I’ve been struggling for a while with understanding what people really mean when they discuss “microservices.” Despite deploying what I would consider to be a version of that pattern in my last gig, it’s quite clear to me that the architecture we used is not the same as the pattern that all other companies use. Recently I finally interrogated someone who has deployed the pattern in a very different way than I have, and so I decided it would be illustrative to compare and contrast the circumstances of our architectures for those in the larger technical audience.

This article is going to have two examples. The first is the rough way “microservices” was deployed in my last gig, and why I made the decisions I made in the architecture. The second is an example of an architecture that is much closer to the “beautiful dream” microservices as I have heard it preached, for architectures that are stream-focused.

Microservices Basics

I think that microservices as an architecture evolved due to a few factors.
  1. A bunch of startups in the late 2000s started on monoliths like rails, scaled their business and team quickly, and hit the wall on what could reasonably be done in that monolith
  2. The cloud made it significantly easier to get access to a new server instance to run software
  3. We all got much more comfortable with the idea that we were dealing with distributed systems and in particular got comfortable making network calls as part of our systems
This combination of factors — scaling woes, easy access to new hardware, distributed systems and network access — played a huge part in what I might call “microservices for CRUD.” If you have managed to scale a company to a certain level of success on a monolith but you are having trouble scaling the technology and/or the engineering team, breaking the monolith into a services-style architecture makes sense. This is a situation I encountered first-hand.

The arguments for microservices here look something like:
  1. Services allow for independent axes of scaling. If you have a part of the system with higher load or capacity requirements than other parts, you can scale to meet its needs. This is certainly doable in a monolith, but somewhat more complicated to reason about.
  2. Services allow for independent failure domains, to a more limited extent. Insofar as parts of your system are independently operable, you may want to allow for partial availability by splitting them out into services. For example, in a commerce app, if you can serve the checkout flow even when the product search flow is down, that might be considered a good thing. This is much more complicated in practice than it is in theory, and people make many silly claims about microservices that imply that any overlap in services means that they are not valuable. Independent failure domains are sometimes more of a “nice to have” than a necessity, and making the architecture truly account for this is not easy.
  3. Services allow for teams to work independently on parts of the system. Again, you can do this in a monolith. I have done this in a monolith. But the challenge with monolith (and a related challenge with services in a monorepo (single source repository)) is that humans struggle to tangibly understand domains that are theoretically separate when they are presented as colocated by the source code. If I can see all of the code and it all compiles together and feels like a single thing, my tendency is to want to use it as a single thing. Grab code from here to use there, grab data from there to use here, etc.
A few more notes. “Monolith” and “monorepo” often get tangled up when talking about this world. A monolithic application is one where you have a set of code that compiles into a single main server artifact (possibly with some additional client artifacts produced). You can use configuration to make monoliths do almost anything you can imagine, including all of the services-type things above, but the image produced tends to include most if not all of the code in the repository. This does get fuzzy because sometimes teams evolve their monoliths to compile to a couple of specialized server artifacts via a combination of build tooling and configuration. I would generally still call this a monolithic architecture.

Monorepo, or monolith repository, is the model where you have a single repository that holds all of the code for any system you are actively changing (so, possibly excluding the source code for your OSS/external dependencies). The repository itself contains source code that accounts for multiple artifacts that are run as separate applications, and which can be compiled/packaged and tested separately without using the entire repository. Often this is used to enable certain shared libraries to change across all of the services that use those libraries, so that developers who support shared libraries can more easily evolve them instead of having to wait for each dependent team to adopt the newest version. The biggest downside of the monorepo model is that there’s not much OSS tooling that supports this, because most OSS is not built this way, so large investments in tooling are usually needed to make this work.

Microservices for CRUD-based Applications

Before I get to how to evolve a CRUD monolith to microservices, let me further articulate the architecture needed to build your traditional mid-sized CRUD platform. This type of platform covers a use case that is pretty well-trod, that of “transactions” and “metadata.”

Transactions: User does an action that you want to persist, consistency of data is very valuable. The “Create, Update, Delete” of CRUD. Much less frequent than the “Read” actions of CRUD. 
Metadata: Information that describes things to the users, but is usually only modified by internal content creators, or rarely by external users (reviews, for example). Changes less frequently, often highly cacheable. Even more, can often tolerate a degree of temporary inconsistency (showing stale data).
Are there more things that CRUD-heavy companies want to do, especially in the analytical space here? Sure. You may want to adjust results frequently based on user behavior as the user is browsing the site, and other personalization actions. However, that is a hard thing to do real-time and you don’t always have the volume of data you need from the user to actually do that well, so it isn’t generally the first-order concern of the system.

The process for moving off of a monolith in this type of architecture is relatively straightforward:
  1. Identify independent entities. This paper by Pat Helland, “Life Beyond Txns”, has some useful and interesting definitions there. It’s better to go a little bit too big early than to go too small and end up having to implement significant distributed transactions. You probably want data-owning services for the major business objects(products, users, etc), and then sets of integration services that implement aggregations and logic over those objects.
  2. Pull out the logic into services entity by entity. Try not to change the data model as much as possible in this process. Redirect the monolith to call APIs in the new services as functionality is moved.
That’s basically it. You pull pieces out until you have enough to cover a particular set of user functionality in data and integration terms, then you can start to evolve that part of the user functionality to do new things in the services.

These services are not classic SOA, but nor are they teeny-tiny microservices. The services that own the data may be fairly sophisticated. You may not want to have too many services because you want to be able to satisfy requests from the user without having to make a ton of network hops, and ideally, without needing to do distributed transactions.

You are probably not making new services every day, and especially if you have a sub-50-person engineering team and a long product roadmap, you may not want to invest extensive engineering time into complex orchestration and tooling that enables people to dynamically add new services at the click of a button (nb: the products to support this are getting better all the time, and so at some point this will be worth doing even for that smaller team. It is unclear to me whether that time is now or not.).

The equation to apply for determining how much to invest in tooling is pretty straightforward: how much time does it cost devs to have a less automated process for adding a new service, vs how long does it take to implement and maintain the automation for doing it easily, and how many new services do you expect to want to deploy over time? You’re making a guess. Obviously, if you think there is value to enabling people to spin up tiny services fast and frequently, it is better to invest time and tooling into this. As with all engineering process optimization decisions, it’s not a matter of getting it perfectly right, but rather, of deciding for the foreseeable future and periodically re-evaluating.

There are many microservices “must-haves” in this instance that I have found to be anything but. I mentioned extensive orchestration above. Dynamic service discovery is also not needed if you are not automatically spinning up services or moving services around frequently (load balancers are pretty nice for doing this at a basic level).

Allowing teams to choose their ideal language, framework, and data store per service is also certainly not a must-have and in fact it’s likely to be far more of a headache than a boon to your team.
Having independent data stores for the services is also not a must-have, although it does mean that you will have a high-risk SPOF on the shared database. As I was writing this piece I discovered a section of some writing on microservices from 2015:
Create a Separate Data Store for Each Microservice
Do not use the the same back-end data store across microservices. … Moreover, with a single data store it’s too easy for microservices written by different teams to share database structures, perhaps in the name of reducing duplication of work. You end up with the situation where if one team updates a database structure, other services that also use that structure have to be changed too.
This is true, but for smaller teams you can prevent sharing of database structures by convention (process and code review, and automated testing and checking for such access if it is a huge worry). When you carefully define the data-owner services, it’s less likely this will happen. And the alternative is the next paragraph:
Breaking apart the data can make data management more complicated, because the separate storage systems can more easily get out sync or become inconsistent, and foreign keys can change unexpectedly. You need to add a tool that performs master data management (MDM) by operating in the background to find and fix inconsistencies. For example, it might examine every database that stores subscriber IDs, to verify that the same IDs exist in all of them (there aren’t missing or extra IDs in any one database). You can write your own tool or buy one. Many commercial relational database management systems (RDBMSs) do these kinds of checks, but they usually impose too many requirements for coupling, and so don’t scale.(original)
This paragraph probably leads to sighs of exhaustion from anyone with experience doing data reconciliation. It’s due to this overhead that I encourage those of you in smaller organizations to at least evaluate a convention-based approach before deciding to use entirely independent and individual data stores. This is a decision you can delay as needed.

This version of the microservices architecture is very compelling for the scaled CRUD world because it lets you do a rewrite piece by piece. You can do the whole system, or you can simply take out pieces that are most sensitive to scaling. You proactively engage with many of the bits of distributed systems complexity by thinking carefully about the data and where transactions on that data will be needed. You probably don’t need a ton of fancy data pipelines floating around. You know where the data will be modified.

Do you have to go to microservices to scale this? Probably not, but that doesn’t mean using microservices to scale such systems is a bad idea. However, going extreme with the microservices model may be a bad idea, because you really don’t want to slice your data up in a way that ends up in distributed transaction land.

Microservices For Data Stream Processing

Now, let’s talk about a very different use case. This use case is not your classic CRUD application, thick with business rules around transactionally-updated objects. Instead, this use case has a large pipeline of data. It has small bits of data flowing into it from many different sources, a very large volume of many bits of data. This large volume of input data sources also has many different services that will consume it, modify it, and pass it along for further processing.

The major concern of this application is ingesting large quantities of ever-changing data, processing it in various ways, and showing a view of it to customers. CRUD concerns are secondary to the larger concerns of keeping up with the data stream and recalculating information based on what is happening on that stream.

Let’s take a metrics-aggregating SaaS application, for example. This application has customers all over the world with various applications, services, and machines that are reporting out metrics to the aggregator. These customers only need to see their data, although the combined total of data for any one customer may be very large. Our aggregator needs to consume these metrics and send them off to the application that is going to show them to the customer. The customer-facing application may be operating on a combination of incoming metrics in real-time plus historical data that comes from cache or a backing storage system. A large part of the value of the data is in the moving-window of what is happening right now/recently.

This architecture from the start has considerations of volume that even our scaled CRUD world may not care about for a very, very long time. Additionally, the data itself is mostly a stream of updates over time. The notion of the “stateful” data that is transactionally updated is minimal, the most useful data is more like a timeseries or log of events. The transactional data, say, stored user views and user configuration, may be more like the “metadata” of our CRUD application in the first example, infrequently changed compared to the updates coming in from the stream. The majority of developer time is most likely spent not in dealing with these transactional changes but rather in managing the streams of inputs, providing new types of inputs, applying new calculations to the stream of inputs, and changing the calculations.

In this example, you can imagine a service that wants to run an experiment by doing a different calculation across a particular element on the stream. Instead of modifying the existing code, the experimental service listens to the data stream at the same point as the existing calculation, provides a new calculation value, and pushes that calculation value back into the data pipeline on a different channel. At some point an experiment service pulls this data out for the customers who are assigned to the experimental treatment and shows the results of that calculation instead of the standard calculation. In all of these places you need a record of what happened in order to do analysis of experiment success and debugging, but that record does not need to be strongly, transactionally related to the record of other events in the system at this time, even across related users.

In this example, it may very well be much more effective to spin up new services as needed, in order to run quick experiments, rather than changing existing services. Especially in cases where the service can do this without needing to worry about coordinating the data consumption or production with any existing service. This is the world of what I would like to call “stream-centric microservices.”

If there is enormous value to your business to manage real-time data streams, and you are going to have a lot of developers consuming those streams by creating new services to listen to them and produce results, then you absolutely must be willing to commit to the investment in tooling to make the process of creating services and putting them into production as easy as possible. You will probably use this for all of your services over time, once you have it, but realize that the clear value is that you have dynamic data that can be processed and manipulated and experimented on independently.

Cron Jobs as Microservices

I’d be remiss if I didn’t mention this pattern. When it becomes very easy to make anything a microservice, everything becomes a microservice, including things we would traditionally run as cron jobs.

But cron jobs are a nice concept, and not everything has to be a “service.” You can use CloudWatch Events from AWS for this purpose, or scheduled Lambda functions. Use Gearman, a queue and async job runner, to schedule cron jobs. Remember your cron jobs need to be idempotent (can be run twice on the same input without changing the outcome). If you have an easy way to spin up services and it’s easy to create tiny services that are basically cron jobs, no problem, but cron jobs in and of themselves are not a great reason to create a large, orchestrated services environment.


I hope that this has been a useful breakout across a few axes of the wild world of microservices. Going through the thought experiment was very useful for me, personally. It helped me understand how what seems obvious to people at one extreme, say those who spend most of their time focused on stream processing, doesn’t make as much sense for people who are more focused on the world of CRUD application scaling.

(This was originally published on medium)

Monday, July 18, 2016

The Virtue of Hubris and The Value of Complaining

In my previous post, I discussed the leadership virtues of Laziness and Impatience. But as you may know, I neglected one of the core virtues in my list, namely, that of hubris. Hubris. Pride. As Larry Wall says,
Excessive pride, the sort of thing Zeus zaps you for. Also the quality that makes you write (and maintain) programs that other people won't want to say bad things about. Hence, the third great virtue of a programmer.
I would translate this as taking pride in one's work, and being willing to not just take pride in it, but show off that work, talk about it, teach others its magic. And hubris is important. One of the challenges of impatience is that it sometimes drives us to cut corners. Cutting corners can make work go faster, but it can also have a price in the long run. So we balance that desire to cut corners with a desire to maintain pride in our work, and use those conflicting values to keep each other in check.

Hubris done well in my opinion has some interesting expressions. You may think of the person who takes pride in their work as someone who loves to learn and share new things. Who loves to brag about the good stuff. This is certainly part of hubris, sharing lessons learned and trying to help others by showing off our wins. Many tech teams encourage this actively, through rituals like "drinks and demos" where teams get up to share what they accomplished during the week. We encourage people to write up cool stuff we've built, to go speak at conferences and talk about cool technology, and this is all a great thing to do.

However, I think there's more to it than just showing off the good stuff. Within a team, hubris also shows in people who are willing to complain about the bad stuff. Yes, that's right, I think that there is value to expressing not just the positive, but also the negative. In fact, I think that you are actively harming your culture and creating a culture of false pride when you only encourage people to speak up to share good things.

Complaining is all about context. The problems we are facing are our context, and the solutions to those problems must be made within understanding of that context. Context is what makes microservices right for one team and wrong for another. Context is what makes hiring a certain way successful in a high-growth startup but devastating in a big company. Context is so important that when you misunderstand the role that it plays in a solution, you run the risk of misapplying that solution to a place where it will cause you more problems than it solves. Applying someone else's lessons to your context without understanding is how we end up with these cargo cult solutions.

So, the details of the problem are pretty important for putting the solution we're bragging about into context. But here's the thing. If you squash people who want to complain or criticize, you lose the details of your problems. Those complaints contain the details!

Does your company have a practice of telling people to "bring solutions, not complaints?" That is at best hiding problems, not avoiding them. It is unrealistic to expect people to be able to solve every problem they see in front of them. I mean, can you do that, really? It is hard enough to expect your executives to be able to do this, believe me, I know. Your team is going to see problems that they will not know how to solve, and to tell them to keep that to themselves until they figure out the solution is a great way to avoid dealing with real issues.

Instead, I encourage you to ask people to give you details when they have complaints. Help them put their complaints in context. If they complain a system sucks, ask them why. Maybe the answer is that they don't like the formatting standards, in which case an appropriate response might be, unfortunately not everything goes your way. On the other hand, maybe the answer is that it takes them a long time to make changes because the system has no tests and breaks easily, in which case, perhaps you want to think about actually fixing that problem.

If you do this well, you actually teach people how to understand which problems are important, and which problems are not. Letting people complain might seem like it will do nothing but encourage negativity and drama, but if you guide people to learn from their complaints it can instead help your team grow. It's great when people can bring problems AND solutions to you simultaneously, but it's more likely that they will need help to see the best solution. Helping them see the best solution starts by helping them understand how to state the problem.

We are going to have disagreements and conflict in our teams. None of us sees the world in the same way, and that is good. We form teams because as a group, sharing our perspectives, we can create things that are greater than the sum of their parts. Trying to create conflict-free environments is a fool's errand. But you can guide conflict and complaints to result in an increased understanding of context. Instead of discouraging all disagreement, push people to be specific about their thoughts and concerns, and attempt to understand them. As a leader, ask questions to tease out details, and show that you are actually interested in the perspectives on your team, even when you might disagree.

Taking pride sometimes means speaking up when something doesn't seem to be right, when something seems to be less than what it could be. Criticism can help us become even better than we are, if we are willing to listen to its details. Please don't smother this in the name of harmony or positivity, because repressing conflict only leads to a false sense of security and prevents us from achieving true greatness.

Friday, June 10, 2016

The Virtues of Laziness and Impatience

This is an excerpt from my work in progress, a book on engineering management. If you're interested in getting occasional updates you can subscribe to my newsletter!

I love the idea of Laziness, Impatience, and Hubris as virtues of engineers, articulated in “Programming Perl” by Larry Wall. I believe these virtues sustain into leadership, and learning how to channel these traits into advantages is something I encourage all managers to do.

As a manager, when you are dealing with people 1-1 you probably don’t want to be impatient, of course. Impatience can be rude when it is directed at individuals. And you don’t want to seem lazy, there’s nothing worse than working for a manager who seems to be taking it easy while you kill yourself to deliver projects. But impatience, paired with laziness, is wonderful when directed at processes and decisions. Impatience and laziness, applied to process, are the key elements to focus.

As you grow more into leadership positions, people will look to you for behavioral guidance. What you want to teach them is how to focus. To that end, there are two areas I encourage you to practice showing, right now: figuring out what’s important, and going home.

I can’t stand watching people waste their energy approaching problems with brute force and spending time rather than thought, and yet, any culture where you are encouraged to work excessive hours all the time is almost certainly doing just that. What is the value of automation if you don’t use it to make your job easier? We engineers automate so that we can focus on the fun stuff, and the fun stuff is the stuff that uses the most of your brain, and it’s not usually something you can do for hours and hours, day after day.

So be impatient to figure out the nut of what is important. As a leader, any time you see something being done that feels inefficient, start to ask the question, why does this feel inefficient to me? What is the value in the thing we are doing? Can we deliver that value in a way that is faster? Can we strip down this project into something simpler and get it done more quickly?

The problem with this line of questioning is that often when managers ask, can it be done faster, what they explicitly or implicitly want to know is, can the team work harder or longer hours to deliver it in fewer days. This is why I encourage you to develop and show the value of laziness. Because “faster” is not about “same number of hours but fewer total days.” “Faster” is about “the same value to the company in less total time.” If the team works 60 hours in a week to deliver something that otherwise would’ve taken a week and a half, they haven’t worked faster, they’ve just given the company more of their free time.

This is where going home comes in. Go home! And stop emailing people at all hours of the night and all hours of the weekend! Forcing yourself to disengage is essential for your mental health, believe me. Burnout is a real problem in the American workforce these days, and almost everyone I know who has worked sustained excess hours has experienced it to some degree. It’s terrible for individuals, terrible for their families, and terrible for teams. But this isn’t just about preventing your own burnout, it’s about preventing your team’s burnout. When you work later than everyone else, when you send those emails at all hours, even if you don’t expect your team to respond to those emails or work those hours, they see you doing it, and think it’s important. And that overwork makes them less effective, especially at the detailed knowledge work that engineers need to perform.

When you are a newish manager, and you haven’t figured out the tricks to do your job effectively, you might find yourself needing to work more hours to get it all done. That is ok, for a little while. But I encourage you to figure out a way to work those hours without encouraging your team to do so, or making them feel obligated to be on your schedule. Queue up the weekend and overnight emails for the next work day. Put your chat status as “away” in off hours. Take vacation and don’t answer email during that time. And constantly ask yourself the same questions you ask your team: can I do this faster? Do I need to be doing this at all? What is the value that I am providing with this work?

Laziness and impatience. We focus so we can go home, and we encourage going home because it forces us to constantly focus. This is how great teams scale.

Thursday, May 5, 2016

Thoughts on Take Home Interviews

There is a movement now in tech to really think about what it would take to improve our interview process. This is a movement a long time coming. White board coding interviews are clearly a strange way to measure a person's ability to actually do the day to day work of a modern software engineer. And we know that we tend to have a lot of bias in our interview processes that takes what we wish were an objective evaluation of skills and turns it into something very, very subjective.

Recently, my friend Julia Grace wrote about the interview process at Slack, to grant more transparency into what it takes to become an engineer at one of the hottest companies around. While this is a recruiting tactic it's also great that they are helping people understand what to expect and how to apply. Slack is taking pains to try to avoid bias, by having people complete a take-home technical exercise.
This varies by position, but generally you’ll have a week to complete a technical exercise and submit the code and working solution back to us.
Since we don’t do any whiteboard coding during the onsite interview, the technical exercise is one of the best ways we’ve found to evaluate programming competency.
The exercise is graded against a rigorous set of over 30 predetermined criteria. We’re looking for code that is clean, readable, performant, and maintainable. We put significant effort into developing these criteria to ensure that, regardless of who grades the exercise, the score is an accurate reflection of the quality of the work. We do this to limit bias. If there are clear criteria, variations that might impact score but have nothing to do with the candidate (such as if the grader is having a good day) are less likely to influence the outcome.
On twitter, a discussion ensued about whether asking people to spend time at home doing exercises didn't itself cause bias, against those who did not have a lot of spare time to be doing take-home exercises. Julia mentioned that they expect it to take 2-4 hours, but admitted that some people got really into the project and spent far longer than that.

This brings up three good questions that I want to address:

1) Are take-home exercises on the balance good?
2) Is it reasonable to expect people to spend 2-4 hours of their own time on a take-home exercise?
3) What about the people who will spend more time?

On the issue of 1, I think that yes, take-home exercises can help a lot to address the bias that happens when you know the person who wrote the code. They have the potential to be the blind audition step that the tech industry needs.

On the issue of 2, I actually ALSO think it is ok to ask candidates to spend a few hours on these exercises. This comes with two caveats. First, you should genuinely believe that the exercise can be done in the number of hours you expect by a candidate qualified for that level (so, measure how long candidates report spending on it). Second, these exercises only take "a few hours," not "tens of hours."

I feel ok, within bounds, to say that you should find a few hours to do a coding assignment, especially if it reduces the time you would need to spend onsite. The benefits of getting rid of whiteboard coding and that particular bit of evaluation bias outweigh the possible inconvenience. If you're looking for a job, you have to budget time to interview. So take half a day off if you can to do this project. Just because the exercise says take home doesn't mean you actually have to do it in your off hours. If you're unable to take time off of work to interview at all, unfortunately, you're going to have a problem getting a new job anywhere.

Which brings us to issue 3, what about the people who will spend more time? This is the most interesting part.

Julia said that candidates get really into solving the problem, and spend more time on it because they're excited about what they're building. That is awesome, but it also makes my blood run cold. At this point you're talking about something that is both a test of your programming abilities but also a creative project. Let's contrast that to the description that Foursquare gives of their take-home portion:

Instead we give out a take-home exercise that takes about three hours. The exercise consists of three questions:
1. A single-function coding question
2. A slightly more complicated coding question that involves creating a data structure
3. A short design doc (less than a page) on how to implement a specific service and its endpoints.
Every question we use is based on a real problem we’ve had to solve and has a preamble explaining the reason we need to solve this problem. If there is an obvious solution with a poor running time we mention it since we can’t help course-correct when the work isn’t being done live. We also provide scaffolding for the coding questions to save the candidate time.
This appears to be designed as an exercise that will only require a lot more time if you are struggling with the solution. Sure you can go nuts creating a crazy creative data structure or design doc, but this is a pretty clinical test. There's few places to add bells and whistles and they're unlikely to get you any brownie points with the interviewers.

These are two processes with the same goal, to reduce bias in our interviewing process, but slightly different tactics in the take-home. I would guess that Slack's take-home is fun and appealing to a set of people, those who enjoy tinkering or creating cool new projects. And it will find those people and make them shine, and probably serve as a good bit of recruiting to make them even more excited about the possibility of working there.

But I can tell you that for someone like me, I would hate being given the "creative" take-home coding problem. I'm happy to write some code to show you that I can. But I don't like to tinker, and I prefer that my creative work be collaborative. It feels like you are wasting my time and instead of making me more excited, it makes me far less excited.

The creative take-home also seems likely to select for those with free time, because if it is really an exercise that some people want to overdo, they will overdo it and you will have a hard time not rewarding that enthusiasm (why shouldn't you!). And while it's ok to ask for a few hours, building something that rewards those who can spend far longer is likely to bias against those who have, say, kids to take care of after work and on weekends, or other activities that limit their free time.

On the other hand, Foursquare's test is the first take-home I ever read about where I thought, "yes, that I would do." It is respectful of my time, and gives me something constrained to complete. We can elaborate on creative topics in the in-person interview, I do much of my best creative thinking with other people, elaborating on ideas together. Of course, I am a strong in-person communicator, and having a process that pushes the creativity into the in-person interaction interviews selects for people like me.

In the end, you probably want to hire a spectrum of people. When thinking about how to design your take-home tests, make sure that you are being considerate of the candidate's time, and decide if you really need this test to be a test of both technical skills and creativity, or merely a screening for technical skills. It may be that different roles need different screenings, and you may even want to offer both! That's a lot to ask for, but no one ever said that fixing hiring was going to be easy.