Introducing the Mixmax Engineering Handbook

December 3, 2020

Written byAdding Test Cases and Test Plans

Introducing the Mixmax Engineering Handbook

For our third blog post of Mixmax Advent 2020, we’ll be sharing a chapter from our internal Mixmax Engineering Handbook - our central hub for all engineering documentation. It provides an overview of our engineering practices, system architecture, tooling, infrastructure, and security practices. Every engineer at Mixmax worked hard to create the handbook and everyone takes pride in maintaining it. By sharing this now, you could say that this blog post is many years in the making 😀.

The Handbook is organized into 5 chapters:

  • Chapter 1: Day-To-Day Development
  • Chapter 2: Architecture
  • Chapter 3: Delivery
  • Chapter 4: Infrastructure
  • Chapter 5: Security

Today, we’ll be sharing the first chapter with you.


Chapter 1: Day-To-Day Development

This section of the handbook describes how Mixmax engineers write code on a day-to-day basis, using Git and related services to continuously deliver work to users. Every new engineer should read this sometime in their first week, and return frequently thereafter. It aims to answer the following questions, among others:

  • How should I approach the practice of writing code at Mixmax?
  • How do I submit changes to Mixmax’s codebase?
  • Who will review my work, and how?
  • When should my code go out to users?

It focuses on the philosophy and mechanics of development, not the planning and communication that occur before and during coding.

1.1 Coding Principles & Philosophy

This section does not attempt to enumerate every standard good coding practice, but rather the ones specific to our software and our overarching goals as a team and a company. In general: be a good engineering citizen and steward of the code. Functioning as an engineering team means having trust and respect for each other, and taking responsibility for producing the best software that we’re able to.

  • Mixmax has to feel rock solid. We’re taking responsibility for our customers’ outbound communications, which is how they present themselves to the professional world. If our software seems flaky, they won’t trust it.
  • Mixmax has to feel snappy. We sell Mixmax as a solution to improve productivity and efficiency. Interfaces that lag or force a user to wait can make them feel like Mixmax is slowing them down.
  • Mixmax has to feel intuitive. Users already feel comfortable with Gmail and their CRM, so when Mixmax is integrated with those systems, it can’t be a jarring change. It should feel like a natural extension that augments the user’s existing expectations and workflows.
  • Mixmax code has to feel cohesive. An engineer who starts to work in a part of the code they haven’t worked with before shouldn’t need a long period of overhead work to understand what’s going on. Variables and functions should be self-evident, and have names that clearly reflect their purpose and documentation and comments as needed to clarify behavior and establish context. Tracing a chain of function calls or how props are passed should be obvious. There should be very little cleverness or magic.
  • Mixmax must be secure. Our users’ data is their data, and it’s our job to protect it. When developing, make sure to follow best practices (see the OWASP Secure Coding Checklist here) and to review code with an eye to ensuring application security.

More (optional) reading:

1.2: Continuous Delivery

At Mixmax, we practice continuous delivery. This means that we build and deploy our code in small, frequently shipped batches. We try to get our work into users’ hands as soon as possible after fixing a bug or adding a feature, even if the change is small.

This approach gets features and fixes out to customers as soon as possible, makes subsequent deploys smaller (and thus less risky), and helps reveal problems that did not surface during testing, (e.g. scale issues). The philosophy and benefits of continuous delivery go way beyond what’s listed here!

In practice, “continuous delivery” means that, in your day-to-day development, you will:

  1. Get a Jira issue to work on
  2. Write code to address that issue in your favorite text editor
  3. Use Git to submit your changes to one of our services
  4. Use GitHub to get code review on your work
  5. Test the code for your issue
  6. Ship your code to production!

Making changes and submitting pull requests

We use Git to track changes to our source, and host our code on GitHub. At Mixmax, we prefer the following Git practices:

  • Work on a new branch off of master, named < engineer > / < ticket > , e.g. brad/SUP-1337
  • Use git diff to check our own code for lint errors or dead code before committing
  • Write semantic commit messages that explain why you made changes
  • Use interactive rebasing to clean up our work’s commit history into atomic commits
  • Pull request branches onto master and get PR review from relevant engineers / mentors

A few other Mixmax-specific Git tips and tricks:

Adding Test Cases and Test Plans

It’s a good idea to add test cases via jest as well as to document a test plan in your PR description. This acts as a sanity check when smoke testing your own changes and to ensure that other engineers can easily test your changes.

Avoiding large PRs

Generally, extremely large PRs (>1000 lines of diff, dozens of files) are not desirable. Reviewing them is difficult, both because it can be difficult to identify the flow of logic in Github’s “files changed” interface and because it takes a remarkable amount of energy and focus to be thorough in reviewing a huge PR. Responding to 50+ comment reviews can also be a real slog!

The way to create smaller PRs is to organize your changes into smaller independent groups. For instance, if you’re working on a full-stack feature, consider shipping the backend API before the frontend, since the backend won’t have any dependencies. Nor will anything depend on the backend at the time you deploy it, thus it’ll be low risk to put it into production, and it’ll be easy to add to if your continued work on the frontend motivates changes.

(“But how can I make sure the backend works without the frontend?” you may say. One way is to develop the backend alongside tests for the backend. 😉 Another way is to actually develop both backend and frontend simultaneously and use Git Magic™ like interactive staging or rebasing to make it look like you had done the backend first.)

If you can eliminate such inter-project dependencies, then you can actually ship these smaller PRs! This doesn’t mean that the work has to be ready for users, just that it makes sense to review on its own, and is low-risk to put into production at an early stage / by itself. If you don’t want users to begin using the work-in-progress feature, you can hide it from them using feature flags.

If even the smaller PRs aren’t safe to merge to master, you can and should get review early using a WIP (work in progress) PR. And if that PR starts to turn into a big PR, you can make amendments easier to review by making them in their own PRs—branching off your initial PR branch—rather than pushing them straight to the initial branch.

WIP PRs

Usually you’ll open a pull request when your code is fully-functional and tested. But you might open a PR earlier if

  • you think your final PR will be huge, and it would be easier to review it incrementally
  • you’re using a technology or framework that’s new to you, and you want tips from other engineers before you commit to a design pattern
  • you wish a designer to begin styling your work while you continue coding

In this case, open a Draft PR, optionally put the “work-in-progress” label on the PR, and either tag specific reviewers on GitHub, or use Slack to let the rest of the team know how they can best help you out (by reviewing, designing, etc.).

Reviewers can find these helpful too. Also feel free to discuss your draft PRs live!

Once you have completed functionality, mark the PR as ready to merge, remove the “work-in-progress” label, and comment saying “ready for final review” in addition to asking for a review via Slack. (The comment lets observers other than reviewers know that they might take a look.)

Pull Reminders

Many folks choose to use a tool called Pull Reminders to be actively reminded of PR review requests, and stale PRs. This gives a consolidated view of your (and others’) PRs across all the Mixmax repos, and proactive notifications are often nice nudges to review others’ work or pick up on a review you’ve been requested on.

Once you’re part of the mixmaxhq organization, you can be added to start receiving updates as well.

Reviewing and merging PRs

Before asking for review, please practice self-review as described in our internal guide to PR review.

For your first few PRs, ask for a review from:

  • your project mentor (if applicable, otherwise one of the suggested reviewers in the upper right of the PR)
  • your product team using the automatic pull assigners in Github
  • your manager (if different than the above).

When you’ve addressed a round of feedback, comment with “updated” on the PR to trigger a notification to the reviewer. (The reviewer may receive a notification when you push to the PR, depending on their GitHub settings, but we recommend using comments because people may push to a WIP PR without it yet being ready to review.)

If the PR is still blocked on design, apply the “blocked-on-design” label.

After the PR is approved, you (the submitter) will merge, since you’ll know if there’s any last things to do before merging (like testing, updating DB indexes, etc.)

You can stay on top of reviews happening at Mixmax in the #github-notifications slack channel or by enabling Notifications in your personal Github Settings.

Testing

Code that we deploy must be tested. At Mixmax, we practice a philosophy of “low-cost testing”.

At the minimum, low-cost testing means that you the submitter have thoroughly exercised your code—even before opening your PR. It can also mean writing automated tests and/or testing with other people, depending on how easy a feature is to test and how risky the changes are.

Deploying

Wait, I deploy immediately? Really?

Yes. For most changes, once you’ve tested, you’re good to go. Caveats:

  • When you are deploying major new features / bug fixes…

    • For new features, the PM needs to sign off to make sure we're not introducing any half-baked features.
    • Product and Marketing may wish to coordinate larger feature releases with marketing efforts. Talk with them about that.
  • When your deploy is risky, we hold this deploy until the end of the day EST (3pm PST) so we get some volume, but less than peak.

  • We should not deploy anything except extremely low-risk features after CS gets off (7pm PST) or on Fridays. It’s better that we get usage volume to discover a regression than have it be regressed for most of people’s working day (Pacific midnight - 10am) or the weekend, and CS will not be available to help handle outages.

The determination of risk and the decision to deploy involves input from product and CS but ultimately comes down to engineers. If an engineer doesn’t believe the PR to be risky then they can deploy it at will. The bar for confidence just goes up depending on the product area and time of day, as described above.

Will users actually get this code immediately?

Server-side code is always immediately updated. For client-side code, our application code will attempt to refresh the browser, but might not be able to if it knows that a refresh will disrupt the user. As such, you will need to plan for lag accordingly i.e. make API changes compatible with older clients.

Deploying to staging

After you’ve passed code review and your code is thoroughly tested locally, it’s time to deploy to staging. When you submit a change to a service (by merging a PR into master), our CI service will then automatically deploy your code to the service’s staging environment. 

If you submitted a change to a module, our semantic-release bot will publish a new version and you'll need to submit PR(s) to the affected microservices to pick up the new version of the library.

Integration testing

Before shipping to production, do whatever extra testing you deem necessary on staging. This should be pretty minimal assuming that you tested locally.

Automated Integration Testing

Our services also make use of automated integration testing via integration-testing-for-robots (or ITFR). Jenkins runs these jest-powered puppeteer tests against staging to test a few common workflows, including loading the app signup page, sending an email from both the app and Gmail, and managing users in a workspace.

When a deploy PR is opened, ITFR will run and report its results back to Github. If it fails, you'll need to open the run in Jenkins and inspect the output to determine which test failed and why. Since it's running an actual browser and connecting to actual servers, sometimes real-world issues cause transient failures (e.g. timeouts when trying to interact with Gmail), so often just re-running will resolve those.

Once both these integration test suites pass, then you’re ready to deploy!

Deploying to Production

Once the staging deploy has finished, open a PR titled “deploy” from https://github.com/mixmaxhq/PROJECT/compare/master-production...master.

Your bias should be to deploy immediately after testing, modulo the caveats above. If you’re not ready to deploy for some reason, here are ways to signal that:

  • Prefix the PR title “ [ do not merge]” and put relevant TODOs in the PR’s description so that other engineers don’t deploy it either.
  • If another service’s deploy needs to go first, prefix your PR title “ [ after ]”. If two deploys need to go together, but the order doesn’t necessarily matter, prefix each PR with “ [ with ]”.

If someone else's code is waiting to deploy, give it a quick review. But you can assume it's good to deploy unless that person had marked the PR as "do not merge". You can always ask that person too. Engineers who have left deploy PRs open for an extended period of time, “blocking” staging, may be gently chivvied. 😛

Your code will be deployed to production when you merge the deploy PR! If your code requires creating new indexes or migrating user data or settings, make sure to do these things before deploying.

After deploying

You can tell when the deploy has finished by watching the build process in Jenkins.

Then, keep an eye on #sentry in Slack for a bit, to see if any bugs pop up related to what you deployed.

After deploying a bug fix, mark any relevant Sentry bugs as resolved. For client-side bugs, you can mark the Sentry bug as “resolved in next release” as soon as you know how to fix the bug—even before you deploy!

Also close any associated Jira issues. The notifications will let Customer Success know to reach out to users to let them know that their issue was fixed!

If you just deployed a high-value bug fix or feature, also post in the relevant Slack channel so the team can celebrate. :)

Get Mixmax