Run locally with prod data

Problem

Engineers complain that it is hard to develop new features because dev environment is not aligned with production.

Context

Engineers run the app on their laptops to have quick feedback about new functionalities under development.
The app integrates with the dev environment APIs. Such APIs can be internal or external to the company.

Data returned by the dev environment APIs are poor and inconsistent.
The APIs have bad availability and performance in the dev environment.

Solution

Engineers run the app locally by pointing to production APIs.
Data are abundant, consistent and highly available.

Security

API keys used locally must be different to the ones used by production deployments.
The scope of the API keys is read-only whenever possible.

Test accounts

If a read-only scope is obvious to read APIs, it gets more tricky for write APIs as we do not want to mess up with production data. The answer here is test user accounts. Data generated by test user accounts cannot be seen by real ones.
A common implementation is a "test" boolean column in a database table.
APIs that perform side effects (e.g. payment gateway, order fulfillment, etc.), return success to requests from test user accounts, but without carrying out the side effect.

Availability

The APIs enforce quotas by API key to prevent local bugs from affecting production performance (e.g. infinite loop).
Although it is unlikely that a single machine can generate enough traffic to affect production systems, API quotas are a good practice that should be applied to testing API keys too.

Why to keep dev environment

Even if we do not use dev environment for testing, it is still a valuable safety net for infrastructure changes.
For example, database migrations.

Notes

Alternative approach: local data setup

An alternative to using dev environment APIs, is to leverage data fixtures defined locally.
This is easy to set up especially if there are already well-defined test fixtures in place.
When the app runs locally, it fakes all APIs integration and fetches the local data fixtures.

Compared to testing against dev environment, this local approach ensures a higher level of autonomy for engineers.

However, compared to testing against prod, there are two shortcomings:

  1. Data are not as rich
  2. Usually, the local data are not maintained and they slowly rot until they become unusable