What is code review?

March 18, 2024

#codereview#productquality#softwaredevelopment

Author

Valletta Software Editorial

Editorial team at Valletta Software, a Malta-based software development partner. We publish hands-on guides on AI development, SaaS architecture, staff augmentation, and OpenClaw self-hosted AI agents. All content is reviewed by senior engineers before publishing.

Code review is the practice of having one or more engineers other than the author read a proposed change before it lands in the main branch. That is the surface answer to "what is code review." The interesting part is what code review is actually for, why teams that do it well outperform teams that do not, and the specific patterns that make a review process useful rather than a tax.

Key takeaways

Code review is not bug-hunting first. It is a knowledge transfer and shared ownership mechanism. Bug catching is a useful side effect.
The best code review format is pull requests under 400 lines of diff, reviewed within 24 hours, by two engineers including one who will maintain the code after the author.
Code review without a clear standard is theatre. Teams need shared norms for what reviewers actually look for.
Asynchronous review beats synchronous in distributed teams. Pair programming complements it; it does not replace it.
Automated checks should land before a human review starts. Reviewers should not spend cycles on style or basic test coverage.

A working definition of code review

Code review is a structured conversation between an author and one or more reviewers about a proposed change to the codebase. It typically happens through a pull request (PR) or merge request (MR), where the change is presented as a diff, and the reviewers leave comments, request changes, or approve the change for merge. The conversation can run synchronously (over a call) or asynchronously (in PR comments), and the standards for what to look for vary by team.

The reason it exists is broader than catching bugs. Code review distributes knowledge of the codebase across the team, enforces shared coding norms without anyone having to police style manually, and creates a documented audit trail of design decisions. Done well, it raises team output. Done badly, it adds days to lead time and demoralizes engineers without improving quality.

What code review is actually for

Five distinct outcomes that a well-run review process produces, in rough order of importance:

Shared ownership. After review, the reviewer also understands the code well enough to debug it later. This reduces bus factor risk and makes on-call rotations sustainable.
Knowledge spread. Junior engineers see how senior engineers think about edge cases. Senior engineers see what newer engineers find confusing, which surfaces documentation gaps.
Design feedback. Most architectural drift in a codebase happens one small PR at a time. Review is the cheapest place to catch a design smell before it propagates.
Bug catching. Real, but the smallest of the five outcomes in most measured studies. Automated tests catch more bugs more cheaply.
Style and norm enforcement. Should be automated where possible (linters, formatters) and only surface in review when a true judgement call is needed.

The formats that work

Three review formats cover almost every team. They are not mutually exclusive, but each has a different cost-benefit profile.

Asynchronous pull request review is the default for distributed and partially remote teams. The author writes a PR with a clear description, requests review from two engineers, and the reviewers leave comments at their own cadence. It works well for changes that are not blocking other work, and it produces a permanent record. The main risk is reviewer latency, which inflates lead time.

Synchronous review means the author and reviewer sit together (or screen-share) and walk through the change. It is faster for resolving discussion, useful for tricky design questions, and effective for onboarding. The downside is that nothing gets written down, so reasoning vanishes after the call.

Pair programming is review in real time during authorship. Two engineers write the code together. This effectively pre-empts the review step, but it doubles the cost of the work for routine changes and is best reserved for high-novelty or high-risk areas.

What to actually look for in a review

Reviewers waste time looking at the wrong things. A useful checklist, in the order to apply it:

The numbers that matter

Research consistently lands on the same thresholds for effective review.

PR size. Reviewer effectiveness drops sharply above 400 lines of diff. Above 1,000 lines, reviewers find roughly half as many issues per line as in smaller PRs.
Reviewer count. Two reviewers catch most issues that one would miss. A third reviewer adds little defect detection but is useful for knowledge spread.
Time to first review. Median under 8 working hours keeps lead time healthy. Above 24 hours and engineers context-switch away from the change.
Reviewer time per change. Effective reviewers spend roughly 5 to 10 minutes per 100 lines of diff. Faster reviews catch fewer issues; slower reviews show diminishing returns.

Anti-patterns that ruin code review

The four review anti-patterns that show up most often in our engagements with B2B SaaS teams:

The rubber stamp. One LGTM in 30 seconds. Defeats the purpose entirely; often a culture symptom rather than a tooling problem.
The bottleneck reviewer. One senior engineer reviews everything. Lead time balloons, knowledge stays trapped in one head.
Bike-shedding. Long debates over trivial issues (variable names, formatting) while real design problems slip through. Usually a sign that the team needs explicit reviewer norms.
The week-late review. By the time feedback arrives, the author has context-switched. Either the feedback is ignored, or the author has to reload context, which is expensive.

Code review and AI tools

AI-assisted code review tools (GitHub Copilot Review, CodeRabbit, Sourcegraph, others) have improved dramatically by 2026. They are good at catching obvious bugs, style violations, missing test coverage, and security smells. They do not yet replace human review for design feedback, contextual judgement, or knowledge transfer.

The pattern that works: AI tools run first, surfacing easy issues so the human reviewer can focus on judgement calls. This roughly halves the time per review while preserving the parts of review that AI cannot do. Teams that try to use AI as the only reviewer end up with the rubber-stamp anti-pattern by another name.

For our take on auditing AI-generated code specifically, see how to audit a vibe-coded app. For the broader case for review, see the fivefold impact of code reviews. For a widely cited industry baseline, see Google's engineering practices on code review and the long-running SmartBear research on peer review effectiveness. See also our honest hiring guide.

Frequently asked questions

How long should a code review take?

Roughly 5 to 10 minutes per 100 lines of diff for a serious review. A 200-line PR should get 10 to 20 minutes of attention. Anything significantly faster suggests rubber-stamping; significantly slower suggests the PR is too large and should be split.

Should junior engineers review senior engineers' code?

Yes. The knowledge transfer is bidirectional. Junior engineers ask the questions that surface implicit assumptions in senior engineers' code, and they learn by reading patterns they would not have written themselves. The only adjustment is that a junior reviewer should not be the only reviewer on a high-risk change.

What if reviewers always say yes?

Almost always a culture or tooling symptom. Either reviewers feel they are not allowed to push back (culture), or the review tooling makes substantive feedback expensive (tooling: clunky inline comments, no draft review feature). Fix the substrate before blaming people.

Is code review still useful with extensive automated testing?

Yes. Tests catch bugs, not design problems. Tests verify what the code does, not whether it should do that thing at all. The four other outcomes of review (shared ownership, knowledge spread, design feedback, norm enforcement) are largely orthogonal to test coverage.

How do you keep reviews from blocking delivery?

Three rules: keep PRs under 400 lines, set explicit SLAs for time-to-first-review (8 working hours is realistic for most teams), and rotate reviewer assignment automatically rather than relying on the author to pick. The combination keeps median lead time under a day for most changes.