What Makes a High-Quality E2E Test? A Practical Quality Standard Based on Global Models and Real-World Experience

When people talk about the quality of end-to-end (E2E) tests, the same explanations appear again and again:
tests should be stable, critical user flows should be covered, and everything should be automated and run in CI.

None of this is wrong.
But I have repeatedly seen E2E test suites that met all of these conditions and were still completely unreliable in real work.

The tests passed.
CI was green.
And yet production incidents still happened.
When a change was ready to ship, engineers still hesitated.

I see many articles discussing “E2E test quality” without addressing this disconnect.

In this article, I will first clarify the globally shared framework for software quality that E2E testing is expected to support.
Then I will explain why applying that framework directly to real projects often fails—and what quality standard actually works in practice.

This article explains:

How software quality is defined at a global, industry-wide level
Why E2E tests often fail when those ideas are applied mechanically
A practical quality standard for E2E tests that supports real decision-making

This article is for engineers who already use E2E tests and want to trust them when release decisions matter.

The quality of an E2E test is determined by whether its result enables correct decisions.

1. The Global Definition of Quality in E2E Testing

Before discussing E2E tests specifically, we need a shared definition of software quality.

One widely referenced model is ISO/IEC 25010.

This model defines software quality through characteristics such as:

Functional suitability — does the system behave as intended?
Reliability — does it continue to work under expected conditions?
Usability — is it understandable and usable for humans?
Maintainability — can it be changed and fixed without excessive cost?

An important point is often overlooked:
the quality of tests is evaluated by how well they protect these qualities.

From this perspective, E2E tests are positioned as:

A mechanism for continuously judging whether the system, from a user’s point of view, remains correct, reliable, and safe to use.

At a conceptual level, this makes sense.
The problem begins when this abstraction is applied directly to real projects.

2. Why Applying Global Standards Directly Breaks E2E Tests

The ISO quality model is correct—but it is also highly abstract.

When teams try to implement it, it is often translated into rules like:

All critical user flows must be covered by E2E tests
As many scenarios as possible should be automated
The test suite must remain stable in CI at all times

On the surface, none of this sounds wrong.

However, I have repeatedly watched E2E test suites built on these principles become less useful over time.

The reason is simple:

The criteria for “measuring quality” become disconnected from “making decisions.”

3. The Core Point: E2E Tests Do Not Measure Quality

This is where most discussions go off track.

E2E tests are not meant to quantify quality or maximize coverage.
They exist to support decisions about quality.

Specifically:

Can this change be released?
What exactly broke?
How far should we assume the impact spreads?

If you follow the global quality model to its logical conclusion, the real question becomes:

Does this test result allow a human to make the right decision?

Any E2E test that fails this—even if it looks correct in theory—reduces quality in practice.

4. Why a “Subjective” Standard Is Unavoidable

At this point, the discussion inevitably sounds subjective.
But this is not about emotions or personal preference.

When E2E tests fail in practice, they usually fail in the same way:

A test fails, but the cause is not immediately clear
It is unclear whether the failure indicates a real defect
Someone ends up reading code to understand what happened

At that moment, the E2E test stops helping decisions and starts delaying them.

That is why I use this standard:

When the test breaks, do we hesitate?

This is not a vague feeling—it is a statement about the quality of decision-making.

5. A Practical Quality Standard for E2E Tests

Based on both global models and real-world experience, I evaluate E2E tests using three questions.

If you cannot answer these immediately, the test’s quality is low.

The Three Questions

What specific failure is this test designed to catch?
When it fails, is the next action obvious?
Does this test result make release decisions more confident?

If even one of these is weak, the test likely claims to protect quality while actually slowing decisions.

Summary

The quality of an E2E test is defined by whether it strengthens correct decision-making.

This does not contradict global quality standards.
It is the result of translating an abstract model back into something usable.

Meaning over coverage
Clarity over mere stability
Judgment over quantity

The true quality of E2E tests does not live in the test suite itself.
It appears in how confidently humans can decide.