Test Quality Standards Explained
Test Quality Standards Explained
Section titled âTest Quality Standards ExplainedâTest quality standards define what makes a test âgoodâ in TEA. These arenât suggestions - theyâre the Definition of Done that prevents tests from rotting in review.
Overview
Section titled âOverviewâTEAâs Quality Principles:
- Deterministic - Same result every run
- Isolated - No dependencies on other tests
- Explicit - Assertions visible in test body
- Focused - Single responsibility, appropriate size
- Fast - Execute in reasonable time
Why these matter: Tests that violate these principles create maintenance burden, slow down development, and lose team trust.
The Problem
Section titled âThe ProblemâTests That Rot in Review
Section titled âTests That Rot in Reviewâ// â The anti-pattern: This test will rottest('user can do stuff', async ({ page }) => { await page.goto('/'); await page.waitForTimeout(5000); // Non-deterministic
if (await page.locator('.banner').isVisible()) { // Conditional await page.click('.dismiss'); }
try { // Try-catch for flow control await page.click('#load-more'); } catch (e) { // Silently continue }
// ... 300 more lines of test logic // ... no clear assertions});Whatâs wrong:
- Hard wait - Flaky, wastes time
- Conditional - Non-deterministic behavior
- Try-catch - Hides failures
- Too large - Hard to maintain
- Vague name - Unclear purpose
- No explicit assertions - Whatâs being tested?
Result: PR review comments: âThis test is flaky, please fixâ â never merged â test deleted â coverage lost
AI-Generated Tests Without Standards
Section titled âAI-Generated Tests Without StandardsâAI-generated tests without quality guardrails:
// AI generates 50 tests like this:test('test1', async ({ page }) => { await page.goto('/'); await page.waitForTimeout(3000); // ... flaky, vague, redundant});
test('test2', async ({ page }) => { await page.goto('/'); await page.waitForTimeout(3000); // ... duplicates test1});
// ... 48 more similar testsResult: 50 tests, 80% redundant, 90% flaky, 0% trusted by team - low-quality outputs that create maintenance burden.
The Solution: TEAâs Quality Standards
Section titled âThe Solution: TEAâs Quality Standardsâ1. Determinism (No Flakiness)
Section titled â1. Determinism (No Flakiness)âRule: Test produces same result every run.
Requirements:
- â No hard waits (
waitForTimeout) - â No conditionals for flow control (
if/else) - â No try-catch for flow control
- â Use network-first patterns (wait for responses)
- â Use explicit waits (waitForSelector, waitForResponse)
Bad Example:
test('flaky test', async ({ page }) => { await page.click('button'); await page.waitForTimeout(2000); // â Might be too short
if (await page.locator('.modal').isVisible()) { // â Non-deterministic await page.click('.dismiss'); }
try { // â Silently handles errors await expect(page.locator('.success')).toBeVisible(); } catch (e) { // Test passes even if assertion fails! }});Good Example (Vanilla Playwright):
test('deterministic test', async ({ page }) => { const responsePromise = page.waitForResponse( resp => resp.url().includes('/api/submit') && resp.ok() );
await page.click('button'); await responsePromise; // â
Wait for actual response
// Modal should ALWAYS show (make it deterministic) await expect(page.locator('.modal')).toBeVisible(); await page.click('.dismiss');
// Explicit assertion (fails if not visible) await expect(page.locator('.success')).toBeVisible();});With Playwright Utils (Even Cleaner):
import { test } from '@seontechnologies/playwright-utils/fixtures';import { expect } from '@playwright/test';
test('deterministic test', async ({ page, interceptNetworkCall }) => { const submitCall = interceptNetworkCall({ method: 'POST', url: '**/api/submit' });
await page.click('button');
// Wait for actual response (automatic JSON parsing) const { status, responseJson } = await submitCall; expect(status).toBe(200);
// Modal should ALWAYS show (make it deterministic) await expect(page.locator('.modal')).toBeVisible(); await page.click('.dismiss');
// Explicit assertion (fails if not visible) await expect(page.locator('.success')).toBeVisible();});Why both work:
- Waits for actual event (network response)
- No conditionals (behavior is deterministic)
- Assertions fail loudly (no silent failures)
- Same result every run (deterministic)
Playwright Utils additional benefits:
- Automatic JSON parsing
{ status, responseJson }structure (can validate response data)- No manual
await response.json()
2. Isolation (No Dependencies)
Section titled â2. Isolation (No Dependencies)âRule: Test runs independently, no shared state.
Requirements:
- â Self-cleaning (cleanup after test)
- â No global state dependencies
- â Can run in parallel
- â Can run in any order
- â Use unique test data
Bad Example:
// â Tests depend on execution orderlet userId: string; // Shared global state
test('create user', async ({ apiRequest }) => { const { body } = await apiRequest({ method: 'POST', path: '/api/users', body: { email: 'test@example.com' } (hard-coded) }); userId = body.id; // Store in global});
test('update user', async ({ apiRequest }) => { // Depends on previous test setting userId await apiRequest({ method: 'PATCH', path: `/api/users/${userId}`, body: { name: 'Updated' } }); // No cleanup - leaves user in database});Problems:
- Tests must run in order (canât parallelize)
- Second test fails if first skipped (
.only) - Hard-coded data causes conflicts
- No cleanup (database fills with test data)
Good Example (Vanilla Playwright):
test('should update user profile', async ({ request }) => { // Create unique test data const testEmail = `test-${Date.now()}@example.com`;
// Setup: Create user const createResp = await request.post('/api/users', { data: { email: testEmail, name: 'Original' } }); const user = await createResp.json();
// Test: Update user const updateResp = await request.patch(`/api/users/${user.id}`, { data: { name: 'Updated' } }); const updated = await updateResp.json();
expect(updated.name).toBe('Updated');
// Cleanup: Delete user await request.delete(`/api/users/${user.id}`);});Even Better (With Playwright Utils):
import { test } from '@seontechnologies/playwright-utils/api-request/fixtures';import { expect } from '@playwright/test';import { faker } from '@faker-js/faker';
test('should update user profile', async ({ apiRequest }) => { // Dynamic unique test data const testEmail = faker.internet.email();
// Setup: Create user const { status: createStatus, body: user } = await apiRequest({ method: 'POST', path: '/api/users', body: { email: testEmail, name: faker.person.fullName() } });
expect(createStatus).toBe(201);
// Test: Update user const { status, body: updated } = await apiRequest({ method: 'PATCH', path: `/api/users/${user.id}`, body: { name: 'Updated Name' } });
expect(status).toBe(200); expect(updated.name).toBe('Updated Name');
// Cleanup: Delete user await apiRequest({ method: 'DELETE', path: `/api/users/${user.id}` });});Playwright Utils Benefits:
{ status, body }destructuring (cleaner thanresponse.status()+await response.json())- No manual
await response.json() - Automatic retry for 5xx errors
- Optional schema validation with
.validateSchema()
Why it works:
- No global state
- Unique test data (no conflicts)
- Self-cleaning (deletes user)
- Can run in parallel
- Can run in any order
3. Explicit Assertions (No Hidden Validation)
Section titled â3. Explicit Assertions (No Hidden Validation)âRule: Assertions visible in test body, not abstracted.
Requirements:
- â Assertions in test code (not helper functions)
- â
Specific assertions (not generic
toBeTruthy) - â Meaningful expectations (test actual behavior)
Bad Example:
// â Assertions hidden in helperasync function verifyProfilePage(page: Page) { // Assertions buried in helper (not visible in test) await expect(page.locator('h1')).toBeVisible(); await expect(page.locator('.email')).toContainText('@'); await expect(page.locator('.name')).not.toBeEmpty();}
test('profile page', async ({ page }) => { await page.goto('/profile'); await verifyProfilePage(page); // What's being verified?});Problems:
- Canât see whatâs tested (need to read helper)
- Hard to debug failures (which assertion failed?)
- Reduces test readability
- Hides important validation
Good Example:
// â
Assertions explicit in testtest('should display profile with correct data', async ({ page }) => { await page.goto('/profile');
// Explicit assertions - clear what's tested await expect(page.locator('h1')).toContainText('Test User'); await expect(page.locator('.email')).toContainText('test@example.com'); await expect(page.locator('.bio')).toContainText('Software Engineer'); await expect(page.locator('img[alt="Avatar"]')).toBeVisible();});Why it works:
- See whatâs tested at a glance
- Debug failures easily (know which assertion failed)
- Test is self-documenting
- No hidden behavior
Exception: Use helper for setup/cleanup, not assertions.
4. Focused Tests (Appropriate Size)
Section titled â4. Focused Tests (Appropriate Size)âRule: Test has single responsibility, reasonable size.
Requirements:
- â Test size < 300 lines
- â Single responsibility (test one thing well)
- â Clear describe/test names
- â Appropriate scope (not too granular, not too broad)
Bad Example:
// â 500-line test testing everythingtest('complete user flow', async ({ page }) => { // Registration (50 lines) await page.goto('/register'); await page.fill('#email', 'test@example.com'); // ... 48 more lines
// Profile setup (100 lines) await page.goto('/profile'); // ... 98 more lines
// Settings configuration (150 lines) await page.goto('/settings'); // ... 148 more lines
// Data export (200 lines) await page.goto('/export'); // ... 198 more lines
// Total: 500 lines, testing 4 different features});Problems:
- Failure in line 50 prevents testing lines 51-500
- Hard to understand (whatâs being tested?)
- Slow to execute (testing too much)
- Hard to debug (which feature failed?)
Good Example:
// â
Focused tests - one responsibility each
test('should register new user', async ({ page }) => { await page.goto('/register'); await page.fill('#email', 'test@example.com'); await page.fill('#password', 'password123'); await page.click('button[type="submit"]');
await expect(page).toHaveURL('/welcome'); await expect(page.locator('h1')).toContainText('Welcome');});
test('should configure user profile', async ({ page, authSession }) => { await authSession.login({ email: 'test@example.com', password: 'pass' }); await page.goto('/profile');
await page.fill('#name', 'Test User'); await page.fill('#bio', 'Software Engineer'); await page.click('button:has-text("Save")');
await expect(page.locator('.success')).toBeVisible();});
// ... separate tests for settings, export (each < 50 lines)Why it works:
- Each test has one responsibility
- Failure is easy to diagnose
- Can run tests independently
- Test names describe exactly whatâs tested
5. Fast Execution (Performance Budget)
Section titled â5. Fast Execution (Performance Budget)âRule: Individual test executes in < 1.5 minutes.
Requirements:
- â Test execution < 90 seconds
- â Efficient selectors (getByRole > XPath)
- â Minimal redundant actions
- â Parallel execution enabled
Bad Example:
// â Slow test (3+ minutes)test('slow test', async ({ page }) => { await page.goto('/'); await page.waitForTimeout(10000); // 10s wasted
// Navigate through 10 pages (2 minutes) for (let i = 1; i <= 10; i++) { await page.click(`a[href="/page-${i}"]`); await page.waitForTimeout(5000); // 5s per page = 50s wasted }
// Complex XPath selector (slow) await page.locator('//div[@class="container"]/section[3]/div[2]/p').click();
// More waiting await page.waitForTimeout(30000); // 30s wasted
await expect(page.locator('.result')).toBeVisible();});Total time: 3+ minutes (95 seconds wasted on hard waits)
Good Example (Vanilla Playwright):
// â
Fast test (< 10 seconds)test('fast test', async ({ page }) => { // Set up response wait const apiPromise = page.waitForResponse( resp => resp.url().includes('/api/result') && resp.ok() );
await page.goto('/');
// Direct navigation (skip intermediate pages) await page.goto('/page-10');
// Efficient selector await page.getByRole('button', { name: 'Submit' }).click();
// Wait for actual response (fast when API is fast) await apiPromise;
await expect(page.locator('.result')).toBeVisible();});With Playwright Utils:
import { test } from '@seontechnologies/playwright-utils/fixtures';import { expect } from '@playwright/test';
test('fast test', async ({ page, interceptNetworkCall }) => { // Set up interception const resultCall = interceptNetworkCall({ method: 'GET', url: '**/api/result' });
await page.goto('/');
// Direct navigation (skip intermediate pages) await page.goto('/page-10');
// Efficient selector await page.getByRole('button', { name: 'Submit' }).click();
// Wait for actual response (automatic JSON parsing) const { status, responseJson } = await resultCall;
expect(status).toBe(200); await expect(page.locator('.result')).toBeVisible();
// Can also validate response data if needed // expect(responseJson.data).toBeDefined();});Total time: < 10 seconds (no wasted waits)
Both examples achieve:
- No hard waits (wait for actual events)
- Direct navigation (skip unnecessary steps)
- Efficient selectors (getByRole)
- Fast execution
Playwright Utils bonus:
- Can validate API response data easily
- Automatic JSON parsing
- Cleaner API
TEAâs Quality Scoring
Section titled âTEAâs Quality ScoringâTEA reviews tests against these standards in test-review:
Scoring Categories (100 points total)
Section titled âScoring Categories (100 points total)âDeterminism (35 points):
- No hard waits: 10 points
- No conditionals: 10 points
- No try-catch flow: 10 points
- Network-first patterns: 5 points
Isolation (25 points):
- Self-cleaning: 15 points
- No global state: 5 points
- Parallel-safe: 5 points
Assertions (20 points):
- Explicit in test body: 10 points
- Specific and meaningful: 10 points
Structure (10 points):
- Test size < 300 lines: 5 points
- Clear naming: 5 points
Performance (10 points):
- Execution time < 1.5 min: 10 points
Quality Scoring Breakdown
Section titled âQuality Scoring Breakdownâ%%{init: {'theme':'base', 'themeVariables': { 'fontSize':'14px'}}}%%pie title Test Quality Score (100 points) "Determinism" : 35 "Isolation" : 25 "Assertions" : 20 "Structure" : 10 "Performance" : 10%%{init: {'theme':'base', 'themeVariables': { 'fontSize':'13px'}}}%%flowchart LR subgraph Det[Determinism - 35 pts] D1[No hard waits<br/>10 pts] D2[No conditionals<br/>10 pts] D3[No try-catch flow<br/>10 pts] D4[Network-first<br/>5 pts] end
subgraph Iso[Isolation - 25 pts] I1[Self-cleaning<br/>15 pts] I2[No global state<br/>5 pts] I3[Parallel-safe<br/>5 pts] end
subgraph Assrt[Assertions - 20 pts] A1[Explicit in body<br/>10 pts] A2[Specific/meaningful<br/>10 pts] end
subgraph Struct[Structure - 10 pts] S1[Size < 300 lines<br/>5 pts] S2[Clear naming<br/>5 pts] end
subgraph Perf[Performance - 10 pts] P1[Time < 1.5 min<br/>10 pts] end
Det --> Total([Total: 100 points]) Iso --> Total Assrt --> Total Struct --> Total Perf --> Total
style Det fill:#ffebee,stroke:#c62828,stroke-width:2px style Iso fill:#e3f2fd,stroke:#1565c0,stroke-width:2px style Assrt fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px style Struct fill:#fff9c4,stroke:#f57f17,stroke-width:2px style Perf fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px style Total fill:#fff,stroke:#000,stroke-width:3pxScore Interpretation
Section titled âScore Interpretationâ| Score | Interpretation | Action |
|---|---|---|
| 90-100 | Excellent | Production-ready, minimal changes |
| 80-89 | Good | Minor improvements recommended |
| 70-79 | Acceptable | Address recommendations before release |
| 60-69 | Needs Work | Fix critical issues |
| < 60 | Critical | Significant refactoring needed |
Comparison: Good vs Bad Tests
Section titled âComparison: Good vs Bad TestsâExample: User Login
Section titled âExample: User LoginâBad Test (Score: 45/100):
test('login test', async ({ page }) => { // Vague name await page.goto('/login'); await page.waitForTimeout(3000); // -10 (hard wait)
await page.fill('[name="email"]', 'test@example.com'); await page.fill('[name="password"]', 'password');
if (await page.locator('.remember-me').isVisible()) { // -10 (conditional) await page.click('.remember-me'); }
await page.click('button');
try { // -10 (try-catch flow) await page.waitForURL('/dashboard', { timeout: 5000 }); } catch (e) { // Ignore navigation failure }
// No assertions! -10 // No cleanup! -10});Issues:
- Determinism: 5/35 (hard wait, conditional, try-catch)
- Isolation: 10/25 (no cleanup)
- Assertions: 0/20 (no assertions!)
- Structure: 15/10 (okay)
- Performance: 5/10 (slow)
- Total: 45/100
Good Test (Score: 95/100):
test('should login with valid credentials and redirect to dashboard', async ({ page, authSession }) => { // Use fixture for deterministic auth const loginPromise = page.waitForResponse( resp => resp.url().includes('/api/auth/login') && resp.ok() );
await page.goto('/login'); await page.getByLabel('Email').fill('test@example.com'); await page.getByLabel('Password').fill('password123'); await page.getByRole('button', { name: 'Sign in' }).click();
// Wait for actual API response const response = await loginPromise; const { token } = await response.json();
// Explicit assertions expect(token).toBeDefined(); await expect(page).toHaveURL('/dashboard'); await expect(page.getByText('Welcome back')).toBeVisible();
// Cleanup handled by authSession fixture});Quality:
- Determinism: 35/35 (network-first, no conditionals)
- Isolation: 25/25 (fixture handles cleanup)
- Assertions: 20/20 (explicit and specific)
- Structure: 10/10 (clear name, focused)
- Performance: 5/10 (< 1 min)
- Total: 95/100
Example: API Testing
Section titled âExample: API TestingâBad Test (Score: 50/100):
test('api test', async ({ request }) => { const response = await request.post('/api/users', { data: { email: 'test@example.com' } // Hard-coded (conflicts) });
if (response.ok()) { // Conditional const user = await response.json(); // Weak assertion expect(user).toBeTruthy(); }
// No cleanup - user left in database});Good Test (Score: 92/100):
test('should create user with valid data', async ({ apiRequest }) => { // Unique test data const testEmail = `test-${Date.now()}@example.com`;
// Create user const { status, body } = await apiRequest({ method: 'POST', path: '/api/users', body: { email: testEmail, name: 'Test User' } });
// Explicit assertions expect(status).toBe(201); expect(body.id).toBeDefined(); expect(body.email).toBe(testEmail); expect(body.name).toBe('Test User');
// Cleanup await apiRequest({ method: 'DELETE', path: `/api/users/${body.id}` });});How TEA Enforces Standards
Section titled âHow TEA Enforces StandardsâDuring Test Generation (atdd, automate)
Section titled âDuring Test Generation (atdd, automate)âTEA generates tests following standards by default:
// TEA-generated test (automatically follows standards)test('should submit contact form', async ({ page }) => { // Network-first pattern (no hard waits) const submitPromise = page.waitForResponse( resp => resp.url().includes('/api/contact') && resp.ok() );
// Accessible selectors (resilient) await page.getByLabel('Name').fill('Test User'); await page.getByLabel('Email').fill('test@example.com'); await page.getByLabel('Message').fill('Test message'); await page.getByRole('button', { name: 'Send' }).click();
const response = await submitPromise; const result = await response.json();
// Explicit assertions expect(result.success).toBe(true); await expect(page.getByText('Message sent')).toBeVisible();
// Size: 15 lines (< 300 â) // Execution: ~2 seconds (< 90s â)});During Test Review (test-review)
Section titled âDuring Test Review (test-review)âTEA audits tests and flags violations:
## Critical Issues
### Hard Wait Detected (tests/login.spec.ts:23)**Issue:** `await page.waitForTimeout(3000)`**Score Impact:** -10 (Determinism)**Fix:** Use network-first pattern
### Conditional Flow Control (tests/profile.spec.ts:45)**Issue:** `if (await page.locator('.banner').isVisible())`**Score Impact:** -10 (Determinism)**Fix:** Make banner presence deterministic
## Recommendations
### Extract Fixture (tests/auth.spec.ts)**Issue:** Login code repeated 5 times**Score Impact:** -3 (Structure)**Fix:** Extract to authSession fixtureDefinition of Done Checklist
Section titled âDefinition of Done ChecklistâWhen is a test âdoneâ?
Test Quality DoD:
- No hard waits (
waitForTimeout) - No conditionals for flow control
- No try-catch for flow control
- Network-first patterns used
- Assertions explicit in test body
- Test size < 300 lines
- Clear, descriptive test name
- Self-cleaning (cleanup in afterEach or test)
- Unique test data (no hard-coded values)
- Execution time < 1.5 minutes
- Can run in parallel
- Can run in any order
Code Review DoD:
- Test quality score > 80
- No critical issues from
test-review - Follows project patterns (fixtures, selectors)
- Test reviewed by team member
Common Quality Issues
Section titled âCommon Quality IssuesâIssue: âMy test needs conditionals for optional elementsâ
Section titled âIssue: âMy test needs conditionals for optional elementsââWrong approach:
if (await page.locator('.banner').isVisible()) { await page.click('.dismiss');}Right approach - Make it deterministic:
// Option 1: Always expect bannerawait expect(page.locator('.banner')).toBeVisible();await page.click('.dismiss');
// Option 2: Test both scenarios separatelytest('should show banner for new users', ...);test('should not show banner for returning users', ...);Issue: âMy test needs try-catch for error handlingâ
Section titled âIssue: âMy test needs try-catch for error handlingââWrong approach:
try { await page.click('#optional-button');} catch (e) { // Silently continue}Right approach - Make failures explicit:
// Option 1: Button should existawait page.click('#optional-button'); // Fails loudly if missing
// Option 2: Button might not exist (test both)test('should work with optional button', async ({ page }) => { const hasButton = await page.locator('#optional-button').count() > 0; if (hasButton) { await page.click('#optional-button'); } // But now you're testing optional behavior explicitly});Issue: âHard waits are easier than network patternsâ
Section titled âIssue: âHard waits are easier than network patternsââShort-term: Hard waits seem simpler Long-term: Flaky tests waste more time than learning network patterns
Investment:
- 30 minutes to learn network-first patterns
- Prevents hundreds of hours debugging flaky tests
- Tests run faster (no wasted waits)
- Team trusts test suite
Technical Implementation
Section titled âTechnical ImplementationâFor detailed test quality patterns, see:
Related Concepts
Section titled âRelated ConceptsâCore TEA Concepts:
- Risk-Based Testing - Quality scales with risk
- Knowledge Base System - How standards are enforced
- Engagement Models - Quality in different models
Technical Patterns:
- Network-First Patterns - Determinism explained
- Fixture Architecture - Isolation through fixtures
Overview:
- TEA Overview - Quality standards in lifecycle
- Testing as Engineering - Why quality matters
Practical Guides
Section titled âPractical GuidesâWorkflow Guides:
- How to Run Test Review - Audit against these standards
- How to Run ATDD - Generate quality tests
- How to Run Automate - Expand with quality
Use-Case Guides:
- Using TEA with Existing Tests - Improve legacy quality
- Running TEA for Enterprise - Enterprise quality thresholds
Reference
Section titled âReferenceâ- TEA Command Reference -
test-reviewcommand - Knowledge Base Index - Test quality fragment
- Glossary - TEA terminology
Generated with BMad Method - TEA (Test Architect)