Skip to content
🤖 Consolidated, AI-optimized BMAD docs: llms-full.txt. Fetch this plain text file for complete context.

How to Run Test Review with TEA

Use TEA’s test-review workflow to audit test quality with objective scoring and actionable feedback. TEA reviews tests against its knowledge base of best practices.

  • Want to validate test quality objectively
  • Need quality metrics for release gates
  • Preparing for production deployment
  • Reviewing team-written tests
  • Auditing AI-generated tests
  • Onboarding new team members (show good patterns)
  • BMad Method installed
  • TEA agent available
  • Tests written (to review)
  • Test framework configured

Start a fresh chat and load TEA:

tea
test-review

TEA will ask what to review.

Review one test file:

tests/e2e/checkout.spec.ts

Best for:

  • Reviewing specific failing tests
  • Quick feedback on new tests
  • Learning from specific examples

Review all tests in a directory:

tests/e2e/

Best for:

  • Reviewing E2E test suite
  • Comparing test quality across files
  • Finding patterns of issues

Review all tests:

tests/

Best for:

  • Release gate quality check
  • Comprehensive audit
  • Establishing baseline metrics

TEA generates a comprehensive quality report with scoring.

# Test Quality Review Report
**Date:** 2026-01-13
**Scope:** tests/e2e/
**Overall Score:** 76/100
## Summary
- **Tests Reviewed:** 12
- **Passing Quality:** 9 tests (75%)
- **Needs Improvement:** 3 tests (25%)
- **Critical Issues:** 2
- **Recommendations:** 6
## Critical Issues
### 1. Hard Waits Detected
**File:** `tests/e2e/checkout.spec.ts:45`
**Issue:** Using `page.waitForTimeout(3000)`
**Impact:** Test is flaky and unnecessarily slow
**Severity:** Critical
**Current Code:**
```typescript
await page.click('button[type="submit"]');
await page.waitForTimeout(3000); // ❌ Hard wait
await expect(page.locator('.success')).toBeVisible();

Fix:

await page.click('button[type="submit"]');
// Wait for the API response that triggers success message
await page.waitForResponse(resp =>
resp.url().includes('/api/checkout') && resp.ok()
);
await expect(page.locator('.success')).toBeVisible();

Why This Matters:

  • Hard waits are fixed timeouts that don’t wait for actual conditions
  • Tests fail intermittently on slower machines
  • Wastes time waiting even when response is fast
  • Network-first patterns are more reliable

File: tests/e2e/profile.spec.ts:28 Issue: Using if/else to handle optional elements Impact: Non-deterministic test behavior Severity: Critical

Current Code:

if (await page.locator('.banner').isVisible()) {
await page.click('.dismiss');
}
// ❌ Test behavior changes based on banner presence

Fix:

// Option 1: Make banner presence deterministic
await expect(page.locator('.banner')).toBeVisible();
await page.click('.dismiss');
// Option 2: Test both scenarios separately
test('should show banner for new users', async ({ page }) => {
// Test with banner
});
test('should not show banner for returning users', async ({ page }) => {
// Test without banner
});

Why This Matters:

  • Tests should be deterministic (same result every run)
  • Conditionals hide bugs (what if banner should always show?)
  • Makes debugging harder
  • Violates test isolation principle

File: tests/e2e/profile.spec.ts Issue: Login code duplicated in every test Severity: Medium Impact: Maintenance burden, test verbosity

Current:

test('test 1', async ({ page }) => {
await page.goto('/login');
await page.fill('[name="email"]', 'test@example.com');
await page.fill('[name="password"]', 'password');
await page.click('button[type="submit"]');
// Test logic...
});
test('test 2', async ({ page }) => {
// Same login code repeated
});

Fix (Vanilla Playwright):

// Create fixture in tests/support/fixtures/auth.ts
import { test as base, Page } from '@playwright/test';
export const test = base.extend<{ authenticatedPage: Page }>({
authenticatedPage: async ({ page }, use) => {
await page.goto('/login');
await page.getByLabel('Email').fill('test@example.com');
await page.getByLabel('Password').fill('password');
await page.getByRole('button', { name: 'Sign in' }).click();
await page.waitForURL(/\/dashboard/);
await use(page);
}
});
// Use in tests
test('test 1', async ({ authenticatedPage }) => {
// Already logged in
});

Better (With Playwright Utils):

// Use built-in auth-session fixture
import { test as base } from '@playwright/test';
import { createAuthFixtures } from '@seontechnologies/playwright-utils/auth-session';
export const test = base.extend(createAuthFixtures());
// Use in tests - even simpler
test('test 1', async ({ page, authToken }) => {
// authToken already available (persisted, reused)
await page.goto('/dashboard');
// Already authenticated via authToken
});

Playwright Utils Benefits:

  • Token persisted to disk (faster subsequent runs)
  • Multi-user support out of the box
  • Automatic token renewal if expired
  • No manual login flow needed

File: tests/e2e/api-calls.spec.ts Issue: No verification of API responses Severity: Low Impact: Tests don’t catch API errors

Current:

await page.click('button[name="save"]');
await expect(page.locator('.success')).toBeVisible();
// ❌ What if API returned 500 but UI shows cached success?

Enhancement:

const responsePromise = page.waitForResponse(
resp => resp.url().includes('/api/profile') && resp.status() === 200
);
await page.click('button[name="save"]');
const response = await responsePromise;
// Verify API response
const data = await response.json();
expect(data.success).toBe(true);
// Verify UI
await expect(page.locator('.success')).toBeVisible();

File: tests/e2e/checkout.spec.ts Issue: Vague test names Severity: Low Impact: Hard to understand test purpose

Current:

test('should work', async ({ page }) => { });
test('test checkout', async ({ page }) => { });

Better:

test('should complete checkout with valid credit card', async ({ page }) => { });
test('should show validation error for expired card', async ({ page }) => { });
CategoryScoreTargetStatus
Determinism26/3530/35⚠️ Needs Improvement
Isolation22/2520/25✅ Good
Assertions18/2016/20✅ Good
Structure7/108/10⚠️ Minor Issues
Performance3/108/10❌ Critical

Determinism (35 points max):

  • No hard waits: 0/10 ❌ (found 3 instances)
  • No conditionals: 8/10 ⚠️ (found 2 instances)
  • No try-catch flow control: 10/10 ✅
  • Network-first patterns: 8/15 ⚠️ (some tests missing)

Isolation (25 points max):

  • Self-cleaning: 20/20 ✅
  • No global state: 5/5 ✅
  • Parallel-safe: 0/0 ✅ (not tested)

Assertions (20 points max):

  • Explicit in test body: 15/15 ✅
  • Specific and meaningful: 3/5 ⚠️ (some weak assertions)

Structure (10 points max):

  • Test size < 300 lines: 5/5 ✅
  • Clear names: 2/5 ⚠️ (some vague names)

Performance (10 points max):

  • Execution time < 1.5 min: 3/10 ❌ (3 tests exceed limit)
FileScoreIssuesStatus
tests/e2e/checkout.spec.ts65/1004❌ Needs Work
tests/e2e/profile.spec.ts72/1003⚠️ Needs Improvement
tests/e2e/search.spec.ts88/1001✅ Good
tests/api/profile.spec.ts92/1000✅ Excellent
  1. Remove hard waits in checkout.spec.ts (line 45, 67, 89)
  2. Fix conditional in profile.spec.ts (line 28)
  3. Optimize slow tests in checkout.spec.ts
  1. Extract login fixture from profile.spec.ts
  2. Add network assertions to api-calls.spec.ts
  3. Improve test names in checkout.spec.ts
  1. Re-run test-review after fixes (target: 85/100)
  2. Add performance budgets to CI
  3. Document test patterns for team

TEA reviewed against these patterns:

## Understanding the Scores
### What Do Scores Mean?
| Score Range | Interpretation | Action |
|-------------|----------------|--------|
| **90-100** | Excellent | Minimal changes needed, production-ready |
| **80-89** | Good | Minor improvements recommended |
| **70-79** | Acceptable | Address recommendations before release |
| **60-69** | Needs Improvement | Fix critical issues, apply recommendations |
| **< 60** | Critical | Significant refactoring needed |
### Scoring Criteria
**Determinism (35 points):**
- Tests produce same result every run
- No random failures (flakiness)
- No environment-dependent behavior
**Isolation (25 points):**
- Tests don't depend on each other
- Can run in any order
- Clean up after themselves
**Assertions (20 points):**
- Verify actual behavior
- Specific and meaningful
- Not abstracted away in helpers
**Structure (10 points):**
- Readable and maintainable
- Appropriate size
- Clear naming
**Performance (10 points):**
- Fast execution
- Efficient selectors
- No unnecessary waits
## What You Get
### Quality Report
- Overall score (0-100)
- Category scores (Determinism, Isolation, etc.)
- File-by-file breakdown
### Critical Issues
- Specific line numbers
- Code examples (current vs fixed)
- Why it matters explanation
- Impact assessment
### Recommendations
- Actionable improvements
- Code examples
- Priority/severity levels
### Next Steps
- Immediate actions (fix critical)
- Short-term improvements
- Long-term quality goals
## Tips
### Review Before Release
Make test review part of release checklist:
```markdown
## Release Checklist
- [ ] All tests passing
- [ ] Test review score > 80
- [ ] Critical issues resolved
- [ ] Performance within budget

Always review AI-generated tests:

1. Run atdd or automate
2. Run test-review on generated tests
3. Fix critical issues
4. Commit tests

Use scores as quality gates:

.github/workflows/test.yml
- name: Review test quality
run: |
# Run test review
# Parse score from report
if [ $SCORE -lt 80 ]; then
echo "Test quality below threshold"
exit 1
fi

Schedule periodic reviews:

  • Per story: Optional (spot check new tests)
  • Per epic: Recommended (ensure consistency)
  • Per release: Recommended for quality gates (required if using formal gate process)
  • Quarterly: Audit entire suite

For large suites, review incrementally:

Week 1: Review E2E tests Week 2: Review API tests Week 3: Review component tests (Cypress CT or Vitest) Week 4: Apply fixes across all suites

Component Testing Note: TEA reviews component tests using framework-specific knowledge:

  • Cypress: Reviews Cypress Component Testing specs (*.cy.tsx)
  • Playwright: Reviews Vitest component tests (*.test.tsx)

Share reports with team:

Team Meeting:
- Review test-review.md
- Discuss critical issues
- Agree on patterns
- Update team guidelines

Track improvement:

## Quality Trend
| Date | Score | Critical Issues | Notes |
|------|-------|-----------------|-------|
| 2026-01-01 | 65 | 5 | Baseline |
| 2026-01-15 | 72 | 2 | Fixed hard waits |
| 2026-02-01 | 84 | 0 | All critical resolved |

Symptoms:

  • Tests fail randomly
  • “Works on my machine”
  • CI failures that don’t reproduce locally

Common Causes:

  • Hard waits (waitForTimeout)
  • Conditional flow control (if/else)
  • Try-catch for flow control
  • Missing network-first patterns

Fix: Review determinism section, apply network-first patterns

Symptoms:

  • Tests take > 1.5 minutes each
  • Test suite takes hours
  • CI times out

Common Causes:

  • Unnecessary waits (hard timeouts)
  • Inefficient selectors (XPath, complex CSS)
  • Not using parallelization
  • Heavy setup in every test

Fix: Optimize waits, improve selectors, use fixtures

Symptoms:

  • Tests fail when run in different order
  • Tests fail in parallel
  • Test data conflicts

Common Causes:

  • Shared global state
  • Tests don’t clean up
  • Hard-coded test data
  • Database not reset between tests

Fix: Use fixtures, clean up in afterEach, use unique test data

Problem: Report shows 50+ issues, overwhelming.

Solution: Prioritize:

  1. Fix all critical issues first
  2. Apply top 3 recommendations
  3. Re-run review
  4. Iterate

Don’t try to fix everything at once.

Problem: Reviewing entire suite takes hours.

Solution: Review incrementally:

  • Review new tests in PR review
  • Schedule directory reviews weekly
  • Full suite review quarterly

Generated with BMad Method - TEA (Test Architect)