How to Run Test Review with TEA
How to Run Test Review with TEA
Section titled âHow to Run Test Review with TEAâUse TEAâs test-review workflow to audit test quality with objective scoring and actionable feedback. TEA reviews tests against its knowledge base of best practices.
When to Use This
Section titled âWhen to Use Thisâ- Want to validate test quality objectively
- Need quality metrics for release gates
- Preparing for production deployment
- Reviewing team-written tests
- Auditing AI-generated tests
- Onboarding new team members (show good patterns)
Prerequisites
Section titled âPrerequisitesâ- BMad Method installed
- TEA agent available
- Tests written (to review)
- Test framework configured
1. Load TEA Agent
Section titled â1. Load TEA AgentâStart a fresh chat and load TEA:
tea2. Run the Test Review Workflow
Section titled â2. Run the Test Review Workflowâtest-review3. Specify Review Scope
Section titled â3. Specify Review ScopeâTEA will ask what to review.
Option A: Single File
Section titled âOption A: Single FileâReview one test file:
tests/e2e/checkout.spec.tsBest for:
- Reviewing specific failing tests
- Quick feedback on new tests
- Learning from specific examples
Option B: Directory
Section titled âOption B: DirectoryâReview all tests in a directory:
tests/e2e/Best for:
- Reviewing E2E test suite
- Comparing test quality across files
- Finding patterns of issues
Option C: Entire Suite
Section titled âOption C: Entire SuiteâReview all tests:
tests/Best for:
- Release gate quality check
- Comprehensive audit
- Establishing baseline metrics
4. Review the Quality Report
Section titled â4. Review the Quality ReportâTEA generates a comprehensive quality report with scoring.
Report Structure (test-review.md):
Section titled âReport Structure (test-review.md):â# Test Quality Review Report
**Date:** 2026-01-13**Scope:** tests/e2e/**Overall Score:** 76/100
## Summary
- **Tests Reviewed:** 12- **Passing Quality:** 9 tests (75%)- **Needs Improvement:** 3 tests (25%)- **Critical Issues:** 2- **Recommendations:** 6
## Critical Issues
### 1. Hard Waits Detected
**File:** `tests/e2e/checkout.spec.ts:45`**Issue:** Using `page.waitForTimeout(3000)`**Impact:** Test is flaky and unnecessarily slow**Severity:** Critical
**Current Code:**```typescriptawait page.click('button[type="submit"]');await page.waitForTimeout(3000); // â Hard waitawait expect(page.locator('.success')).toBeVisible();Fix:
await page.click('button[type="submit"]');// Wait for the API response that triggers success messageawait page.waitForResponse(resp => resp.url().includes('/api/checkout') && resp.ok());await expect(page.locator('.success')).toBeVisible();Why This Matters:
- Hard waits are fixed timeouts that donât wait for actual conditions
- Tests fail intermittently on slower machines
- Wastes time waiting even when response is fast
- Network-first patterns are more reliable
2. Conditional Flow Control
Section titled â2. Conditional Flow ControlâFile: tests/e2e/profile.spec.ts:28
Issue: Using if/else to handle optional elements
Impact: Non-deterministic test behavior
Severity: Critical
Current Code:
if (await page.locator('.banner').isVisible()) { await page.click('.dismiss');}// â Test behavior changes based on banner presenceFix:
// Option 1: Make banner presence deterministicawait expect(page.locator('.banner')).toBeVisible();await page.click('.dismiss');
// Option 2: Test both scenarios separatelytest('should show banner for new users', async ({ page }) => { // Test with banner});
test('should not show banner for returning users', async ({ page }) => { // Test without banner});Why This Matters:
- Tests should be deterministic (same result every run)
- Conditionals hide bugs (what if banner should always show?)
- Makes debugging harder
- Violates test isolation principle
Recommendations
Section titled âRecommendationsâ1. Extract Repeated Setup
Section titled â1. Extract Repeated SetupâFile: tests/e2e/profile.spec.ts
Issue: Login code duplicated in every test
Severity: Medium
Impact: Maintenance burden, test verbosity
Current:
test('test 1', async ({ page }) => { await page.goto('/login'); await page.fill('[name="email"]', 'test@example.com'); await page.fill('[name="password"]', 'password'); await page.click('button[type="submit"]'); // Test logic...});
test('test 2', async ({ page }) => { // Same login code repeated});Fix (Vanilla Playwright):
// Create fixture in tests/support/fixtures/auth.tsimport { test as base, Page } from '@playwright/test';
export const test = base.extend<{ authenticatedPage: Page }>({ authenticatedPage: async ({ page }, use) => { await page.goto('/login'); await page.getByLabel('Email').fill('test@example.com'); await page.getByLabel('Password').fill('password'); await page.getByRole('button', { name: 'Sign in' }).click(); await page.waitForURL(/\/dashboard/); await use(page); }});
// Use in teststest('test 1', async ({ authenticatedPage }) => { // Already logged in});Better (With Playwright Utils):
// Use built-in auth-session fixtureimport { test as base } from '@playwright/test';import { createAuthFixtures } from '@seontechnologies/playwright-utils/auth-session';
export const test = base.extend(createAuthFixtures());
// Use in tests - even simplertest('test 1', async ({ page, authToken }) => { // authToken already available (persisted, reused) await page.goto('/dashboard'); // Already authenticated via authToken});Playwright Utils Benefits:
- Token persisted to disk (faster subsequent runs)
- Multi-user support out of the box
- Automatic token renewal if expired
- No manual login flow needed
2. Add Network Assertions
Section titled â2. Add Network AssertionsâFile: tests/e2e/api-calls.spec.ts
Issue: No verification of API responses
Severity: Low
Impact: Tests donât catch API errors
Current:
await page.click('button[name="save"]');await expect(page.locator('.success')).toBeVisible();// â What if API returned 500 but UI shows cached success?Enhancement:
const responsePromise = page.waitForResponse( resp => resp.url().includes('/api/profile') && resp.status() === 200);await page.click('button[name="save"]');const response = await responsePromise;
// Verify API responseconst data = await response.json();expect(data.success).toBe(true);
// Verify UIawait expect(page.locator('.success')).toBeVisible();3. Improve Test Names
Section titled â3. Improve Test NamesâFile: tests/e2e/checkout.spec.ts
Issue: Vague test names
Severity: Low
Impact: Hard to understand test purpose
Current:
test('should work', async ({ page }) => { });test('test checkout', async ({ page }) => { });Better:
test('should complete checkout with valid credit card', async ({ page }) => { });test('should show validation error for expired card', async ({ page }) => { });Quality Scores by Category
Section titled âQuality Scores by Categoryâ| Category | Score | Target | Status |
|---|---|---|---|
| Determinism | 26/35 | 30/35 | â ď¸ Needs Improvement |
| Isolation | 22/25 | 20/25 | â Good |
| Assertions | 18/20 | 16/20 | â Good |
| Structure | 7/10 | 8/10 | â ď¸ Minor Issues |
| Performance | 3/10 | 8/10 | â Critical |
Scoring Breakdown
Section titled âScoring BreakdownâDeterminism (35 points max):
- No hard waits: 0/10 â (found 3 instances)
- No conditionals: 8/10 â ď¸ (found 2 instances)
- No try-catch flow control: 10/10 â
- Network-first patterns: 8/15 â ď¸ (some tests missing)
Isolation (25 points max):
- Self-cleaning: 20/20 â
- No global state: 5/5 â
- Parallel-safe: 0/0 â (not tested)
Assertions (20 points max):
- Explicit in test body: 15/15 â
- Specific and meaningful: 3/5 â ď¸ (some weak assertions)
Structure (10 points max):
- Test size < 300 lines: 5/5 â
- Clear names: 2/5 â ď¸ (some vague names)
Performance (10 points max):
- Execution time < 1.5 min: 3/10 â (3 tests exceed limit)
Files Reviewed
Section titled âFiles Reviewedâ| File | Score | Issues | Status |
|---|---|---|---|
tests/e2e/checkout.spec.ts | 65/100 | 4 | â Needs Work |
tests/e2e/profile.spec.ts | 72/100 | 3 | â ď¸ Needs Improvement |
tests/e2e/search.spec.ts | 88/100 | 1 | â Good |
tests/api/profile.spec.ts | 92/100 | 0 | â Excellent |
Next Steps
Section titled âNext StepsâImmediate (Fix Critical Issues)
Section titled âImmediate (Fix Critical Issues)â- Remove hard waits in
checkout.spec.ts(line 45, 67, 89) - Fix conditional in
profile.spec.ts(line 28) - Optimize slow tests in
checkout.spec.ts
Short-term (Apply Recommendations)
Section titled âShort-term (Apply Recommendations)â- Extract login fixture from
profile.spec.ts - Add network assertions to
api-calls.spec.ts - Improve test names in
checkout.spec.ts
Long-term (Continuous Improvement)
Section titled âLong-term (Continuous Improvement)â- Re-run
test-reviewafter fixes (target: 85/100) - Add performance budgets to CI
- Document test patterns for team
Knowledge Base References
Section titled âKnowledge Base ReferencesâTEA reviewed against these patterns:
- test-quality.md - Execution limits, isolation
- network-first.md - Deterministic waits
- timing-debugging.md - Race conditions
- selector-resilience.md - Robust selectors
## Understanding the Scores
### What Do Scores Mean?
| Score Range | Interpretation | Action ||-------------|----------------|--------|| **90-100** | Excellent | Minimal changes needed, production-ready || **80-89** | Good | Minor improvements recommended || **70-79** | Acceptable | Address recommendations before release || **60-69** | Needs Improvement | Fix critical issues, apply recommendations || **< 60** | Critical | Significant refactoring needed |
### Scoring Criteria
**Determinism (35 points):**- Tests produce same result every run- No random failures (flakiness)- No environment-dependent behavior
**Isolation (25 points):**- Tests don't depend on each other- Can run in any order- Clean up after themselves
**Assertions (20 points):**- Verify actual behavior- Specific and meaningful- Not abstracted away in helpers
**Structure (10 points):**- Readable and maintainable- Appropriate size- Clear naming
**Performance (10 points):**- Fast execution- Efficient selectors- No unnecessary waits
## What You Get
### Quality Report- Overall score (0-100)- Category scores (Determinism, Isolation, etc.)- File-by-file breakdown
### Critical Issues- Specific line numbers- Code examples (current vs fixed)- Why it matters explanation- Impact assessment
### Recommendations- Actionable improvements- Code examples- Priority/severity levels
### Next Steps- Immediate actions (fix critical)- Short-term improvements- Long-term quality goals
## Tips
### Review Before Release
Make test review part of release checklist:
```markdown## Release Checklist- [ ] All tests passing- [ ] Test review score > 80- [ ] Critical issues resolved- [ ] Performance within budgetReview After AI Generation
Section titled âReview After AI GenerationâAlways review AI-generated tests:
1. Run atdd or automate2. Run test-review on generated tests3. Fix critical issues4. Commit testsSet Quality Gates
Section titled âSet Quality GatesâUse scores as quality gates:
- name: Review test quality run: | # Run test review # Parse score from report if [ $SCORE -lt 80 ]; then echo "Test quality below threshold" exit 1 fiReview Regularly
Section titled âReview RegularlyâSchedule periodic reviews:
- Per story: Optional (spot check new tests)
- Per epic: Recommended (ensure consistency)
- Per release: Recommended for quality gates (required if using formal gate process)
- Quarterly: Audit entire suite
Focus Reviews
Section titled âFocus ReviewsâFor large suites, review incrementally:
Week 1: Review E2E tests Week 2: Review API tests Week 3: Review component tests (Cypress CT or Vitest) Week 4: Apply fixes across all suites
Component Testing Note: TEA reviews component tests using framework-specific knowledge:
- Cypress: Reviews Cypress Component Testing specs (*.cy.tsx)
- Playwright: Reviews Vitest component tests (*.test.tsx)
Use Reviews for Learning
Section titled âUse Reviews for LearningâShare reports with team:
Team Meeting:- Review test-review.md- Discuss critical issues- Agree on patterns- Update team guidelinesCompare Over Time
Section titled âCompare Over TimeâTrack improvement:
## Quality Trend
| Date | Score | Critical Issues | Notes ||------|-------|-----------------|-------|| 2026-01-01 | 65 | 5 | Baseline || 2026-01-15 | 72 | 2 | Fixed hard waits || 2026-02-01 | 84 | 0 | All critical resolved |Common Issues
Section titled âCommon IssuesâLow Determinism Score
Section titled âLow Determinism ScoreâSymptoms:
- Tests fail randomly
- âWorks on my machineâ
- CI failures that donât reproduce locally
Common Causes:
- Hard waits (
waitForTimeout) - Conditional flow control (
if/else) - Try-catch for flow control
- Missing network-first patterns
Fix: Review determinism section, apply network-first patterns
Low Performance Score
Section titled âLow Performance ScoreâSymptoms:
- Tests take > 1.5 minutes each
- Test suite takes hours
- CI times out
Common Causes:
- Unnecessary waits (hard timeouts)
- Inefficient selectors (XPath, complex CSS)
- Not using parallelization
- Heavy setup in every test
Fix: Optimize waits, improve selectors, use fixtures
Low Isolation Score
Section titled âLow Isolation ScoreâSymptoms:
- Tests fail when run in different order
- Tests fail in parallel
- Test data conflicts
Common Causes:
- Shared global state
- Tests donât clean up
- Hard-coded test data
- Database not reset between tests
Fix: Use fixtures, clean up in afterEach, use unique test data
âToo Many Issues to Fixâ
Section titled ââToo Many Issues to FixââProblem: Report shows 50+ issues, overwhelming.
Solution: Prioritize:
- Fix all critical issues first
- Apply top 3 recommendations
- Re-run review
- Iterate
Donât try to fix everything at once.
Reviews Take Too Long
Section titled âReviews Take Too LongâProblem: Reviewing entire suite takes hours.
Solution: Review incrementally:
- Review new tests in PR review
- Schedule directory reviews weekly
- Full suite review quarterly
Related Guides
Section titled âRelated Guidesâ- How to Run ATDD - Generate tests to review
- How to Run Automate - Expand coverage to review
- How to Run Trace - Coverage complements quality
Understanding the Concepts
Section titled âUnderstanding the Conceptsâ- Test Quality Standards - What makes tests good
- Network-First Patterns - Avoiding flakiness
- Fixture Architecture - Reusable patterns
Reference
Section titled âReferenceâ- Command: *test-review - Full command reference
- Knowledge Base Index - Patterns TEA reviews against
Generated with BMad Method - TEA (Test Architect)