Test Prompts That Actually Work

Module 03: TDD with AI | Expansion Guide

Back to Module 03

The Problem

You ask AI to "write tests" and get back a test suite that looks great at first glance. It runs. Everything passes. You commit it. Then production breaks because the tests were useless - they tested implementation details, missed edge cases, or had assertions so weak they'd pass with completely wrong code.

The issue: Generic prompts produce generic tests. AI needs specific guidance to write tests that actually catch bugs.

Most developers either accept whatever tests AI generates (dangerous) or write all tests manually (slow). Neither approach leverages AI effectively for test generation.

The Core Insight

Good test prompts include examples, specify edge cases, and demand strong assertions. They teach AI what "working" means for your specific code.

AI can write excellent tests if you provide: the happy path example, edge cases to cover, specific assertion patterns, and the testing philosophy you want followed. Think of your prompt as a brief specification document.

The Walkthrough

Pattern 1: The Example-Driven Test Prompt

Start with concrete examples of how the code should behave. AI extrapolates from examples better than from abstract descriptions.

# BAD: Vague prompt
"Write tests for the email validator function"

# GOOD: Example-driven prompt
"Write tests for validate_email(email: str) -> bool

Examples of valid emails:
- user@example.com
- user.name+tag@example.co.uk
- user_123@subdomain.example.com

Examples of invalid emails:
- @example.com (missing local part)
- user@.com (missing domain)
- user name@example.com (contains space)
- user@example (missing TLD)

The function should:
1. Return True for valid emails
2. Return False for invalid emails
3. Not raise exceptions for any input

Write comprehensive tests covering these cases plus any additional edge cases you identify."

This gives AI a clear picture of what constitutes valid/invalid, plus permission to think of more cases.

Pattern 2: The Edge Case Checklist

AI often misses edge cases unless you explicitly list categories to test. Provide a checklist:

"Write tests for the calculate_discount(price, coupon_code) function.

Test these categories:
1. **Happy path:** Valid price + valid coupon = correct discount
2. **Boundary values:**
   - price = 0
   - price = 0.01 (minimum)
   - price = 999999 (very large)
3. **Invalid inputs:**
   - Negative price
   - None/null price
   - Non-numeric price
4. **Coupon edge cases:**
   - Empty coupon code
   - Expired coupon
   - Coupon already used
   - Invalid coupon code
   - Case sensitivity (should 'SAVE10' == 'save10'?)
5. **Calculation edge cases:**
   - Discount larger than price
   - Multiple coupons (if applicable)
   - Rounding edge cases (e.g., $10.00 with 33% off)

For each test, use descriptive names and include docstrings explaining the scenario."

Why This Works

The checklist forces AI to think systematically. You're not doing the work - AI generates the actual test code - but you're providing the test strategy.

Pattern 3: The Assertion Template

Weak assertions are the most common test smell. Show AI what strong assertions look like:

"Write tests for the UserService.create_user() method.

Use these assertion patterns:

**For successful creation:**
```python
user = service.create_user(email="test@example.com", name="Test")

# Assert object properties
assert user.id is not None
assert user.email == "test@example.com"
assert user.name == "Test"
assert isinstance(user.created_at, datetime)

# Assert side effects
assert User.query.count() == 1
assert email_service.sent_emails[0].to == "test@example.com"
```

**For validation errors:**
```python
with pytest.raises(ValidationError) as exc_info:
    service.create_user(email="invalid", name="Test")

# Assert error details
assert "email" in str(exc_info.value)
assert "invalid format" in str(exc_info.value).lower()
```

**For duplicate prevention:**
```python
service.create_user(email="test@example.com", name="Test 1")

with pytest.raises(DuplicateUserError):
    service.create_user(email="test@example.com", name="Test 2")

# Assert first user still exists, second wasn't created
assert User.query.count() == 1
```

Generate comprehensive tests following these assertion patterns."

Pattern 4: The Test Philosophy Specification

Different projects need different testing approaches. Tell AI your philosophy:

"Write integration tests for the PaymentProcessor class.

Testing philosophy for this project:
- **Test behavior, not implementation:** Don't assert on private methods or internal state
- **No mocking external APIs:** Use VCR.py to record real API responses
- **Test idempotency:** Every test that creates data should clean up after itself
- **Realistic test data:** Use factories, not hardcoded values like 'test@test.com'
- **Parallel-safe:** Tests must not depend on shared state or execution order

For async methods, use pytest-asyncio. For database tests, use the test_db fixture.

Cover these scenarios:
1. Successful payment processing
2. Declined payment handling
3. Network timeout handling
4. Idempotent retry behavior
5. Webhook signature verification"

Failure Patterns

1. Tests That Only Test Happy Path

Symptom: All tests pass in development, production crashes on first error case.

Fix: Always include negative test cases in your prompt.

# Add to every test prompt:
"For each function, write at least:
- 1 happy path test
- 2 edge case tests
- 1 error handling test"

2. Tests Coupled to Implementation

Symptom: Refactoring breaks tests even when behavior is unchanged.

Fix: Explicitly tell AI to test public interface only.

"Write tests for the ShoppingCart class.

IMPORTANT: Only test public methods. Do NOT:
- Assert on private attributes (e.g., self._items)
- Mock internal helper methods
- Test implementation details

DO:
- Test observable behavior through public API
- Assert on return values and side effects
- Use the class as an end user would"

3. Missing Test Data Context

Symptom: Tests use nonsensical data like user_id=1 everywhere.

Fix: Provide realistic test data in your prompt.

"Write tests for order processing logic.

Use realistic test data:
- User IDs: UUIDs like '550e8400-e29b-41d4-a716-446655440000'
- Order IDs: Format 'ORD-2026-001234'
- Timestamps: Use freezegun to control time
- Products: Reference the test_products.json fixture

Generate meaningful test scenarios like:
- Customer orders 3 items, 1 out of stock
- Customer applies expired coupon code
- Order total exceeds credit limit"

The Copy-Paste Trap

AI sometimes generates tests by copy-pasting the same test with minor variations. Review generated tests for redundancy. If 10 tests are nearly identical, you probably need better prompt guidance on what variations matter.

Advanced Techniques

Parameterized Test Generation

"Write parameterized tests for password_strength(password) -> int

Use pytest.mark.parametrize with these test cases:

Weak passwords (should return 0-30):
- '12345678'
- 'password'
- 'qwerty'

Medium passwords (should return 31-70):
- 'Password1'
- 'my-password-123'

Strong passwords (should return 71-100):
- 'C0mpl3x!P@ssw0rd'
- 'correct-horse-battery-staple-2024'

Format as a single parameterized test, not individual test functions."

Property-Based Testing Prompts

"Write property-based tests using Hypothesis for the merge_sorted_lists() function.

Properties to test:
1. Output length equals sum of input lengths
2. Output is sorted
3. All elements from inputs appear in output
4. No elements appear in output that weren't in inputs
5. Function is commutative: merge(a, b) == merge(b, a)

Use hypothesis.strategies.lists with integers."

Quick Reference

Essential Prompt Components:

Test Coverage Checklist to Include:

1. Happy path
2. Boundary values (0, 1, max, min)
3. Invalid inputs (null, wrong type, malformed)
4. Error conditions (exceptions, error returns)
5. Edge cases specific to domain
6. Integration points (if applicable)

Prompt Template:

"Write tests for [function/class name]

Signature: [exact signature with types]

Behavior:
- [what it does in plain English]

Valid inputs:
- [example 1]
- [example 2]

Invalid inputs:
- [example 1] -> should [raise/return X]
- [example 2] -> should [raise/return Y]

Test these scenarios:
1. [scenario 1]
2. [scenario 2]

Use [testing framework] with [specific patterns/fixtures].

Each test should have:
- Descriptive name
- Docstring explaining scenario
- Arrange-Act-Assert structure
- Specific assertions (not just truthy checks)"

Red Flags in Generated Tests: