The Problem
You ask AI to "write tests" and get back a test suite that looks great at first glance. It runs. Everything passes. You commit it. Then production breaks because the tests were useless - they tested implementation details, missed edge cases, or had assertions so weak they'd pass with completely wrong code.
The issue: Generic prompts produce generic tests. AI needs specific guidance to write tests that actually catch bugs.
Most developers either accept whatever tests AI generates (dangerous) or write all tests manually (slow). Neither approach leverages AI effectively for test generation.
The Core Insight
Good test prompts include examples, specify edge cases, and demand strong assertions. They teach AI what "working" means for your specific code.
AI can write excellent tests if you provide: the happy path example, edge cases to cover, specific assertion patterns, and the testing philosophy you want followed. Think of your prompt as a brief specification document.
The Walkthrough
Pattern 1: The Example-Driven Test Prompt
Start with concrete examples of how the code should behave. AI extrapolates from examples better than from abstract descriptions.
# BAD: Vague prompt
"Write tests for the email validator function"
# GOOD: Example-driven prompt
"Write tests for validate_email(email: str) -> bool
Examples of valid emails:
- user@example.com
- user.name+tag@example.co.uk
- user_123@subdomain.example.com
Examples of invalid emails:
- @example.com (missing local part)
- user@.com (missing domain)
- user name@example.com (contains space)
- user@example (missing TLD)
The function should:
1. Return True for valid emails
2. Return False for invalid emails
3. Not raise exceptions for any input
Write comprehensive tests covering these cases plus any additional edge cases you identify."
This gives AI a clear picture of what constitutes valid/invalid, plus permission to think of more cases.
Pattern 2: The Edge Case Checklist
AI often misses edge cases unless you explicitly list categories to test. Provide a checklist:
"Write tests for the calculate_discount(price, coupon_code) function.
Test these categories:
1. **Happy path:** Valid price + valid coupon = correct discount
2. **Boundary values:**
- price = 0
- price = 0.01 (minimum)
- price = 999999 (very large)
3. **Invalid inputs:**
- Negative price
- None/null price
- Non-numeric price
4. **Coupon edge cases:**
- Empty coupon code
- Expired coupon
- Coupon already used
- Invalid coupon code
- Case sensitivity (should 'SAVE10' == 'save10'?)
5. **Calculation edge cases:**
- Discount larger than price
- Multiple coupons (if applicable)
- Rounding edge cases (e.g., $10.00 with 33% off)
For each test, use descriptive names and include docstrings explaining the scenario."
Why This Works
The checklist forces AI to think systematically. You're not doing the work - AI generates the actual test code - but you're providing the test strategy.
Pattern 3: The Assertion Template
Weak assertions are the most common test smell. Show AI what strong assertions look like:
"Write tests for the UserService.create_user() method.
Use these assertion patterns:
**For successful creation:**
```python
user = service.create_user(email="test@example.com", name="Test")
# Assert object properties
assert user.id is not None
assert user.email == "test@example.com"
assert user.name == "Test"
assert isinstance(user.created_at, datetime)
# Assert side effects
assert User.query.count() == 1
assert email_service.sent_emails[0].to == "test@example.com"
```
**For validation errors:**
```python
with pytest.raises(ValidationError) as exc_info:
service.create_user(email="invalid", name="Test")
# Assert error details
assert "email" in str(exc_info.value)
assert "invalid format" in str(exc_info.value).lower()
```
**For duplicate prevention:**
```python
service.create_user(email="test@example.com", name="Test 1")
with pytest.raises(DuplicateUserError):
service.create_user(email="test@example.com", name="Test 2")
# Assert first user still exists, second wasn't created
assert User.query.count() == 1
```
Generate comprehensive tests following these assertion patterns."
Pattern 4: The Test Philosophy Specification
Different projects need different testing approaches. Tell AI your philosophy:
"Write integration tests for the PaymentProcessor class.
Testing philosophy for this project:
- **Test behavior, not implementation:** Don't assert on private methods or internal state
- **No mocking external APIs:** Use VCR.py to record real API responses
- **Test idempotency:** Every test that creates data should clean up after itself
- **Realistic test data:** Use factories, not hardcoded values like 'test@test.com'
- **Parallel-safe:** Tests must not depend on shared state or execution order
For async methods, use pytest-asyncio. For database tests, use the test_db fixture.
Cover these scenarios:
1. Successful payment processing
2. Declined payment handling
3. Network timeout handling
4. Idempotent retry behavior
5. Webhook signature verification"
Failure Patterns
1. Tests That Only Test Happy Path
Symptom: All tests pass in development, production crashes on first error case.
Fix: Always include negative test cases in your prompt.
# Add to every test prompt:
"For each function, write at least:
- 1 happy path test
- 2 edge case tests
- 1 error handling test"
2. Tests Coupled to Implementation
Symptom: Refactoring breaks tests even when behavior is unchanged.
Fix: Explicitly tell AI to test public interface only.
"Write tests for the ShoppingCart class.
IMPORTANT: Only test public methods. Do NOT:
- Assert on private attributes (e.g., self._items)
- Mock internal helper methods
- Test implementation details
DO:
- Test observable behavior through public API
- Assert on return values and side effects
- Use the class as an end user would"
3. Missing Test Data Context
Symptom: Tests use nonsensical data like user_id=1 everywhere.
Fix: Provide realistic test data in your prompt.
"Write tests for order processing logic.
Use realistic test data:
- User IDs: UUIDs like '550e8400-e29b-41d4-a716-446655440000'
- Order IDs: Format 'ORD-2026-001234'
- Timestamps: Use freezegun to control time
- Products: Reference the test_products.json fixture
Generate meaningful test scenarios like:
- Customer orders 3 items, 1 out of stock
- Customer applies expired coupon code
- Order total exceeds credit limit"
The Copy-Paste Trap
AI sometimes generates tests by copy-pasting the same test with minor variations. Review generated tests for redundancy. If 10 tests are nearly identical, you probably need better prompt guidance on what variations matter.
Advanced Techniques
Parameterized Test Generation
"Write parameterized tests for password_strength(password) -> int
Use pytest.mark.parametrize with these test cases:
Weak passwords (should return 0-30):
- '12345678'
- 'password'
- 'qwerty'
Medium passwords (should return 31-70):
- 'Password1'
- 'my-password-123'
Strong passwords (should return 71-100):
- 'C0mpl3x!P@ssw0rd'
- 'correct-horse-battery-staple-2024'
Format as a single parameterized test, not individual test functions."
Property-Based Testing Prompts
"Write property-based tests using Hypothesis for the merge_sorted_lists() function.
Properties to test:
1. Output length equals sum of input lengths
2. Output is sorted
3. All elements from inputs appear in output
4. No elements appear in output that weren't in inputs
5. Function is commutative: merge(a, b) == merge(b, a)
Use hypothesis.strategies.lists with integers."
Quick Reference
Essential Prompt Components:
- Function signature: Exact types and return value
- Happy path example: What success looks like
- Edge cases list: Boundaries, nulls, errors
- Assertion style: Show an example of good assertions
- Testing constraints: No mocking X, use fixture Y, etc.
Test Coverage Checklist to Include:
1. Happy path
2. Boundary values (0, 1, max, min)
3. Invalid inputs (null, wrong type, malformed)
4. Error conditions (exceptions, error returns)
5. Edge cases specific to domain
6. Integration points (if applicable)
Prompt Template:
"Write tests for [function/class name]
Signature: [exact signature with types]
Behavior:
- [what it does in plain English]
Valid inputs:
- [example 1]
- [example 2]
Invalid inputs:
- [example 1] -> should [raise/return X]
- [example 2] -> should [raise/return Y]
Test these scenarios:
1. [scenario 1]
2. [scenario 2]
Use [testing framework] with [specific patterns/fixtures].
Each test should have:
- Descriptive name
- Docstring explaining scenario
- Arrange-Act-Assert structure
- Specific assertions (not just truthy checks)"
Red Flags in Generated Tests:
- Assertion is just
assert resultwith no specific check - Test names are generic:
test_function_1,test_function_2 - No error case testing
- All tests use same input values
- Tests assert on private attributes