Sandbox demo

Running Tests with Claude

Tight feedback loops: run, read failure, fix, run again — without you driving.

Once tests are wired up, the agent can drive the red→green loop on its own. You describe what should be true; it writes the assertion; it watches the test fail; it edits the implementation; it re-runs until green.

The skill here is *trusting the loop*. Don't peek at every iteration — let it run, then review the diff at the end.

Sandbox transcript
Claude Code · sandbox replay
Press Play to replay this transcript step-by-step.
0 / 8

The transcript is canned — no real API calls. The point is to internalize the loop: user → plan → tool → result → answer.

Check your understanding
Q1. What's the value of letting the agent run the test *before* writing the implementation?
· Score 100% on the quiz.