Sandbox demo

Running Tests with Claude

Tight feedback loops: run, read failure, fix, run again — without you driving.

Once tests are wired up, the agent can drive the red→green loop on its own. You describe what should be true; it writes the assertion; it watches the test fail; it edits the implementation; it re-runs until green.

The skill here is *trusting the loop*. Don't peek at every iteration — let it run, then review the diff at the end.

Sandbox transcript

Claude Code · sandbox replay

Press Play to replay this transcript step-by-step.

0 / 8

The transcript is canned — no real API calls. The point is to internalize the loop: user → plan → tool → result → answer.

Check your understanding

Q1. What's the value of letting the agent run the test *before* writing the implementation?

· Score 100% on the quiz.