
If you write end-to-end tests, you already know the slow part. It is not running the suite. It is the back-and-forth of opening the app, clicking through a flow, inspecting the DOM, and translating all of that into selectors and assertions. The Playwright MCP server removes most of that friction by letting an AI assistant like Claude drive a real browser, observe what actually renders, and turn a live session into a working Playwright test.
This tutorial is for developers who already test with Playwright (or want to) and use an MCP-capable client such as Claude Desktop, Claude Code, or Cursor. You will learn what the Playwright MCP server is, how it talks to the browser through the accessibility tree, how to configure it, and how to use it to explore an app and generate maintainable E2E tests. By the end, you will have a repeatable workflow that turns “click around and figure out the selectors” into a few sentences of instruction.
What Is the Playwright MCP Server?
The Playwright MCP server is an official Microsoft tool that exposes browser automation as Model Context Protocol (MCP) tools, so any MCP client can navigate pages, click elements, fill forms, and read page structure. It lets an AI assistant control a real Chromium, Firefox, or WebKit browser and observe results through the accessibility tree rather than guessing from raw HTML.
That last detail matters. Most AI browser tools work from screenshots, which forces the model to reason visually about pixel coordinates. The Playwright MCP server instead hands the model a structured snapshot of the page: roles, labels, and accessible names, similar to what a screen reader sees. As a result, actions are deterministic, fast, and easy to translate into real Playwright selectors like getByRole and getByLabel.
If MCP itself is new to you, start with our guide to the MCP protocol, then come back here. This post assumes you understand the basic client-server model.
Why Drive E2E Tests Through MCP?
Traditional test authoring is a manual translation job. You perform an action, open DevTools, find a stable selector, and write the assertion by hand. Multiply that across a checkout flow with ten steps, and a single test eats an afternoon.
With the Playwright MCP server, you describe the flow in plain language. The assistant opens the page, reads the accessibility snapshot, performs each step, and reports what it saw. Because it works from real page structure, it picks role-based selectors that survive CSS refactors. Furthermore, it can spot flakiness sources, such as a button that is briefly disabled, before you ever commit the test.
This pairs naturally with Claude’s tool use, which is the underlying mechanism the model uses to call each browser action. MCP just standardizes how those tools are exposed.
How the Accessibility Tree Approach Works
When the assistant calls browser_snapshot, the server returns a YAML-like tree of the current page. A login form might come back looking like this:
- document:
- heading "Sign in" [level=1]
- textbox "Email"
- textbox "Password"
- button "Log in"
- link "Forgot password?"
Each node carries a stable reference the model uses to act. To click the button, the assistant calls browser_click with that element’s reference, not a brittle XPath or a pixel coordinate. Consequently, the same instruction works whether the button is styled with Tailwind, plain CSS, or a component library.
This is also why generated tests read cleanly. A snapshot node like button "Log in" maps directly to page.getByRole('button', { name: 'Log in' }), which is exactly the locator the Playwright team recommends for resilient tests.
Prerequisites
Before configuring the server, make sure you have the following in place:
- Node.js 18 or newer, since the server ships as an npm package run through
npx. - An MCP client, such as Claude Desktop, Claude Code, or Cursor.
- A running app to test, either locally (for example
http://localhost:3000) or a deployed staging URL. - Basic Playwright familiarity. If you have never set Playwright up, read our end-to-end testing with Playwright guide first.
You do not need to install Playwright browsers separately for the MCP server itself. On first run, it downloads the Chromium build it needs.
Step 1: Add the Playwright MCP Server to Your Client
The server runs as a local process that your MCP client launches over stdio. Configuration is a small JSON block. For Claude Desktop, open the config file (claude_desktop_config.json) and add the server:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
For Claude Code, you can register it from the terminal instead, which writes the same configuration for you:
# Register the Playwright MCP server for the current project
claude mcp add playwright -- npx @playwright/mcp@latest
# Verify it connected
claude mcp list
# Expected output:
# playwright: connected
After saving the config, restart the client. The assistant now has access to browser tools such as browser_navigate, browser_click, browser_type, and browser_snapshot. You did not write a single line of automation code to get here.
Step 2: Configure Headless, Viewport, and Origin Limits
The defaults work, but production use benefits from a few flags. You pass them as additional args. For example, to run headless in CI and restrict the browser to your own domains, configure it like this:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp@latest",
"--headless",
"--browser=chromium",
"--viewport-size=1280,720",
"--allowed-origins=http://localhost:3000;https://staging.example.com"
]
}
}
}
A few of these deserve explanation. The --allowed-origins flag is a guardrail: it prevents the assistant from navigating off to arbitrary sites, which matters when you let an AI control a browser autonomously. Meanwhile, --headless is what you want in CI, whereas leaving it off opens a visible window that is genuinely useful while you watch the assistant explore a flow for the first time.
By default the server uses the snapshot (accessibility) mode described earlier. If you specifically need visual reasoning, such as testing a canvas element, add --vision to switch to screenshot-based interaction. For most form-and-button web apps, snapshot mode is faster and more reliable, so keep it as the default.
Step 3: Explore a Flow in Plain Language
With the server connected, you drive it through conversation against your own running app. Suppose you want to test a login flow. Pointing at your local dev URL with a seeded test account, you might prompt (swap in your real login URL and test credentials):
Open my app’s login page at
<your-local-url>/login, sign in with the seeded test account, and confirm the dashboard heading appears.
Behind the scenes, the assistant runs a sequence of tool calls:
browser_navigateto the login URLbrowser_snapshotto read the form structurebrowser_typeinto the email and password textboxesbrowser_clickon the “Log in” buttonbrowser_snapshotagain to verify the dashboard rendered
Crucially, it reports each step and what it observed. If the password field had a different accessible label than expected, you would see that immediately instead of discovering it as a failing assertion three hours later. This live feedback loop is the real productivity gain.
This style of autonomous browser control is closely related to Claude’s computer use, but scoped to the browser and driven by structured snapshots rather than screenshots, which makes it both cheaper and more deterministic.
Step 4: Generate a Maintainable Playwright Test
Exploration is useful, but the goal is a committed test. Once the flow works, ask the assistant to turn the session into a Playwright spec:
Generate a Playwright test for that login flow. Use role-based locators, add a web-first assertion for the dashboard, and follow the Arrange-Act-Assert structure.
A good result looks like production code, not a recording dump:
import { test, expect } from '@playwright/test';
test('user can log in and reach the dashboard', async ({ page }) => {
// Arrange: start from a clean login page
await page.goto('/login');
// Act: complete the sign-in form
await page.getByLabel('Email').fill('test@example.com');
await page.getByLabel('Password').fill('hunter2');
await page.getByRole('button', { name: 'Log in' }).click();
// Assert: web-first assertion auto-waits for the dashboard
await expect(
page.getByRole('heading', { name: 'Dashboard' })
).toBeVisible();
});
Notice why this code is solid. It uses getByLabel and getByRole, so a class rename will not break it. It relies on expect(...).toBeVisible(), which is a web-first assertion that auto-waits instead of needing a hardcoded waitForTimeout. These are exactly the patterns that prevent flaky suites, and the assistant chose them because the snapshot gave it real roles to work with.
Always review generated tests before committing. The assistant produces a strong first draft, yet you still own correctness, edge cases, and whether the assertion actually proves what you care about.
Step 5: Wire Generated Tests Into CI
Generated specs are ordinary Playwright tests, so they run wherever your suite already runs. You do not need the MCP server in CI at all; it is an authoring tool, not a runtime dependency. A typical GitHub Actions step stays unchanged:
# .github/workflows/e2e.yml
- name: Install Playwright browsers
run: npx playwright install --with-deps chromium
- name: Run E2E tests
run: npx playwright test
env:
BASE_URL: http://localhost:3000
Keep a clean separation in your head: the Playwright MCP server helps you write and debug tests interactively, while playwright test executes the committed suite deterministically. Mixing the two, for instance trying to run the MCP server inside CI to “test live,” is a common mistake that adds nondeterminism for no benefit.
When to Use the Playwright MCP Server
- You are authoring new E2E tests and want stable, role-based locators chosen from real page structure.
- You are debugging a flaky test and need to watch exactly what the browser does step by step.
- You are exploring an unfamiliar app or a teammate’s feature and want to map out user flows quickly.
- You want to generate a first-draft regression test for a bug before fixing it.
When NOT to Use the Playwright MCP Server
- For your CI run itself; commit the generated
playwright testspecs and run those instead. - For pure visual or pixel-perfect checks, where dedicated visual regression tools like Percy or Chromatic fit better.
- For load or performance testing, since browser automation is the wrong layer for that work.
- In any environment where an AI driving a browser against production data is unacceptable; restrict it to local and staging origins.
Common Mistakes with the Playwright MCP Server
- Committing AI-generated tests without review, which lets weak or tautological assertions slip in.
- Leaving
--allowed-originsunset, so the assistant can wander to external sites during a session. - Forcing
--visionmode by default, which is slower and less deterministic than snapshot mode for standard web UIs. - Treating the server as a test runner and depending on it in CI instead of the standard Playwright CLI.
A Realistic Scenario: Onboarding a New Feature’s Tests
Consider a small team shipping a multi-step signup wizard in a mid-sized SaaS app. The feature works, but it landed without E2E coverage, and the engineer who built it is out for the week. A teammate needs to add regression tests against a flow they have never seen.
Rather than reverse-engineering selectors from the component source, they point the Playwright MCP server at the staging URL and ask the assistant to walk the wizard end to end. Over a few minutes, the assistant navigates each step, reports the accessible labels it finds, and surfaces a real problem: the “Continue” button on step two shares an accessible name with a footer link, which would have produced an ambiguous locator. Knowing that up front, the teammate adds a data-testid to disambiguate, then has the assistant generate three specs covering the happy path and two validation branches. The trade-off is clear and worth naming: the AI accelerates discovery and drafting, but a human still decides which branches matter and verifies that each assertion proves something real.
Conclusion
The Playwright MCP server turns test authoring from a manual translation chore into a guided conversation. Because it reads the accessibility tree instead of screenshots, it produces deterministic actions and role-based locators that survive refactors, and it shows you problems while you explore rather than after a CI failure. Set it up once in your MCP client, restrict its origins, and use it to draft tests, then commit and run those specs with the standard Playwright CLI.
To go deeper, pair this workflow with our end-to-end Playwright setup and patterns guide, and if you run other tools through the same client, see how to manage MCP servers in Claude Code. Start by generating one test for your most fragile flow, review it carefully, and add it to your suite today.