Mastra: TypeScript AI Framework Setup and Patterns

If you build AI agents in TypeScript today, you are probably stitching together the Vercel AI SDK, a vector store client, a workflow engine, and your own eval harness. The Mastra TypeScript AI framework bundles all of that into one package with a local playground, durable workflows, and first-class types. This guide walks through setup, the core primitives, and the patterns that hold up in production.

The post assumes you have shipped at least one LLM feature before and you are comfortable with async TypeScript. By the end, you will have a Mastra project running locally with an agent, two tools, a multi-step workflow, RAG over a small corpus, and a deployable build.

What Is Mastra?

Mastra is an open-source TypeScript framework for building AI agents and workflows. It was created by the team behind Gatsby and sits on top of the Vercel AI SDK, which means it works with OpenAI, Anthropic, Google, Groq, and any other provider the SDK supports. Unlike Python frameworks such as LangGraph or CrewAI, Mastra is designed for the JavaScript ecosystem from day one. Types, IDE autocompletion, and deployment targets like Vercel, Cloudflare Workers, and Hono are not afterthoughts.

The framework gives you five primary primitives. Agents are LLM calls with tools, memory, and a system prompt. Tools are typed functions the agent can invoke. Workflows are durable, branching state machines that survive process restarts. RAG wraps embeddings, chunking, and vector store reads in one API. Evals score agent output against metrics you define.

Each piece can be used independently. You can run a Mastra agent inside a Next.js API route without touching workflows, or use the workflow engine to orchestrate calls to non-Mastra services. That flexibility is one reason teams adopt it over heavier orchestrators.

Setting Up Your First Mastra Project

The fastest way to start is the official scaffold. It generates a TypeScript project, installs dependencies, and wires up the local playground.

npx create-mastra@latest my-mastra-app
cd my-mastra-app

The CLI asks which provider to configure (OpenAI, Anthropic, Groq, or Google), which example to include (agent, workflow, or both), and whether to enable telemetry. For this walkthrough, pick Anthropic and the agent example. Drop your API key into .env:

ANTHROPIC_API_KEY=sk-ant-...

Now start the dev server:

npm run dev

The Mastra playground opens at http://localhost:4111. It is a browser UI for chatting with agents, replaying workflow runs, inspecting tool calls, and editing the system prompt without restarting the process. For teams used to Python notebooks, this is the closest TypeScript equivalent for iterating on prompts.

The generated project structure looks like this:

src/
  mastra/
    index.ts          # Mastra instance, wires everything together
    agents/
      index.ts        # Agent definitions
    tools/
      index.ts        # Tool definitions
    workflows/
      index.ts        # Workflow definitions

The src/mastra/index.ts file is the entry point. Every agent, tool, and workflow you build gets registered here so the playground can find them.

Building Your First Agent

A Mastra agent is a configuration object, not a class hierarchy. You declare the model, system prompt, tools, and memory in one place. The framework handles the call loop, tool execution, and response formatting.

Here is a minimal customer-support agent in src/mastra/agents/support.ts:

import { Agent } from '@mastra/core/agent';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';

export const supportAgent = new Agent({
  name: 'support-agent',
  instructions: `You are a customer support agent for a SaaS billing platform.
Use the lookup tool to fetch customer records before answering account questions.
If the customer asks about a refund over $500, hand off to a human by calling escalate.`,
  model: anthropic('claude-sonnet-4-5'),
});

import { Mastra } from '@mastra/core/mastra';
import { supportAgent } from './agents/support';

export const mastra = new Mastra({
  agents: { supportAgent },
});

Restart the dev server and the agent appears in the playground sidebar. You can chat with it directly, see token counts, and copy the underlying prompt that was sent to Anthropic. This visibility matters because most agent bugs come from the system prompt not behaving the way you assumed it would.

To call the agent from code, import the Mastra instance and use generate or stream:

import { mastra } from './mastra';

const result = await mastra.getAgent('supportAgent').generate(
  'My invoice for March looks wrong, customer ID is acc_42',
);

console.log(result.text);

The stream method returns an AIStream that plugs directly into Next.js Server Actions or any SSE endpoint. For an end-to-end pattern on streaming in React, see our building AI chatbots with streaming responses guide, which works the same way under the hood.

Adding Tools to Agents

Tools are how agents touch the outside world. In Mastra, a tool is a typed function with a Zod schema for inputs and outputs. The framework converts the schema to the JSON schema format the LLM expects and validates results before passing them back to the model.

Define a customer lookup tool in src/mastra/tools/customer.ts:

import { createTool } from '@mastra/core/tools';
import { z } from 'zod';

export const lookupCustomer = createTool({
  id: 'lookup-customer',
  description: 'Fetch a customer record by ID. Use before answering account questions.',
  inputSchema: z.object({
    customerId: z.string().describe('The customer account ID, format acc_<id>'),
  }),
  outputSchema: z.object({
    customerId: z.string(),
    plan: z.enum(['free', 'pro', 'enterprise']),
    monthlySpend: z.number(),
    status: z.enum(['active', 'past_due', 'cancelled']),
  }),
  execute: async ({ context }) => {
    const { customerId } = context;
    const customer = await db.customers.findUnique({ where: { id: customerId } });
    if (!customer) {
      throw new Error(`Customer ${customerId} not found`);
    }
    return {
      customerId: customer.id,
      plan: customer.plan,
      monthlySpend: customer.monthlySpend,
      status: customer.status,
    };
  },
});

Attach the tool to the agent:

import { lookupCustomer } from '../tools/customer';

export const supportAgent = new Agent({
  name: 'support-agent',
  instructions: '...',
  model: anthropic('claude-sonnet-4-5'),
  tools: { lookupCustomer },
});

The agent can now decide on its own when to call lookupCustomer. In the playground, every tool call shows up in the trace with inputs, outputs, and timing. When you ship to production, those traces also flow into Langfuse or LangSmith if you wire them up, which is the same observability pattern covered in our getting started with Claude API guide for the underlying request shape.

One pattern that catches newcomers: tools defined with createTool are reusable across agents and workflows. Build a tools/ library and import the same lookupCustomer into both your support agent and a billing workflow. The schema becomes a contract, which is exactly the kind of guarantee you give up when you write agent code in untyped Python.

Workflows for Multi-Step Orchestration

Agents are great when the LLM should decide what to do next. Workflows are for when you already know the steps and want them to run reliably even if a node crashes mid-execution.

A Mastra workflow is a graph of typed steps. Each step has an input schema, an output schema, and an execute function. Steps can run sequentially, in parallel, or conditionally. The runtime persists state between steps, so a workflow can suspend on a human approval and resume hours later without losing context.

Here is a refund workflow with three steps — lookup, approval check, and process — in src/mastra/workflows/refund.ts:

import { createWorkflow, createStep } from '@mastra/core/workflows';
import { z } from 'zod';

const lookupStep = createStep({
  id: 'lookup',
  inputSchema: z.object({ customerId: z.string(), amount: z.number() }),
  outputSchema: z.object({ plan: z.string(), monthlySpend: z.number(), amount: z.number() }),
  execute: async ({ inputData }) => {
    const customer = await db.customers.findUnique({
      where: { id: inputData.customerId },
    });
    return {
      plan: customer.plan,
      monthlySpend: customer.monthlySpend,
      amount: inputData.amount,
    };
  },
});

const approvalStep = createStep({
  id: 'approval',
  inputSchema: z.object({ plan: z.string(), monthlySpend: z.number(), amount: z.number() }),
  outputSchema: z.object({ approved: z.boolean(), reason: z.string() }),
  execute: async ({ inputData }) => {
    const autoApprove = inputData.amount <= 500 && inputData.plan !== 'free';
    return {
      approved: autoApprove,
      reason: autoApprove ? 'Within auto-approval limit' : 'Requires manual review',
    };
  },
});

const processStep = createStep({
  id: 'process',
  inputSchema: z.object({ approved: z.boolean(), reason: z.string() }),
  outputSchema: z.object({ refundId: z.string().nullable() }),
  execute: async ({ inputData }) => {
    if (!inputData.approved) return { refundId: null };
    const refund = await stripe.refunds.create({ /* ... */ });
    return { refundId: refund.id };
  },
});

export const refundWorkflow = createWorkflow({
  id: 'refund-workflow',
  inputSchema: z.object({ customerId: z.string(), amount: z.number() }),
  outputSchema: z.object({ refundId: z.string().nullable() }),
})
  .then(lookupStep)
  .then(approvalStep)
  .then(processStep)
  .commit();

const run = await mastra.getWorkflow('refundWorkflow').createRun();
const result = await run.start({
  inputData: { customerId: 'acc_42', amount: 250 },
});

Because every step has an explicit schema, refactoring is cheap. Rename a field in lookupStep and TypeScript flags every downstream step that depends on it. That is the kind of safety net Python workflow engines like LangGraph or CrewAI cannot offer without significant boilerplate — for a side-by-side look, our LangGraph stateful cyclic agents post covers the equivalent Python patterns.

RAG with Mastra

Mastra ships a built-in RAG pipeline for documents you want the agent to query. The pipeline has three parts: chunk the source, embed each chunk, and store the embeddings in a vector store. At query time, Mastra retrieves the top-k chunks and injects them into the agent context.

Here is a minimal RAG setup using OpenAI embeddings and pgvector:

import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';
import { openai } from '@ai-sdk/openai';
import { PgVector } from '@mastra/pg';

const doc = MDocument.fromText(productDocsText);
const chunks = await doc.chunk({
  strategy: 'recursive',
  size: 512,
  overlap: 50,
});

const { embeddings } = await embedMany({
  model: openai.embedding('text-embedding-3-small'),
  values: chunks.map((c) => c.text),
});

const store = new PgVector({ connectionString: process.env.DATABASE_URL! });
await store.upsert({
  indexName: 'product-docs',
  vectors: embeddings,
  metadata: chunks.map((c) => ({ text: c.text, source: c.metadata?.source })),
});

To query, build a tool that searches the vector store and returns the top matches:

export const searchDocs = createTool({
  id: 'search-docs',
  description: 'Search product documentation for relevant context.',
  inputSchema: z.object({ query: z.string() }),
  outputSchema: z.object({
    results: z.array(z.object({ text: z.string(), score: z.number() })),
  }),
  execute: async ({ context }) => {
    const { embedding } = await embed({
      model: openai.embedding('text-embedding-3-small'),
      value: context.query,
    });
    const results = await store.query({
      indexName: 'product-docs',
      queryVector: embedding,
      topK: 4,
    });
    return {
      results: results.map((r) => ({ text: r.metadata.text, score: r.score })),
    };
  },
});

Attach searchDocs to your agent and it will consult the documentation before answering. For deeper coverage of when each chunking strategy works best, see our guide on RAG chunking strategies. For an end-to-end pgvector setup, the pgvector with Postgres for RAG post walks through the schema and index choices Mastra uses under the hood.

Memory and Evals

Mastra has a built-in memory module that handles conversation history, working memory across turns, and semantic recall over past interactions. Enable it on an agent by passing a memory instance:

import { Memory } from '@mastra/memory';
import { PostgresStore, PgVector } from '@mastra/pg';

const memory = new Memory({
  storage: new PostgresStore({ connectionString: process.env.DATABASE_URL! }),
  vector: new PgVector({ connectionString: process.env.DATABASE_URL! }),
  embedder: openai.embedding('text-embedding-3-small'),
  options: {
    lastMessages: 10,
    semanticRecall: { topK: 3, messageRange: 2 },
  },
});

export const supportAgent = new Agent({
  name: 'support-agent',
  instructions: '...',
  model: anthropic('claude-sonnet-4-5'),
  tools: { lookupCustomer, searchDocs },
  memory,
});

When you call generate, pass a resourceId and threadId so Mastra knows which conversation to load:

const result = await mastra.getAgent('supportAgent').generate(
  'Did we resolve that refund I asked about yesterday?',
  { resourceId: 'user_42', threadId: 'thread_99' },
);

Evals close the loop. Mastra ships metrics for hallucination, answer relevance, toxicity, and content similarity, plus a custom scorer interface for your own checks. Attach evals to an agent and they run automatically against every response:

import { HallucinationMetric, AnswerRelevancyMetric } from '@mastra/evals/llm';

export const supportAgent = new Agent({
  // ...
  evals: {
    hallucination: new HallucinationMetric(anthropic('claude-sonnet-4-5'), {
      context: ['...known facts about the customer...'],
    }),
    relevance: new AnswerRelevancyMetric(anthropic('claude-sonnet-4-5')),
  },
});

Scores appear in the playground after each run and can be exported to your CI pipeline. Treat these like unit tests for prompts — they catch regressions when you tweak system instructions or swap models.

Real-World Scenario

Picture a mid-sized B2B SaaS team adding a billing copilot to their internal admin dashboard. Their stack is Next.js on Vercel with a Postgres database. They want an agent that can answer customer-specific questions, run refund workflows with human approval, and search a 400-page internal runbook.

Without Mastra, the team would stitch together the Vercel AI SDK for the chat loop, a custom state machine for the refund process, a separate RAG service for the runbook, and a homegrown eval script. Each piece needs its own tracing, its own deploy story, and its own typing layer.

With Mastra, the same team scaffolds a project with npx create-mastra, defines one agent with three tools, builds the refund workflow with createWorkflow, indexes the runbook with the RAG helpers, and deploys to Vercel as a Next.js API route. The local playground replaces ad-hoc Postman scripts during development. The Zod schemas catch most refactor breaks at compile time. The eval scores show up in the CI dashboard for every PR that touches the agent.

The trade-off is lock-in. If the team later wants to move the workflow engine to Temporal or the agent loop to LangGraph, they will rewrite. For a one-team product, that is usually a fair trade for the velocity.

When to Use Mastra

Your stack is TypeScript and you want types across agents, tools, and workflows
You need agents, workflows, RAG, and evals in one framework rather than four
You want a local playground for prompt iteration without spinning up infrastructure
You deploy to Vercel, Cloudflare Workers, or any Node/Hono runtime
You are early enough in the project that adopting a framework is cheaper than building one

When NOT to Use Mastra

Your team is Python-first and your existing tooling is LangGraph, CrewAI, or LlamaIndex
You need a battle-tested durable workflow engine like Temporal or Inngest for non-AI workloads too
You only need a single agent call with no tools, memory, or workflow — the raw Vercel AI SDK is lighter
You require fine-grained control over the call loop that an opinionated framework cannot provide
Your AI logic is tightly coupled to a specific provider’s SDK features that Mastra does not surface

Common Mistakes with Mastra

Skipping Zod schemas on tool inputs, which removes the type safety that is the whole reason to pick Mastra over plain SDK code
Treating workflows like agents — if the LLM does not need to choose the next step, a workflow is the safer abstraction
Ignoring the playground during development and debugging only through console logs, which slows iteration significantly
Forgetting to register agents or workflows in the Mastra instance, leading to confusing “not found” errors at runtime
Running evals only locally — wire them into CI so prompt regressions fail builds the same way unit tests do
Storing memory and vectors in separate databases when one Postgres instance with pgvector can hold both

Deploying Mastra

Mastra deploys cleanly to Vercel, Cloudflare Workers, AWS Lambda, and any Node host. For Vercel, the framework provides an adapter that turns your agents and workflows into serverless functions:

npm install @mastra/deployer-vercel

import { VercelDeployer } from '@mastra/deployer-vercel';

export const mastra = new Mastra({
  agents: { supportAgent },
  workflows: { refundWorkflow },
  deployer: new VercelDeployer({
    teamSlug: 'your-team',
    projectName: 'mastra-billing',
    token: process.env.VERCEL_TOKEN!,
  }),
});

Run mastra deploy and the framework bundles your project, uploads it, and wires up the routes. For Cloudflare, swap the deployer and you are done — the same agent code runs on the edge with no rewrites.

Watch the Postgres connection count if you ship on Vercel without connection pooling. Workflows that suspend and resume can exhaust the pool fast. Use PgBouncer or Neon’s pooled connection string to avoid surprise 500s in production. Our database connection pooling guide covers the safe defaults.

Conclusion

The Mastra TypeScript AI framework is the closest the JavaScript ecosystem has come to a one-stop kit for agents, workflows, RAG, and evals. The local playground, Zod-typed schemas, and Vercel-native deployment make it a strong choice for teams already shipping in TypeScript.

Start by scaffolding a project with npx create-mastra and porting one feature — a single agent with one tool — to see how the typing and tracing feel in your codebase. From there, layer in workflows when the LLM should not drive the step order, and RAG when answers need to ground in your own data. For a broader look at when agent frameworks are the right call at all, our building AI agents tutorial explains the planning and execution patterns Mastra encodes for you.

Mastra: TypeScript AI Framework Setup and Patterns

What Is Mastra?

Setting Up Your First Mastra Project

Building Your First Agent

Adding Tools to Agents

Workflows for Multi-Step Orchestration

RAG with Mastra

Memory and Evals

Real-World Scenario

When to Use Mastra

When NOT to Use Mastra

Common Mistakes with Mastra

Deploying Mastra

Conclusion

1 Comment

Leave a Comment Cancel reply

What Is Mastra?

Setting Up Your First Mastra Project

Building Your First Agent

Adding Tools to Agents

Workflows for Multi-Step Orchestration

RAG with Mastra

Memory and Evals

Real-World Scenario

When to Use Mastra

When NOT to Use Mastra

Common Mistakes with Mastra

Deploying Mastra

Conclusion

1 Comment

Leave a Comment Cancel reply

Related Articles

A2A Protocol vs MCP: Which Agent Standard to Adopt

Pydantic AI: Type-Safe Agents for Production Python

Vercel AI SDK: Build Streaming Chat UIs in Next.js 15