Skip to content
Docs

Investigate GitHub issues with HarnessAgent and Sandbox

Use HarnessAgent to run coding agent harness like Claude Code or Codex against untrusted code inside Vercel Sandbox. Reproduce GitHub issues with agent harness and generate report for the maintainer.

15 min read
Last updated June 25, 2026

Build a GitHub issue triage agent that reproduces the issue, investigates it in a safe environment, and generates a report for the maintainer to speed up their work. It uses AI SDK's HarnessAgent to drive a real coding harness (e.g., Claude Code, Codex, Pi, or OpenCode) inside a Vercel Sandbox, through a single uniform agent interface, with AI Gateway routing every model call. AI Gateway makes updating to a different harness a one-line change, and Vercel Sandbox keeps the untrusted code off the maintainer's machine.

In this guide, you’ll build a Next.js app that uses the HarnessAgent primitive to triage GitHub issues from their public issue URL, validate the steps to reproduce the issue, and return a structured report for the maintainer.

Deploy the template now, or read on for a deeper look at how it all works.

GitHub issue triage agent with HarnessAgent

Securely triage GitHub issues using coding agents harness in an isolated sandbox.

Deploy Template
AI Assistance

I want to build a sandboxed issue triage agent using the sandboxed issue triage agent template. Read the setup instructions at https://agent-resources.dev/sandboxed-issue-triage-agent-template.md and follow them. They will cover deploying the template, building with HarnessAgent and Vercel Sandbox, how everything works overall, and more.

Turn your agent into a Vercel expert with this plugin. It gives your coding agent current knowledge of the Vercel products this template uses, including AI Gateway and Vercel Sandbox. The plugin is optional; it is not required to use this template or for this guide.

The issue triage flow combines a Next.js app, a streaming API route, AI SDK 7 HarnessAgent, Vercel AI Gateway, and Vercel Sandbox. A run follows this path:

  1. A maintainer enters a public GitHub issue URL.
  2. The UI sends the issue URL, failing command, and harness adapter to /api/triage.
  3. The API route validates the issue URL and fetches issue context from GitHub.
  4. The route creates a HarnessAgent with the selected harness adapter.
  5. The agent creates an isolated Vercel Sandbox session.
  6. The prompt asks the harness to inspect the repository and run the failing command.
  7. The API route streams debug events, tool activity, and report text as newline-delimited JSON.
  8. Any frontend, CLI, or webhook consumer can read the stream and render or store the report.

This keeps untrusted repository code inside the sandbox while still giving the maintainer an actionable report.

Before you begin, make sure you have:

HarnessAgent drives a real coding harness inside a sandbox provider, behind a uniform session-and-stream interface. You do not script the harness’ I/O or manage the sandbox lifecycle by hand. You construct the agent with a harness, a sandbox, and a skill, open a session, stream a prompt, and consume the result.

  • Coding-agents-specific harness adapters: @ai-sdk/harness-claude-code and @ai-sdk/harness-codex each wrap a real agentic coding tool, Claude Code and OpenAI's Codex. Each adapter is constructed with the same signature, so the only difference between them at the call site is which factory you call.
  • Isolation with sandbox: @ai-sdk/sandbox-vercel binds the agent to Vercel Sandbox through createVercelSandbox, so the harness and any code it executes run inside an isolated microVM, not on the host. This makes investigating untrusted code defensible.
  • Access coding agents via AI Gateway: The adapters take a gateway auth object rather than a provider-specific API key, so model access authenticates through one credential path. There are no Anthropic or OpenAI keys in the application.

Create a new Next.js app with create-next-app. The --yes flag applies the recommended defaults, including TypeScript, ESLint, App Router, and the @/* import alias.

Terminal
pnpm create next-app@latest sandboxed-issue-triage-agent --yes
cd sandboxed-issue-triage-agent

Then add the AI SDK harness packages and the Vercel Sandbox adapter used by the route.

Terminal
pnpm add ai @ai-sdk/harness @ai-sdk/harness-claude-code @ai-sdk/harness-codex @ai-sdk/sandbox-vercel @vercel/sandbox

Next.js should keep the harness and sandbox packages external to the server bundle. Add the external package list to next.config.ts.

next.config.ts
import type { NextConfig } from "next";
const nextConfig: NextConfig = {
serverExternalPackages: [
"@ai-sdk/harness",
"@ai-sdk/harness-claude-code",
"@ai-sdk/harness-codex",
"@ai-sdk/sandbox-vercel",
"@vercel/sandbox"
]
};
export default nextConfig;

For local development, use the Vercel CLI to link the project and pull environment variables:

Terminal
vercel link
vercel env pull

AI SDK uses VERCEL_OIDC_TOKEN to authenticate with the Vercel AI Gateway with OIDC authentication.

Terminal
VERCEL_OIDC_TOKEN=...

The route accepts a GitHub issue URL, an optional failing command, and the harness adapter. Start with the shared types in lib/types.ts.

lib/types.ts
export type HarnessKey = "claude-code" | "codex";
export type TriageRequest = {
issueUrl: string;
failingCommand?: string;
harness?: HarnessKey;
};

The validated request includes the parsed GitHub owner, repository, issue number, and repository URL. The route uses this parsed shape for all downstream work.

lib/types.ts
export type ValidatedTriageRequest = Required<TriageRequest> & {
owner: string;
repo: string;
issueNumber: number;
repoUrl: string;
};

The validation result is explicit so the route can return a 400 response without throwing when user input is invalid.

lib/types.ts
export type TriageValidationResult =
| { ok: true; value: ValidatedTriageRequest }
| { ok: false; error: string };
const harnesses = new Set<HarnessKey>(["claude-code", "codex"]);

Validate the request body and apply defaults. Missing failing commands default to npm test, and unknown harness values default to Claude Code.

lib/types.ts
export function validateTriageRequest(input: unknown): TriageValidationResult {
if (!input || typeof input !== "object") {
return { ok: false, error: "Expected a JSON request body." };
}
const body = input as Partial<TriageRequest>;
const issueUrl = typeof body.issueUrl === "string" ? body.issueUrl.trim() : "";
const failingCommand =
typeof body.failingCommand === "string" && body.failingCommand.trim()
? body.failingCommand.trim()
: "npm test";
const harness =
typeof body.harness === "string" && harnesses.has(body.harness as HarnessKey)
? (body.harness as HarnessKey)
: "claude-code";
if (!issueUrl) {
return { ok: false, error: "Enter a public GitHub issue URL." };
}

Parse the issue URL defensively. This lets bad URLs fail with a useful validation message instead of an unhandled route error.

lib/types.ts
let parsedUrl: URL;
try {
parsedUrl = new URL(issueUrl);
} catch {
return { ok: false, error: "The issue URL is not a valid URL." };
}
if (parsedUrl.protocol !== "https:" || parsedUrl.hostname !== "github.com") {
return {
ok: false,
error: "Use an https://github.com/{owner}/{repo}/issues/{number} issue URL."
};
}

Extract the GitHub path parts and make sure the URL points to an issue, not an arbitrary GitHub page.

lib/types.ts
const [owner, repo, issueSegment, issueNumberSegment] = parsedUrl.pathname
.split("/")
.filter(Boolean);
const issueNumber = Number(issueNumberSegment);
if (
parsedUrl.pathname.split("/").filter(Boolean).length !== 4 ||
!owner ||
!repo ||
issueSegment !== "issues" ||
!Number.isInteger(issueNumber) ||
issueNumber < 1
) {
return {
ok: false,
error: "Use an https://github.com/{owner}/{repo}/issues/{number} issue URL."
};
}
const repoUrl = `https://github.com/${owner}/${repo}`;
return {
ok: true,
value: {
issueUrl,
repoUrl,
owner,
repo,
issueNumber,
failingCommand,
harness
}
};
}

The template streams debug events to the UI, but it also redacts sensitive metadata before logging. Start lib/debug.ts with the metadata type and redaction pattern.

lib/debug.ts
type DebugMetadata = Record<string, unknown>;
const DEBUG_PREFIX = "[triage-debug]";
const SENSITIVE_KEY_PATTERN = /(api|auth|bearer|cookie|key|password|secret|token)/i;

Use a recursive helper so nested metadata objects and arrays are redacted consistently.

lib/debug.ts
function redactValue(key: string, value: unknown): unknown {
if (SENSITIVE_KEY_PATTERN.test(key)) {
return "[redacted]";
}
if (Array.isArray(value)) {
return value.map((item, index) => redactValue(String(index), item));
}
if (value && typeof value === "object") {
return Object.fromEntries(
Object.entries(value as Record<string, unknown>).map(([entryKey, entryValue]) => [
entryKey,
redactValue(entryKey, entryValue)
])
);
}
return value;
}

Expose a sanitizer for route events and UI stream metadata.

lib/debug.ts
export function sanitizeTriageMetadata(metadata: DebugMetadata) {
return Object.fromEntries(
Object.entries(metadata).map(([key, value]) => [key, redactValue(key, value)])
);
}

Error metadata is normalized so both server logs and stream consumer diagnostics have the same shape. Also, add the createRunId() helper function:

lib/debug.ts
export function getTriageErrorMetadata(error: unknown) {
return error instanceof Error
? {
errorName: error.name,
errorMessage: error.message,
errorStack: error.stack
}
: { errorMessage: String(error) };
}
export function createRunId() {
return `${Date.now().toString(36)}-${Math.random().toString(36).slice(2, 8)}`;
}

Finally, wrap console logging so all triage logs share the same prefix and redaction behavior.

lib/debug.ts
export function logTriageDebug(event: string, metadata: DebugMetadata = {}) {
console.info(DEBUG_PREFIX, event, sanitizeTriageMetadata(metadata));
}
export function logTriageError(event: string, error: unknown, metadata: DebugMetadata = {}) {
console.error(
DEBUG_PREFIX,
event,
sanitizeTriageMetadata({ ...metadata, ...getTriageErrorMetadata(error) })
);
}

The agent needs issue context before it starts the sandbox run. Define the GitHub response shapes in lib/github.ts.

lib/github.ts
import type { ValidatedTriageRequest } from "./types";
type GitHubIssueResponse = {
title?: string;
body?: string | null;
state?: string;
user?: { login?: string };
html_url?: string;
comments?: number;
comments_url?: string;
pull_request?: unknown;
};
type GitHubCommentResponse = {
body?: string | null;
user?: { login?: string };
html_url?: string;
};
const MAX_BODY_CHARS = 12_000;
const MAX_COMMENT_CHARS = 2_000;
const MAX_COMMENTS = 10;
function truncate(value: string, maxLength: number) {
if (value.length <= maxLength) {
return value;
}
return `${value.slice(0, maxLength)}\n\n[truncated ${value.length - maxLength} chars]`;
}

Fetch JSON from GitHub with explicit headers and throw a clear error when the API response is not successful.

lib/github.ts
async function fetchGithubJson<T>(url: string): Promise<T> {
const response = await fetch(url, {
headers: {
Accept: "application/vnd.github+json",
"User-Agent": "sandboxed-issue-triage-agent"
}
});
if (!response.ok) {
throw new Error(
`GitHub API request failed (${response.status} ${response.statusText}) for ${url}`
);
}
return (await response.json()) as T;
}

Now create the issue-context helper. It starts with the issue endpoint derived from validated input. Fetch up to ten comments when the issue has comments. This keeps the prompt useful while staying bounded.

lib/github.ts
export async function fetchGithubIssueContext(issue: ValidatedTriageRequest) {
const apiUrl = `https://api.github.com/repos/${issue.owner}/${issue.repo}/issues/${issue.issueNumber}`;
const issueResponse = await fetchGithubJson<GitHubIssueResponse>(apiUrl);
if (issueResponse.pull_request) {
throw new Error("The provided URL is a pull request, not a GitHub issue.");
}
const commentsCount = issueResponse.comments ?? 0;
const commentsUrl =
issueResponse.comments_url ??
`${apiUrl}/comments?per_page=${MAX_COMMENTS}`;
const comments =
commentsCount > 0
? await fetchGithubJson<GitHubCommentResponse[]>(
`${commentsUrl}${commentsUrl.includes("?") ? "&" : "?"}per_page=${MAX_COMMENTS}`
)
: [];
return [
`Title: ${issueResponse.title ?? "(untitled)"}`,
`State: ${issueResponse.state ?? "unknown"}`,
`Author: ${issueResponse.user?.login ?? "unknown"}`,
`URL: ${issueResponse.html_url ?? issue.issueUrl}`,
"",
"Body:",
truncate(issueResponse.body?.trim() || "(no body)", MAX_BODY_CHARS),
"",
"Comments:",
comments.length
? comments
.map((comment, index) =>
[
`Comment ${index + 1} by ${comment.user?.login ?? "unknown"}:`,
truncate(comment.body?.trim() || "(no body)", MAX_COMMENT_CHARS),
comment.html_url ? `URL: ${comment.html_url}` : ""
]
.filter(Boolean)
.join("\n")
)
.join("\n\n")
: "(no comments fetched)"
].join("\n");
}

The skill gives the harness a narrow operating contract: treat the repo as untrusted, avoid remote side effects, and return a predictable maintainer report.

lib/reproduction-triage-skill.ts
export const reproductionTriageSkill = {
name: "reproduction-triage",
description:
"Investigate third-party reproduction repositories inside an isolated sandbox and return a concise maintainer-ready report.",
content: `
You are using the Reproduction Triage Skill.
Safety rules:
- Treat the repository as untrusted code.
- Prefer read-only investigation unless the user explicitly asks for a patch.
- Do not open pull requests, post comments, push branches, or modify remote state.
- Do not dump full logs. Quote only the shortest useful excerpts.
Investigation rules:
- Clone or inspect the repository in the sandboxed workspace.
- Run the provided failing command first.
- If the failing command is missing, inspect package.json and infer the safest likely command.
- Capture the exact commands you ran.
- Read the smallest set of files needed to explain the failure.
- Stop after diagnosis for this v1 demo.
Return the report in this exact format:
## Summary
## Reproduction status
Use one of: reproduced, not_reproduced, blocked.
## Commands run
## Relevant error output
## Likely cause
## Suggested next step
## Files worth inspecting
`.trim(),
examples: [
{
issue:
"The project fails on npm test after upgrading Next. The error mentions an async server component.",
report:
"A report that lists npm test, marks the status as reproduced, quotes the failing stack frame, and points at the smallest relevant file."
}
]
} as const;

The prompt combines the validated repository URL, issue URL, fetched issue text, and the command the maintainer wants the harness to try first.

lib/prompt.ts
import type { ValidatedTriageRequest } from "./types";
export function createTriagePrompt({
issueUrl,
repoUrl,
issueText,
failingCommand
}: ValidatedTriageRequest & { issueText: string }) {
return `
Investigate this issue report in an isolated sandbox.
Repository:
${repoUrl}
Issue URL:
${issueUrl}
Issue report:
${issueText}
Failing command to try first:
${failingCommand}
Your task:
1. Clone or inspect the repository.
2. Run the failing command first.
3. If setup is required, choose the smallest safe setup path and explain it.
4. Determine whether the failure reproduces.
5. Return the report using the Reproduction Triage Skill format.
Do not edit files or open a pull request.
`.trim();
}

The agent setup uses Vercel AI Gateway for model access and Vercel Sandbox for isolated execution.

lib/harness-agent.ts
import { HarnessAgent } from "@ai-sdk/harness/agent";
import { createClaudeCode } from "@ai-sdk/harness-claude-code";
import { createCodex } from "@ai-sdk/harness-codex";
import { createVercelSandbox } from "@ai-sdk/sandbox-vercel";
import { logTriageDebug } from "./debug";
import { reproductionTriageSkill } from "./reproduction-triage-skill";
import type { HarnessKey } from "./types";
function getGatewayAuth() {
return {
apiKey: process.env.VERCEL_OIDC_TOKEN,
};
}

Resolve the selected harness adapter. The rest of the app does not need to know whether the run uses Claude Code or Codex.

lib/harness-agent.ts
function resolveHarnessAdapter(harness: HarnessKey) {
const gateway = getGatewayAuth();
if (harness === "claude-code") {
return createClaudeCode({
auth: { gateway },
model: "claude-sonnet-4.5",
});
}
return createCodex({
auth: { gateway },
model: "gpt-5-codex",
});
}

Create the triage agent for each run. The sandbox option is the core safety boundary: unknown repository code runs in Vercel Sandbox instead of on the maintainer machine.

lib/harness-agent.ts
export async function createTriageAgent(harness: HarnessKey) {
logTriageDebug("harness.create-agent.start", {
harness,
gatewayAuth: process.env.VERCEL_OIDC_TOKEN
? "VERCEL_OIDC_TOKEN"
: "missing",
sandboxRuntime: "node24",
sandboxPorts: [3000],
skillNames: [reproductionTriageSkill.name],
});
const adapter = resolveHarnessAdapter(harness);
const agent = new HarnessAgent({
harness: adapter,
sandbox: createVercelSandbox({
runtime: "node24",
ports: [3000],
}),
instructions:
"You are a maintainer assistant that safely investigates bug reproduction repositories for issue triage.",
tools: {},
skills: [reproductionTriageSkill],
});
logTriageDebug("harness.create-agent.complete", { harness });
return agent;
}

The API route bridges the caller, GitHub, the harness, and the sandbox. It returns newline-delimited JSON so any consumer can render or store report text and diagnostics as they arrive.

app/api/triage/route.ts
import { createTriageAgent } from "@/lib/harness-agent";
import { createTriagePrompt } from "@/lib/prompt";
import { fetchGithubIssueContext } from "@/lib/github";
import {
createRunId,
getTriageErrorMetadata,
logTriageDebug,
logTriageError,
sanitizeTriageMetadata,
} from "@/lib/debug";
import { validateTriageRequest } from "@/lib/types";
import type { TextStreamPart, ToolSet } from "ai";
import type { HarnessAgentSession } from "@ai-sdk/harness/agent";

Normalize stream parts from different adapters so the route can handle both text and delta fields.

app/api/triage/route.ts
type StreamLogMetadata = Record<string, unknown>;
function createBlockedReport(message: string) {
return [
"## Summary",
"",
"The triage run could not start or complete.",
"",
"## Reproduction status",
"blocked",
"",
"## Relevant error output",
"```",
message,
"```",
"",
"## Suggested next step",
"Check the GitHub issue URL, harness package installation, and Vercel Sandbox credentials.",
].join("\n");
}

Small helpers keep adapter stream handling and metadata summaries consistent.

app/api/triage/route.ts
function getStreamPartText(part: TextStreamPart<ToolSet>) {
return part.type === "text-delta" ? part.text : "";
}
function getErrorMessage(error: unknown) {
return error instanceof Error ? error.message : String(error);
}
function truncate(value: string, maxLength = 220) {
return value.length > maxLength ? `${value.slice(0, maxLength)}...` : value;
}

Summarize tool inputs and outputs before sending them to diagnostics. Narrow on part.type before reading tool fields so TypeScript enforces the actual stream part shape.

app/api/triage/route.ts
function summarizeValue(value: unknown) {
if (typeof value === "undefined") return undefined;
if (typeof value === "string") return truncate(value);
try {
return truncate(JSON.stringify(value));
} catch {
return String(value);
}
}
function getStreamPartMetadata(part: TextStreamPart<ToolSet>) {
const metadata: StreamLogMetadata = {
type: part.type,
textLength: getStreamPartText(part).length,
hasError: part.type === "error" && typeof part.error !== "undefined",
};
if (part.type === "tool-call") {
return {
...metadata,
toolName: part.toolName,
toolCallId: part.toolCallId,
input: summarizeValue(part.input),
};
}
if (part.type === "tool-result") {
return {
...metadata,
toolName: part.toolName,
toolCallId: part.toolCallId,
input: summarizeValue(part.input),
output: summarizeValue(part.output),
};
}
return metadata;
}

The POST() handler is intentionally one orchestration function in the template. It validates input, creates an NDJSON stream, fetches GitHub context, creates a sandbox-backed harness session, forwards result.stream, recovers final text when needed, streams blocked reports for failures, destroys the session, and returns the response with application/x-ndjson.

The route expects a JSON body with the issue URL, optional failing command, and optional harness adapter. A caller can also pass x-triage-run-id to correlate caller logs with server logs.

curl -N http://localhost:3000/api/triage \
-H "Content-Type: application/json" \
-H "x-triage-run-id: local-run-001" \
-d '{"issueUrl":"https://github.com/vercel-labs/harness-triage-repro-bugs/issues/2","failingCommand":"npm test","harness":"claude-code"}'

The response is newline-delimited JSON. report events contain text to append to the final maintainer report.

{ "type": "report", "text": "## Summary\n\n..." }

debug, error, and activity events contain progress and diagnostics that a frontend, CLI, or background worker can render separately.

{
"type": "activity",
"text": "Calling tool",
"runId": "local-run-001",
"elapsedMs": 1200
}

You now have a GitHub issue triage agent ready to investigate issues. The complete template adds a simple Next.js UI on top of this route: a form for the GitHub issue URL, failing command, and harness adapter; a report panel that appends report events; and a diagnostics panel that renders debug, error, and activity events. That UI is not required for the sandboxed-agent pattern. It is just one consumer of the newline-delimited JSON stream.

To get started, clone the repository, configure credentials, and run the local app:

Terminal
pnpm install
pnpm dev

Open http://localhost:3000 and submit:

Terminal
Issue URL: https://github.com/vercel-labs/harness-triage-repro-bugs/issues/2
Failing command: npm test
Harness: Claude Code

A successful run streams a report with these sections:

GitHub issue triage with HarnessAgent and sandbox

This project generalizes beyond issue triaging. The same primitives apply to more sophisticated use cases like a coding interview platform or vibe coding tool. The important boundary stays the same across all of these: untrusted repository code runs in Vercel Sandbox, the harness adapter is swappable, and the route emits a report-shaped stream that the product surface can decide how to present.

Cause: The issue URL is empty, malformed, points to a pull request, or points to another GitHub page. The route only accepts public GitHub issue URLs in the form https://github.com/{owner}/{repo}/issues/{number}.

Fix: Send a public issue URL. If you want to support pull requests later, add a separate validation branch and prompt contract for PR context.

Cause: Missing Vercel Sandbox credentials, missing AI Gateway credentials, an unavailable GitHub issue, or a harness adapter startup failure. The template intentionally converts startup and adapter failures into report-shaped output so the caller still receives a useful response.

Fix: Run vercel link, pull environment variables with vercel env pull .env.local, and confirm VERCEL_OIDC_TOKEN is available locally.

Cause: Some adapters may finish without emitting text-delta events, or the run may end before the model produces report text.

Fix: Keep the final-text recovery branch in the POST() handler. If result.text is also empty, return a blocked report that points the caller to diagnostics. Use the streamed debug events to inspect the last adapter activity.

Cause: NDJSON events can arrive split across network chunks. A single reader.read() call is not guaranteed to contain a complete JSON line.

Fix: Buffer text until a newline is available, parse one line at a time, and ignore empty lines. Treat report events as append-only text and render debug, error, and activity events separately.

Cause: The raw harness stream type and the translated AI SDK stream type are not the same surface. Redeclaring a local union can drift as the harness adapters evolve.

Fix: Use TextStreamPart<ToolSet> from ai and narrow on part.type before reading fields like text, toolName, toolCallId, input, or output.

The reproduction repository is untrusted code. Vercel Sandbox gives the harness an isolated environment for installing dependencies, running commands, and inspecting files without asking a maintainer to run unknown code on their laptop.

Yes. The route is built around HarnessAgent, so the harness adapter is swappable. This guide uses Claude Code and Codex as examples, but the same pattern can support Pi, OpenCode, and other adapters as they become available.

Reproduction triage can take time. NDJSON lets the route stream tool activity, debug events, errors, and report text as they happen, while still ending with a report that a maintainer can copy or store.

No. The UI is one consumer of the stream. You can also consume the same route from a CLI, GitHub App, Slack bot, Linear workflow, dashboard, or background job.

No. The harness runs inside Vercel Sandbox using AI Gateway, so the agent environment belongs to the sandbox, not your laptop.

Was this helpful?

supported.