The Glitched Goblet Logo

The Glitched Goblet

Where Magic Meets Technology

Prompt Engineering for Agentic Development: Getting Exactly What You Want

August 18, 2025

Agent coding tools like Copilot aren’t just autocomplete for single files anymore, they can now build fullstack applications. They can reason across files, propose refactors, draft tests, even argue about architecture. But none of that matters if your asks are mush. The difference between “meh, plausible code” and “wow, exactly what I meant” is about prompt craftsmanship and quickly becoming a necessary skill.

This is a practical, opinionated guide to prompting for agentic project development. Leaning on patterns you’ll also see echoed in community gems like awesome-copilot, with a special focus on chatmodes (the “hats” you ask your AI to wear).

Why Prompting Structure Matters

Natural language is imprecise, while code is not. Even though coding agents are getting smarter by the month agentic systems fill in the gaps with their own assumptions, which can lead to unexpected results.:

  • If you don’t bound scope, agents invent scope.
  • If you don’t name constraints, they average the internet.
  • If you don’t share context, they’ll build generic, "just okay" solutions.

Rule 0: Treat language as an interface. Your prompt is the API contract. You state inputs, outputs, invariants, and success criteria.

The Durable Prompt Pattern

Use this spine for 90% of dev asks:

ROLE -> CONTEXT -> TASK -> CONSTRAINTS -> FORMAT -> ACCEPTANCE

  • ROLE (chatmode): the hat it should wear.
  • CONTEXT: code, file tree, conventions, dependencies.
  • TASK: action, not vibes. (“Implement”, “Refactor”, “Diagnose”.)
  • CONSTRAINTS: frameworks, patterns, performance, security, style.
  • FORMAT: how you want the result (code block, diff, checklist).
  • ACCEPTANCE: tests, examples, or explicit “done when” criteria.

Example:

Role: You are a refactoring assistant. Context: Node 18, no 3rd-party CSV libs. Current function below. Task: Reduce cognitive complexity of parseCsv() without changing behavior. Constraints: Keep streaming interface, preserve error messages, time complexity ≤ current. Format: Return a unified diff patch. Acceptance: Include 3 table-driven Jest tests that pass on quoted fields, empty lines, and BOM.

This template prevents “model improv hour” and drives the agent toward the exact deliverable you need.

Chatmodes: Switch Hats, Change Outcomes

Currently with Copilot, you are able to define a chatmodes folder within .github at the root of your project. Right next to the workflows folder. From there you can define chatmodes by creating .md files describing a role for copilot to use. Once created, reload VS Code and you can select the chatmode from the Copilot chat interface.

Per awesome-copilot

Define chat behavior, available tools, and codebase interaction patterns within specific boundaries for each request

Same model, different behavior, if you name the mode. Five high-leverage modes for dev work:

  1. Implementer “Write the thing.” Great for scaffolds and boilerplate. Ask for: code + minimal notes.

  2. Explainer “Teach me what this does (and where it breaks).” Ask for: concise summary, failure cases, complexity, example I/O.

  3. Refactorer “Improve clarity/perf without behavior change.” Ask for: a diff + rationale bullets.

  4. Bug Hunter “Skeptically review for edge cases and defects.” Ask for: numbered findings + repro snippets.

  5. Spec Writer “Write the contract before code.” Ask for: structured requirements, API signatures, acceptance tests.

Pro tip: If an answer feels off, don’t just “ask again.” Switch modes. Explainer -> reveals misunderstanding. Bug Hunter -> pressure tests. Spec Writer -> aligns on the target before touching code.

Do’s and Don’ts (a.k.a. Prompt Hygiene)

Do

  • Bullet your asks. Models parse lists better than paragraphs.
  • Bound the canvas. “One function” vs. “new service” is a critical difference.
  • Show the neighborhood. Include file tree, key interfaces, or conventions.
  • Ask for a plan first when tasks are multi-step: “Give a 5-bullet plan, then wait.”
  • Prefer diffs over blobs for refactors, easier to review and revert.
  • Pin success. “Done when these tests pass” beats “looks good.”

Don’t

  • Don’t say “optimize” without an axis (latency, memory, readability).
  • Don’t stack 10 asks in one go, chain them.
  • Don’t rely on memory. Restate constraints in each thread/session.
  • Don’t copy-paste to prod. Run, test, and profile, treat AI code as code.

Prompt Smells (And What To Say Instead)

  • Vague Verb: “Improve this.” -> Specific Verb: “Reduce allocations, target ≤1 allocation per loop.”
  • Unbounded Scope: “Add analytics.” -> Scoped: “Emit pageview + timing via track() only on client routes.”
  • Context Starvation: “Write tests.” -> Contexted: “Jest + React Testing Library, existing jest.setup.ts below, mock network with MSW.”
  • Style Drift: “Build auth middleware.” -> “Match existing Result<T, E> pattern, no exceptions, return Unauthorized sentinel.”

Before -> Better -> Best (Concrete Examples)

Example A: API Endpoint

  • Before: “Write a POST /weather endpoint.”

  • Better: “Express TS POST /weather accepts { city }, calls OpenWeatherMap, returns { temp, conditions }, handles network errors.”

  • Best:

    • Role: Implementer
    • Task: Add POST /weather
    • Context: Node 20, zod for validation, fetch available, error style uses ProblemDetails.
    • Constraints: Timeout 2s, no retries, log X-Request-Id.
    • Format: Code block + 3 Jest tests (happy path, timeout, 400 invalid city).
    • Acceptance: Tests pass, function pure except I/O.

Example B: Refactor Without Behavior Change

  • Before: “Clean this file.”
  • Better: “Refactor for readability.”
  • Best: “Refactor priceCalculator.ts to lower cognitive complexity (≤10 per function). Don’t change numerical results. Return a unified diff, then list any renamed helpers.”

Iteration as a Feature, Not a Failure

Agentic systems thrive on tight feedback loops. Treat the model like an eager junior dev:

  1. Draft: “Give me a minimal first pass.”
  2. Constrain: “Great—consolidate error branches and add input guards.”
  3. Harden: “Now property-based tests for edge cases.”
  4. Polish: “Docstring each public function with examples.”

Each turn adds a constraint or acceptance check. You’re not “wasting prompts”, you’re converging.

Make It Verifiable

Ask for artifacts that prove alignment, not paragraphs that claim it:

  • Diffs, not dumps.
  • Tests, not promises.
  • Bench scripts, not anecdotes.
  • Threat notes, not “we considered security.”

If it matters, make it machine-checkable. Make it verify endpoints with curl commands, or have it run unit tests.

A Mini Cheat Sheet You Can Paste

ROLE: Implementer / Explainer / Refactorer / Bug Hunter / Spec Writer CONTEXT: File tree, key snippets, conventions TASK: One clear verb and object CONSTRAINTS: Frameworks, patterns, perf/security bounds FORMAT: Code block, diff, checklist, table ACCEPTANCE: Tests/examples/metrics that define “done”

Example skeleton:

You are a <MODE>.
Context: <FILES/CONVENTIONS/SNIPPETS>.
Task: <ACTION>.
Constraints: <RULES/BOUNDS>.
Format: <DIFF | CODE + TESTS | CHECKLIST>.
Acceptance: <TESTS/CRITERIA>.

Final Word

Agentic development isn’t “let the robot figure it out.” It’s you steering a powerful collaborator with crisp contracts. Name the hat (chatmode), bound the canvas, feed real context, and anchor to verifiable acceptance. Do that, and your copilot stops guessing—and starts delivering exactly what you asked for. If you're interested in reading more of my thoughts, check out my blog at The Glitched Goblet.