puml-agent-pack Codex / Claude Plugin + MCP Specification
Mirror of
docs/specs/puml_agent_plugin_mcp_spec.md— the in-repo file is the source of truth.
Make agents write correct sequence diagrams by giving them the same compiler, language server, renderer, and repair loop humans use.
This is the agent ecosystem layer for puml: Codex plugin, Claude Code plugin, Agent Skill package, MCP server, LSP bundle, diagram authoring skill, diagram review skill, and deterministic tools for parse/render/export.
Runtime contract snapshot (Current, audited in issue #24)
The sections below specify the target package. The current shipped runtime surface that is implemented and release-safe today is:
- plugin manifests:
agent-pack/.codex-plugin/plugin.jsonagent-pack/.claude-plugin/plugin.json
- marketplace metadata:
agent-pack/.codex-plugin/marketplace.jsonagent-pack/.claude-plugin/marketplace.json
- MCP contract + runtime bridge:
agent-pack/.mcp.jsonagent-pack/bin/puml-mcp
- LSP contract:
agent-pack/.lsp.json
- shared skills:
skills/puml-sequence-authorskills/puml-sequence-reviewer
- Claude agents:
agents/puml-diagram-designer.mdagents/puml-diagram-reviewer.md
Current baseline constraints:
- v0.0.1 ships
.lsp.jsonhost wiring metadata forbin/puml-lsp; release archives may still depend on the host to provide or map the actual binary. scripts/validate_agent_pack.pyvalidates manifest keys, marketplace metadata, MCP runtime/spec contract parity, and the.lsp.jsonlanguage-server manifest contract.agent-pack/tests/mcp_smoke.shandscripts/harness-check.shexercise MCP baseline behavior and parity harness integration.
Product position
Agents are bad at diagrams when diagrams are treated as prose. They become good when diagrams are treated as source code with a compiler.
puml-agent-pack gives agents a strict workflow:
understand request
-> draft puml source
-> check with compiler
-> repair diagnostics
-> render SVG
-> inspect output metadata
-> deliver source + artifact
No freehand SVG. No imaginary syntax. No “looks right” without parsing. No final answer containing invalid puml.
Non-negotiables
- Works for Codex.
- Works for Claude Code.
- Ships skills/instructions.
- Ships MCP tools.
- Bundles or configures
puml-lspwhere the host supports LSP plugins. - Uses the real
pumlcompiler and renderer. - No duplicate parser in prompt text.
- No arbitrary shell execution.
- No remote fetch by default.
- Remote includes are disabled unless a tool call explicitly opts in with
allow_url_includes: true. - No silent file writes.
- No path traversal outside approved workspace roots.
- No generated diagram is considered final until
puml_checkpasses. - 90% coverage for MCP server/tooling code.
- Eval suite covers every supported primitive.
Architecture religion
Use a deterministic agent-tool architecture.
The architecture is:
Skill instructions
-> agent reasoning
-> MCP tool call
-> puml-core / puml CLI / puml-lsp
-> structured diagnostics/artifacts
-> agent repair loop
Rules:
- Skills teach workflow and taste.
- MCP tools execute deterministic operations.
- LSP supplies live diagnostics and code intelligence while the agent edits files.
- The compiler is the authority.
- The renderer is the authority for artifacts.
- The agent never claims a diagram is valid unless the tool says it is valid.
- The agent never edits raw SVG to “fix” a diagram.
- The agent edits
.pumlsource and re-renders. - The plugin is an adapter layer, not a source of truth.
- The same skill content should work across Codex and Claude where host formats allow it.
- Host-specific manifests are thin wrappers around shared skills, binaries, and configs.
Package layout
One source tree ships both host integrations:
agent-pack/
README.md
LICENSE
CHANGELOG.md
.codex-plugin/
plugin.json
.claude-plugin/
plugin.json
skills/
puml-sequence-author/
SKILL.md
references/
syntax.md
style.md
examples.md
repair-loop.md
scripts/
validate-output.sh
puml-sequence-reviewer/
SKILL.md
references/
review-rubric.md
common-failures.md
puml-docs-integrator/
SKILL.md
references/
markdown-embedding.md
mdbook.md
ci.md
agents/
puml-diagram-designer.md
puml-diagram-reviewer.md
prompts/
write-sequence-diagram.md
review-sequence-diagram.md
convert-to-puml.md
explain-sequence-diagram.md
.mcp.json
.lsp.json
bin/
puml
puml-lsp
puml-mcp
assets/
icon.png
logo.png
screenshot-authoring.png
screenshot-preview.png
tests/
fixtures/
evals/
mcp-transcripts/
snapshots/
Only host manifest files live inside .codex-plugin/ and .claude-plugin/. Skills, MCP config, LSP config, agents, assets, and binaries live at plugin root.
Codex plugin
Codex manifest:
.codex-plugin/plugin.json
Required behavior:
- identifies package as
puml-agent-pack - exposes shared
skills/ - exposes
.mcp.json - exposes assets
- includes install-surface metadata
- includes default prompts for common workflows
- does not require hooks for core functionality
Required default prompts:
- “Create a sequence diagram for this flow and render it with puml.”
- “Review this
.pumldiagram for correctness and clarity.” - “Convert this prose/API flow into a puml sequence diagram.”
- “Add a puml diagram to this README and verify it renders.”
- “Find sequence diagrams in this repo and render them.”
Codex plugin behavior:
- Agent skill handles authoring and repair loop.
- MCP server provides parse/check/render/export tools.
- The plugin must work in CLI, IDE extension, and app surfaces where supported by Codex.
- If the host exposes plugin marketplace metadata, the plugin appears as “puml Diagrams”.
Claude Code plugin
Claude plugin manifest:
.claude-plugin/plugin.json
Required behavior:
- exposes shared
skills/ - exposes shared
agents/ - exposes
.mcp.json - exposes
.lsp.json - exposes binaries through
bin/where supported - uses plugin-root path variables, not absolute developer-machine paths
Claude plugin behavior:
/puml-sequence-authorwrites and repairs diagrams./puml-sequence-reviewercritiques syntax, layout, naming, and communication quality.puml-diagram-designersubagent can be invoked for larger diagram tasks.puml-lspprovides diagnostics while Claude edits.pumlfiles.puml-mcpexposes deterministic tools.
Skills
puml-sequence-author
Purpose:
Write valid, readable, native puml sequence diagrams from prose, code, docs, logs, tests, API specs, architecture notes, and existing broken diagrams.
Hard instructions:
- Always produce
.pumlsource first. - Always run
puml_checkbefore finalizing when tools are available. - Always repair diagnostics before finalizing.
- Prefer aliases for long participant names.
- Use explicit participant declarations for non-trivial diagrams.
- Use participant kinds when semantically useful.
- Use notes for context, not for hiding unclear flow.
- Use groups for branching/repetition/concurrency.
- Use lifecycle only when it clarifies control flow.
- Keep message labels short and verb-driven.
- Do not use unsupported PlantUML features.
- Do not emit class/activity/state diagrams.
- Do not emit raw SVG unless asked for the rendered artifact.
Skill workflow:
1. Identify participants.
2. Choose aliases.
3. Determine main flow.
4. Add messages.
5. Add groups/notes/lifecycle only where they clarify.
6. Render/check.
7. Repair diagnostics.
8. Simplify labels.
9. Deliver source and artifact path/SVG if requested.
puml-sequence-reviewer
Purpose:
Review existing .puml diagrams for correctness, compatibility, readability, layout risk, and maintainability.
Review categories:
- syntax validity
- semantic validity
- participant naming
- alias clarity
- message direction
- group correctness
- lifecycle correctness
- note placement
- overuse of notes
- unsupported syntax
- renderability
- export readiness
- docs embedding readiness
Output format:
Verdict: pass / needs changes / invalid
Blocking issues:
Recommended improvements:
Rendered artifact:
Suggested patch:
puml-docs-integrator
Purpose:
Insert rendered diagrams into documentation systems.
Supported targets:
- README Markdown
- mdBook
- MkDocs
- Docusaurus
- plain HTML
- CI-generated artifacts
Rules:
- Keep
.pumlsource checked in. - Put generated SVG in a predictable artifact path.
- Prefer relative links.
- Add render command to docs/CI when appropriate.
- Never inline huge SVG unless requested.
MCP server
Binary:
puml-mcp
Transports:
- stdio required
- streamable HTTP optional only after stdio is complete and tested
The MCP server exposes tools, resources, and prompts.
MCP security rules
- Validate every input against schema.
- Canonicalize paths.
- Deny path traversal.
- Default allowed root is current project root.
- File writes require explicit
write: trueand target path. - Tools that write files return a preview/diff when possible.
- No shell interpolation.
- No arbitrary command execution.
- No network.
- No remote include resolution.
- Tool output is sanitized.
- SVG output contains no scripts.
- Large outputs are size-limited and can be returned as resources.
- Tool errors distinguish protocol errors from execution errors.
MCP tools
puml_parse
Input:
Output:
Purpose:
Parse source and return AST/diagnostics. Does not normalize or render.
puml_check
Input:
Output:
Purpose:
Parse and normalize. This is the mandatory validation tool before final output.
puml_render_svg
Input:
Output:
Purpose:
Render valid source to SVG.
puml_export
Input:
Output:
Purpose:
Export diagrams. Writes only when explicitly requested.
puml_normalize
Input:
Output:
Purpose:
Return canonical source or a patch.
puml_apply_edit
Input:
Output:
Purpose:
Let agents perform structural edits without fragile string surgery.
puml_explain
Input:
Output:
Purpose:
Return a structured explanation of a diagram for agents and docs.
puml_list_primitives
Input:
Output:
Purpose:
Give agents the supported syntax catalog without burning prompt tokens in the base skill.
puml_lint_readability
Input:
Output:
Purpose:
Non-blocking readability advice separate from compiler diagnostics.
MCP resources
Resources:
puml://syntax/spec
puml://syntax/arrows
puml://syntax/participants
puml://syntax/notes
puml://syntax/groups
puml://syntax/lifecycle
puml://syntax/styling
puml://examples/hello
puml://examples/login-flow
puml://examples/async-job
puml://examples/lifecycle
puml://themes/default
puml://themes/dark
puml://themes/minimal
puml://project/diagrams
puml://project/diagnostics
Rules:
- Resources are read-only.
- Project resources are limited to allowed roots.
- Large resources paginate or summarize.
- Syntax resources reflect the actual compiled primitive catalog.
MCP prompts
Prompts:
write_sequence_diagram
Arguments:
Prompt behavior:
- Ask the model to draft source.
- Require
puml_check. - Require repair loop.
- Require final source.
review_sequence_diagram
Arguments:
Prompt behavior:
- Check syntax.
- Render if valid.
- Review readability.
- Suggest patch.
convert_to_puml
Arguments:
Prompt behavior:
- Convert to supported
pumlsequence syntax. - Reject non-sequence diagrams unless user explicitly asks to translate the idea into sequence form.
explain_sequence_diagram
Arguments:
Prompt behavior:
- Use
puml_explain. - Return human summary and flow breakdown.
LSP bundling
The agent pack ships .lsp.json for hosts that can load LSP servers from plugins.
Required config behavior:
- map
.puml,.plantuml,.iuml,.puto language IDpuml - start bundled
puml-lsp - pass include roots through environment/config when host supports it
- avoid absolute paths
- use plugin-root variables where supported
LSP is not optional in Claude Code package if the host supports it. It is the mechanism that lets agents see diagnostics as they edit.
Agent behavior contract
When writing a new diagram, the agent must:
- Identify intended diagram as sequence-only.
- Choose participants and aliases.
- Draft
.pumlsource. - Call
puml_check. - Repair all blocking diagnostics.
- Call
puml_render_svgwhen an artifact is requested. - Return source and artifact.
When editing an existing diagram, the agent must:
- Preserve existing intent.
- Parse/check current source.
- Apply minimal source edits.
- Re-check.
- Re-render when artifact exists or is requested.
- Explain changes.
When reviewing a diagram, the agent must:
- Run
puml_check. - Separate compiler errors from readability advice.
- Provide concrete patches.
- Avoid vague design criticism.
When the tool is unavailable, the agent must:
- say it could not validate with the compiler
- still follow syntax spec
- avoid claiming render success
Eval suite
The eval suite is mandatory.
Eval categories:
basic_authoring
participant_aliases
arrows
self_messages
found_lost
notes
refs
groups
lifecycle
styling
includes
docs_integration
repair_invalid
convert_from_prose
convert_from_mermaid
readability_review
security_adversarial
Each eval includes:
- task prompt
- expected primitive coverage
- expected source constraints
- required tool calls
- pass/fail checker
- snapshot of final source
- snapshot of diagnostics
- snapshot of SVG when applicable
Pass criteria:
- final source parses
- final source normalizes
- final source renders when render requested
- no unsupported primitives
- output matches requested style/constraints
- no path traversal
- no unapproved writes
MCP protocol tests
Snapshot protocol transcripts for:
- initialize
- tools/list
- tools/call
puml_parse - tools/call
puml_check - tools/call
puml_render_svg - tools/call
puml_export - tools/call
puml_normalize - tools/call
puml_apply_edit - tools/call
puml_explain - tools/call invalid arguments
- tools/call unknown tool
- resources/list
- resources/read
- prompts/list
- prompts/get
- shutdown
Every tool schema is snapshot-tested.
Plugin validation tests
Codex validation:
- manifest parses
- paths are relative
- skills discovered
- MCP config discovered
- assets exist
- default prompts exist
- marketplace metadata complete
Claude validation:
- manifest parses
- skills discovered
- agents discovered
- MCP config discovered
- LSP config discovered
- plugin-root path variables used
- no components incorrectly placed inside
.claude-plugin/
Host-specific validation commands should run in CI when available. If a host CLI cannot run in CI, schema tests must cover the manifest shape and path rules.
Security tests
Required adversarial cases:
../path traversal- absolute path outside workspace
- symlink escape if testable
- remote include
- huge source input
- decompression/large output pressure
- SVG script injection through message text
- XML escape attack
- prompt requesting shell execution
- prompt requesting secret exfiltration
- write without
write: true - output path overwrite attempt
- invalid JSON schema inputs
The MCP server must fail closed.
Coverage
Minimum:
90% line coverage for puml-mcp
90% line coverage for shared plugin tooling scripts
all MCP tools covered
all resources covered
all prompts covered by evals
all skills covered by evals
all security cases covered
Commands:
cargo test -p puml-mcp
cargo llvm-cov -p puml-mcp --fail-under-lines 90
cargo test -p puml-agent-evals
Do not lower coverage to pass. Add tests.
Distribution
Release artifacts:
- Codex plugin package
- Claude Code plugin package
- standalone
puml-mcpbinary - standalone
puml-lspbinary - platform-specific bundles for macOS, Linux, Windows
- checksums
- SBOM if release automation supports it
Binary rules:
- Binaries are signed or checksummed.
- Plugin manifests reference bundled binaries with relative paths.
- No install script downloads arbitrary code at activation time.
- Optional post-install dependency setup must be explicit and documented.
README requirements
The agent-pack README includes:
- one-line pitch
- supported hosts
- install for Codex
- install for Claude Code
- what the skills do
- MCP tools table
- examples
- security model
- eval strategy
- troubleshooting
- development commands
- release commands
Tone:
- confident
- direct
- no “experimental” framing for core workflows
- no apology section
- no “phase 1” language
Example successful flow
User asks:
Create a sequence diagram for login with browser, API, and database. Render it.
Agent must do:
1. Draft puml source.
2. Call puml_check.
3. Fix diagnostics if any.
4. Call puml_render_svg.
5. Return source and SVG/artifact.
Expected source shape:
@startuml
actor User
boundary Browser
control API
database DB
title Login flow
User -> Browser: Submit credentials
Browser -> API: POST /login
activate API
API -> DB: SELECT user
DB --> API: user row
API --> Browser: session token
deactivate API
Browser --> User: Login complete
@enduml
The exact content can vary. Validity cannot.
Definition of done
puml-agent-pack is done when:
- Codex plugin manifest is valid.
- Claude plugin manifest is valid.
- Shared skills load in both host packages.
- MCP server starts over stdio.
- MCP tools list and call successfully.
- MCP resources list and read successfully.
- MCP prompts list and get successfully.
puml_checkis mandatory in authoring skill workflow.puml_render_svgreturns valid SVG.puml_apply_editsupports core structural edits.puml-lspis bundled/configured for hosts that support LSP plugin config.- Eval suite covers every sequence primitive.
- Security tests pass.
- Coverage is at least 90%.
- Release bundles include binaries, manifests, skills, MCP config, LSP config, assets, README, LICENSE, and checksums.
- Agents using the pack reliably produce valid sequence diagrams instead of plausible-looking invalid syntax.
v0.0.1 implementation profile (historical baseline)
v0.0.1 originally shipped a constrained plugin/MCP profile that was immediately usable for deterministic authoring and validation loops before LSP support landed in-repo.
Scope for v0.0.1:
- Codex plugin manifest and Claude plugin manifest are included in the spec as concrete templates.
- MCP server contract is scoped to check/render/export workflows that can run on top of the existing
pumlCLI. .lsp.jsonhost wiring metadata is included for hosts that can launchbin/puml-lsp.- Skills are authored to require
puml_checkand never claim success without a passing check.
Deferred in v0.0.1
- bundled LSP binaries inside release archives
- host-specific live diagnostics and completions integration
- rename/hover/code-action style editing affordances
- LSP-backed editor integrations inside host IDE surfaces
Required plugin manifests for v0.0.1
Codex manifest template
Claude plugin manifest template
v0.0.1 includes .lsp.json host wiring metadata for bin/puml-lsp; release archives still need bundled LSP binaries before this becomes a zero-config plugin experience.
MCP tool contract for v0.0.1
Minimum required tools:
puml_check- input: diagram text or file path in workspace root
- output:
{ ok, diagnostics[] } - exits non-zero when
ok=false
puml_render_svg- input: diagram text or file path
- output:
{ ok, svg, width, height, diagnostics[] }
puml_render_file- input: diagram text or file path + output path
- output:
{ ok, output_path, diagnostics[] }
Optional (recommended):
puml_dump_astpuml_dump_modelpuml_dump_scene
Deterministic repair loop requirement (v0.0.1)
Every authoring flow in Codex and Claude must enforce:
- draft/update
.puml - run
puml_check - repair until diagnostics are empty
- render SVG only after a passing check
- return source plus render artifact/path
If check fails, the workflow cannot claim completion.
Reference anchors
- Codex plugins: https://developers.openai.com/codex/plugins
- Codex build plugins: https://developers.openai.com/codex/plugins/build
- Codex skills: https://developers.openai.com/codex/skills
- Codex MCP: https://developers.openai.com/codex/mcp
- Claude Code plugins reference: https://code.claude.com/docs/en/plugins-reference
- Claude Code skills: https://code.claude.com/docs/en/skills
- MCP specification: https://modelcontextprotocol.io/specification/2025-06-18
- MCP tools: https://modelcontextprotocol.io/specification/2025-06-18/server/tools
- MCP resources: https://modelcontextprotocol.io/specification/2025-06-18/server/resources
- MCP prompts: https://modelcontextprotocol.io/specification/2025-06-18/server/prompts