mcp-workshop

by DGitHubUpdated May 4, 2026

Pure-compute utilities for AI agents: chunk text by tokens, extract JSON from messy LLM output, validate against JSON Schema, analyze logs, convert timestamps across timezones, compute hashes and encodings. No API keys, no setup, no external dependencies — every tool runs on pure compute. Built to compose: extract JSON, validate it, chunk the result, hash the chunks. The tool list grows based on what users actually request via request_tool.

text-processing
data-extraction
string-manipulation
+7
|

mcp-workshop

Overview

mcp-workshop is an MCP server exposing 11 utility tools for text processing, data validation, log analysis, cryptographic operations, and developer workflow support. Its capabilities span diffing, chunking, extraction, truncation, schema validation, regex matching, JSON log parsing, hash computation, and timestamp conversion. A standout feature is request_tool — a built-in feedback primitive that lets any agent request new tools directly from within a workflow, with submissions logged and prioritized by demand. This makes mcp-workshop not just a static utility server but a living, agent-driven platform that grows based on real usage. It enables programmatic operations without custom scripting, integrating directly into AI workflows or data pipelines via the MCP protocol. All tools are pure compute — no API key required.

Key Capabilities

  • text_diff: Generates a unified diff highlighting changes between two input strings, including added/removed line counts.
  • text_chunk: Splits long text into token-sized chunks with configurable size and optional overlap between consecutive chunks.
  • text_extract_json: Identifies and extracts the first valid JSON object or array from messy or mixed text, such as LLM output wrapped in markdown fences or prose.
  • text_regex_extract: Applies a regular expression to capture all matches from text, returning each match with its position and capture groups.
  • text_truncate: Trims text to fit within a token budget without cutting mid-word, with control over which end to truncate from.
  • validate_json_schema: Validates a JSON value against a JSON Schema (drafts 4, 7, 2019-09, 2020-12), returning structured errors with path and reason for each violation.
  • parse_json_logs: Analyzes JSON or NDJSON log data and returns a structured summary including level counts, top error messages, timeline span, and anomalies.
  • hash_compute: Computes cryptographic hashes (SHA-256, SHA-384, SHA-512, SHA-1) or performs encode/decode operations on input data.
  • time_convert: Converts a timestamp between timezones and formats for cross-environment compatibility.
  • request_tool: Allows agents to submit requests for tools the server doesn't yet have, logged and prioritized by demand.
  • list_requests: Lists all tool requests submitted via request_tool, newest first, for visibility into agent-driven development feedback.

Use Cases

  1. Code Review Automation: Use text_diff to compare file versions in CI/CD pipelines, feeding diffs to LLMs for change summaries or review flagging.
  2. Document Processing for RAG: Apply text_chunk to split large PDFs or articles into LLM-friendly segments before vectorization, with overlap to preserve context at boundaries.
  3. Data Parsing from Logs: Combine parse_json_logs, text_extract_json, and text_regex_extract to pull structured data and surface anomalies from unstructured server logs.
  4. LLM Output Handling: Use text_extract_json paired with validate_json_schema for the standard extract → validate → branch pipeline pattern when working with model-generated structured output.
  5. Context Window Management: Employ text_truncate to enforce token limits on dynamic inputs before sending to an LLM, ensuring prompts fit within budget without mid-word cuts.
  6. Security & Data Integrity: Use hash_compute to verify file integrity, sign payloads, or generate content fingerprints within agent pipelines.
  7. Multi-Timezone Pipelines: Apply time_convert to normalize timestamps from disparate sources before logging, comparing, or displaying event data.

Who This Is For

Backend developers building text-heavy apps, data engineers cleaning and validating datasets, AI engineers managing LLM context and output parsing, and platform builders embedding utility operations into agent workflows. It suits scenarios requiring reliable, on-demand compute primitives without installing libraries or managing infrastructure — callable directly over MCP from any compatible client. Because any agent can call request_tool to flag missing functionality, the answer to "who this is for" is expanding over time: the server's capabilities are shaped by the agents and developers that use it, with the most-requested tools getting built first.