Product · CLI

The agent CLI that doesn't need the internet.

Meshly CLI runs the Claude Code workflow against the model server your team operates. A shared vLLM cluster, an in-house inference gateway, workstation Ollama if you're solo. The agent loop and your code stay inside your network. Nothing leaves the perimeter. The agent inside is called Ilves. Finnish for lynx. Local, watchful, doesn't need permission to hunt.
~/code/ledger-svc · meshly
AIR-GAPPED
● ilves · qwen2.5-coder:32b · inference.gpu.internal
Plan a refactor of apps/api/src/auth/refresh.ts
↳ reading 8 files · proposing 3 edits · 0 calls outside the perimeter
Edit · apps/api/src/auth/refresh.ts
↳ +28 −12
Bash · pnpm test --filter @meshly/auth
↳ 84 passing · 0 failing
✓ done · all traffic stayed inside the perimeter
01 · Your model. Your endpoint.

Whatever model server your team already runs.

Most teams have a centralized inference server on their own GPUs: vLLM or TGI behind an internal endpoint, sometimes routed through a governance gateway. Meshly CLI points at that endpoint. Workstation Ollama works too if you're solo. Anything that speaks OpenAI-compatible HTTP. The CLI handles the agent loop; you choose the brain and where it lives.

  • Shared inference: vLLM, TGI, or any OpenAI-compatible server on your GPU cluster.
  • Through your gateway: route via your existing model proxy or governance layer.
  • Workstation: Ollama or llama.cpp on a laptop for solo / offline work.
LOCAL MODEL RUNTIME
vLLM / TGI
shared GPU cluster
MOST COMMON
Any OpenAI-compatible endpoint
internal gateway, private proxy
Ollama
workstation or LAN
llama.cpp
embedded server
$ meshly --model qwen2.5-coder:32b "fix the test for expired tokens"
↳ connects to inference.gpu.internal · no API key · stays in your network
02 · Same workflow as Claude Code

Same muscle memory. Different brain.

Meshly CLI mirrors the Claude Code workflow: same tool surface (Read / Edit / Write / Bash / Glob / Grep), same task-oriented agent patterns, same MCP server extension model. It does not call api.anthropic.com. It does not speak the Anthropic Messages API. It just gives your engineers the muscle memory they already have against the model server you point it at.

  • Workflow-compatible, not protocol-compatible. The Anthropic API isn't in the picture.
  • Same tool surface as Claude Code. Same prompts, same patterns, same MCP servers.
  • Engineers who use Claude Code switch over in a session, not a sprint.
CLAUDE CODE
cloud model
calls api.anthropic.com
MESHLY CLI
your model
calls your inference endpoint
SAME WORKFLOW · SAME TOOLS
Read / Edit / Write / Glob / Grep tools
Bash execution with permission modes
MCP servers for extension
Project-scoped system prompts
Task / agent / subagent patterns
Tool allowlists per session
03 · Air-gapped by design

No phone-home. No public-internet calls.

The only outbound call Meshly CLI makes is to the inference endpoint you configure. No telemetry, no auto-update pings, no anonymous usage stats, no 'check for new release' beacons. Bring it in on a USB drive, run it on a closed network, sleep at night.

  • Zero public-internet requirement. All traffic stays inside your perimeter.
  • No telemetry, no metrics export, no crash reporting to anyone but you.
  • Reproducible offline installs: vendored dependencies, signed binaries, deterministic builds.
Meshly CLI
workstation, CI runner, dev box
HTTPS
Inference endpoint
vLLM / TGI / Ollama, on your GPU cluster or LAN
inside the perimeter
AIR GAP
No public-internet calls. No telemetry. No phone-home. The CLI only talks to the endpoint you configure.
Your code stays where you put it.
04 · Pairs with Server when you want

Works standalone. Better together.

Meshly CLI is a complete coding agent on its own. When you also use Meshly Build Server, the CLI dispatches tasks from the shared queue, streams output back to the dashboard, and pushes architecture scans for the team. Same role as Station, just terminal-native.

  • Standalone mode: your inference endpoint, your repo, no Server required.
  • Connected mode: same task / agent / audit surface as Station, driven from a terminal.
  • One config file. Switch between standalone and connected without reinstalling.
CONFIGURATION · ~/.meshly/config.toml
# your inference endpoint
model = "qwen2.5-coder:32b"
endpoint = "https://inference.gpu.internal/v1"
# optional · pair with Meshly Build Server
[server]
url = "https://server.meshly.build"
auth = "sso"
STANDALONE
Your inference endpoint + your repo. No Server needed.
WITH SERVER
Dispatch tasks, push results to the shared workspace.
Who needs this

For the teams that can't ship code to cloud AI.

Cloud-AI coding agents work great until your compliance officer reads the data processing addendum. Meshly CLI is the answer for environments where the cloud isn't an option. Bring the agent to the data, not the other way around.

Defence + government
Classified networks. Code and prompts cannot leave the perimeter.
Finance + banking
Data sovereignty. Strict outbound-network policies. Audit-bound.
Healthcare
PHI must not reach cloud AI. HIPAA-bound environments.
Regulated R&D
Pre-release IP, patent-pending code, legal-sensitive work.
On-prem development teams
Locked-down workstations, controlled internet egress, managed dev environments.
Three pieces. One platform.

CLI runs the agents where the cloud can't go.

CLI · AIR-GAPPED
Local-AI runner
Terminal-native, model-agnostic, internet-optional. For environments where cloud AI isn't an option.
STATION · DESKTOP
GUI runner
Tauri app on each operator's machine. Spawns Claude Code agents, scans architecture, streams output to Server. Read →
SERVER · MANAGED
Control plane
The shared workspace. Boards, agents, review, audit, knowledge, operated by Meshly.Read →

Available by request.

Meshly CLI isn't a public download. Tell us about your environment, what models you run, what restrictions you operate under, and we'll work out whether it fits and get you a build with the right configuration.