Product · CLI

The agent CLI that doesn't need the internet.

Meshly CLI runs the Claude Code workflow against the model server your team operates. A shared vLLM cluster, an in-house inference gateway, workstation Ollama if you're solo. The agent loop and your code stay inside your network. Nothing leaves the perimeter. The agent inside is called Ilves. Finnish for lynx. Local, watchful, doesn't need permission to hunt.

~/code/ledger-svc · meshly

AIR-GAPPED

● ilves · qwen2.5-coder:32b · inference.gpu.internal

› Plan a refactor of apps/api/src/auth/refresh.ts

↳ reading 8 files · proposing 3 edits · 0 calls outside the perimeter

› Edit · apps/api/src/auth/refresh.ts

↳ +28 −12

› Bash · pnpm test --filter @meshly/auth

↳ 84 passing · 0 failing

✓ done · all traffic stayed inside the perimeter

01 · Your model. Your endpoint.

Whatever model server your team already runs.

Most teams have a centralized inference server on their own GPUs: vLLM or TGI behind an internal endpoint, sometimes routed through a governance gateway. Meshly CLI points at that endpoint. Workstation Ollama works too if you're solo. Anything that speaks OpenAI-compatible HTTP. The CLI handles the agent loop; you choose the brain and where it lives.

Shared inference: vLLM, TGI, or any OpenAI-compatible server on your GPU cluster.
Through your gateway: route via your existing model proxy or governance layer.
Workstation: Ollama or llama.cpp on a laptop for solo / offline work.

LOCAL MODEL RUNTIME

vLLM / TGI

shared GPU cluster

MOST COMMON

Any OpenAI-compatible endpoint

internal gateway, private proxy

Ollama

workstation or LAN

llama.cpp

embedded server

$ meshly --model qwen2.5-coder:32b "fix the test for expired tokens"

↳ connects to inference.gpu.internal · no API key · stays in your network

02 · Same workflow as Claude Code

Same muscle memory. Different brain.

Meshly CLI mirrors the Claude Code workflow: same tool surface (Read / Edit / Write / Bash / Glob / Grep), same task-oriented agent patterns, same MCP server extension model. It does not call api.anthropic.com. It does not speak the Anthropic Messages API. It just gives your engineers the muscle memory they already have against the model server you point it at.

Workflow-compatible, not protocol-compatible. The Anthropic API isn't in the picture.
Same tool surface as Claude Code. Same prompts, same patterns, same MCP servers.
Engineers who use Claude Code switch over in a session, not a sprint.

CLAUDE CODE

cloud model

calls api.anthropic.com

MESHLY CLI

your model

calls your inference endpoint

SAME WORKFLOW · SAME TOOLS

✓ Read / Edit / Write / Glob / Grep tools

✓ Bash execution with permission modes

✓ MCP servers for extension

✓ Project-scoped system prompts

✓ Task / agent / subagent patterns

✓ Tool allowlists per session

03 · Air-gapped by design

No phone-home. No public-internet calls.

The only outbound call Meshly CLI makes is to the inference endpoint you configure. No telemetry, no auto-update pings, no anonymous usage stats, no 'check for new release' beacons. Bring it in on a USB drive, run it on a closed network, sleep at night.

Zero public-internet requirement. All traffic stays inside your perimeter.
No telemetry, no metrics export, no crash reporting to anyone but you.
Reproducible offline installs: vendored dependencies, signed binaries, deterministic builds.

◆

Meshly CLI

workstation, CI runner, dev box

HTTPS

◇

Inference endpoint

vLLM / TGI / Ollama, on your GPU cluster or LAN

inside the perimeter

AIR GAP

No public-internet calls. No telemetry. No phone-home. The CLI only talks to the endpoint you configure.

Your code stays where you put it.

04 · Pairs with Server when you want

Works standalone. Better together.

Meshly CLI is a complete coding agent on its own. When you also use Meshly Build Server, the CLI dispatches tasks from the shared queue, streams output back to the dashboard, and pushes architecture scans for the team. Same role as Station, just terminal-native.

Standalone mode: your inference endpoint, your repo, no Server required.
Connected mode: same task / agent / audit surface as Station, driven from a terminal.
One config file. Switch between standalone and connected without reinstalling.

CONFIGURATION · ~/.meshly/config.toml

# your inference endpoint

model = "qwen2.5-coder:32b"

endpoint = "https://inference.gpu.internal/v1"

# optional · pair with Meshly Build Server

[server]

url = "https://server.meshly.build"

auth = "sso"

STANDALONE

Your inference endpoint + your repo. No Server needed.

WITH SERVER

Dispatch tasks, push results to the shared workspace.

Who needs this

For the teams that can't ship code to cloud AI.

Cloud-AI coding agents work great until your compliance officer reads the data processing addendum. Meshly CLI is the answer for environments where the cloud isn't an option. Bring the agent to the data, not the other way around.

Defence + government

Classified networks. Code and prompts cannot leave the perimeter.

Finance + banking

Data sovereignty. Strict outbound-network policies. Audit-bound.

Healthcare

PHI must not reach cloud AI. HIPAA-bound environments.

Regulated R&D

Pre-release IP, patent-pending code, legal-sensitive work.

On-prem development teams

Locked-down workstations, controlled internet egress, managed dev environments.

Three pieces. One platform.

CLI runs the agents where the cloud can't go.

CLI · AIR-GAPPED

Local-AI runner

Terminal-native, model-agnostic, internet-optional. For environments where cloud AI isn't an option.

STATION · DESKTOP

GUI runner

Tauri app on each operator's machine. Spawns Claude Code agents, scans architecture, streams output to Server. Read →

SERVER · MANAGED

Control plane

The shared workspace. Boards, agents, review, audit, knowledge, operated by Meshly.Read →

Available by request.

Meshly CLI isn't a public download. Tell us about your environment, what models you run, what restrictions you operate under, and we'll work out whether it fits and get you a build with the right configuration.