Cli-Over-Https on network-notes

CLI Over HTTPS Part 4: Where Do We Go from Here?

brett@network-notes.com (Brett Lykins) — Thu, 07 May 2026 09:00:00 +0000

This is the last post in the series. Part 1 explained why SSH is slow for automation. Part 2 measured it. HTTPS batch is up to 17x faster at real-world latencies. Part 3 showed that an edge proxy and a transparent tunnel capture most of that improvement even when devices don’t speak HTTPS natively.

This post is the practical takeaway: when to use what, how to deploy it, and what the industry should build next.

The Decision Framework

Not every network needs a proxy. Not every automation run is latency-sensitive. Here’s how to think about it:

SSH Direct Is Fine When

Your automation server is co-located with the devices (same DC, same site)
Round-trip time to the devices is under ~5ms
You’re managing fewer than a few hundred devices
You’re already using SSH multiplexing (ControlMaster) or persistent connections

At local latency, SSH overhead is measured in single-digit milliseconds. The protocol tax from Part 1 is real but negligible. Don’t add architectural complexity to save 3ms per device.

Deploy a Proxy When

Your automation runs over a WAN (30ms+ RTT to the devices)
You manage devices across multiple sites, regions, or continents
Automation run time is operationally painful (the Rackspace problem from Part 1)
You’re already running regional infrastructure (jump hosts, bastion servers, Ansible Tower nodes)
You can modify your automation to speak HTTPS instead of SSH

The proxy pattern from Part 3 showed 5.3-14.7x improvement at real WAN latencies (14.7x with connection reuse at 150ms RTT). If you already have bastion hosts in each region, you’re halfway there. The proxy is a bastion that speaks HTTPS instead of (or in addition to) SSH.

Deploy a Tunnel When

You need WAN optimization but can’t change your automation tooling
Your team has years of Ansible playbooks, Nornir scripts, or Netmiko wrappers that must speak SSH
You want a migration path: tunnel first, proxy later

The tunnel from Part 3 is transparent to both sides: automation speaks SSH to the headend, the device sees SSH from the site proxy. With command batching, it hits 12.7x speedup at intercontinental latency. Without batching, it’s still 3.0x faster than SSH direct.

Deploy NAAS When

You want a production-ready proxy with multi-vendor support out of the box
You need connection pooling, async jobs, and circuit breakers
You manage devices across 100+ platforms (everything Netmiko supports)

NAAS (Netmiko as a Service) implements the proxy pattern with production concerns handled. Deploy an instance per region, point your automation at it, and the SSH stays local. More on NAAS below.

Push for Native HTTPS When

You’re evaluating new platforms or vendors
You have influence over vendor roadmaps (large deployments, design partners)
You’re building internal tooling that could expose CLI over HTTPS

Native HTTPS eliminates the proxy entirely. The ~17x batch improvement from Part 2 is the ceiling. No intermediate hop, no backend SSH overhead. If a vendor offers it, use it.

NAAS: The Production Proxy

NAAS (Netmiko as a Service) is what the proxy pattern looks like when you build it for production. Written in Python, it wraps Netmiko behind a REST API, which means it supports 100+ device platforms: Cisco IOS, NX-OS, ASA, Juniper Junos, Arista EOS, Palo Alto, and everything else Netmiko handles.

You POST a JSON payload with the device address, platform, credentials, and commands. NAAS opens the SSH session, runs the commands, and returns the output in the HTTP response:

1
2
3
4


curl -k -X POST https://naas.dc1.example.com:8443/v1/send_command \
 -u "automation:token" \
 -H "Content-Type: application/json" \
 -d '{"host": "10.1.1.1", "platform": "cisco_ios", "commands": ["show version", "show ip route"]}'

What NAAS handles that a minimal proxy doesn’t:

Multi-vendor connection pooling. Persistent SSH connections with health checks and automatic reconnection.
Async job queue. Long-running commands (show tech-support, bulk config pushes) run in a Redis-backed queue. Your automation gets a job ID back immediately and polls for results.
Circuit breaker and observability. Stops hammering unreachable devices, exposes Prometheus metrics for connection pool health and per-device latency.

Deploy a NAAS instance in each data center or region, and your automation talks HTTPS to the nearest one. The SSH sessions stay local. The architecture is the same as what the benchmarks in Part 3 measured: HTTPS over the WAN, SSH on the last hop.

1
2
3
4
5
6


git clone https://github.com/lykinsbd/naas.git && cd naas
docker compose up -d
curl -k -X POST https://localhost:8443/v1/send_command \
 -u "username:password" \
 -H "Content-Type: application/json" \
 -d '{"host": "10.1.1.1", "platform": "cisco_ios", "commands": ["show version"]}'

See the NAAS getting started guide for full setup and configuration.

What Exists and What’s Missing

Some of the pieces are already in place.

Arista’s eAPI accepts CLI commands via JSON-RPC over HTTPS. It wraps everything in JSON, but the core pattern is there: send commands over HTTPS, get output back. The ASA interface and eAPI have been in production for years. NAAS (described above) brings the proxy pattern to the 100+ platforms Netmiko supports. The clibench tunnel mode demonstrates the transparent SSH-to-HTTP approach for teams that can’t change their automation tooling.

What’s missing:

A standard CLI-over-HTTPS interface. Not RESTCONF, not gNMI. Those are structured data interfaces for a different use case. A simple, standardized way to send CLI commands over HTTPS and get text output back. The ASA pattern is a reasonable starting point: GET /cli/exec/{command} for show commands, POST /cli/config for configuration. Basic auth or token auth over TLS. Content-Type: text/plain. No JSON wrapping unless the client asks for it. Arista’s eAPI is the closest thing to this, but it’s vendor-specific and JSON-only.

Proxy support in the automation ecosystem. Ansible could ship a connection plugin that talks HTTPS to a proxy like NAAS instead of SSH to the device. Nornir could support an HTTP transport alongside Paramiko and Netmiko. NAAS works today as a standalone API, and a native connection plugin would make adoption even easier: a configuration option instead of a code change.

Broader vendor adoption. Every network OS already has an HTTPS server for its web UI. Exposing the CLI through that same server is not a large engineering effort. The ASA proves the concept. A plain-text CLI endpoint alongside the structured API would cover both use cases.

None of this requires abandoning SSH. SSH remains the right tool for interactive sessions, for out-of-band recovery, for environments where HTTPS infrastructure doesn’t exist. The argument isn’t “replace SSH.” It’s “stop using SSH for the thing it’s worst at.”

The Numbers, One More Time

For reference, here’s the speedup picture from the series. Most automation tools (Netmiko, Ansible, Scrapli) use PTY mode, so SSH PTY is the realistic baseline:

Scenario	Speedup vs SSH PTY
HTTPS batch (native)	~17x
Proxy (reused connection)	~14.7x
Tunnel batch	~12.7x
Proxy (new connection)	~5.3x
HTTPS keep-alive (native)	~3.4x
Tunnel per-command	~3.0x
SSH with ControlMaster	~1.7x

The proxy with connection reuse gets you most of the native HTTPS improvement without requiring any changes to the devices. The tunnel with batching is close behind, and requires zero changes to your automation tooling either.

Try It Yourself

The benchmark tool supports all scenarios:

1
2
3
4
5
6
7
8


# All transports (SSH, HTTPS, HTTP/3, proxy, tunnel)
sudo ./bin/clibench bench --latency regional --iterations 20 --commands 5

# Proxy pattern (HTTPS + HTTP/3 variants)
sudo ./bin/clibench bench --latency regional --iterations 20 --commands 5 --transport proxy

# Tunnel (transparent SSH-to-HTTP WAN optimization)
sudo ./bin/clibench bench --latency intercontinental --iterations 20 --commands 5 --transport tunnel-https

The code is MIT licensed. Run it on your own infrastructure, with your own latency profiles, and see what the numbers look like for your network.

My take: The network automation community has treated SSH as a given for fifteen years. It was the right default when automation meant one engineer scripting against a handful of devices. At the scale most organizations operate today, SSH’s protocol overhead is a measurable, avoidable cost. Native HTTPS CLI is the right long-term direction. The proxy and tunnel patterns are deployable today. I built NAAS so you can start today. Contributions welcome.

CLI Over HTTPS Part 3: The Proxy Pattern

brett@network-notes.com (Brett Lykins) — Tue, 05 May 2026 09:00:00 +0000

In Part 1 I showed that SSH burns 10-15 round trips before delivering a single byte of command output. In Part 2 I proved it. HTTPS batch is ~17x faster than SSH at real-world latencies when the device supports it natively. Even HTTPS keep-alive, with no batching, is 3.4x faster.

The obvious objection: most devices don’t support it natively. Your Cisco IOS switches, your Juniper routers, your Arista leaf nodes, they speak SSH. And while some of them have other interfaces, SSH is not changing anytime soon.

So the question isn’t “how do I get my switches to speak HTTPS.” The question is: where does the SSH happen?

There are two answers. The proxy requires your automation to speak HTTPS, but it’s architecturally simple: one hop, one translation. The tunnel keeps SSH on both ends and optimizes only the WAN segment, so existing tooling works unchanged. Both relocate the expensive SSH round trips to a local link where they cost almost nothing.

The Proxy: Replace SSH on the WAN

SSH is slow because of round trips. Round trips are slow because of distance. If you move the SSH session closer to the device, the round trips get cheap.

A proxy co-located with the devices, in the same data center or local network, talks SSH to the devices over a 1-2ms link where the protocol overhead is negligible. Your automation platform talks HTTPS to the proxy over the WAN, where the round-trip savings from Part 1 actually matter.

The device never knows the difference, it sees an SSH session from a local IP. Your automation never touches SSH directly, it sends an HTTP request and gets CLI output back in the response body.

The Architecture

The proxy is the only component that touches SSH. Everything upstream is HTTPS: connection pooling, TLS 1.3, request batching, proper Content-Length framing. Everything downstream is SSH, but over a link where it doesn’t matter.

Proving It

I added a proxy mode to the benchmark tool from Part 2. The proxy is an HTTPS server that receives commands via the same ASA-style endpoints (/admin/exec/, /admin/config), then opens an SSH session to a backend device and returns the output.

The test setup:

Backend device: SSH listener with 2ms RTT (local latency)
Proxy: HTTPS frontend with WAN latency, SSH client to backend
Benchmark client: Talks HTTPS to the proxy, same as it would to a native HTTPS device

Four proxy modes tested with a new WAN connection per request (cold start, first request of an automation run):

fresh-ssh: New WAN connection + new SSH to backend per request
pooled-ssh: New WAN connection, reuses one SSH connection on the backend
h3-fresh-ssh: Same as fresh-ssh, but the WAN leg uses HTTP/3 (QUIC)
h3-pooled-ssh: QUIC on WAN, pooled SSH on backend

Plus two connection-reuse modes (steady state, what a running automation platform does):

keep-alive: Persistent HTTPS connection to proxy, pooled SSH on backend
h3-keep-alive: Persistent QUIC connection to proxy, pooled SSH on backend

The proxy’s WAN-facing listener gets the same latency injection as the direct SSH and HTTPS tests from Part 2. The backend SSH link gets a fixed 2ms RTT. All transports experience the same WAN conditions. The only difference is what happens on the last hop.

Results

All runs: 20 iterations, 5 commands per iteration (batched in one POST). SSH direct numbers from Part 2 for comparison. The “SSH direct” column uses PTY/shell mode, what Netmiko and Ansible actually do.

WAN RTT	SSH direct (PTY)	Proxy (new conn)	Proxy (reused conn)	Speedup (reused vs SSH PTY)
30ms	528ms	124ms	56ms	9.4x
70ms	1,208ms	248ms	95ms	12.7x
150ms	2,571ms	489ms	175ms	14.7x

The new-connection proxy (full TLS handshake per request) is 4.3-5.3x faster than SSH direct. With connection reuse, the proxy hits 9.4-14.7x, and the advantage grows with latency because reusing the connection eliminates the TLS handshake entirely, paying only 1 round trip per request.

At 150ms RTT (a US NOC managing devices in Hong Kong) SSH direct (PTY) takes 2.6 seconds per device. The proxy with a persistent connection does it in 175ms.

Why It Works

SSH direct (PTY mode) at 150ms RTT pays the full protocol tax on every round trip over the WAN:

TCP handshake: 1 RT × 150ms
SSH version exchange: 1 RT × 150ms
Key exchange: 2 RT × 150ms
Auth + channel + PTY + shell: 4 RT × 150ms
Session prep: 2 RT × 150ms
5 commands with echo verification: 5 RT × 150ms

That’s ~15 round trips × 150ms = ~2,250ms of protocol overhead, plus processing time.

The proxy with a new connection splits that cost across two links:

WAN leg (HTTPS, new connection): TCP + TLS 1.3 + HTTP request = ~3 RT × 150ms = ~450ms
Local leg (SSH): The same ~15 SSH round trips, but at 2ms = ~30ms

Total: ~480ms. That’s the cold-start cost when your automation opens a new connection to the proxy.

The proxy with a reused connection eliminates the handshake entirely:

WAN leg (HTTPS, reused connection): HTTP request/response = ~1 RT × 150ms = ~150ms
Local leg (SSH): Same ~30ms

Total: ~180ms. Measured: 175ms.

This is the steady-state performance. Once your automation has an open connection to the proxy (which any HTTP client maintains by default), every subsequent request costs exactly one WAN round trip plus the local SSH work. The SSH overhead is still there. It’s just happening on a link where 15 round trips cost 30ms instead of 2,250ms.

Fresh vs Pooled: Does It Matter?

At local latency, not much. The gap between fresh-ssh (132ms) and pooled-ssh (119ms) at 30ms WAN RTT is 13ms, the cost of one SSH handshake at 2ms RTT. In production you’d pool connections anyway for resource efficiency, but the performance argument for pooling is modest when the backend latency is low.

The operational argument matters more. A pooled connection means fewer SSH sessions on the device, and Network devices have finite session limits. An ASA might handle 5 concurrent SSH sessions, a catalyst might allow 16. If your proxy is serving 50 requests per second, fresh connections will exhaust those limits instantly. Pooling keeps one session open per device and multiplexes commands through it.

The pooling logic in clibench is simple. getSSH() returns an existing connection if one is pooled, or dials a new one:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


func (s *Server) getSSH() (*ssh.Client, bool, error) {
 if !s.pooled {
 c, err := ssh.Dial("tcp", s.backendAddr, s.sshCfg)
 return c, false, err
 }
 if s.pool != nil {
 return s.pool, true, nil
 }
 c, err := ssh.Dial("tcp", s.backendAddr, s.sshCfg)
 if err != nil {
 return nil, false, err
 }
 s.pool = c
 return c, true, nil
}

The tradeoff is stale connections; devices reboot, sessions time out, firewalls drop idle flows. The proxy needs to detect dead connections and reconnect, the same problem as HTTP connection pooling or database connection pooling. In clibench, a failed session operation clears the pool so the next request gets a fresh connection. In production, you’d add periodic health checks and a circuit breaker for unreachable devices; which is what NAAS does, for example.

The Tunnel: Keep SSH on Both Ends

The proxy requires changing your automation client. What if you can’t?

Many teams have years of Ansible playbooks, Nornir scripts, and Netmiko wrappers that all speak SSH. Rewriting them to speak HTTPS is a project, not a config change. The tunnel solves this: both your automation and the device speak SSH. The WAN segment in between uses HTTPS or HTTP/3, but neither endpoint knows or cares.

Architecture

The headend sits near your automation server. It accepts SSH connections, parses the exec command, and forwards it as an HTTP request over the WAN to the site proxy. The site proxy is the same component from the proxy pattern above. It receives the HTTP request and talks SSH to the device on a local link.

Your automation runs ssh headend "show version" and gets back the device output. Under the hood, the WAN segment used HTTPS with 2-3 round trips instead of SSH’s 15+.

Results

WAN RTT	SSH direct (PTY)	Tunnel (per-cmd)	Tunnel (batch)	Speedup (batch vs SSH PTY)
30ms	528ms	228ms	82ms	6.4x
70ms	1,208ms	429ms	121ms	10.0x
150ms	2,571ms	856ms	202ms	12.7x

Two things jump out.

Without batching, the tunnel is slower than the proxy. The per-command tunnel mode (ssh-https-ssh) pays SSH overhead on both ends: the automation-to-headend SSH handshake, plus the site proxy-to-device SSH handshake. That’s two sets of SSH round trips at campus latency (~2ms each), plus the WAN HTTP request per command. At 150ms, 856ms is still 3.0x faster than SSH direct, but much worse than the proxy’s 175ms.

With batching, the tunnel approaches proxy performance. The batch mode sends all 5 commands in a single SSH exec payload to the headend. The headend forwards them as one HTTP POST. The site proxy runs them all in one SSH session. At 150ms, that’s 202ms vs the proxy’s 175ms. The tunnel pays a small penalty for the extra SSH hop on the automation side, but it’s close.

The tunnel’s value isn’t raw speed. It’s that you get 12.7x improvement with zero changes to your automation code or your devices.

Proxy vs Tunnel: When to Use Which

Use the proxy when you can modify your automation to speak HTTPS. It’s faster (14.7x with connection reuse vs 12.7x for the tunnel), simpler (one hop instead of two), and has lower per-request overhead.

Use the tunnel when you can’t change the automation client. If your tooling must speak SSH (connection plugins, credential management, or organizational inertia) the tunnel gives you WAN optimization transparently. The batch mode requires that your SSH client sends multiple commands in one exec call (which tools like ssh host "cmd1 && cmd2" do naturally), but even per-command mode is 3.0x faster than SSH direct at high latency.

Use both in a migration. Deploy the tunnel first for immediate wins with no code changes, then migrate automation to speak HTTPS to the proxy directly as you refactor.

What This Looks Like in Practice

If you have an internal API that accepts “run this command on this device” requests and returns the output, you’re already running a version of this.

Examples in the ecosystem:

Salt proxy minions with NAPALM behind the Salt REST API
AWX execution environments co-located with devices
Oxidized’s web interface
The Rackspace Go microservices from Part 1
NAAS (Netmiko as a Service): wraps Netmiko behind a REST API with connection pooling, async jobs, and circuit breakers

Most of these co-locate SSH with the devices (good), but don’t expose a clean HTTPS interface upstream, or they bury it under job queues, inventory sync, and YAML sprawl. The core pattern is simpler than any of those tools. The proxy in clibench proves the concept in ~180 lines of Go. A production deployment adds multi-vendor support, health checks, and credential management on top.

Security

The proxy doesn’t make things more or less secure. It changes the trust model.

With SSH direct, your automation server holds the SSH keys and authenticates directly to every device. With the proxy pattern, the trust boundary splits in two: your automation authenticates to the proxy (over HTTPS, using API tokens, mTLS, or whatever your org uses for service-to-service auth), and the proxy authenticates to the devices (over SSH, using keys that are available to the proxy itself).

What actually changes:

Where the SSH keys live. They move from the automation server to the proxy. The private keys never cross the WAN in either model (SSH public key auth sends a signature, not the key), but the proxy pattern puts the keys physically closer to the devices they unlock.
The WAN-side auth mechanism. Your automation no longer speaks SSH to devices. It speaks HTTPS to the proxy. That’s not inherently better or worse. It’s a different credential type (API token or client cert vs SSH key) managed through whatever system your organization already runs for service authentication.
The blast radius of a compromised proxy. The proxy has access to the SSH keys for every device it manages. Compromise the proxy, and you have access to the fleet. This is the same risk profile as an SSH bastion host, which most organizations already operate and already know how to harden: minimal attack surface, restricted network access, key rotation, session logging, and monitoring. The proxy deserves the same care you’d give a bastion.

When the Proxy Doesn’t Help

The proxy pattern assumes the WAN latency between your automation and the devices is the bottleneck. If your automation server is already co-located with the devices (same rack, same DC), there’s no WAN leg to optimize. SSH at 1-2ms RTT is fast enough.

It also doesn’t help if your bottleneck is device processing time rather than transport overhead. If a show tech-support takes 30 seconds to generate on the device, the transport saves you a few hundred milliseconds on a 30-second operation. Still worth it at scale, but the relative improvement is smaller.

And the proxy adds operational complexity. It’s another service to deploy, monitor, and maintain. For a team managing 50 devices in one location, the overhead isn’t justified. For a team managing thousands of devices across multiple continents, which is where SSH overhead actually hurts, the proxy pays for itself on the first automation run.

Try It

The benchmark code includes all modes. Run it yourself:

1
2
3
4
5
6
7
8


# All transports at 150ms WAN
sudo ./bin/clibench bench --latency intercontinental --iterations 20 --commands 5

# Proxy only (HTTPS + HTTP/3 variants)
sudo ./bin/clibench bench --latency intercontinental --iterations 20 --commands 5 --transport proxy

# Tunnel mode (SSH-to-HTTPS transparent WAN tunnel)
sudo ./bin/clibench bench --latency intercontinental --iterations 20 --commands 5 --transport tunnel-https

In Part 4, I’ll lay out a decision framework for choosing between SSH direct, a proxy, a tunnel, and native HTTPS, and dig into NAAS as a production deployment of the proxy pattern.

My take: The proxy pattern isn’t a workaround. It’s the right architecture for managing geographically distributed network infrastructure. SSH is fine for the last hop. HTTPS (or QUIC) is better for everything upstream.

CLI Over HTTPS Part 2: Proving It

brett@network-notes.com (Brett Lykins) — Thu, 30 Apr 2026 09:00:00 +0000

In Part 1, I argued that SSH is a slow transport for network automation at scale and that HTTPS is fundamentally faster. Round-trip analysis and back-of-napkin math are useful, but they’re not proof. This post is the proof.

I built clibench, a dual-protocol network device emulator and benchmark client that measures the difference at realistic latencies sourced from Verizon’s published backbone measurements.

Design Constraints

For the comparison to mean anything, the test has to be fair. That means:

Same device, same commands, same output. Both transports hit the same device.Device struct. The only variable is how the command arrives and how the response leaves.
Same latency for both. Delay is injected at the TCP connection level, not the application level. Both SSH and HTTPS experience identical network conditions.
Realistic latency values. No made-up numbers. Every profile is sourced from published measurements.
Multiple modes per transport. SSH gets tested with fresh connections and with connection reuse (ControlMaster-style). HTTPS gets tested with fresh connections, keep-alive, multi-command batching, and config push. Each mode represents a real-world usage pattern.

Architecture

clibench is written in Go. The project has nine packages:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


internal/
 bench/ Benchmark runner: SSH, HTTPS, proxy, and PTY modes
 device/ Shared command engine: prefix matching, transcript loading
 sshserver/ crypto/ssh listener, CiSSHGo patterns
 httpserver/ net/http + TLS, ASA-style /admin/exec/ and /admin/config
 proxy/ HTTPS→SSH edge proxy (fresh + pooled modes)
 netem/ tc netem latency injection (Linux, requires sudo)
 latency/ Userspace delay injection (fallback, no root)
 stats/ Benchmark statistics: percentile, summarize, parallel runner
 tlsutil/ Shared self-signed TLS config generator

The benchmark client embeds its own server. No separate process needed. Latency is injected at the kernel level using Linux tc netem, applied per-port on the loopback interface so both SSH and HTTPS experience identical network conditions.

The Shared Command Engine

Both servers use the same device.Device:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


type Device struct {
 Hostname string
 Username string
 Password string
 commands map[string]string // "show version" -> transcript text
 transcriptDir string
}

func (d *Device) Exec(input string) string {
 input = strings.TrimSpace(input)
 if input == "" {
 return ""
 }
 // exact match first
 if resp, ok := d.commands[input]; ok {
 return resp
 }
 // prefix match ("sh ver" -> "show version")
 // returns "% Ambiguous command" or "% Unknown command" on miss
}

Command transcripts are plain text files loaded from a directory. The filename convention maps to the command: show_version.txt becomes show version. Templates support {{.Hostname}} substitution. This follows the same pattern as CiSSHGo, which I wrote about recently.

The SSH Server

The SSH side uses Go’s crypto/ssh package with an ed25519 host key generated at startup. It supports both exec mode (ssh host "show version") and interactive shell sessions with prompt rendering and command matching. The benchmark client tests both, since real-world tools are split: libraries like Go’s x/crypto/ssh use exec mode, while Netmiko, Ansible, and Scrapli use PTY/shell.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


// Exec mode: split newline-delimited payloads for batch support
case "exec":
 if len(req.Payload) > 4 {
 execCmd = string(req.Payload[4:])
 }
 req.Reply(true, nil)
 if execCmd != "" {
 var out strings.Builder
 for _, line := range strings.Split(execCmd, "\n") {
 cmd := strings.TrimSpace(line)
 if cmd != "" {
 out.WriteString(s.dev.Exec(cmd))
 }
 }
 io.WriteString(ch, out.String())
 }
 ch.SendRequest("exit-status", false, []byte{0, 0, 0, 0})

The HTTPS Server

The HTTPS side generates a self-signed P-256 ECDSA certificate at startup (negotiating TLS 1.3 with TLS_AES_128_GCM_SHA256) and exposes the same endpoints as the Cisco ASA HTTP interface:

GET /admin/exec/show+version. Single command, URL-encoded
GET /admin/exec/cmd1/cmd2/cmd3. Multiple commands, slash-separated
POST /admin/config. Bulk commands, newline-delimited body

Authentication is HTTP Basic over TLS, matching the ASA’s behavior.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


func (s *Server) handleExec(w http.ResponseWriter, r *http.Request) {
 path := strings.TrimPrefix(r.URL.Path, "/admin/exec/")
 parts := strings.Split(path, "/")
 var out strings.Builder
 for _, p := range parts {
 cmd := strings.ReplaceAll(p, "+", " ")
 cmd = strings.TrimSpace(cmd)
 if cmd == "" {
 continue
 }
 out.WriteString(s.dev.Exec(cmd))
 }
 w.Header().Set("Content-Type", "text/plain")
 io.WriteString(w, out.String())
}

The Benchmark Client

The client calls both transports with the same commands and measures wall-clock time. The key difference is visible in the code: SSH requires connection setup, auth, channel open, and per-command exec requests. HTTPS is a single HTTP call:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


// SSH: connect + auth + exec per command
client, _ := ssh.Dial("tcp", addr, sshConfig)
for _, cmd := range commands {
 session, _ := client.NewSession()
 output, _ := session.CombinedOutput(cmd)
 session.Close()
}

// HTTPS: one request, all commands
url := "https://" + addr + "/admin/exec/" + strings.Join(commands, "/")
resp, _ := httpClient.Get(url)
body, _ := io.ReadAll(resp.Body)

Latency Injection

Latency is injected using Linux tc netem on the loopback interface, configured entirely via the vishvananda/netlink library, the same netlink library used by Docker and Kubernetes. The tool sets up a prio qdisc with per-port u32 filters so that traffic to the SSH and HTTPS server ports gets the configured one-way delay, while other loopback traffic is unaffected. This requires root or CAP_NET_ADMIN, the same requirement as most raw-socket networking tools.

Netlink qdisc setup code (click to expand)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


// Qdisc setup via netlink - no shell-out to tc
prio := netlink.NewPrio(netlink.QdiscAttrs{
 LinkIndex: loopbackIndex,
 Handle: netlink.MakeHandle(1, 0),
 Parent: netlink.HANDLE_ROOT,
})
prio.Bands = 4
prio.PriorityMap = [16]uint8{} // unmatched traffic → band 0 (no delay)
netlink.QdiscAdd(prio)

netem := netlink.NewNetem(
 netlink.QdiscAttrs{Parent: netlink.MakeHandle(1, 2)},
 netlink.NetemQdiscAttrs{Latency: uint32(delay.Microseconds())},
)
netlink.QdiscAdd(netem)

// Per-port u32 filter: match dport, classify to delayed band
netlink.FilterAdd(&netlink.U32{
 FilterAttrs: netlink.FilterAttrs{Parent: netlink.MakeHandle(1, 0), Protocol: 0x0800},
 ClassId: netlink.MakeHandle(1, 2),
 Sel: &netlink.TcU32Sel{
 Flags: netlink.TC_U32_TERMINAL,
 Keys: []netlink.TcU32Key{{Val: uint32(port), Mask: 0xffff, Off: 20}},
 },
})

Because netem operates at the kernel’s network stack, it captures real TCP behavior: Nagle’s algorithm, delayed ACKs, TCP window scaling, and proper per-packet delay. Every packet in both directions, client-to-server and server-to-client, experiences the configured delay. This is more accurate than userspace delay injection, which can’t distinguish between logically separate protocol exchanges that happen to be coalesced into a single write.

A -userspace flag is available as a fallback for environments where root isn’t available, but the published numbers all use tc netem.

Latency Profiles

Each profile corresponds to a real network path, sourced from Verizon Enterprise’s monthly IP latency statistics (March 2026). The simulated RTT values are rounded for readability; the Verizon measured column shows the exact source data:

Profile	Simulated RTT	Real-world path	Verizon measured RTT
`local`	0ms	Co-located	Baseline
`campus`	2ms	Same data center	AWS/Prisma: 1-2ms
`regional`	30ms	US backbone	29.9ms
`continental`	70ms	NYC ↔ London	70.2ms
`intercontinental`	150ms	US ↔ Hong Kong	145.5ms
`transpacific`	175ms	NA ↔ Taiwan	175.2ms

Benchmark Modes

The client tests these scenarios across both transports (plus a multi-command GET mode when running more than one command per iteration):

SSH Exec Modes

SSH exec mode opens a channel, sends a command, and reads the output. This is what Go’s x/crypto/ssh, Paramiko’s exec_command(), and OpenSSH’s ssh host "cmd" use. Each command gets its own channel.

Mode	What it measures
`ssh/fresh-conn`	Full SSH lifecycle per iteration: TCP + handshake + auth + channel + exec
`ssh/reuse-conn`	One SSH connection shared across all iterations (ControlMaster-style)
`ssh/batch-exec`	Multi-line command string over a single exec session

SSH PTY/Shell Modes

SSH PTY mode opens an interactive shell with a pseudo-terminal, sends commands as keystrokes, and detects the prompt after each command. This is what Netmiko, Ansible network_cli, Scrapli, and most real-world network automation tools use. Many network devices don’t support exec mode properly, and automation tools need prompt detection, pagination control, and mode transitions. (Part 1 called this the “screen-scraping tax”, the cost of parsing an unstructured byte stream.)

The PTY benchmark includes session preparation (sending terminal length 0 and terminal width 511 before the first command) and per-command echo verification (reading until the echoed command appears, then reading until the prompt), matching the protocol-level behavior common to all major tools.

Mode	What it measures
`ssh/pty-fresh`	Full SSH lifecycle + PTY + shell + session prep + commands with echo verification
`ssh/pty-reuse`	Shared connection, new PTY/shell per iteration with session prep

HTTPS Modes

Mode	What it measures
`https/fresh-conn`	New TCP + TLS handshake per iteration (`DisableKeepAlives: true`)
`https/keep-alive`	Single TCP + TLS connection reused across all iterations (default HTTP behavior)
`https/batch-post`	All commands in one POST body (`/admin/config`)
`https/multi-cmd`	All commands in one GET request (ASA `/admin/exec/cmd1/cmd2` syntax)

Each mode runs N iterations, each executing 5 show version commands. The client reports min, max, average, p50, p95, and standard deviation.

Running It Yourself

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


git clone https://github.com/lykinsbd/clibench.git
cd clibench
go build -o bin/bench ./cmd/bench/

# Baseline - no added latency (no root needed)
./bin/bench -latency local -iterations 20 -commands 5

# US backbone - 30ms RTT (requires root for tc netem)
sudo ./bin/bench -latency regional -iterations 20 -commands 5

# US to Hong Kong - 150ms RTT
sudo ./bin/bench -latency intercontinental -iterations 20 -commands 5

# Fallback: userspace delay injection (no root, less accurate)
./bin/bench -latency regional -iterations 20 -commands 5 -userspace

clibench embeds its own server. No separate process needed. Non-local profiles require root (or CAP_NET_ADMIN) for tc netem on the loopback interface. Output is JSON.

Results

5 commands per iteration, all times in milliseconds (average of 20 iterations). Latency injected via tc netem on the loopback interface.

At zero latency, SSH exec mode is the fastest option. There’s no round-trip penalty, and SSH’s binary framing has less per-message overhead than HTTP headers + TLS:

The moment real network latency enters the picture, the result flips. At 30ms RTT, a US backbone path per Verizon’s March 2026 measurements, HTTPS batch is 16.8x faster than SSH PTY fresh and 15.9x faster than SSH exec fresh-conn:

At intercontinental distances (US ↔ Hong Kong, 150ms RTT), SSH PTY fresh takes 2.6 seconds for 5 commands. SSH exec fresh takes 2.4 seconds. HTTPS batch does it in 151ms:

Exec Mode vs PTY/Shell Mode

The PTY overhead comes from session preparation (terminal length 0, terminal width 511) and per-command echo verification. At higher latencies, this adds up:

Profile	RTT	SSH exec fresh	SSH PTY fresh	PTY overhead
local	0ms	3.9ms	4.3ms	+0.4ms
campus	2ms	40ms	42ms	+2ms
regional	30ms	494ms	522ms	+28ms
continental	70ms	1,144ms	1,213ms	+69ms
intercontinental	150ms	2,412ms	2,565ms	+153ms

The PTY overhead scales linearly with RTT because the session prep commands add roughly one extra round trip of overhead before the first real command runs. At 150ms RTT, that’s ~150ms of pure protocol overhead. And this is the best case. Real devices add processing time, ANSI escape codes, and prompt detection regex that the emulator doesn’t capture.

Speedup vs SSH PTY fresh (what most tools actually use)

Profile	RTT	SSH exec fresh	SSH reuse	HTTPS keep-alive	HTTPS batch
local	0ms	1.1x	3.9x	7.2x	19.1x
campus	2ms	1.1x	1.8x	3.3x	16.3x
regional	30ms	1.1x	1.7x	3.3x	16.8x
continental	70ms	1.1x	1.7x	3.4x	16.3x
intercontinental	150ms	1.1x	1.7x	3.4x	17.0x

What the Numbers Say

All results are from 20 iterations per profile. Variance was low, at regional (30ms), SSH exec fresh-conn p50 was 492ms with p95 at 508ms.

At zero latency, SSH exec wins. When there’s no network delay, TLS handshake overhead dominates. SSH exec fresh-conn takes 3.9ms; HTTPS fresh-conn takes 12.0ms. But PTY mode is already slower at 4.3ms due to session prep overhead. The reuse modes tell a different story: SSH exec reuse (1.1ms) and HTTPS keep-alive (0.6ms) are both sub-millisecond. Once the handshake is amortized, both protocols are fast.

Most automation tools don’t use exec mode. As covered above, they use PTY/shell mode for prompt detection, pagination control, and mode transitions. The PTY numbers are what your automation actually experiences.

SSH reuse helps, but not enough. Sharing one SSH connection (the ControlMaster pattern) eliminates the handshake cost, but each command still requires its own round trips. The improvement is consistent at ~1.7x. Real, but modest.

HTTPS keep-alive is ~3.4x faster at any real latency. Every HTTP client library does connection pooling by default. You don’t have to configure anything special. Just reuse the http.Client. At 30ms RTT, that’s 158ms vs 522ms (PTY fresh).

HTTPS batch is ~17x faster. Batching all commands into a single HTTP request eliminates per-command round trips entirely. The entire exchange costs one round trip regardless of command count. At 150ms RTT, that’s 151ms vs 2,565ms (PTY fresh). Unlike keep-alive (which still pays one round trip per command), batch mode pays a fixed cost regardless of command count.

The advantage grows with command count, for per-command modes. At 30ms RTT with 50 commands, SSH exec fresh-conn takes 3,253ms. SSH PTY fresh takes 1,912ms (PTY avoids per-command channel overhead but pays per-command echo verification). HTTPS keep-alive takes 1,548ms. But HTTPS batch takes just 33ms. A ~99x improvement over exec fresh and ~58x over PTY fresh. SSH batch-exec shows the same flat scaling (~250ms regardless of command count), confirming this is a property of batching, not the transport.

What This Means at Scale

If you’re managing 100 devices serially (worst case, no concurrency), using PTY mode (what Netmiko/Ansible actually do):

Profile	RTT	SSH PTY fresh (total)	HTTPS batch (total)	Time saved
regional	30ms	52s	3.1s	49s
continental	70ms	121s (2.0 min)	7.4s	114s
intercontinental	150ms	257s (4.3 min)	15s	242s (4.0 min)

Concurrency shrinks the wall time, but the per-device cost stays the same. At 150ms RTT with 10 concurrent workers against 1,000 devices, SSH PTY takes ~4.3 minutes of wall time. HTTPS batch takes ~15 seconds.

Limitations

This benchmark measures transport overhead, not device processing time. Real network devices add their own latency to command execution: parsing the command, generating output, writing to the terminal. That cost is the same regardless of transport, so it doesn’t affect the relative comparison.

The HTTPS server uses a self-signed certificate with no session resumption. TLS 1.3 0-RTT resumption would make the HTTPS numbers even better on repeated connections, but I didn’t implement it because most device management scenarios don’t maintain long-lived TLS sessions.

A -userspace flag is available as a fallback for environments where root isn’t available, but it under-counts SSH round trips due to write coalescing in Go’s crypto/ssh. The published numbers all use tc netem.

What’s Next

In Part 3, I’ll look at what happens when you can’t change the device: the proxy pattern. Move SSH to the edge, talk HTTPS over the WAN, and capture most of the improvement without touching a single device config.

The benchmark code already supports proxy mode. Try it yourself and see what your numbers look like.

CLI Over HTTPS Part 1: The Protocol Tax

brett@network-notes.com (Brett Lykins) — Tue, 28 Apr 2026 09:00:00 +0000

During my six years at Rackspace, we spent a lot of time thinking about how to interact with network devices faster. We had tens of thousands of them; firewalls, load balancers, switches, routers and more, all spread across multiple data centers on four continents.

In the early days, managing devices across these data centers meant running shell scripts, Expect, or Perl from local machines and centralized bastions over the WAN. It was operationally painful. SSH connections to devices on other continents were slow enough that teams scheduled automation runs around maintenance windows not just because of the change itself, but because of how long it took to deliver the changes.

An Erlang-based platform solved this by co-locating the SSH connections with the devices. They ran inside each data center, talking SSH to devices over local links where the protocol overhead was negligible. Phil Toland presented the Erlang architecture at Erlang Factory 2012, detailing Erlang and Ruby managing backups and automation for 20,000+ network devices across 8 data centers. My team later supplemented it with a Go microservices architecture to provide API-driven access to device CLIs. Both systems were effective not just because of language capabilities; crucially they were fast because SSH stayed local.

But, even with co-located endpoints, the Cisco ASA fleet was a special case which tested our capabilities due to the extreme size of some of the Access Lists. That’s when someone discovered that the ASA has an HTTP interface we could use. Not the ineffective ASA Java-based REST API, but an actual CLI-over-HTTPS endpoint used by the ASDM client. There was a URL on the device where you could send the same commands you’d type into an SSH session, but over an HTTPS request. We tried it as an experiment, and it was remarkably faster. What started as a curiosity became a production lifesaver for our ASA fleet.

I’ve been thinking about that experience ever since, and I finally decided to quantify it properly. In this series of posts, I will quantify the performance difference between SSH and HTTPS as CLI transports and explain why the gap exists.

The SSH Tax

For a deeper look at SSH’s protocol layers, channel types, and how NETCONF and RESTCONF fit in, see What Actually Happens When You SSH Into a Router.

When your automation tool opens an SSH connection to a network device, here’s what actually happens on the wire before a single byte of command output comes back. (Throughout this series, “RT” means a round trip: one message out, one message back. “RTT” is the round-trip time in milliseconds for a given network path.)

1. TCP three-way handshake. SYN, SYN-ACK, ACK. One round trip. (The ACK can piggyback on the first data segment, but the connection isn’t usable until the handshake completes.)

2. Protocol version exchange. Client and server each send an identification string (SSH-2.0-OpenSSH_9.6). Another round trip.

3. Key exchange. The client and server negotiate algorithms (encryption, MAC, compression) via KEXINIT, then perform a Diffie-Hellman key exchange. This takes 1-3 round trips depending on whether the KEXINIT messages cross in flight and which DH group is negotiated.

4. Service request. The client requests the ssh-userauth service. One round trip.

5. User authentication. Password or public key auth. 1-3 round trips depending on how many methods the server probes (GSSAPI, publickey, then password, each one a round trip).

6. Channel open. SSH multiplexes channels over a single connection. Opening a session channel is another round trip.

That’s 6-10 round trips just to get an authenticated channel. If you’re running a single exec-style command, add one more round trip for the request/response and you’re done.

But automation tools don’t use exec mode. Netmiko, Ansible, and Scrapli open a PTY/shell channel, which adds:

7. PTY request. Ask the server for a pseudo-terminal. One round trip.

8. Shell request. Start an interactive shell on that PTY. One round trip.

9. Session prep. Send terminal length 0, terminal width 511, and wait for each prompt. Two to three more round trips.

Add it up: 10-15 round trips before you see the output of show version, with most real-world automation sessions landing at 10-12. The SSH3 project cites similar overhead in their motivation for building SSH over HTTP/3.

SSH multiplexing (ControlMaster) amortizes the connection setup cost across sessions, but the first connection still pays the full overhead. It helps for repeated connections to the same device, but doesn’t solve the problem at scale across thousands of hosts.

At zero latency (localhost), nobody cares. At real-world distances, it compounds fast.

What This Costs at Real Distances

Verizon Enterprise publishes monthly backbone latency measurements from their global network. Here’s what SSH connection setup costs at those measured round-trip times, assuming 10-15 round trips for a typical PTY/shell automation session:

Path	Measured RTT	SSH setup (10-15 RT)	Source
US backbone (intra-region)	30ms	300-450ms	Verizon Mar 2026: 29.9ms
Transatlantic (NYC ↔ London)	70ms	700-1,050ms	Verizon Mar 2026: 70.2ms
US ↔ Hong Kong	146ms	1,460-2,190ms	Verizon Mar 2026: 145.5ms
US ↔ New Zealand	174ms	1,740-2,610ms	Verizon Mar 2026: 174.2ms

That’s just connection setup. You haven’t sent a command yet. And if your automation opens a fresh SSH connection per device, or per task, which is Ansible’s default behavior without ControlMaster, you pay this cost repeatedly.

A thousand devices at 30ms RTT without concurrency: 300-450 seconds of pure SSH handshake overhead. At 150ms RTT: 1,500-2,250 seconds. Concurrency reduces wall time, but every device still pays the full per-connection cost.

The Screen-Scraping Tax

SSH’s round-trip overhead is only half the story. The other half is what happens after you connect.

SSH gives you a byte stream. A pseudo-terminal. Your automation tool has to:

Detect the prompt to know when command output is complete. Netmiko does this with regex matching on every chunk of bytes received, waiting for a pattern like hostname#. If the output is large or arrives in small TCP segments, this means multiple read cycles.
Handle pagination. --More-- prompts. Most tools send terminal length 0 first, which is another command round trip.
Navigate mode transitions. enable, configure terminal, interface GigabitEthernet0/0. Each one is a command, a prompt change, and a round trip.
Wait for command completion. Netmiko’s default read_timeout adds deliberate delays to avoid reading partial output.

None of this is Netmiko’s fault. It’s doing the best it can with what SSH gives it: an unstructured byte stream with no framing, no content-length, no end-of-message delimiter. The tool has to infer when the device is done talking. Compare that to HTTPS (Content-Length), NETCONF (]]>]]> delimiter), or even SSH exec mode (exit codes). PTY is the only mode where the client has to guess when the response is complete.

The HTTPS Alternative

Now consider what happens when you send the same command over HTTPS:

1. TCP three-way handshake. Same as SSH. One round trip.

2. TLS 1.3 handshake. 1 round trip. (TLS 1.2 was 2 round trips; TLS 1.3 cut it in half. With session resumption, it’s 0 round trips.)

3. HTTP request/response. Send the command, get the output. 1 round trip.

Total: about 3 round trips for a fresh connection. With connection reuse (HTTP keep-alive, which every HTTP client library does by default): 1 round trip per command after the first.

Side by side, the difference is stark. Here’s the same operation (5 commands at 30ms RTT) over both protocols:

And here’s the part that changes the math entirely: you can batch commands. The Cisco ASA HTTP interface accepts multiple commands in a single request, either slash-separated in the URL or newline-delimited in a POST body. Ten commands, one request, one response. That’s 1 round trip for 10 commands, vs SSH where each command requires its own channel-open and exec round trips.

No prompt detection. No pagination handling. No mode transitions. The response body is the command output, with a proper HTTP Content-Length header. Your client knows exactly when the response is complete.

Prior Art: The Cisco ASA HTTP Interface

This isn’t a theoretical proposal. Cisco has shipped an HTTP-based CLI interface on the ASA for years. Their own documentation opens with this:

One way to interface with most network appliances including ASAs is via CLI. An automated tool could Telnet or SSH into a device, authenticate and execute commands, one at a time. This method has a number of drawbacks, however. The tool must maintain the state of the Telnet and SSH connection, and if that connection is broken, the login process has to be repeated. Using CLI, it is only possible to send one command at a time, so administering many firewalls would be time consuming, especially when the firewalls are some latency away from the management station.

Cisco identified the exact problem and shipped a solution. The interface is straightforward:

1
2
3
4
5
6
7
8
9


# Single command
curl -sk -u admin:admin https://asa.example.com/admin/exec/show+version

# Multiple commands in one request
curl -sk -u admin:admin https://asa.example.com/admin/exec/show+version/show+ip+interface+brief

# Bulk config push
curl -sk -u admin:admin -X POST --data-binary @config.txt \
 https://asa.example.com/admin/config

Basic auth over TLS. URL-encoded commands in the path. Newline-delimited commands in a POST body. No SDK, no client library, no special protocol. Just HTTPS.

Aaron Hackney, another principal engineer at Rackspace at the time who was dealing with the same fleet of ASAs, went deep on this interface in a BRKSEC-2031 session at Cisco Live Orlando 2018. The work involved reverse-engineering the ASDM client’s HTTP calls and building Python tooling to script against the same interface. The two-part Cisco community blog series that followed is still one of the best references for anyone looking to automate ASAs over HTTPS instead of SSH.

“Great, but My Switches Don’t Have an HTTPS CLI”

The network automation community has spent a decade building tooling around SSH. Netmiko, Paramiko, Ansible’s network_cli, Nornir, NAPALM. All SSH-based for CLI interaction. That investment is real and valuable. NETCONF and gNMI exist as alternatives, but they require device support for structured data models. A different paradigm entirely. Many organizations have thousands of CLI commands, templates, and playbooks that work. They don’t need a new data model. They need a faster pipe.

And the ASA’s HTTP interface is the exception, not the rule. Most network platforms (IOS, NX-OS, EOS, Junos) don’t expose their CLI over HTTPS. Realistically, that’s not going to change across the industry without a major shift in how vendors think about management plane interfaces. I’m not holding my breath.

But the protocol overhead problem doesn’t go away just because the devices don’t support the better transport natively. So what do you do?

You move the expensive SSH transactions to the edge. Put a lightweight proxy, an API gateway, a microservice, whatever you want to call it, close to the devices it manages. That proxy talks SSH to the devices over a low-latency local network where the round-trip overhead is negligible. Your automation platform talks HTTPS to the proxy over the WAN, where the round-trip savings actually matter.

The pattern looks like this: your centralized automation (Ansible, Nornir, custom tooling) sends an HTTPS request to a proxy co-located with the devices. The proxy opens an SSH session to the device on a 1-2ms local link, executes the commands, and returns the output in the HTTP response. Your automation never touches SSH directly. It gets the speed of HTTPS over the WAN and the compatibility of SSH on the last hop.

This isn’t hypothetical. It’s the architecture behind tools like my NAAS (Netmiko as a Service) application, which wraps Netmiko’s SSH sessions behind a REST API. Deploy a NAAS instance in each region or data center, and your automation talks HTTP to the nearest one. The SSH overhead stays local. The WAN traffic is pure HTTPS.

In Part 2, I’ll detail a dual-protocol device emulator I built in Go that serves the same commands over both SSH and HTTPS, and a benchmark client that measures the difference at realistic latencies. The code is already public if you want to run ahead.

My take: SSH is a fine protocol for interactive terminal sessions. It’s a poor protocol for automation at scale. The round-trip overhead is baked into the protocol design, and no amount of connection pooling or multiplexing fully eliminates it. HTTPS with TLS 1.3 is a strictly better transport for the “send command, get output” pattern that defines CLI automation. The industry should be building toward it.