Skip to main content
← Back

Wasm Autoscaling on Kubernetes, wasmCloud 2.3.0, and the WASI Preview 3 Vote

The June 10, 2026 wasmCloud community call demonstrates Wasm runtime autoscaling on Kubernetes. Jeremy Fleitz scales a single Wasm workload from one instance to 38 at roughly 460 requests per second using KEDA and the standard HPA scale subresource on the workload deployment CRD, then Bailey Hayes demos multi-subscription workloads with wash dev, letting one workload host multiple components subscribed to distinct NATS topics with per-component config. The team discusses wasmCloud 2.3.0 with Helm-configurable NATS, a Kubernetes-native ingress story, and a new egress-control example, and looks forward to the ** launch vote** scheduled for the following day, with JCO joining Wasmtime as the second reference implementation.

Key Takeaways

  • Wasm horizontal pod autoscaling on Kubernetes lands — the workload deployment CRD now exposes the standard /scale subresource, so the Horizontal Pod Autoscaler and KEDA can drive replica count off any Prometheus metric, including request volume
  • A 460 req/s demo scales the Hello World workload from 1 to 38 instances — KEDA polls an OTel-emitted invocation gauge, targets 10 req/s per instance, and reconciles the workload deployment every 10 seconds without disturbing the host pods
  • Workload-level scaling preserves wasmCloud's high-density story — host pods stay fixed per node while the scale subresource adds and removes Wasm component instances inside them, the opposite of one-pod-per-workload container scheduling
  • Helm chart enhancements expose every hard-coded NATS knob — separate control-plane and data-plane NATS URLs, BYO NATS for Synadia or external clusters, plus arbitrary container args and environment variables for custom-host images
  • wash dev now supports multiple components and per-component config — one workload can run an HTTP front-end plus several messaging handlers, each with its own subscriptions config (task.leet, task.reverse) and per-component environment variables
  • The runtime's intercomponent-linking cycle check was relaxed — duplicate exports across sibling components are no longer flagged as ambiguous when no other component in the workload imports that export, unlocking multi-handler workloads in a single wash dev session for both Rust and TypeScript templates
  • wasmCloud 2.3.0 shipped last week with workload env config and secrets in wash config YAML, OTel fixes, a new loud example host demonstrating egress control from workload components, and updated Kubernetes-native ingress docs replacing the deprecated runtime gateway
  • The WASI P3 launch vote is tomorrow — JCO is now the second reference implementation alongside Wasmtime; on a pass, wasmCloud will enable WASI P3 by default, the component-model async feature flags land in Wasmtime 47, and OCI artifacts will bundle the WIT interfaces from WASI 0.3.0

Chapters

Meeting Notes

Auto-Scaling Wasm Workloads on Kubernetes

Jeremy Fleitz opened the call with the autoscaling PR that closes one of the Q2 roadmap items: bringing Wasm runtime autoscaling to Kubernetes through the standard Horizontal Pod Autoscaler contract. The change adds the /scale subresource to the workload deployment CRD that the wasmCloud runtime operator uses to schedule a Wasm workload onto a host. With that subresource in place (replica path plus label selector) HPA (and the broader CNCF KEDA project, which lets HPA target any external metric source like Prometheus) can scale the workload exactly the way it scales a Deployment.

The architectural point Jeremy emphasised is that the host pod count stays fixed. Containers scale by adding pods because each container is its own scheduling unit. wasmCloud hosts intentionally pack many Wasm component instances into one host pod per node — that is most of the density story. So the scale subresource here adds and removes workload instances inside the existing host pods, not host pods. HPA literally stands for horizontal pod autoscaling, but the wasmCloud workload deployment CRD reuses that contract to scale workload components, which is the unit the platform actually cares about.

The demo ran the stock Hello World workload component on a Kind cluster following the operator quickstart, with Prometheus and KEDA installed via their Helm charts and .NET Aspire running as the OTel dashboard. KEDA was pointed at a workload-emitted Prometheus gauge with a target of 10 requests per second per instance; a local script then drove ~450 requests per second at localhost, which the Kind port-forward routed through the Kubernetes Service into the host pods. Within seconds the workload scaled out — Jeremy watched three hosts pick up 12, 10, and 7 instances on the way to a stable 38 instances handling close to 460 req/s. KEDA reports the OTel gauge every five seconds, the operator updates the deployment every ten, and a stabilization window keeps the controller from flapping. The test script and a runnable harness will land in the public Cosmonic Labs repo so anyone can replay this on their own cluster.

The natural production shape is to target an ingress back-pressure metric rather than a self-reported request gauge: scale on rising request latency or queue depth, not on the workloads' own report of how busy they are. The demo used the simpler signal so the scaling behaviour was easy to read off the dashboard.

Helm Chart Enhancements: Bring Your Own NATS and Custom Host

The second item came directly out of Mike's question on the wasmCloud Slack about pointing wasmCloud at a Synadia cluster instead of the NATS instance the chart installs by default. The current escape hatch is a Kustomize external-service rewrite: workable, but ugly enough that it really should not be the answer. So Jeremy and Bailey reviewed the chart with a "anything hard-coded should be a default value, not a permanent value" lens and opened up the rest of the surface.

The new knobs land soon: a data-plane NATS URL that workload components talk on, distinct from a control-plane / scheduler NATS URL that the host and runtime operator use for orchestration — so you can isolate subjects, run different NATS topologies for each, or point one side at Synadia and keep the in-cluster NATS for the other. The chart also accepts arbitrary environment variables and extra container args passed through to the host containers, which mostly matters when you are running a custom-built host image: a pre-host-plugin world where you compile in your extensions and need to flip a flag at start time. Host component plugins are coming separately, but the args path is the bridge for now. The whole set targets the next release.

Multi-Subscription Workloads with wash dev

Bailey Hayes then demoed the matching developer-experience side — the multi-subscription PR that lets a single workload run several components with per-component config, each subscribed to a different message topic. The demo workload had an HTTP front-end (a small UI plus a task API) plus two messaging handlers behind it: task-leet (l33t-speak the input) and task-reverse (reverse the words). The front-end posts to task.leet or task.reverse on the in-memory message broker; each handler component declares its own subscription in its dev config; the host routes each subject to the matching component instance. Bailey also expanded the bundled UI template to include a small architecture diagram so demo viewers can see the shape of the workload at a glance — a pattern she wants to standardize across the templates.

Two configuration pieces unblock this. First, wash dev now reads per-component config — a workload can declare subscriptions: [...] (and other config) at the component level instead of only at the workload level. Second, local environment variables now flow into a component's runtime context — the same ${VAR} flow that landed for env in 2.3.0 now reaches per-component config. The template is mirrored across Rust and TypeScript repos with the same HTML and the same component contract, so language choice is transparent to the user.

Removing the Duplicate-Export Cycle Check

The deeper change underneath that demo was relaxing a long-standing intercomponent-linking check. Previously the runtime rejected workloads where multiple sibling components exported the same interface (the messaging handler export, in this case) because there is no way to tell which instance to invoke — that is genuinely ambiguous if another component inside the workload is trying to import that export. But if no component imports it — the host is the one that decides which component instance to call, based on the subscription topic — the workloads are not ambiguous, just multiply-exported. The PR keeps the cycle check on imports and lifts it on duplicate exports when nothing in the workload pulls them. The result: a workload can host three components that all export wasi:messaging/handler, each bound to a different subject, and the host's routing layer picks the right one per message. The host-side in-memory messaging plugin was updated alongside this to do that subject routing locally. If a WORKLOAD_NATS_URL is set, the workload automatically loads the NATS plugin instead — but the in-memory backend stays the default so first-run demos do not need any external infrastructure.

For a developer who runs into the multiple-exports check elsewhere, the workarounds are still the same: compose the components together with wac so each external import is bound to a specific instance, or split the components into separate workload deployments. The change makes the in-workload, host-routed case work without forcing either.

wasmCloud 2.3.0 and Kubernetes-Native Ingress

Jeremy then handed Eric Gregory the documentation slot with two updates from the wasmCloud 2.3.0 release blog that shipped the week before. The release includes the workload env config and secrets surface in wash config YAML for wash dev, OpenTelemetry fixes, and a new example host (loud) that demonstrates how a workload component's egress can be observed and controlled — a small pattern useful for socket-isolation walkthroughs.

The bigger documentation update is the Kubernetes-native ingress architecture page under the operator manual. The previous Wasm-on-K8s ingress path went through a wasmCloud-specific runtime gateway; that gateway is deprecated in favor of using plain Kubernetes Services and Endpoint Slices. The architecture page now walks the whole request shape: an Ingress front-ends a Service, kube-proxy resolves the Service to the Endpoint Slices that the runtime operator registered for the workload's host pods, and each host pod routes incoming requests to the right Wasm component based on header and path matching. This is the same plumbing every other Kubernetes workload uses and it composes cleanly with cluster-standard ingress controllers, cert-manager, etc.

WASI P3 Launch Vote

The last and biggest item: the WASI P3 launch vote is scheduled for the day after the call. Bailey walked the state. Eric has a wasi.dev refresh PR in review that reorganises the Preview-3 messaging and calls out the WASI test suite. Victor Adossi and Tomas Hribernik got JCO test-suite compliance over the line, which makes JCO the second reference implementation alongside Wasmtime. Anyone can pull main of the WASI test suite, run it against JCO, and see the pass. Two reference implementations is exactly what the subgroup needs for the vote, and if it passes WASI P3 launches.

What that unlocks downstream:

  • wasmCloud will enable WASI P3 by default, shortly after the vote.
  • Alex Crichton's Wasmtime PR to flip the component-model async feature flags on by default lands in Wasmtime 47. Wasmtime 46 already branched and is in continuous fuzzing, so 47 is the realistic landing window for async-by-default.
  • Bailey will cut a wasmCloud release that bundles WASI 0.3.0 WIT interfaces into the published OCI artifacts so guests have one canonical place to pull the new world from.

She also called out Aditya Salunkhe's PR on sharing values inside the same component-model store as a great piece of work, with a follow-up conversation about the multi-store design that will land on a future call.

WebAssembly News and Updates

This call is a snapshot of the WebAssembly ecosystem putting the final pieces in place for the WASI P3 launch. With JCO joining Wasmtime as the second reference implementation, the Bytecode Alliance and the WASI subgroup have a clear path to a passing launch vote. On the runtime side, the wasi.dev refresh, the imminent Wasmtime 47 async-by-default flip, and OCI artifacts pre-bundled with WASI 0.3.0 WIT interfaces all converge into a single moment when component-model async, WASI P3, and native async I/O land in production runtimes at the same time. On the platform side, wasm runtime autoscaling on Kubernetes now uses the standard HPA contract through KEDA, and the deprecated runtime gateway is replaced by plain Kubernetes Services and Endpoint Slices — making this the cleanest Wasm-on-Kubernetes story to date. For ongoing webassembly news, follow the Bytecode Alliance and the wasmCloud blog.

What is wasmCloud?

wasmCloud is a CNCF project for building applications out of WebAssembly components and deploying them across cloud, edge, and Kubernetes clusters. The Wasm component model lets you write business logic in Rust, Go, Python, TypeScript, C#, Java, and more — while the platform handles capabilities like HTTP, messaging, key-value storage, and observability through a pluggable host plugin architecture backed by Wasmtime. wash is the developer shell — build, run, deploy, debug — and the runtime operator schedules Wasm workloads on Kubernetes the same way you schedule container workloads, with built-in OpenTelemetry observability, HPA-aligned autoscaling, and native WASI P3 support landing. The result is the production substrate for WebAssembly on Kubernetes and the edge.

Topic Deep Dive: Wasm Runtime Autoscaling on Kubernetes

The headline change in this call is the alignment between Wasm runtime autoscaling and the Kubernetes Horizontal Pod Autoscaler. Container platforms scale by adding pods because each container is the unit Kubernetes schedules. wasmCloud workloads are different: a single host pod runs many Wasm component instances, and the platform earns its density by not spinning up a fresh pod per workload. So the autoscaling problem becomes: how do we adopt the standard HPA contract — which assumes "scale this resource to N replicas" — without losing that density story?

The answer landing in this PR is that the workload deployment CRD now implements the /scale subresource directly. The HPA sees a standard resource with spec.replicas and a label selector and treats it like any other scalable thing. KEDA layers on top by reading any Prometheus metric — request volume, queue depth, ingress back-pressure — and feeding the HPA an external metric. What scales up and down is the Wasm workload instance count inside the existing host pods, which is exactly the right knob: the platform reuses host capacity, OTel-emitted metrics drive the controller, and the user gets the familiar HPA observability surface. Pair it with the Kubernetes-native ingress story and the upcoming Helm-configurable NATS and you have the first end-to-end Wasm autoscaling story that looks indistinguishable from container autoscaling to a platform engineer — which is, in turn, the foundation for running mainstream services as components at production scale.

Who Should Watch This

This call is especially valuable for platform engineers evaluating Wasm autoscaling on Kubernetes (0:12) — the demo is a complete walkthrough of the HPA contract on a real Kind cluster — SREs and infrastructure teams running their own NATS clusters who want to plug Synadia or external NATS into the wasmCloud Helm chart (11:36), and component-model developers building multi-handler workloads who want to see the single-wash dev multiple-subscription pattern in action (13:14). Anyone tracking the WASI P3 launch should jump straight to Bailey's status update at 24:21 — the vote that follows it changes the WebAssembly roadmap for the rest of 2026.

Up Next

The next community calls are framed by the WASI P3 launch vote the morning after this one. Watch for wasmCloud enabling WASI P3 by default, an OCI release with the WASI 0.3.0 WIT interfaces bundled in, the component-model async feature flags going on by default in Wasmtime 47, and a deeper conversation with Aditya Salunkhe on multi-store value sharing inside the component model. The Helm-chart NATS overhaul and wash dev's per-component config both ship in the next release, and Jeremy's autoscaling test harness should land in Cosmonic Labs so anyone can replay the scale-to-38 demo on their own cluster.

Get Involved

wasmCloud is a CNCF project and contributions are welcome. Join the community:

Full Transcript

Read the complete transcript with speaker labels and timestamps:

Read the full transcript →