Structurizing gotos


Problem

WebAssembly’s control flow is structured. Branches (br, br_if, br_table) only target enclosing block or loop constructs. There is no general indirect jump.

C’s goto is not structured. A goto can target any label in the same function, in any direction, across any control-flow boundary. break and continue are structured at the source level, but they have to be re-encoded along with goto once the function is rewritten as a state machine.

compiler.js handles this in two passes:

  • GOTO_NORMALIZER — an IIFE in compiler.js exposing { optimize, collectTraces, … }. Source-level AST rewrite that tries to make every (goto, label) pair resolvable by structured wasm scopes — hoisting labels and inserting fall-through plugs as needed. Best-effort.
  • IRREDUCIBLE_LOWERING — the next IIFE in the file. Whole-function CFG rewrite. Invoked by the codegen as a per-function retry when structured emit hits an unresolvable goto. Total: any function it touches emerges as plain structured C the existing codegen handles unchanged.

This post is about the second pass.

When pass 2 runs

Not as a discrete pass over the unit. The codegen tries structured emit on every function and falls back to IRREDUCIBLE_LOWERING only on functions whose structured emit failed.

The structured codegen accumulates a gotoErrors array as it walks each function — every goto whose target it cannot resolve to a structured wasm scope produces an entry. Right before each function’s emit, the driver snapshots the lengths of the wasm body, locals, source-map entries, local-name entries, and gotoErrors. After the emit, if gotoErrors grew, the driver:

  1. Rolls back the appended bytes, locals, source-map entries, and local-name entries to the snapshot.
  2. Clears the new gotoErrors.
  3. Calls IRREDUCIBLE_LOWERING.lower(fdef) to rewrite the function body.
  4. Re-emits. The lowered body is plain structured C, so the second emit cannot produce goto errors.

Consequences:

  • Functions whose structured emit succeeds pay no extra cost beyond five length-snapshots and one comparison. The fast path is the default.
  • The classifier is the codegen. Whatever the structured emit cannot resolve, the retry rewrites — no separate analysis is needed, and there is no opportunity to misclassify a function.
  • --no-irreducible-lowering skips the retry. On affected functions, the gotoErrors are surfaced as diagnostics and the compile fails.

An earlier version of the IIFE exposed a pre-classifier interface (needsLowering(funcDef) and a per-unit optimize(unit, options) driver) that walked the unit deciding which functions to rewrite. That interface was strictly worse than the codegen retry — it paid analysis cost on every function and risked misclassification — and was removed once the retry path was in place. The current public surface is just { lower }.

The lowering, end to end

IRREDUCIBLE_LOWERING.lower(funcDef) runs three sub-passes:

  1. hoistDeclarations(funcDef) — move every local declaration to a flat list at function entry. Replace each in-place declaration with an assignment at the original site.
  2. buildSegments(rewrittenBody) — walk the AST and split it at every control-flow boundary into a flat array of segments (the AST equivalent of basic blocks), each ending in one of six terminators.
  3. synthesizeWrapper(funcDef, decls, segments) — build the new function body: hoisted decls, a state variable, and a while (1) switch (__state) whose cases are the segments.

The output of pass 2 is plain structured C and feeds back through the existing codegen unchanged. No new codegen path is added; the lowering is purely AST→AST.

Sub-pass 1: hoist declarations

The reason hoisting is necessary: the state machine breaks every scope. Once a declaration’s lifetime is split across multiple case bodies of a switch, the original block-scoped storage no longer matches the visibility the program needs. The fix is to give every local declaration function-wide scope and keep its name unique.

hoistDeclarations produces { decls, rewrittenBody }:

  • Walk the body. For each SDecl containing DVars (skip static and extern locals), strip the initExpr and push the DVar onto the hoisted list.
  • If the original declaration had an initializer, emit an SExpr of name = initExpr at the original site. The DVar is now plain, but the initialization still runs in source order.
  • Aggregate, array, and EInitList initializers are an exception — keep the original initExpr on the hoisted DVar; the codegen runs them at the function-entry declaration point.
  • Maintain a Set<string> of names seen so far. When a nested-scope DVar shadows an outer one, α-rename it (xx__2) by mutating DVar.name. EIdent nodes hold the DVar by reference, so every use site updates automatically.

Output invariant: every DVar in the function has a unique name within the function, lives in decls, and is initialized (if at all) by an explicit assignment at its original source position.

Sub-pass 2: build segments (the partial CFG)

This is the heart of the lowering. The output is a flat array of segments, where each segment is a maximal run of straight-line statements bounded by one terminator. Together they form a graph: terminators reference segments by integer id.

It is a “partial” CFG in the sense that it carries only the structure the rewrite needs — no dominance, no SSA, no def–use chains. Each segment knows what to execute and what to do next; nothing more.

Data structures

// Segment ≈ basic block. Sorted by id; id becomes the switch-case value.
function newSegment(id) {
  return { id, stmts: [], term: null };
}

// One of six terminator shapes.
const Term = {
  fallthrough: (nextId)              => ({ kind: "fallthrough", nextId }),
  goto:        (targetId)            => ({ kind: "goto", targetId }),
  branch:      (cond, thenId, elseId)=> ({ kind: "branch", cond, thenId, elseId }),
  switchDispatch: (scrutinee, cases, defaultTarget)
                                     => ({ kind: "switch", scrutinee, cases, defaultTarget }),
  return:      (expr)                => ({ kind: "return", expr }),
  halt:        ()                    => ({ kind: "halt" }),
};

Six terminator kinds, all the function-local control-flow shapes that survive past the AST level: an unconditional jump (fallthrough/goto — same encoding, separate kinds for diagnostics), a two-way conditional (branch), an N-way dispatch (switch), a function exit (return), and a “this segment is unreachable / end of function” marker (halt).

The visitor

buildSegments opens the entry segment and recursively visits the rewritten body. State carried during the walk:

  • curSeg — the segment currently being filled, or null if a terminator was just emitted.
  • nextId — counter for fresh segment ids.
  • labelToIdMap<SLabel, int>, so a forward goto L can request the id before the label is visited.
  • loopStack — stack of { breakId, continueId } records. The top entry resolves SBreak and SContinue.
  • caseIdsStack — stack of Map<SCase, int>. Each SSwitch pre-allocates one id per SCase in its body and pushes the map; the SCase handler consults the top map to find its own id. This works regardless of how deeply the case label is nested inside compound statements.

The visitor’s job per node:

AST nodeWhat the visitor does
SEmpty, SCompoundNothing / recurse.
SLabel(L)Close current with fallthrough(L_id). Open L_id.
SGoto(L)Close current with goto(L_id). If unresolved, close with halt.
SIf(cond, then, else?)Allocate thenId, elseId (if else exists), joinId. Close with branch(cond, thenId, elseId ?? joinId). Visit each branch, falling through to joinId at the end. Open joinId.
SWhile(cond, body)Allocate hdrId, bodyId, exitId. Close current with fallthrough(hdrId). Open header, close with branch(cond, bodyId, exitId). Push {breakId: exitId, continueId: hdrId}, visit body, fall through to hdrId. Pop, open exitId.
SDoWhile(body, cond)Allocate bodyId, testId, exitId. Fall into body. Push {breakId: exitId, continueId: testId}, visit. Fall to testId. Test closes with branch(cond, bodyId, exitId).
SFor(init, cond, inc, body)Lower init into the current segment. Allocate hdrId, bodyId, contId, exitId. contId is the increment segment, not the header — that’s why for has a distinct continue target. continue jumps to contId, which contains inc and falls through to hdrId.
SSwitch(expr, body)Pre-allocate one id per SCase in body.caseBag; build dispatchCases and defaultTarget. Close current with switchDispatch(expr, cases, defaultTarget). Push {breakId: postId, continueId: null}continueId: null is how the loopStack signals “switches don’t catch continue.”
SCase(v)Close with fallthrough(id_from_top_of_caseIdsStack). Open that id.
SBreakClose with goto(topBreakTarget()).
SContinueClose with goto(topLoop().continueId), where topLoop() walks the loopStack from the top, skipping entries with continueId === null (those are switches).
SReturn(e)Close with return(e).
Other (SExpr, surviving SDecl, …)Append to curSeg.stmts.

After the walk, any still-open segment is closed with halt (function fall-through).

The loopStack is doing all the work for break/continue. A for loop pushes a different continueId than its breakId because they target different segments (the increment block vs. the exit). A switch pushes continueId: null so a continue inside a switch inside a loop walks past the switch entry and finds the enclosing loop’s continue target — matching C semantics exactly.

Shape of the output

For a tiny function with a goto, a while, an if, and a return, the segment graph might look like this:

CFG of a small example function after buildSegments: six segments connected by fallthrough, branch, goto, and return edges.

The same six terminator kinds — and only those — flow into the wrapper synthesis. Any C control-flow construct that survives parsing has been reduced to a combination of straight-line statements plus one of these six edges.

Sub-pass 3: synthesize the wrapper

The output shape:

{
  /* hoisted DVar declarations */
  int __irreducible_state;
  __irreducible_state = 0;
  while (1) {
    switch (__irreducible_state) {
      case 0: /* seg 0 stmts */ /* terminator */
      case 1: /* seg 1 stmts */ /* terminator */
      /* … */
      default: break;          /* exit while on unexpected state */
    }
  }
}

Each segment becomes one case. Inside the case body, the segment’s stmts come first; then a translation of its terminator:

TerminatorStatements emitted
fallthrough(n)__state = n; break;
goto(n)__state = n; break;
branch(cond, t, e)__state = (cond) ? t : e; break;
switch(scrut, cases, def)inner switch(scrut) whose body assigns __state for each case and default, then break;
return(e)return e;
haltreturn <zero of return type>; for non-void; return; for void

The break in fallthrough/goto/branch/switch is the inner break — it exits the switch (__state) block. Control flows back to the while (1), which iterates and re-enters the switch on the new state.

There’s a small bit of footwork around the break keyword serving two roles. The wrapper uses an SBreak AST node (rendered as break; in C) inside each case body to exit the inner switch. That same SBreak would, if found outside the wrapper’s synthesized switch, target the enclosing while (1). The wrapper ensures the inner switch is always the immediately enclosing breakable, so every emitted break targets the switch as intended.

The halt terminator is the unreachable / fall-off-the-end case. It fires when:

  • The function source body ends without an explicit return (well-defined for void, undefined behavior for non-void in C — but the lowering still has to emit valid C). Returning a zero of the function’s return type is safe and predictable.
  • A goto target is unresolved, or some code path exits a segment without ever transferring control elsewhere. These cases shouldn’t reach runtime; emitting a return is a conservative placeholder.

For pointer returns the wrapper fabricates an integer zero and relies on the codegen’s existing implicit-cast machinery. For aggregate / struct returns it emits a bare return; with no value, accepting the UB — the codegen may diagnose.

Cost

Static:

  • One segment per basic block boundary in the source body. Branching and loops fan out: an if/else creates 3 segments (then, else, join), a for creates 4 (hdr, body, cont, exit), an n-way switch creates n + 1.
  • One case per segment, one br_table arm per case, one nesting level of block per case (wasm’s standard switch encoding).

Runtime:

  • Every former jump is now a state-variable assignment plus an inner-switch break plus a br_table dispatch on the next iteration. Where the structured codegen would emit a single br of constant depth, the lowered version emits three operations plus a memory-resident state read.
  • Branch prediction degrades because every jump in the function now flows through the same br_table.

Functions that hit pass 2 in practice: Duff’s device, hand-written state machines, irreducible control-flow graphs from generated code, computed gotos (not supported here, but would land in the same place). Idiomatic hand-written C using the common goto patterns (goto cleanup, goto retry, goto error) almost always stays on pass 1.

Limitations (from the source comment)

  • VLAs are rejected at parse time; their hoisting would need dynamic alloca.
  • setjmp/longjmp interaction inside an irreducible function is not supported. Structured-mode functions still work.
  • Computed goto (GCC extension) is not supported generally.

References

  • Implementation: the IRREDUCIBLE_LOWERING IIFE in compiler.js, sitting between the GOTO_NORMALIZER IIFE (the pass-1 normalizer) and the LEB128 encoding utilities used by the wasm emitter. Public surface: { lower }. Invoked from the per-function retry block inside the codegen’s emitFunctionBody driver. Sub-passes are nested functions inside the IIFE: hoistDeclarations, buildSegments (with the Term factory and newSegment helper), synthesizeWrapper.
  • Pre-pass: the GOTO_NORMALIZER IIFE in the same file. Article touches it only briefly; the hoist-and-plug algorithm there is its own subject.
  • Disable flag: --no-irreducible-lowering, parsed in the compiler’s command-line option handler.
  • Prior art: Emscripten’s relooper (replaced by stackifier in modern Emscripten). LLVM’s WebAssemblyCFGStackify pass. Cheerp’s structurizer.
  • Reducibility analysis: Hecht & Ullman, “Flow Graph Reducibility,” 1972.