Introducing Azoth: A Deterministic EVM Bytecode Obfuscator
We're excited to introduce Azoth, an open-source EVM bytecode obfuscator. It preserves contract logic while dissolving recognizable patterns, making reverse engineering at scale costly and impractical.
Preface
I first came across the term program obfuscation in this blog post: Programmable Cryptography I around November 2024, and I was hell-bent on practically proving the concept. Next thing I know, Mirage
Here's an analogy for Mirage: say Alice wants to make an on-chain transaction such that an outside observer cannot easily tell if Alice's on-chain action is a privacy-seeking transaction or just business as usual. Mirage is how Alice gets that executed. In Mirage, users fulfill their transactions by executing contracts whose runtime bytecode has been obfuscated on the EVM.
Wait, why is bytecode obfuscation needed?
Ethereum smart contracts are public by design. Once deployed, anyone can inspect their bytecode and reverse engineer the logic, function selectors, and control-flow structure. For most projects, this transparency is part of the trust model. But for Mirage, it's a liability. Mirage's privacy guarantees depend on preventing outside observers from distinguishing a privacy-seeking transaction from an ordinary one.
The danger is simple: if Mirage bytecode looks recognizable, an adversary can tag it and trace every interaction. That breaks privacy before it starts. Obfuscation flips the economics. Reverse engineering one contract is possible, but at scale it's a losing game.
This is where bytecode obfuscation becomes essential: it makes reverse engineering and signature-based detection materially harder while maintaining functionality. It ensures that function dispatchers, control flow, and constants are concealed, so a Mirage contract looks statistically similar to unrelated contracts on-chain. And because Mirage's contracts are open-source and auditable, there is no new trust assumption. You can enjoy the benefits of obfuscation while keeping it fully trustless.
What is Azoth?
In alchemy, Azoth was the legendary universal solvent, said to heal, transform, and protect. In Ethereum, we take up the role of alchemists, and Azoth becomes our solvent: an obfuscator that transforms EVM bytecode to protect privacy.
Azoth's primary goal is to produce multiple, deterministic variants of the same bytecode that are hard to recognize and fingerprint at scale. It is not about protecting a single deployed contract (which can be deobfuscated once) or providing security guarantees. Instead, its power comes when used at scale to hide the same underlying logic across many obfuscated deployments. For an adversary to reverse engineer them, they must first detect that a contract is obfuscated, deobfuscate it and then repeat that same process for every contract instance. At that point, the process becomes prohibitively expensive in both time and computation.
While Mirage is our main use case, Azoth is open source and available to anyone. It works by surgically obfuscating runtime bytecode, preserving semantics but disrupting structural patterns.
The Alchemy of Azoth
Azoth works in three stages: dissolve, transform, reconstitute. First it peels off the clean runtime code. Then it scrambles selectors, control flow, and constants so the usual bytecode fingerprints vanish. Finally, it stitches the contract back together, fully functional but unrecognizable. The result is the same logic in many unique disguises.
1. Dissolution
Azoth starts by decoding the raw byte stream into an instruction stream with program counters and immediates. It then performs sectioning to separate the creation code from the persistent runtime while also peeling off auxdata and other trailing metadata. When the creation scaffold is present, that boundary is used directly. Specifically, "boundary" is in reference to the precise split between the creation program that executes once, and the runtime blob it returns. Concretely, the creation program copies the embedded runtime into memory and returns it; the RETURN (0xF3) marks the end of creation execution, and the CODECOPY source/length delimit the runtime slice (often visible as ... f3 fe 60 80 60 40 ..., with runtime starting right after f3 fe); this looks like:
; creation scaffold (PUSHn -> PUSH1/2/3/4 depending on size)
PUSHn <len> PUSHn <src> PUSH* <dst> CODECOPY
PUSHn <len> PUSHn <dst> RETURN ; 0xF3
; often followed by an 0xFE sentinel from solcWhen the scaffold is not unambiguous, Azoth actively probes for a Solidity‑style dispatcher over sliding windows of the instruction stream and uses the first positive hit to anchor the start of runtime. The probe looks for selector extraction followed by comparison blocks. Common extraction shapes include:
; standard selector extraction (Solidity style)
[PUSH0 | PUSH1 0x00] CALLDATALOAD PUSH1 0xe0 SHR
; tolerated variants that appear in the wild
CALLDATALOAD PUSH1 0xe0 SHROnce a candidate extraction is seen, the detector seeks repeated comparison/jump blocks of the form:
DUP1 PUSH4 <imm> (EQ|GT) PUSH{1..4} <addr> JUMPI
The earliest convincing occurrence of this sequence in a plausible window anchors the runtime start. With the boundary fixed, a strip step removes everything that isn't runtime and produces a clean runtime blob together with a compact report describing what was removed, the exact byte/PC ranges, and a PC remapping for round‑trip reconstruction. Over this cleaned slice Azoth builds a control‑flow graph and IR to make subsequent rewrites safe.
2. Transformation
Azoth currently applies four transforms:
Function Dispatcher (major transform)
A typical Solidity bytecode exposes selectors and a tell‑tale dispatcher table and anyone can basically recognize what function was being called:
PUSH1 0x00 CALLDATALOAD PUSH1 0xe0 SHR
DUP1 PUSH4 0xa9059cbb EQ PUSH1 0x30 JUMPI
DUP1 PUSH4 0x7ff36ab5 EQ PUSH1 0x1a JUMPIFingerprinting this at scale is trivial. So what does Azoth do?
- Tokenizes selectors: we take the original 4-byte function selector
sto obtain a 4-byte tokentusing a keyed Keccak hash with a secret seed; mathematically,t = Keccak(seed || s)[..4]. - Disguises extraction: avoids the signature
CALLDATALOAD; SHR 0xe0by:- Generating a zero value on the stack using non-obvious ops (e.g.,
PUSH x; PUSH x; SUB or XOR), or briefly storing/loading from memory withMSTORE/MLOAD. - Loading calldata with
CALLDATALOADat a position/order that doesn't match Solidity's clean sequence. - Masking the result with
ANDand a token-width mask (e.g.,0xfffffffffor 4-byte tokens) instead of shifting 224 bits. This preserves semantics but breaks the classic dispatcher signature.
- Generating a zero value on the stack using non-obvious ops (e.g.,
- Comparison style: the rewriter currently emits
EQcomparisons and may reorder the check order. (GT pivot chains are recognized by detection but not emitted.)
For example:
; disguise 0 (SUB val val) | or XOR or MSTORE/MLOAD
PUSH1 0x37
PUSH1 0x37
SUB
; load calldata and mask to token width
CALLDATALOAD
PUSH4 0xffffffff
AND
DUP1
; compare against tokenized selector
PUSH4 0x04cad84c ; token for original selector (example)
EQ
PUSH1 0x29
JUMPI
; ... next comparisons ...Given the same seed, tokenization is deterministic.
Opaque Predicate
Injects always-true (or always-false) predicates built from cheap arithmetic or constant-equality (e.g., XOR + ISZERO or EQ on identical constants). Adds dummy control-flow that never influences observable behavior but explodes/complicates CFG shape.
Shuffle (block‑level)
This is a simple block-level randomization that changes program layout without affecting execution. For example, 0x60015b6002 -> 0x5b60026001. As of now, it does not shuffle instructions inside a block yet (that's planned).
Jump Address Transformer
Rewrites direct jump immediates into small arithmetic expressions that recompute the same target at runtime. For example, it replaces PUSH1 0x42 JUMP with PUSH1 0x20 PUSH1 0x22 ADD JUMP
Reconstitution
After the whole transformation process, Azoth re-assembles the obfuscated runtime bytecode with non-runtime sections: auxdata, trailing metadata, etc. How does it do this?
- Encode: First, Azoth encodes the sequence of transformed IR into their corresponding bytecode representation.
- Re-assembling: then it fixes the obfuscated runtime with the stripped non-runtime code, according to the strip report.
- Verification (in progress): The verification stack includes:
- Practical testing via EVM simulation (state/output/event/gas equivalence),
- Formal verification systems (semantic extraction + SMT/Z3 integration and proof objects) is designed to prove behavior preservation for all inputs. The framework and types are in place; proofs and property catalogs are being built out. The pipeline's goal is to fail-shut if any divergence is found.
Check it out!
We invite you to explore the Azoth repository, try it out, and share feedback. If you find a way to break Azoth's obfuscation or prove semantic inequivalence, we want to talk.
