Bytecode & .class Files
Bytecode is the intermediate representation the Java compiler produces — platform-neutral instructions that any JVM can interpret or JIT-compile to native code. Reading bytecode with
javapdemystifies how generics erasure works, what lambdas really compile to, and why+on Strings is not as naive as it looks.
What Problem Does It Solve?
Java source code compiles to .class files containing JVM bytecode rather than to native machine code. This solves the write-once-run-anywhere problem: the same .class files run on Windows, Linux, macOS, ARM, and x86 — the JVM handles the translation to native instructions per platform.
From a developer's perspective, understanding bytecode answers questions that source code alone cannot:
- Why do two lambda expressions compile to different shapes?
- What does type erasure actually look like at the instruction level?
- Why does
String a = "foo" + "bar"compile to a constant, butString a = s1 + s2allocates aStringBuilder?
javap — the JDK's bytecode disassembler — is the practical tool for all of these questions.
The .class File Structure
A .class file is a precisely-structured binary format defined in the JVM Specification (JVMS §4):
Caption: The .class file is a structured binary format — the constant pool is its backbone, referenced by almost every other section via integer indices.
The Constant Pool
The constant pool is the most important part of the .class file. It is an indexed table that stores all string literals, class names, method signatures, and field descriptors used by the class. Instructions reference the pool by index (e.g., ldc #5 — load constant at index 5).
This indirection keeps bytecode instructions compact (a 2-byte index rather than a full string) and enables efficient symbol resolution during class loading.
Method Code Attribute
Each method's body lives in a Code attribute that contains:
max_stack— maximum operand stack depth the method needsmax_locals— number of local variable slots (index 0 =thisfor instance methods)code— the actual array of bytecode instructions- Exception table — which bytecode ranges are covered by which
catchhandlers
Key Bytecode Instructions
JVM bytecode is a typed, stack-based instruction set. Instructions push values onto the operand stack, pop them off, and push results back.
| Category | Examples | What they do |
|---|---|---|
| Load/store | iload, aload, istore, astore | Move values between local variables and operand stack |
| Arithmetic | iadd, dmul, isub, idiv | Pop operands, push result; prefix = type (i=int, d=double, l=long...) |
| Object | new, getfield, putfield, invokevirtual | Heap allocation, field access, method dispatch |
| Control | ifeq, goto, tableswitch | Conditional/unconditional jumps by bytecode offset |
| Return | ireturn, areturn, return | Return value (or void) to caller |
| Stack | dup, pop, swap | Manipulate the operand stack directly |
Method invocation has four opcodes reflecting Java's dispatch rules:
| Opcode | Invokes |
|---|---|
invokevirtual | Instance method (virtual dispatch through vtable) |
invokeinterface | Interface method (dispatch through itable) |
invokespecial | Constructor, private method, or super call |
invokestatic | Static method |
invokedynamic | Dynamically-linked call site (lambdas, string concat since Java 9) |
How It Works — Using javap
javap is the JDK tool for disassembling .class files. The most useful flags are:
| Flag | Output |
|---|---|
javap MyClass | Public API (field names, method signatures) |
javap -c MyClass | Disassembled bytecode for each method |
javap -verbose MyClass | Full detail: constant pool, stack sizes, exception table, attributes |
javap -p MyClass | Include private members |
Code Examples
Hello World bytecode walkthrough
// Source
public class Hello {
public static void main(String[] args) {
System.out.println("Hello, JVM!");
}
}
javac Hello.java
javap -c Hello
public static void main(java.lang.String[]);
Code:
0: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
// ← pushes System.out reference onto operand stack
3: ldc #13 // String "Hello, JVM!"
// ← pushes string constant onto operand stack
5: invokevirtual #15 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
// ← pops PrintStream + String, calls println, pushes nothing (void)
8: return // ← method returns void
Generics erasure in bytecode
// Source: generic method
public <T extends Comparable<T>> T max(T a, T b) {
return a.compareTo(b) >= 0 ? a : b;
}
javap -c -p MyClass # ← -p shows all members including bridge methods
public java.lang.Comparable max(java.lang.Comparable, java.lang.Comparable);
Code:
0: aload_1 // ← load 'a' — type is Comparable (the erasure of T extends Comparable<T>)
1: aload_2 // ← load 'b'
2: invokeinterface #24, 2 // InterfaceMethod java/lang/Comparable.compareTo ...
// ← T erased to Comparable; no trace of T in bytecode
7: iflt 15
10: aload_1
11: goto 16
14: aload_2
16: areturn
Type parameter T is completely absent. The bytecode uses Comparable (the upper bound). A cast is inserted at the call site by the compiler when the return value is used as a specific type.
Lambda compilation: invokedynamic
// Source
Runnable r = () -> System.out.println("run");
r.run();
javap -c LambdaExample
0: invokedynamic #9, 0 // InvokeDynamic #0:run:()Ljava/lang/Runnable;
// ← first call: LambdaMetafactory bootstrap creates Runnable impl
// ← subsequent calls: return the cached implementation
5: astore_1
6: aload_1
7: invokeinterface #13, 1 // InterfaceMethod java/lang/Runnable.run:()V
12: return
The lambda body (System.out.println("run")) is compiled into a private synthetic method in the same class, and invokedynamic at runtime links it to a Runnable implementation via LambdaMetafactory. No anonymous inner class .class file is generated; the class is created and cached in memory the first time the bootstrap method runs.
String concatenation: invokedynamic (Java 9+)
// Source
String name = "Alice";
int age = 30;
String msg = "Hello " + name + ", age " + age;
# Java 8 — old: creates StringBuilder, calls append() 4 times
# Java 9+ — new: invokedynamic with StringConcatFactory
invokedynamic #5,0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;I)Ljava/lang/String;
// ← StringConcatFactory generates an optimal concatenation strategy at linkage time
Java 9+ replaced the StringBuilder chain with a single invokedynamic call to StringConcatFactory. The factory selects the most efficient strategy (often a StringTemplate or intrinsified routine) for the specific types involved — and can be updated in a future JDK without changing source or bytecode.
Reading verbose output — the constant pool
javap -verbose Hello | head -30
minor version: 0
major version: 65 ← Java 21; 61=Java 17, 55=Java 11, 52=Java 8
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #2.#3 // java/lang/Object."<init>":()V
#2 = Class #4 // java/lang/Object
#3 = NameAndType #5:#6 // "<init>":()V
#4 = Utf8 java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
...
#13 = String #14 // Hello, JVM!
#14 = Utf8 Hello, JVM!
Best Practices
- Use
javap -verbosewhen debugging framework magic — Spring AOP's CGLIB proxies, Lombok-generated code, and Kotlin coroutines all look different in bytecode.javapshows exactly what class files were actually generated. - Check major version number to verify a JAR was compiled for the target Java version: major 65 = Java 21, 61 = Java 17, 52 = Java 8.
UnsupportedClassVersionErrormeans a class was compiled for a newer version than the running JVM. - Use
jclasslib(IntelliJ plugin) orASMifierfor interactive exploration — much friendlier than rawjavapoutput for complex classes. - Don't hand-write bytecode unless using ASM for a library (e.g., Mockito, ByteBuddy). Manipulate Java source or use a proper bytecode library with a well-tested API.
Common Pitfalls
- Assuming generics survive to runtime: They don't. Type parameters are erased during compilation and replaced by their bounds (
Objectby default, or the declared upper bound). Runtime casts (checkcastopcode) enforce correct usage at the call site, not inside the generic method. - Thinking each lambda creates an inner class file: Java 7
invokedynamic+LambdaMetafactorymeans lambdas generate a class in memory at runtime, not a.classfile. The$anonymous-class files you see for pre-Java 8 anonymous inner classes do not appear for lambdas. - Confusing
invokespecialwithinvokevirtual:this.method()on a regular instance method usesinvokevirtual(polymorphic).super.method()usesinvokespecial(bypass dynamic dispatch). Private method calls also useinvokespecial— important to know when debugging viajavap. major versionmismatch crashes apps: If your build produces Java 21 bytecode (major 65) but the runtime is Java 17 (supports up to major 61), every class load throwsUnsupportedClassVersionError. This often happens in multi-stage Docker builds where the compile stage uses a newer JDK than the runtime image.
Interview Questions
Beginner
Q: What is bytecode, and why does Java compile to it instead of native machine code?
A: Bytecode is a compact, platform-neutral instruction set for the JVM stack machine. Compiling to bytecode rather than native code means the same .class files run unchanged on any JVM on any OS or CPU architecture. The JVM (specifically its JIT compiler) handles the final translation to native instructions per platform. This is the "write once, run anywhere" promise.
Q: What does javap do?
A: javap is the JDK's bytecode disassembler. Given a .class file, it prints a human-readable representation of its bytecode, constant pool, and metadata. javap -c shows bytecode instructions; javap -verbose shows the full class file structure including the constant pool, stack size requirements, and version numbers.
Intermediate
Q: How are Java generics represented at the bytecode level?
A: Type parameters are erased during compilation — a process called type erasure. In bytecode, all generic type parameters are replaced by their upper bound: Object for unbounded <T>, or the declared bound for <T extends Comparable<T>> (replaced by Comparable). The compiler inserts checkcast instructions at call sites to maintain type safety. The Signature attribute in the .class file preserves the original generic signature for reflection and IDEs, but the JVM runtime does not use it for dispatch or memory layout.
Q: How do lambda expressions compile to bytecode?
A: Lambda bodies compile to private synthetic methods in the class that defines them. At the lambda's use site, the compiler emits an invokedynamic instruction referencing LambdaMetafactory. On the first call, the JVM bootstrap mechanism uses LambdaMetafactory to generate and cache a class implementing the target functional interface, wired to the synthetic method. Subsequent calls reuse the cached implementation. This is faster and more flexible than generating anonymous inner class files at compile time.
Advanced
Q: What is invokedynamic and why was it added in Java 7?
A: invokedynamic (JSR 292, Java 7) adds a call site that delegates its first dispatch to a user-defined bootstrap method. The bootstrap method links the call site to a MethodHandle which is cached for all subsequent calls. This enables dynamic languages on the JVM (Groovy, JRuby) to use the same efficient dispatch mechanism as static Java. Java itself adopted invokedynamic for lambdas (via LambdaMetafactory), string concatenation (via StringConcatFactory), and records. The key advantage over reflection is that linked method handles are JIT-compiled just like regular invocations.
Follow-up: What is the difference between invokedynamic and reflection for dynamic dispatch?
A: Reflection (Method.invoke()) bypasses the JIT's inlining because the call target is not statically known. invokedynamic with a MethodHandle provides the same runtime flexibility but with a stable call target once linked — the JIT can inline the target method after warmup, achieving near-static-dispatch performance. This is why Java switched from generating StringBuilder bytecode to invokedynamic for string concatenation in Java 9.
Further Reading
- JVMS §4 — The class File Format — the authoritative specification for every field in the
.classbinary format - Oracle javap Reference — complete flag reference for the bytecode disassembler
- Baeldung: View Bytecode of a Class in Java — practical guide using javap and ASMifier
Related Notes
- JIT Compilation — the JIT compiler reads bytecode as its input; understanding bytecode sheds light on what the JIT is actually optimising
- Java Type System — Generics & Type Erasure — type erasure is a bytecode-level choice; the generic signatures visible to
javap -verboseare metadata only - Functional Programming — Lambdas — lambda compilation via
invokedynamicis explained from the bytecode perspective in this note