⭐ Merged · Deduped · Advanced · 10 YOE Interview Ready

Collections, Generics
& Concurrency

The definitive merged guide — JVM memory internals, API semantics, concurrency primitives, Java 8–21 features, Spring/Kafka/Hibernate patterns, and 90+ senior-level Q&As. All duplicates removed. New advanced questions added throughout.

90+
Q&As Total
5
Sections
JDK 21
Coverage
0
Duplicates
Memory-Level Detail
📦

Part I — Collections & Generics

Memory layout · Internals · API semantics · Type system · Concurrent collections

Q1–Q22
ArrayList vs LinkedList — Memory & Cache Q1
Describe heap-level memory layout of ArrayList vs LinkedList. Why is ArrayList faster for iteration despite both being O(n)?

ArrayList uses a single contiguous Object[] elementData block on the heap. Every element slot is a 4-byte (compressed OOP) or 8-byte reference. On resize, a new array of size × 1.5 is allocated and System.arraycopy runs — O(n) amortized but O(1) per append. The contiguous layout is CPU cache-friendly: sequential access loads a full 64-byte cache line at a time, so prefetching works perfectly.

LinkedList allocates each Node object separately on the heap. Node layout: 16B object header + 8B item ref + 8B next + 8B prev = ~40B per element. Nodes are scattered across the heap, causing cache misses on every pointer hop. Iteration over 1M elements: ArrayList ~5–10× faster due to cache locality.

Java — memory layout
// ArrayList: single contiguous block
transient Object[] elementData; // heap: [ref0, ref1, ref2, ...]
// grow: Arrays.copyOf(data, newCapacity = size * 3/2 + 1)

// LinkedList: scattered nodes
private static class Node<E> {
    E item;          // 8 bytes ref
    Node<E> next;   // 8 bytes ref — points ANYWHERE in heap
    Node<E> prev;   // 8 bytes ref
    // + 16 bytes object header = ~40 bytes per element
}

// Benchmark: iterating 10M ints
// ArrayList: ~8ms  (prefetcher loads 8 refs per cache line)
// LinkedList: ~80ms (cache miss on every node → main memory)
🧠 Memory-Level Rule Choose ArrayList for everything except: (a) frequent insertion/removal at arbitrary positions with existing iterator, (b) implementing a true double-ended queue (use ArrayDeque instead). Never use LinkedList as a Queue — use ArrayDeque.
HashMap — Hash, Bucket, Treeify, Resize, equals/hashCode Q2
Walk through HashMap.put() at source level: hash spreading, index computation, collision chaining, treeify threshold, resize, and why equals+hashCode contract is critical.

Step 1 – Hash spreading: h ^ (h >>> 16) XORs the high 16 bits into the low 16 bits, spreading entropy across all bits. Without this, objects whose hashCodes differ only in high bits would all land in bucket 0.

Step 2 – Index: (n-1) & hash — bitwise AND, not modulo. Works because capacity is always a power of 2. Cost: O(1).

Step 3 – Collision: Linked list in the bucket. equals() scans for key match. Complexity degrades to O(n) with pathological hash functions.

Step 4 – Treeify: When bucket list length ≥ 8 AND total table size ≥ 64 → convert to red-black TreeNode (O(log n)). Untreed when shrinks back to ≤ 6. Table < 64 → resize instead.

Step 5 – Resize: Triggered when size > capacity × 0.75. New table = 2×. In Java 8, each bucket splits into high/low bit sub-lists without rehashing full hash — very efficient.

equals/hashCode contract: Equal objects MUST have equal hash codes. Violating this means put() and get() use different buckets → silent data loss.

Java — putVal internals (JDK 17+)
// Hash spreading — prevents high-bit clustering
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

// putVal: plain writes — NOT thread-safe!
if ((p = tab[i = (n - 1) & hash]) == null)
    tab[i] = newNode(hash, key, value, null); // plain store
else {
    for (int binCount = 0;; ++binCount) {
        if ((e = p.next) == null) {
            p.next = newNode(hash, key, value, null);
            if (binCount >= TREEIFY_THRESHOLD - 1) // 8
                treeifyBin(tab, hash); // convert to TreeNode
            break;
        }
    }
}
if (++size > threshold) resize(); // load factor 0.75

// Correct equals + hashCode (Java 16+ records do this automatically)
public boolean equals(Object o) {
    if (!(o instanceof User u)) return false;
    return age == u.age && Objects.equals(name, u.name);
}
public int hashCode() { return Objects.hash(name, age); }
⚠ Thread Safety HashMap.put() uses plain (non-volatile) writes. Two threads concurrently putting can: (a) Java 7 — create circular linked list → infinite loop on get(); (b) Java 8+ — silent data loss when both threads resize and one overwrites the table reference. Always use ConcurrentHashMap in concurrent code.
ConcurrentHashMap — CAS, Bucket-Lock, CounterCell Q3
Deep dive into ConcurrentHashMap.put() memory ordering. How does it avoid a global lock? Explain CAS on empty bucket, synchronized on non-empty, and how size() works via CounterCell striping.

Java 8+ dropped segments (Java 7's 16-segment approach) in favour of per-bucket locking:

  • Volatile table array: Node<K,V>[] table is volatile — every read sees latest pointer after a write.
  • Empty bucket → CAS: casTabAt(tab, i, null, newNode) uses Unsafe.compareAndSwapObject — atomic, no lock needed.
  • Non-empty bucket → synchronized on head: Only the first node is the lock object. Concurrency = number of buckets. Two threads in different buckets never block each other.
  • size() via CounterCells: LongAdder-style striped counters marked @Contended (each on its own cache line). size() sums all cells — not an atomic snapshot; may lag briefly.
Java — ConcurrentHashMap putVal (JDK 17)
final V putVal(K key, V value, boolean onlyIfAbsent) {
    for (Node<K,V>[] tab = table;;) { // volatile read each iteration
        Node<K,V> f; int n, i, fh;
        if (tab == null) tab = initTable();
        else if ((f = tabAt(tab, i=(n-1)&hash)) == null) {
            // EMPTY: lock-free CAS — no contention path
            if (casTabAt(tab, i, null, new Node<>(hash,key,value)))
                break;
        } else {
            // COLLISION: lock on first node only
            synchronized (f) {
                // traverse list or tree, update or insert
            }
        }
    }
    addCount(1L, binCount); // LongAdder-style striped counter
}

// CounterCell — padded to avoid false sharing between CPUs
@jdk.internal.vm.annotation.Contended
static final class CounterCell {
    volatile long value; // each on its own cache line
}
💡 Key Interview Insight Use mappingCount() instead of size() on large maps — it returns long and avoids overflow. size() is not atomic — it's the sum of CounterCell values at one moment in time.
TreeMap · LinkedHashMap · PriorityQueue — Internals & APIs Q4
Describe TreeMap's Red-Black tree memory layout and NavigableMap API. How does LinkedHashMap implement LRU? Explain PriorityQueue heap structure and peek/poll/element differences.

TreeMap: Red-black BST — each Entry node holds key, value, left, right, parent, color (boolean). No array — just root reference. O(log n) for all ops. NavigableMap methods: floorKey(k) (largest ≤ k), ceilingKey(k) (smallest ≥ k), headMap(k), tailMap(k), subMap(lo, hi), pollFirstEntry(). Each node ≈ 48 bytes.

LinkedHashMap LRU: Extends HashMap, adds doubly-linked list through all entries. Constructor param accessOrder=true triggers afterNodeAccess() on every get/put — moves accessed entry to tail. Override removeEldestEntry() to auto-evict head.

PriorityQueue: Binary min-heap stored in array. Parent of index i is at (i-1)/2. Children at 2i+1, 2i+2. peek()/element() return root O(1) — peek returns null, element throws. poll()/remove() remove root, siftDown O(log n). remove(Object) is O(n) linear scan.

Java — all three in production patterns
// TreeMap — range queries (e.g. Snowflake partition metadata)
TreeMap<Long, Partition> partitions = new TreeMap<>();
NavigableMap<Long, Partition> range =
    partitions.subMap(startTs, true, endTs, true); // O(log n) view

// LinkedHashMap LRU Cache
class LRUCache<K,V> extends LinkedHashMap<K,V> {
    private final int cap;
    LRUCache(int cap) { super(cap, 0.75f, true); this.cap=cap; }
    protected boolean removeEldestEntry(Map.Entry<K,V> e) {
        return size() > cap; // evict LRU when over capacity
    }
}

// PriorityQueue — rate limiter sorted by expiry window
PriorityQueue<Request> pq =
    new PriorityQueue<>(Comparator.comparingLong(Request::getExpiry));
pq.peek();     // earliest expiry, null if empty
pq.element(); // earliest expiry, throws NoSuchElementException
pq.poll();     // removes+returns earliest, null if empty
pq.remove();  // removes+returns earliest, throws if empty
WeakHashMap — ReferenceQueue & GC Interaction Q5
Explain WeakHashMap internal mechanics: how does it use ReferenceQueue for stale entry cleanup? Why are keys weak but values strong? Give a production use case.

Each key is wrapped in WeakReference<K> registered with a shared ReferenceQueue. When GC collects a key (no strong reference exists), the WeakReference is enqueued on the queue. On subsequent map operations (get, put, size), expungeStaleEntries() polls the queue and removes matching bucket entries.

Why keys weak, values strong: Values should live as long as keys. Making values weak too would evict entries even when the key is still live — defeating the purpose.

Production use: Spring's ReflectionUtils method caches, ClassLoader metadata caches — entries should not prevent GC of class objects.

Java — WeakHashMap cleanup cycle
// Simplified stale entry cleanup (runs on map operations)
private void expungeStaleEntries() {
    for (Object x; (x = queue.poll()) != null; ) {
        synchronized (queue) {
            Entry<K,V> e = (Entry<K,V>) x;
            int i = indexFor(e.hash, table.length);
            // unlink entry from bucket chain
        }
    }
}

// ⚠ Not thread-safe! Wrap with synchronizedMap if needed
Map<Key, Metadata> cache =
    Collections.synchronizedMap(new WeakHashMap<>());

// Alternative: Use Guava Cache or Caffeine with weakKeys()
Cache<Key, Value> c = Caffeine.newBuilder()
    .weakKeys().softValues().build();
CopyOnWriteArrayList — Snapshot Semantics & Memory Q6
Explain CopyOnWriteArrayList memory model. Why does its iterator never throw ConcurrentModificationException? When should you use it vs synchronizedList?

Every mutative operation (add, set, remove) acquires a lock, copies the full underlying array, applies the change, then performs a volatile write of the array reference — establishing happens-before for subsequent readers.

Iterators capture the array reference at creation time. They always iterate the snapshot array, ignoring subsequent changes. No modCount check → never throws CME. But may return stale data.

Use when: reads vastly outnumber writes (listener lists, Spring event multicaster, read-heavy configuration maps). Avoid when: high write frequency — every add() is O(n) array copy.

Java — add() and volatile array write
public boolean add(E e) {
    synchronized (lock) {
        Object[] es = getArray();           // volatile read
        int len = es.length;
        Object[] newEs = Arrays.copyOf(es, len + 1); // O(n) copy
        newEs[len] = e;
        setArray(newEs);                  // volatile write → HB guarantee
        return true;
    }
}

// Spring ApplicationEventMulticaster uses it internally:
// private final Set<ApplicationListener<?>> applicationListeners
//    = new CopyOnWriteArraySet<>();   (backed by COWAL)
🧠 Memory For 100k-element list, each add() allocates + copies 100k refs (~800KB). GC pressure is significant. Prefer ConcurrentHashMap or ReadWriteLock pattern for write-heavier scenarios.
BlockingQueue — ArrayBlockingQueue vs LinkedBlockingQueue Q7
Compare ArrayBlockingQueue vs LinkedBlockingQueue from memory, locking, and throughput perspectives. How does DelayQueue work?
PropertyArrayBlockingQueueLinkedBlockingQueue
Backing storeFixed Object[] preallocatedLinked Node objects (dynamic)
CapacityBounded (required at construction)Optional (Integer.MAX_VALUE default)
LockingSingle ReentrantLock (put + take share)Two locks: putLock + takeLock
ThroughputLower — single lock contentionHigher — producers/consumers don't block each other
MemoryPredictable, preallocatedGC pressure from node allocation
Fairness optionYes (fair=true constructor)No

DelayQueue: Unbounded BlockingQueue backed by PriorityQueue. Elements must implement Delayed.getDelay(). take() blocks until head element's delay ≤ 0. Used for: session expiry, scheduled retries, cache TTL eviction.

Java — BlockingQueue + DelayQueue pattern
// Producer-consumer with LinkedBlockingQueue
BlockingQueue<Task> queue = new LinkedBlockingQueue<>(1000);
queue.put(task);               // blocks if full
Task t = queue.take();         // blocks until available
queue.offer(task, 1, TimeUnit.SECONDS); // timeout version

// DelayQueue for session expiry
class SessionEvent implements Delayed {
    private final long expiryNanos;
    public long getDelay(TimeUnit u) {
        return u.convert(expiryNanos - System.nanoTime(), TimeUnit.NANOSECONDS);
    }
    public int compareTo(Delayed o) {
        return Long.compare(expiryNanos, ((SessionEvent)o).expiryNanos);
    }
}
DelayQueue<SessionEvent> expiry = new DelayQueue<>();
SessionEvent expired = expiry.take(); // blocks until delay expires
Fail-Fast — modCount, CME, Safe Alternatives Q8
Explain modCount and expectedModCount. How is fail-fast implemented? Is it guaranteed? What are the safe ways to remove during iteration?

Every AbstractList has int modCount incremented on every structural modification (add, remove, clear, sort). When an Iterator is created it snapshots expectedModCount = modCount. Each next() call invokes checkForComodification().

Not guaranteed: Another thread modifying without synchronization means the JMM doesn't guarantee visibility of the updated modCount. The iterator may or may not see the change — CME is best-effort, not a thread-safety mechanism.

Java — safe removal patterns
// ✅ Safe: iterator.remove() (syncs expectedModCount)
Iterator<String> it = list.iterator();
while (it.hasNext())
    if (shouldRemove(it.next())) it.remove();

// ✅ Safe: removeIf (Java 8, uses modCount internally)
list.removeIf(s -> s.startsWith("X"));

// ✅ Safe: CopyOnWriteArrayList (snapshot iterator)
new CopyOnWriteArrayList<>(list).iterator();

// ✅ Safe: collect to new list then remove
List<String> toRemove = list.stream()
    .filter(this::shouldRemove).toList();
list.removeAll(toRemove);

// ❌ Broken: modifying list during forEach
list.forEach(s -> if (bad) list.remove(s)); // CME!
ConcurrentSkipListMap — Lock-Free Sorted Map Q9
How does ConcurrentSkipListMap achieve thread safety without locks for reads? When to prefer it over TreeMap wrapped in synchronizedSortedMap?

A skip list is a probabilistic multilevel linked list. The base layer is a sorted linked list; each higher layer is a "fast lane" — each node promoted with probability 0.25. Average height: O(log n).

Thread safety: node next pointers are volatile. Insert uses CAS to update links. Reads are fully lock-free — they traverse volatile next references and always see a consistent state (may briefly see stale due to in-progress insertions).

vs synchronizedSortedMap(TreeMap): synchronizedSortedMap uses a single mutex — one thread blocks all others. ConcurrentSkipListMap allows concurrent reads and fine-grained write concurrency. Use ConcurrentSkipListMap when you need a concurrent NavigableMap.

Java — ConcurrentSkipListMap range queries
// Concurrent event timeline (used in Kafka consumer internals)
ConcurrentSkipListMap<Long, Event> timeline = new ConcurrentSkipListMap<>();
timeline.put(System.currentTimeMillis(), event);

// Thread-safe range query — no external lock needed
ConcurrentNavigableMap<Long,Event> lastMinute =
    timeline.tailMap(System.currentTimeMillis() - 60_000);

// ConcurrentSkipListSet for sorted deduplication
ConcurrentSkipListSet<String> tags = new ConcurrentSkipListSet<>();
tags.add("java"); tags.add("kafka");
tags.higher("j"); // "java" — lock-free O(log n)
Generics — Type Erasure, Bridge Methods, PECS Rule Q10
Explain type erasure and bridge methods. What is the PECS rule? Why can't you add to List<? extends Number>? What is heap pollution?

Type erasure: Generic parameters are erased at compile time. List<String> and List<Integer> are the same class at runtime. Bounds become their erasure (T extends ComparableComparable). Compiler inserts casts at use sites.

Bridge methods: When a generic subclass overrides a method, the compiler generates a synthetic bridge method with the erased signature that delegates to the concrete method — maintains polymorphism.

PECS — Producer Extends, Consumer Super: ? extends T = you can READ (produces T), cannot WRITE. ? super T = you can WRITE T values, cannot read specific type (only Object).

Heap pollution: Variable of parameterized type holds object of wrong type — ClassCastException fires at unexpected location, not at assignment.

Java — generics deep dive
// Bridge method — compiler generated
class Base<T> { void set(T t) {} }
class Sub extends Base<String> {
    void set(String s) { /* override */ }
    // compiler adds: void set(Object o){set((String)o);} — BRIDGE
}

// PECS: Collections.copy uses both wildcards
public static <T> void copy(
    List<? super T> dest,    // consumer: write T into it
    List<? extends T> src)   // producer: read T from it
{ ... }

// Why can't add to ? extends Number?
List<? extends Number> nums = new ArrayList<Integer>();
nums.add(1.5); // COMPILE ERROR — could be List<Integer>, adding Double is wrong

// Heap pollution
List[] raw = new List[1];
List<String>[] polluted = raw;      // unchecked
raw[0] = List.of(42);               // Integer list in String[] slot
String s = polluted[0].get(0);     // CCE here, not at assignment!
EnumMap & IdentityHashMap — Specialised Maps Q11 🆕
When would you use EnumMap instead of HashMap? How does IdentityHashMap differ? Give production use cases for each.

EnumMap: Keys must be enum values. Internally a plain Object[] indexed by key.ordinal() — zero hashing, zero collision, zero resizing. O(1) by array index. Faster and more memory-efficient than HashMap for enum keys.

IdentityHashMap: Uses == (reference equality) and System.identityHashCode() instead of equals()/`hashCode()`. Open addressing with linear probing (not separate chaining). Two distinct objects with same equals() are treated as different keys.

Java — EnumMap + IdentityHashMap
// EnumMap — state machine or permission flags
enum State { PENDING, PROCESSING, DONE, FAILED }
EnumMap<State, Color> stateColors = new EnumMap<>(State.class);
stateColors.put(State.DONE, Color.GREEN);
// Internal: Object[4] indexed by State.ordinal() — fastest possible Map

// IdentityHashMap — serialization cycle detection
// (e.g. Java ObjectOutputStream uses IdentityHashMap internally)
IdentityHashMap<Object, Integer> visited = new IdentityHashMap<>();
void serialize(Object obj) {
    if (visited.containsKey(obj)) return; // cycle detected by identity
    visited.put(obj, nextId++);
    // serialize obj...
}

// Spring AOP uses IdentityHashMap for proxy caching:
// two proxied objects wrapping same target should be distinct
ArrayDeque — Circular Array, Stack/Queue Best Practice Q12 🆕
Why is ArrayDeque preferred over LinkedList for Stack/Queue? How does its circular array work internally?

ArrayDeque uses a resizable circular array with head and tail int pointers. Capacity is always a power of 2 — enables bitwise modulo. addFirst: elements[head = (head - 1) & (length - 1)] = e. addLast: elements[tail] = e; tail = (tail + 1) & (length - 1). No per-element allocation. Cache-friendly. Null elements prohibited (used as sentinel).

vs LinkedList as Stack/Queue: No per-element Node allocation = zero GC pressure per operation. Better cache locality. Java's own Deque javadoc states: "This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue."

Java — ArrayDeque internals
// Circular buffer — head/tail wrap around
// addFirst: head moves left (wrapping), addLast: tail moves right
//  [_, _, 3, 2, 1, _, _]  head=2, tail=5
//  addFirst(4): head=(2-1)&7=1  →  [_, 4, 3, 2, 1, _, _]

// Use as Stack (LIFO)
Deque<Integer> stack = new ArrayDeque<>();
stack.push(1); stack.push(2);
stack.pop();   // 2  (LIFO)
stack.peek();  // 1  (no remove)

// Use as Queue (FIFO)
Deque<Integer> queue = new ArrayDeque<>();
queue.offer(1); queue.offer(2);
queue.poll();  // 1  (FIFO)

// BFS/DFS algorithm template
Deque<Node> frontier = new ArrayDeque<>();
frontier.offerLast(root);     // BFS: poll from front
// frontier.offerFirst(root); // DFS: poll from front too
🧵

Part II — Multithreading & Concurrency

JMM · volatile · synchronized · AQS · Locks · Atomics · CompletableFuture · Thread pool

C1–C20
volatile — Memory Barriers, Happens-Before, Not Atomic C1
What does volatile guarantee at the JMM level? List all happens-before rules. Why doesn't volatile int counter++ provide thread safety?

volatile guarantees:

  • Visibility: A write to a volatile field happens-before every subsequent read of that field by any thread. The write bypasses CPU store buffer and flushes to main memory.
  • No reordering: StoreStore, LoadLoad, StoreLoad barriers prevent surrounding instructions from crossing the volatile access.

Happens-before (JMM §17.4.5) — complete list:

  • Program order: Each action in a thread HB its next action
  • Monitor unlock → lock: Unlock of monitor HB subsequent lock of same monitor
  • Volatile write → read: Write to volatile field HB subsequent reads
  • Thread.start(): start() call HB any action in the started thread
  • Thread.join(): All actions in thread T HB t.join() returning
  • Object finalizer: Constructor end HB finalize() start
  • Transitivity: A HB B, B HB C → A HB C

volatile not atomic: counter++ is READ→INCREMENT→WRITE. Two threads reading the same value before either writes will both compute +1 → lost update. Use AtomicInteger.incrementAndGet().

Java — volatile visibility guarantee
// Volatile flag with non-volatile data — safe publication pattern
class Publisher {
    int data = 0;          // plain field
    volatile boolean ready = false;

    // Thread A
    void publish() {
        data = 42;           // plain write, but...
        ready = true;        // volatile write = StoreStore barrier above
        // data=42 happens-before ready=true happens-before Thread B read
    }

    // Thread B
    void consume() {
        if (ready) {          // volatile read
            assert data == 42; // guaranteed: HB chain intact
        }
    }
}

// NOT thread-safe — not atomic
volatile int counter = 0;
counter++;  // READ(0) ... WRITE(1) — race condition window!
// Fix: AtomicInteger.incrementAndGet() — single CAS instruction
synchronized — ObjectMonitor, Lock Inflation, Biased Locking C2
Describe JVM-level synchronized: biased locking, thin lock, lock inflation. What changed in Java 15+? Differentiate object lock vs class lock.

Every Java object has a mark word (64-bit header field). Lock state encoded in mark word bits:

  • Biased locking (Java <15 with -XX:+UseBiasedLocking): Mark word stores ThreadID. Acquire = single mark-word check if same thread. Zero cost if uncontended. On first contention, bias must be revoked (stop-the-world) — expensive for frequently contested objects.
  • Lightweight (thin) lock: Thread CAS copies displaced mark word to its stack frame, writes frame pointer into mark word. Fast if not contended. If another thread tries, it inflates.
  • Heavyweight (fat/inflated): ObjectMonitor in OS — mutex + condition queue. Threads call park() and block. Wake on notify.

Java 15+: Biased locking deprecated (JEP 374), disabled by default in Java 21. Lock path: unlocked → lightweight → heavyweight. Simpler JVM, no STW revocation pauses.

Object vs Class lock: Instance method synchronized locks on this. Static method synchronized locks on the Class object. These are completely independent monitors — two threads can hold one each simultaneously.

Java — synchronized semantics
class Foo {
    synchronized void        a() {} // lock = this (Foo instance)
    static synchronized void b() {} // lock = Foo.class

    void c() {
        synchronized(this)       {} // same as a()
        synchronized(Foo.class) {} // same as b()
        synchronized(lockObj)    {} // private lock object — best practice
    }
}
// Thread states under synchronized:
// Waiting for monitor entry → BLOCKED
// In Object.wait() → WAITING
// In Object.wait(timeout) → TIMED_WAITING
AQS — AbstractQueuedSynchronizer, CLH Queue, State C3
Explain AQS internals: volatile state, CLH wait queue, node states. How does ReentrantLock implement fair vs unfair acquisition? How do Condition objects work?

AQS is the backbone of all java.util.concurrent locks. It maintains:

  • volatile int state — represents lock state (0=free, 1=held, N=held N times for reentrant)
  • CLH queue (Craig-Landin-Hagersten variant) — a virtual doubly-linked list of Node objects. Each Node has: waitStatus (CANCELLED, SIGNAL, CONDITION, PROPAGATE), thread, prev, next.

Unfair acquisition: CAS state 0→1 immediately, without checking the queue. Barging — new thread may jump ahead of waiting threads. Higher throughput, possible starvation.

Fair acquisition: Checks hasQueuedPredecessors() first. If any thread is waiting, enqueue instead of barging. FIFO ordering. Lower throughput but starvation-free.

Condition: lock.newCondition() creates a ConditionObject with its own linked list of waiting threads. await() releases the lock and moves thread to condition queue. signal() moves thread back to AQS CLH queue.

Java — AQS tryAcquire (ReentrantLock NonfairSync)
protected final boolean tryAcquire(int acquires) {
    final Thread current = Thread.currentThread();
    int c = getState(); // volatile read
    if (c == 0) {
        // UNFAIR: no queue check — barge in
        if (compareAndSetState(0, acquires)) {
            setExclusiveOwnerThread(current);
            return true;
        }
    } else if (current == getExclusiveOwnerThread()) {
        // REENTRANT: same thread holds lock
        setState(c + acquires); // plain write — already own the lock
        return true;
    }
    return false; // enqueue and park
}

// Condition example — bounded buffer
ReentrantLock lock = new ReentrantLock();
Condition notFull  = lock.newCondition();
Condition notEmpty = lock.newCondition();

// producer
lock.lock();
try {
    while (isFull()) notFull.await(); // releases lock, waits
    buf.add(item);
    notEmpty.signal(); // wake one consumer
} finally { lock.unlock(); }
StampedLock — 3 Modes, Optimistic Read, Non-Reentrant C4
Explain StampedLock's three modes. How does optimistic read avoid blocking? What is the fallback pattern? Why is it not reentrant and what's the consequence?

3 modes: (1) Write lock — exclusive, blocks all. (2) Read lock — shared, multiple concurrent readers. (3) Optimistic read — tryOptimisticRead() returns a non-zero stamp without acquiring any lock.

Optimistic read pattern: Get stamp → read data into locals → validate(stamp). If valid (no write occurred), proceed. If invalid, fall back to full read lock. The validate() is a cheap volatile read of the internal state version counter.

Not reentrant: Attempting to acquire the same lock twice from the same thread will deadlock. Design around this — extract methods that don't re-acquire. If reentrance needed, use ReentrantReadWriteLock.

When to use over ReadWriteLock: Read-mostly data (99% reads). Optimistic read has ~zero overhead for the common case vs acquiring a read lock.

Java — StampedLock optimistic + fallback pattern
class Point {
    double x, y;
    final StampedLock sl = new StampedLock();

    double distanceFromOrigin() { // called very frequently
        long stamp = sl.tryOptimisticRead(); // no lock — just a version
        double cx = x, cy = y;              // read into locals

        if (!sl.validate(stamp)) {          // write happened? re-read
            stamp = sl.readLock();         // now hold shared lock
            try { cx = x; cy = y; }
            finally { sl.unlockRead(stamp); }
        }
        return Math.hypot(cx, cy);
    }

    void move(double dx, double dy) {     // rare
        long stamp = sl.writeLock();
        try { x += dx; y += dy; }
        finally { sl.unlockWrite(stamp); }
    }
}
ThreadLocal — WeakKey Memory Leak, MDC, InheritableThreadLocal C5
How does ThreadLocal work internally? Why does it cause memory leaks in thread pools? How does MDC use it? What is InheritableThreadLocal and its limitation with thread pools?

Each Thread object holds a ThreadLocalMap. Map entries use WeakReference<ThreadLocal> as the key but a strong reference as the value. If the ThreadLocal variable goes out of scope (no strong reference), the key becomes null at GC. But the value remains strongly reachable through Thread → ThreadLocalMap → Entry.value. In thread pools (threads reused), this accumulates.

Fix: Always call tl.remove() in a finally block.

MDC (Mapped Diagnostic Context): Logback/SLF4J uses ThreadLocal<Map<String,String>> to store per-request context (correlation ID, user ID). Every log call reads from this map to append structured fields.

InheritableThreadLocal: Child threads inherit parent's value at creation time (copy). Limitation: thread pool threads are created once — new tasks don't inherit the submitting thread's values. Use TransmittableThreadLocal from Alibaba or use MDC.getCopyOfContextMap() + explicit propagation.

Java — MDC correlation ID + safe cleanup
@Component
public class CorrelationFilter implements Filter {
    public void doFilter(ServletRequest req, ...) {
        String id = Optional
            .ofNullable(((HttpServletRequest)req).getHeader("X-Correlation-ID"))
            .orElse(UUID.randomUUID().toString());
        try {
            MDC.put("correlationId", id);
            chain.doFilter(req, res);
        } finally {
            MDC.clear(); // MUST clear — thread returns to pool!
        }
    }
}

// Propagating MDC to async tasks (thread pool)
Map<String,String> ctx = MDC.getCopyOfContextMap();
executor.submit(() -> {
    try {
        if (ctx != null) MDC.setContextMap(ctx); // restore in new thread
        doWork();
    } finally { MDC.clear(); }
});
Deadlock — 4 Conditions, Detection, Prevention C6
State the 4 Coffman conditions for deadlock. Show programmatic detection via ThreadMXBean. Give two prevention strategies with code.

4 Coffman conditions (ALL must hold simultaneously):

  1. Mutual exclusion: Resource held non-shareably (only one thread at a time)
  2. Hold-and-wait: Thread holds at least one resource while waiting to acquire another
  3. No preemption: Resources cannot be forcibly taken away; only voluntarily released
  4. Circular wait: T1 waits for T2, T2 waits for T1 (or longer cycle)
Java — detection + prevention
// Detection: ThreadMXBean
ThreadMXBean tmx = ManagementFactory.getThreadMXBean();
long[] ids = tmx.findDeadlockedThreads(); // monitors + synchronizers
if (ids != null) {
    ThreadInfo[] infos = tmx.getThreadInfo(ids, true, true);
    for (ThreadInfo ti : infos)
        log.error("DEADLOCK: {} blocked on {}", ti.getThreadName(), ti.getLockName());
}

// Prevention 1: consistent lock ordering (break circular wait)
void transfer(Account a, Account b) {
    Account first  = a.getId() < b.getId() ? a : b; // always lower id first
    Account second = a.getId() < b.getId() ? b : a;
    synchronized(first) { synchronized(second) { /* transfer */ } }
}

// Prevention 2: tryLock with timeout (break hold-and-wait)
if (lockA.tryLock(100, TimeUnit.MS)) {
    try {
        if (lockB.tryLock(100, TimeUnit.MS)) {
            try { /* critical section */ }
            finally { lockB.unlock(); }
        }
    } finally { lockA.unlock(); }
}
CAS, AtomicInteger, ABA Problem, LongAdder C7
Explain CAS at CPU level. How does AtomicInteger use it? What is the ABA problem and fix? When should you use LongAdder instead of AtomicLong?

CAS (Compare-And-Swap): Single x86 instruction CMPXCHG — atomically: if memory[addr] == expected, write newVal, return true. Else return false. No lock held. Java wraps via Unsafe.compareAndSwapInt (Java 8) or VarHandle.compareAndSet (Java 9+).

ABA problem: Thread reads value A. Another thread changes A→B→A. Thread's CAS sees A and succeeds — but state was transiently changed. For counters: harmless. For pointer-based lock-free structures (stacks, queues): dangerous corruption. Fix: AtomicStampedReference<V> — pairs value with monotonic stamp; CAS must match both.

LongAdder vs AtomicLong: AtomicLong has one volatile long — all threads CAS-retry on the same memory location. Under high contention: many retries, CPU waste. LongAdder uses Striped64 — a Cell[] array where each thread maps to its own cell (padded with @Contended to avoid false sharing). sum() adds all cells. 5–20× faster under contention. Trade-off: sum() is not atomic — use AtomicLong when you need read-modify-write atomicity.

Java — CAS + ABA + LongAdder
// AtomicInteger CAS loop (incrementAndGet internals)
public final int incrementAndGet() {
    for (;;) {
        int cur = get();          // volatile read
        int next = cur + 1;
        if (compareAndSet(cur, next)) // CAS: atomic read+write
            return next;
        // retry if CAS failed (another thread incremented)
    }
}

// ABA fix: AtomicStampedReference
AtomicStampedReference<Node> head =
    new AtomicStampedReference<>(node, 0);
int[] stampHolder = new int[1];
Node curr = head.get(stampHolder);
int stamp = stampHolder[0];
head.compareAndSet(curr, newNode, stamp, stamp + 1);
// Now A→B→A is detected: stamp incremented each change

// LongAdder for high-throughput counters (metrics, rate limiters)
LongAdder hitCount = new LongAdder();
hitCount.increment(); // writes to thread-local Cell — no contention
long total = hitCount.sum(); // aggregate — NOT atomic snapshot
ThreadPoolExecutor — Lifecycle, Queue Strategy, Spring @Async C8
Walk through ThreadPoolExecutor task submission lifecycle: when are new threads created vs queued? Explain all 4 rejection policies. How should Spring @Async be configured?

Task submission decision tree:

  1. Running threads < corePoolSize → create new thread (even if idle threads exist)
  2. Running threads ≥ corePoolSize → try to enqueue in workQueue
  3. Queue full AND threads < maxPoolSize → create new thread (beyond core)
  4. Queue full AND threads == maxPoolSize → invoke RejectedExecutionHandler

4 Rejection policies: AbortPolicy (default, throws RejectedExecutionException), CallerRunsPolicy (calling thread executes — backpressure), DiscardPolicy (silently drop), DiscardOldestPolicy (drop queue head, retry submit).

Spring @Async: Default SimpleAsyncTaskExecutor creates a new thread per task — terrible for production. Configure a proper ThreadPoolTaskExecutor.

Java — production ThreadPoolExecutor + Spring @Async
// Raw ThreadPoolExecutor with bounded queue
ThreadPoolExecutor executor = new ThreadPoolExecutor(
    10,                              // corePoolSize
    50,                              // maximumPoolSize
    60L, TimeUnit.SECONDS,           // keepAlive (non-core threads)
    new ArrayBlockingQueue<>(200), // bounded queue (backpressure)
    new ThreadPoolExecutor.CallerRunsPolicy() // backpressure via caller
);

// Spring @Async production configuration
@Configuration @EnableAsync
public class AsyncConfig {
    @Bean public Executor asyncExecutor() {
        ThreadPoolTaskExecutor ex = new ThreadPoolTaskExecutor();
        ex.setCorePoolSize(10);
        ex.setMaxPoolSize(50);
        ex.setQueueCapacity(200);
        ex.setThreadNamePrefix("async-worker-");
        ex.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        ex.setWaitForTasksToCompleteOnShutdown(true);
        ex.initialize(); return ex;
    }
}
CompletableFuture — thenApply vs thenCompose, allOf, Timeout, Memory C9
What is the difference between thenApply and thenCompose? How do you aggregate 3 parallel calls with allOf? Handle timeout (Java 9+)? What volatile field ensures memory visibility between stages?

thenApply(fn): synchronous transform T→U. Runs in completing thread (or caller if already done). If fn itself returns a CF, you get CF<CF<U>> — nested, wrong.

thenCompose(fn): flatMap equivalent. fn returns CompletionStage<U>, result is CF<U> — flattened. Use when the next step is itself async.

thenApplyAsync: same as thenApply but forces execution on ForkJoinPool.commonPool() (or provided executor) — frees the completing thread.

Memory visibility: CF has a volatile Object result. CAS sets it on completion — establishes happens-before. Any thread that reads result via subsequent stage is guaranteed to see all writes made before complete() was called.

Java — CompletableFuture production patterns
// thenApply vs thenCompose
CF<String> step1 = CF.supplyAsync(() -> "hello");
// thenApply — sync transform (no new async step)
CF<Integer> r1 = step1.thenApply(String::length);  // CF<Integer> ✓
// thenCompose — flatten when next step is async
CF<User> r2 = step1.thenCompose(name ->
    userService.lookupAsync(name));  // CF<User> ✓, not CF<CF<User>>

// allOf — 3 parallel calls, aggregate
CF<User>    cfU  = getUserAsync(id);
CF<Orders> cfO  = getOrdersAsync(id);
CF<Account> cfAcc = getAccountAsync(id);

CF.allOf(cfU, cfO, cfAcc)
  .orTimeout(2, TimeUnit.SECONDS)  // Java 9+
  .thenApply(v ->
      new Dashboard(cfU.join(), cfO.join(), cfAcc.join())) // safe, done
  .exceptionally(ex -> Dashboard.empty())  // on timeout/error
  .handle((res, ex) -> {                   // always runs (try-finally)
      metrics.record(res, ex);
      return res != null ? res : Dashboard.empty();
  });

// completeOnTimeout — return default instead of throwing
cfU.completeOnTimeout(User.anonymous(), 500, TimeUnit.MS);
ForkJoinPool — Work-Stealing, RecursiveTask, Parallel Streams C10
Explain ForkJoinPool work-stealing algorithm. How are task deques organized? Why does compute() call right.compute() inline (not fork)? When should you use a custom FJP for parallel streams?

Each worker thread owns a double-ended deque. fork() pushes task to deque tail (LIFO). Worker pops from its own tail (LIFO — better cache locality for recursive tasks). Idle workers steal from other workers' heads (FIFO — reduces contention with owner).

Why compute-right-inline + fork-left pattern: By computing right subtask directly (not forking), you avoid creating an extra queued task for work you're about to do anyway. You only fork left — which another idle thread can steal. This halves task creation overhead.

Custom FJP for parallel streams: Stream.parallel() uses ForkJoinPool.commonPool() (shared JVM-wide). If your parallel stream does blocking IO, it starves common pool for other tasks. Wrap in custom FJP to isolate.

Java — RecursiveTask + custom FJP
class SumTask extends RecursiveTask<Long> {
    static final int THRESHOLD = 1_000;
    int[] arr; int lo, hi;

    @Override
    protected Long compute() {
        if (hi - lo <= THRESHOLD) {
            long sum = 0;
            for (int i = lo; i < hi; i++) sum += arr[i];
            return sum;
        }
        int mid = (lo + hi) / 2;
        SumTask left  = new SumTask(arr, lo, mid);
        SumTask right = new SumTask(arr, mid, hi);
        left.fork();                        // push left to deque tail
        return right.compute() + left.join(); // compute right inline
    }
}

// Custom FJP to isolate parallel stream from commonPool
ForkJoinPool custom = new ForkJoinPool(8);
long result = custom.submit(() ->
    bigList.parallelStream().mapToLong(Item::getValue).sum()
).get();
custom.shutdown();
ReadWriteLock — Writer Starvation, Lock Downgrade C11
How does ReentrantReadWriteLock work? Explain writer starvation and how fair mode addresses it. What is lock downgrade and when is it useful?

AQS state is split: high 16 bits = shared (read) count, low 16 bits = exclusive (write) count. Multiple readers can hold simultaneously; writer needs all read counts to reach 0.

Writer starvation (unfair mode): While readers continuously arrive and acquire the lock, a waiting writer never gets in. Unfair mode allows new read acquisitions to proceed as long as no writer currently holds.

Fair mode: A FIFO queue is maintained. Once a writer is waiting, subsequent read requests are queued behind it. Eliminates starvation but reduces throughput.

Lock downgrade: Holding write lock, acquire read lock, then release write lock — you atomically move from exclusive to shared without a gap where another writer could sneak in and modify data you already computed.

Java — ReadWriteLock + lock downgrade
class CachedData {
    volatile boolean cacheValid = false;
    final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
    Object[] data;

    void processCachedData() {
        rwl.readLock().lock();
        try {
            if (!cacheValid) {
                rwl.readLock().unlock(); // drop read before acquiring write
                rwl.writeLock().lock();
                try {
                    if (!cacheValid) {    // double-check after write lock
                        data = loadData();
                        cacheValid = true;
                    }
                    rwl.readLock().lock(); // DOWNGRADE: acquire read before releasing write
                } finally { rwl.writeLock().unlock(); } // release write
                // now holding read lock only — safe to proceed
            }
            use(data); // process under read lock
        } finally { rwl.readLock().unlock(); }
    }
}
CountDownLatch · CyclicBarrier · Phaser · Semaphore C12
Compare CountDownLatch, CyclicBarrier, and Phaser. What memory visibility does each guarantee? How does Semaphore implement rate limiting?
FeatureCountDownLatchCyclicBarrierPhaser
ReusableNo — one-shotYes — auto-resetsYes — multiple phases
Party countFixedFixedDynamic register/deregister
Barrier actionNoneOptional RunnableOverride onAdvance()
HB guaranteecountDown() HB await() returnawait() HB barrier action HB subsequent awaitersarrive() HB phase advance HB next phase start

Semaphore: AQS state = permit count. acquire() decrements (CAS). release() increments (CAS + unpark waiters). Use for connection pools, rate limiters, bounded resources.

Java — coordination primitives
// CountDownLatch — await all services to start
CountDownLatch ready = new CountDownLatch(3);
// each service calls ready.countDown() after init
ready.await(); // main waits — all writes before countDown() are visible

// CyclicBarrier — parallel batch processing
CyclicBarrier barrier = new CyclicBarrier(4,
    () -> log.info("Batch complete, merging results"));
// each worker calls barrier.await() when done with its shard
// barrier.reset() or it auto-resets for next batch round

// Semaphore — connection pool rate limiter
Semaphore semaphore = new Semaphore(10, true); // 10 permits, fair
semaphore.acquire(); // blocks if 10 already acquired
try { useResource(); } finally { semaphore.release(); }

// Phaser — dynamic multi-phase simulation
Phaser ph = new Phaser(1); // 1 initial party (orchestrator)
ph.register();                // add party dynamically
ph.arriveAndAwaitAdvance(); // participate and wait for phase N
ph.arriveAndDeregister();   // leave phaser
VarHandle — Access Modes, Acquire/Release, vs Unsafe C13
What is VarHandle (Java 9)? What are its access modes? How does acquire/release differ from volatile? Where is VarHandle used in JDK internals?

VarHandle is the safe, typed replacement for sun.misc.Unsafe. Created via MethodHandles.lookup().findVarHandle(...). Provides fine-grained memory ordering:

  • Plain: No ordering guarantee (same as regular field)
  • Opaque: Atomicity for that single variable only
  • Acquire: Read + LoadLoad/LoadStore fence after. Subsequent reads/writes not moved before this read. Used for "acquire the lock" semantics.
  • Release: Write + StoreStore/LoadStore fence before. Prior writes not moved after this write. Used for "release the lock" semantics.
  • Volatile: Full fence — same as volatile field access
  • compareAndSet, getAndAdd, etc.: Atomic RMW operations

Acquire/Release is cheaper than Volatile — only one-sided fence. Sufficient for lock implementations (lock acquire = getAcquire, lock release = setRelease).

JDK uses VarHandle in: ConcurrentHashMap (tabAt, casTabAt), AbstractQueuedSynchronizer, CompletableFuture, ForkJoinPool, StampedLock.

Java — VarHandle access modes
class MyClass {
    volatile int state = 0;
    static final VarHandle STATE;
    static {
        STATE = MethodHandles.lookup()
            .findVarHandle(MyClass.class, "state", int.class);
    }

    void demonstrate() {
        STATE.set(this, 1);                        // plain write
        int v = (int) STATE.getAcquire(this);      // acquire read
        STATE.setRelease(this, 2);                // release write
        STATE.compareAndSet(this, 0, 1);           // CAS volatile
        STATE.getAndAdd(this, 1);                  // atomic increment
        STATE.compareAndExchangeAcquire(this,0,1); // CAS + acquire fence
    }
}
False Sharing & @Contended — Cache-Line Isolation C14
What is false sharing and why does it destroy performance? How does @Contended solve it? Show how ConcurrentHashMap CounterCell uses it. How do you diagnose false sharing?

CPUs cache data in 64-byte cache lines. If thread A updates variable X and thread B updates variable Y, but X and Y occupy the same cache line, every update by A invalidates B's copy of the line (and vice versa) — even though they're logically independent. This "false sharing" can reduce throughput by 10–40× and causes cache coherency traffic.

@jdk.internal.vm.annotation.Contended pads the annotated field with ~128 bytes, ensuring it occupies its own cache line. For application code, use @sun.misc.Contended with -XX:-RestrictContended.

Diagnosis: Linux perf stat -e cache-misses,L1-dcache-load-misses. Java Flight Recorder CPU sample stacks. High LOCK XCHG instructions under perf. Async-profiler with --event cache-misses.

Java — false sharing + @Contended
// ❌ False sharing: a and b may share same 64-byte cache line
class Counters {
    volatile long a = 0; // Thread 1 writes
    volatile long b = 0; // Thread 2 writes — same cache line!
}

// ✅ Fix with @Contended (adds ~128 bytes padding around each)
class Counters {
    @jdk.internal.vm.annotation.Contended
    volatile long a = 0; // own 128-byte padded cache line
    @jdk.internal.vm.annotation.Contended
    volatile long b = 0; // own 128-byte padded cache line
}

// JDK ConcurrentHashMap uses this for its striped counter
@Contended static final class CounterCell {
    volatile long value; // each CounterCell on its own cache line
    // Without @Contended: 16 threads → 16 writes → all invalidate same line
    // With @Contended: each thread hits its own cache line → linear throughput
}

// Manual padding (pre-Java 9, ugly but works)
long p1,p2,p3,p4,p5,p6,p7; // 56 bytes prefix
volatile long value;         // 8 bytes on its own line
long q1,q2,q3,q4,q5,q6,q7; // 56 bytes suffix
LockSupport — park/unpark, Permit Semantics, AQS Foundation C15
What is LockSupport? Why is unpark-before-park valid (unlike notify-before-wait)? What memory visibility does unpark establish? How does AQS use it?

LockSupport provides per-thread permit-based blocking (max 1 permit per thread). park() blocks if permit unavailable; consumes permit if available. unpark(t) grants permit to thread t; if t is parked, it wakes immediately.

Why unpark-before-park is safe: The permit is stored. When park() is called later, it sees the permit and returns immediately. This is unlike Object.notify() which is lost if no thread is waiting — LockSupport has no such lost-wakeup problem.

Memory visibility: unpark(t) happens-before park() returns in thread t. Everything written before unpark is visible to the parked thread after it wakes.

Java — LockSupport mechanics
// park/unpark — AQS uses this for all lock implementations
Thread waiter = new Thread(() -> {
    System.out.println("parking...");
    LockSupport.park();          // blocks — may have spurious wakeups
    System.out.println("unparked");
});
waiter.start();
Thread.sleep(100);
// data write before unpark — waiter will see it
LockSupport.unpark(waiter);    // wake waiter — HB established

// unpark BEFORE park — still works
LockSupport.unpark(waiter);   // permit granted now
LockSupport.park(waiter);     // returns immediately — permit consumed

// Spurious wakeup guard (always check condition in loop)
while (!condition) {
    LockSupport.park(this);     // 'this' = blocker object for jstack
    if (Thread.interrupted())
        throw new InterruptedException();
}
Double-Checked Locking — Correct Singleton Patterns C16
Write correct double-checked locking. Why is volatile essential? Show two safer alternatives: Initialization-on-Demand Holder and enum singleton.

Without volatile: new Singleton() is 3 steps: allocate memory, call constructor, write reference. JIT/CPU can reorder to: allocate, write reference, call constructor. Thread B sees non-null reference before constructor finishes → reads partially constructed object.

volatile on instance prevents that reorder via StoreStore barrier. Constructor completion happens-before volatile write, which happens-before every subsequent read.

Java — 3 singleton patterns
// Pattern 1: DCL with volatile (acceptable)
public class Singleton {
    private static volatile Singleton instance; // volatile REQUIRED
    public static Singleton get() {
        if (instance == null) {              // fast path (no lock)
            synchronized (Singleton.class) {
                if (instance == null)        // re-check under lock
                    instance = new Singleton();
            }
        }
        return instance;
    }
}

// Pattern 2: Initialization-on-Demand Holder (preferred)
// Class loading is thread-safe by JVM spec (§12.4.2)
public class Singleton2 {
    private static class Holder {
        static final Singleton2 INSTANCE = new Singleton2(); // lazy+safe
    }
    public static Singleton2 get() { return Holder.INSTANCE; }
}

// Pattern 3: Enum singleton (simplest, serialization-safe)
public enum Config {
    INSTANCE;
    public void doWork() { ... } // guaranteed single instance
}
Config.INSTANCE.doWork();
Exchanger · ConcurrentLinkedQueue · Flow API C17
How does Exchanger work? Give a double-buffer production use case. How does ConcurrentLinkedQueue achieve lock-freedom? What is the Flow API and backpressure?

Exchanger<V>: Rendezvous point for exactly two threads — each calls exchange(v) and receives the other's value. Thread A blocks until thread B arrives (or timeout). Internally: CAS on a slot node with volatile item and waiter thread. Supports arenas for high-concurrency pair matching.

ConcurrentLinkedQueue: Michael-Scott algorithm. head and tail are volatile. Enqueue: CAS tail.next from null to newNode; update tail lazily. Dequeue: CAS head to head.next. Lock-free but not wait-free (CAS can retry). size() is O(n) — avoid.

Flow API (Java 9): Reactive Streams standard — Publisher, Subscriber, Subscription, Processor. Backpressure: Subscriber calls subscription.request(n) to pull n items. Publisher must not emit more than requested. Prevents fast producer from overwhelming slow consumer. Used by Spring WebFlux (Project Reactor) and Kafka Streams.

Java — Exchanger double-buffer
// Exchanger: zero-copy double buffer pattern
Exchanger<List<Event>> ex = new Exchanger<>();

// Producer
new Thread(() -> {
    List<Event> buf = new ArrayList<>();
    while(true) {
        buf.add(nextEvent());
        if (buf.size() == 100) {
            buf = ex.exchange(buf); // swap full for empty list
            buf.clear();
        }
    }
});

// Consumer
new Thread(() -> {
    List<Event> buf = new ArrayList<>();
    while(true) {
        buf = ex.exchange(buf);    // get full, return empty
        buf.forEach(this::process);
        buf.clear();
    }
});
AtomicReferenceFieldUpdater vs AtomicReference — Zero GC Overhead C18
When do you use AtomicReferenceFieldUpdater instead of wrapping fields in AtomicReference? Where does the JDK use this pattern?

AtomicReference<V> is a separate heap object (~24+ bytes per instance). In a data structure with millions of nodes, using one AtomicReference per node can add gigabytes of overhead.

AtomicReferenceFieldUpdater performs CAS directly on a volatile field using a single shared updater instance (reflection/VarHandle). Zero additional allocation per node — only the volatile field cost.

JDK usage: ConcurrentHashMap.Node (next field), FutureTask (waiters), AbstractQueuedSynchronizer (head, tail, state), SynchronousQueue.

Java — ARFU vs AtomicReference
// ❌ AtomicReference: 1 extra heap object per node
class Node<T> {
    T value;
    AtomicReference<Node<T>> next = new AtomicReference<>();
    // 24+ bytes per node just for the AtomicReference wrapper
}

// ✅ ARFU: one shared updater, zero per-node overhead
class Node<T> {
    T value;
    volatile Node<T> next;   // just a volatile reference

    static final AtomicReferenceFieldUpdater<Node, Node> NEXT =
        AtomicReferenceFieldUpdater.newUpdater(Node.class, Node.class, "next");

    boolean casNext(Node<T> expect, Node<T> update) {
        return NEXT.compareAndSet(this, expect, update);
    }
}

// VarHandle alternative (Java 9+, preferred)
static final VarHandle NEXT_VH = MethodHandles.lookup()
    .findVarHandle(Node.class, "next", Node.class);
Virtual Threads — Java 21 Project Loom 🆕 C19 Java 21
What are Virtual Threads (Java 21)? How do they differ from platform threads? What is carrier thread pinning and when does it occur? How does Spring Boot 3.2+ use them?

Virtual threads are lightweight JVM-managed threads. Thousands of virtual threads are multiplexed onto a small pool of OS platform threads ("carrier threads"). When a virtual thread blocks (IO, sleep, lock), the JVM unmounts it from its carrier thread — the carrier is freed to run another virtual thread.

Key differences from platform threads: Stack is heap-allocated (grows/shrinks, starts tiny ~1KB). Creating millions is cheap. Ideal for blocking IO workloads — no need for reactive/callback style.

Carrier thread pinning: When a virtual thread is blocked inside a synchronized block, it cannot unmount — the carrier thread is pinned. This can exhaust the carrier pool. Fix: replace synchronized with ReentrantLock inside VT-heavy code.

Spring Boot 3.2+: spring.threads.virtual.enabled=true switches Tomcat/Jetty/Undertow to use virtual thread per request — replaces thread-per-request pool model. Each HTTP request gets a fresh virtual thread; blocking DB/IO calls are cheap.

Java 21 — Virtual Threads
// Create virtual threads
Thread vt = Thread.ofVirtual().name("vt-1").start(() -> {
    // Blocking IO here is fine — carrier freed while blocked
    var result = dbService.queryBlocking(id); // unmounts, doesn't block carrier
});

// ExecutorService with virtual threads — no pooling needed!
try (var exec = Executors.newVirtualThreadPerTaskExecutor()) {
    for (var req : requests)
        exec.submit(() -> handleRequest(req)); // 1M concurrent = fine
}

// ⚠ PINNING: synchronized + blocking = carrier pinned!
synchronized(this) {
    Thread.sleep(1000); // carrier thread blocked — pinning!
}
// ✅ Fix: use ReentrantLock
lock.lock();
try { Thread.sleep(1000); } // carrier freed — no pinning
finally { lock.unlock(); }

// Spring Boot 3.2+ application.yml
// spring.threads.virtual.enabled: true
✅ Virtual Thread Best Practices Don't pool virtual threads — create them freely. Don't use ThreadLocal for large values (memory multiplied by thread count). Use Semaphore to limit concurrency to downstream services. Virtual threads are not faster for CPU-bound tasks — use ForkJoinPool for those.
Structured Concurrency & Scoped Values — Java 21 Preview 🆕 C20 Java 21
What is Structured Concurrency (JEP 453)? How does it differ from CompletableFuture for parallel subtasks? What are ScopedValues (JEP 446) and why are they better than ThreadLocal for virtual threads?

Structured Concurrency: Groups related virtual threads into a "scope" with clear owner lifetime semantics. When the scope closes, all forked threads are joined. If one subtask fails, the scope cancels others. Solves the problem of leaked threads and scattered error handling in CompletableFuture chains.

vs CompletableFuture: CF chains are not tied to a single owner thread — completion can happen on any thread, making cancellation/shutdown complex. StructuredTaskScope enforces parent-child lifetime hierarchy.

ScopedValues (JEP 446): Immutable per-scope binding — one value per scope, shared (read-only) across all child virtual threads. Unlike ThreadLocal: not mutable, not inherited by all pools, no remove() needed, no memory leak risk. Perfect for request context in virtual-thread servers.

Java 21 — StructuredTaskScope + ScopedValue
// StructuredTaskScope: ShutdownOnFailure pattern
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    Subtask<User>    userTask = scope.fork(() -> fetchUser(id));
    Subtask<Orders> orderTask = scope.fork(() -> fetchOrders(id));

    scope.join(); // wait for all tasks
    scope.throwIfFailed(); // propagate first failure, cancels others

    return new Dashboard(userTask.get(), orderTask.get());
} // scope closes: all forked threads guaranteed complete or cancelled

// ScopedValue: immutable per-request context
static final ScopedValue<RequestContext> CTX = ScopedValue.newInstance();

// Set at request entry point
ScopedValue.where(CTX, new RequestContext(correlationId))
    .run(() -> handleRequest());

// Read anywhere in the call tree (including child virtual threads)
void deepMethod() {
    String id = CTX.get().correlationId(); // no ThreadLocal leak possible
}

Part III — Java 8–21 Key Features

Streams · Collectors · Optional · Records · Sealed classes · Pattern matching

J1–J8
Streams — Lazy Evaluation, Short-Circuit, Spliterator J1
Explain lazy evaluation in streams. Which operations are short-circuit? How does Spliterator enable parallelism? When does a stream throw ConcurrentModificationException?

Lazy evaluation: Intermediate operations (filter, map, flatMap, sorted, peek) build a pipeline but execute nothing. The pipeline fires only when a terminal operation (collect, forEach, findFirst, count, reduce) is invoked. This means the pipeline iterates the source once — all operations fused per element.

Short-circuit: findFirst(), findAny(), anyMatch(), allMatch(), noneMatch(), limit(n) — can stop processing before all elements are consumed.

Spliterator: The strategy for traversal and splitting. trySplit() splits data for parallel processing. Characteristics: ORDERED, DISTINCT, SORTED, SIZED, NONNULL, IMMUTABLE, CONCURRENT, SUBSIZED. Parallel streams automatically split via Spliterator and submit to ForkJoinPool.commonPool().

CME in streams: Structurally modifying the source collection during stream pipeline execution causes CME (same modCount mechanism as iterator).

Java — stream lazy + short-circuit
// Only 4 filter calls — stops at first match ≥ 10
List.of(1,3,5,10,20).stream()
    .filter(n -> { System.out.print(n + " "); return n >= 10; })
    .findFirst(); // prints: 1 3 5 10  (not 20!)

// Intermediate ops are STATELESS vs STATEFUL:
// Stateless: filter, map, flatMap, peek — no ordering dependency
// Stateful: sorted, distinct, limit — need to see data before outputting

// Custom Spliterator for custom data source
class RangeSpliterator implements Spliterator.OfInt {
    int current, end;
    public boolean tryAdvance(IntConsumer action) {
        if (current < end) { action.accept(current++); return true; }
        return false;
    }
    public Spliterator.OfInt trySplit() {
        int mid = (current + end) / 2;
        if (mid == current) return null; // too small to split
        int lo = current; current = mid;
        return new RangeSpliterator(lo, mid); // left half to new thread
    }
}
Collectors — groupingBy, teeing, Custom Collector, Parallel Pitfalls J2
Demonstrate groupingBy with downstream collector, partitioningBy, toMap with merge function, teeing (Java 12), and implement a custom Collector. When should you avoid parallel streams?

Custom Collector requires implementing Collector<T, A, R>: supplier (create mutable accumulator), accumulator (fold one element), combiner (merge parallel accumulators), finisher (transform A to R), characteristics (IDENTITY_FINISH, CONCURRENT, UNORDERED).

Avoid parallel streams when: source is small (<10K elements), operations have side effects or shared mutable state, source is ordered with unordered-sensitive operations, IO-bound operations (use virtual threads instead), tasks are unequal in size (work stealing won't help).

Java — advanced Collectors
// groupingBy with downstream: dept → avg salary
Map<String, Double> avgSalary = employees.stream()
    .collect(Collectors.groupingBy(
        Employee::getDept,
        Collectors.averagingDouble(Employee::getSalary)));

// toMap with merge (avoid key collision exception)
Map<String, Long> wordCount = words.stream()
    .collect(Collectors.toMap(w -> w, w -> 1L, Long::sum));

// teeing (Java 12): two collectors in one pass
record Stats(double avg, Optional<Employee> highest) {}
Stats stats = employees.stream().collect(Collectors.teeing(
    Collectors.averagingDouble(Employee::getSalary),
    Collectors.maxBy(Comparator.comparingDouble(Employee::getSalary)),
    Stats::new));

// Custom Collector: immutable set with running total
Collector<Integer, ?, Map<String,Object>> custom =
    Collector.of(
        () -> new int[]{0, 0},             // [sum, count]
        (arr, n) -> { arr[0]+=n; arr[1]++; }, // accumulate
        (a, b) -> new int[]{a[0]+b[0], a[1]+b[1]}, // combine (parallel)
        arr -> Map.of("sum", arr[0], "avg", arr[1]>0 ? arr[0]/arr[1] : 0) // finisher
    );
Java 11 — String methods, var in lambda, Files, HTTP Client J3 🆕
What are the most useful Java 11 additions? Cover String methods, var in lambda parameters, Files.readString/writeString, and the new HttpClient API.
Java 11 — key additions
// String methods
"  hello  ".strip();         // "hello" — Unicode-aware (vs trim)
"  ".isBlank();             // true — whitespace check
"a\nb\nc".lines().toList(); // ["a","b","c"]
"ab".repeat(3);             // "ababab"
"".stripLeading();           // strip only from left/right

// var in lambda parameters (Java 11)
(var s, var t) -> s + t      // allows @NotNull var s — for annotations

// Files.readString / writeString
String content = Files.readString(Path.of("config.json"));
Files.writeString(Path.of("out.txt"), "hello", StandardOpenOption.CREATE);

// New HttpClient (non-blocking)
HttpClient client = HttpClient.newBuilder()
    .connectTimeout(Duration.ofSeconds(5))
    .build();
HttpRequest req = HttpRequest.newBuilder()
    .uri(URI.create("https://api.example.com/data"))
    .header("Authorization", "Bearer " + token)
    .GET().build();

// Async HTTP call
CompletableFuture<String> body = client
    .sendAsync(req, HttpResponse.BodyHandlers.ofString())
    .thenApply(HttpResponse::body);

// Collection.toArray with type
String[] arr = list.toArray(String[]::new); // no reflection needed
Java 14–17 — Records, Sealed Classes, Pattern Matching J4 🆕
What do Records (Java 16) provide? What are Sealed classes (Java 17) and how do they pair with pattern matching switch? How does Text Blocks (Java 15) improve readability?

Records (Java 16): Immutable data carriers. Compiler auto-generates: constructor, accessors (no get prefix), equals, hashCode, toString. Cannot extend other classes (implicitly extends Record). Can implement interfaces.

Sealed classes (Java 17): Restrict which classes can extend/implement. Paired with pattern matching switch — compiler can verify exhaustiveness.

Java 16/17 — Records + Sealed + Pattern Matching
// Record — immutable data carrier
record Point(double x, double y) {
    // compact canonical constructor for validation
    Point { if (x < 0) throw new IllegalArgumentException(); }
    double distance() { return Math.hypot(x, y); } // extra methods OK
}
Point p = new Point(3, 4);
p.x(); p.y(); p.distance(); // accessors — no getX()

// Sealed class + exhaustive pattern matching switch
sealed interface Shape
    permits Circle, Rectangle, Triangle {}

record Circle(double r) implements Shape {}
record Rectangle(double w, double h) implements Shape {}
record Triangle(double base, double h) implements Shape {}

double area(Shape s) {
    return switch (s) { // Java 21: exhaustive — no default needed
        case Circle(var r)         -> Math.PI * r * r;
        case Rectangle(var w, var h) -> w * h;
        case Triangle(var b, var h)  -> 0.5 * b * h;
    };
}

// Text blocks (Java 15)
String json = """
    {
        "name": "Bheemesh",
        "role": "Senior Java Developer"
    }
    """; // indentation stripped automatically
Java 21 — Pattern Matching, Switch Expressions, SequencedCollection 🆕 J5 Java 21
What are the finalized features in Java 21? Cover pattern matching for switch, guard clauses (when), SequencedCollection, and String Templates (preview).
Java 21 — finalized features
// Pattern Matching for switch (JEP 441 — finalized)
Object obj = getResult();
String desc = switch (obj) {
    case Integer i    -> "int: " + i;
    case String s     -> "str: " + s;
    case Long l when l > 1000 -> "big long"; // guard clause
    case null         -> "null";
    default           -> "unknown";
};

// SequencedCollection (JEP 431 — finalized)
// New interface for ordered collections with defined first/last
SequencedCollection<String> seq = new ArrayList<>(List.of("a","b","c"));
seq.getFirst();           // "a" — unified API across List/Deque/LinkedHashSet
seq.getLast();            // "c"
seq.addFirst("z");        // ["z","a","b","c"]
seq.reversed();           // reversed view ["c","b","a","z"]

// SequencedMap — firstEntry / lastEntry
SequencedMap<String,Integer> sm =
    new LinkedHashMap<>(Map.of("a",1,"b",2));
sm.firstEntry();  sm.lastEntry();

// Record Patterns in switch (JEP 440 — finalized)
if (shape instanceof Circle(var r) && r > 10) {
    // r is in scope here — pattern match + destructuring
}

// Unnamed Classes and Instance Main (Java 21 preview)
// void main() { System.out.println("Hello"); } — no class needed
Optional — Correct Usage, Anti-Patterns, Java 9–11 Additions J6
What are correct and incorrect usages of Optional? Explain the difference between map/flatMap/orElse/orElseGet/orElseThrow. What did Java 9–11 add?

Rule 1: Never use Optional as a field, parameter, or in collections — only as a method return type to signal "this method may not return a value."

map vs flatMap: map(fn) wraps result: if fn returns T, you get Optional<T>. flatMap(fn) flattens: fn must return Optional<T>, result is Optional<T> (no nesting).

orElse vs orElseGet: orElse(default) evaluates the default expression always even if Optional has value. orElseGet(() -> expensive()) is lazy — only calls the supplier if empty.

Java — Optional patterns
// ✅ Correct — return type only
public Optional<User> findById(Long id) { ... }

// map vs flatMap
Optional<String> name = findUser(id)
    .map(User::getName);            // Optional<String>
Optional<Address> addr = findUser(id)
    .flatMap(User::getOptionalAddr); // flatten Optional<Optional<Address>>

// orElse: ALWAYS evaluates even if present
String s = opt.orElse(expensiveDefault()); // called unconditionally!
// orElseGet: LAZY — only calls if empty
String s2 = opt.orElseGet(() -> expensiveDefault()); // only if empty

// Java 9+: ifPresentOrElse
opt.ifPresentOrElse(v -> log.info("found {}", v),
                    () -> log.warn("not found"));

// Java 9+: or — supply alternative Optional
Optional<User> user = findInDB(id)
    .or(() -> findInCache(id));   // fallback chain

// Java 9+: stream — bridge to Stream pipeline
List<User> users = ids.stream()
    .map(this::findById)          // Stream<Optional<User>>
    .flatMap(Optional::stream)   // Java 9: unwrap non-empty
    .toList();
🧩

Part IV — Code Puzzles & Edge Cases

Tricky interview gotchas · Memory corruption · Concurrent bugs

P1–P6
HashMap Concurrent Resize — Java 7 vs Java 8 Bugs P1
What happens when two threads simultaneously put into HashMap and trigger resize? Explain the Java 7 infinite loop bug and the Java 8+ data loss bug.

Java 7 — infinite loop bug: Resize transferred entries using a reverse-order loop. Thread A starts resize, reverses entries in a bucket. Thread B starts resize concurrently. When both complete, one bucket can contain a circular link: Node A → Node B → Node A. Any subsequent get() on that bucket loops forever at 100% CPU.

Java 8+ — data loss: Java 8 fixed the circular link but introduced a different race. Both threads allocate new tables independently. One thread's table = newTable write overwrites the other's → all entries inserted by the losing thread vanish silently. No exception, no log — entries just gone.

Java — concurrent HashMap danger
// Java 7 circular list (simplified transfer logic)
void transfer(Entry[] newTable) {
    for (Entry<K,V> e : table) {
        while (e != null) {
            Entry<K,V> next = e.next;
            e.next = newTable[i]; // Thread A reverses here
            newTable[i] = e;      // Thread B does same → A.next=B, B.next=A
            e = next;             // → CIRCULAR → infinite loop on get()
        }
    }
}

// Java 8: table reference overwrite data loss
// Thread A: newTab = new Node[32]; ... table = newTab; ✓
// Thread B: newTab = new Node[32]; ... table = newTab; ← overwrites A!
// → A's entries inserted before B's resize are LOST

// ✅ Always use ConcurrentHashMap
Map<K,V> safe = new ConcurrentHashMap<>();
// CAS + per-bucket sync: zero data loss, no infinite loops
Volatile Puzzle — What Happens Without volatile? P2
In this code, what are two bugs without volatile? Trace the exact memory ordering issue.
Java — volatile puzzle
class BrokenSingleton {
    private static BrokenSingleton instance; // NOT volatile
    private int value = 42;

    public static BrokenSingleton get() {
        if (instance == null) {
            synchronized(BrokenSingleton.class) {
                if (instance == null)
                    instance = new BrokenSingleton(); // BUG 1
            }
        }
        return instance; // BUG 2
    }

    void use() {
        BrokenSingleton s = get();
        assert s.value == 42; // Can FAIL!
    }
}

// BUG 1: new BrokenSingleton() may be reordered by JIT:
//   a) allocate memory
//   b) write reference to instance  ← Thread B sees non-null here
//   c) initialize fields (value=42)  ← NOT YET DONE
// Thread B reads instance != null, skips sync, reads value = 0!

// BUG 2: Without volatile, Thread B may read stale null
// from its CPU cache even after Thread A wrote it — visibility!

// FIX: private static volatile BrokenSingleton instance;
// volatile write establishes HB: constructor end HB volatile write HB every read
CompletableFuture — thenApply vs thenCompose Puzzle P3
What is the TYPE of r1 and r2? Which is correct for chaining an async lookup? Which is the "nested future" mistake?
Java — CF composition puzzle
CompletableFuture<Integer> idFuture =
    CompletableFuture.supplyAsync(() -> 42);

// Q1: What is the type of r1?
var r1 = idFuture.thenApply(id ->
    CompletableFuture.supplyAsync(() -> "User:" + id));
// A: CompletableFuture<CompletableFuture<String>> — NESTED! Wrong.
// You'd need r1.join().join() — bad.

// Q2: What is the type of r2?
var r2 = idFuture.thenCompose(id ->
    CompletableFuture.supplyAsync(() -> "User:" + id));
// A: CompletableFuture<String> — FLATTENED. Correct!
// thenCompose = flatMap in Stream / Mono.flatMap in Reactor

// Rule: thenApply when fn returns T (not CF)
//       thenCompose when fn returns CompletionStage<T>

// Q3: What thread executes thenApply?
var r3 = idFuture.thenApply(id -> id * 2);
// A: The thread that completed idFuture (ForkJoinPool worker)
//    OR the calling thread if already complete at registration time
// thenApplyAsync: ALWAYS a new ForkJoinPool thread
Parallel Stream — Shared Mutable State Bug P4
Why does this parallel stream produce wrong results? Show the fix with a thread-safe collector.
Java — parallel stream race condition
// ❌ BUG: ArrayList not thread-safe — concurrent add causes corruption
List<Integer> results = new ArrayList<>();
IntStream.range(0, 1000).parallel()
    .forEach(results::add);
// results.size() may be 983, 995, or throw ArrayIndexOutOfBoundsException
// Two threads expand array simultaneously → corrupted state

// ❌ BUG: shared counter — race condition
int[] count = {0};
IntStream.range(0, 1000).parallel()
    .forEach(i -> count[0]++); // lost updates!

// ✅ FIX 1: use collect() — thread-safe combiner
List<Integer> safe = IntStream.range(0, 1000).parallel()
    .boxed()
    .collect(Collectors.toList()); // always correct — thread-safe merge

// ✅ FIX 2: thread-safe collection
List<Integer> safe2 = new CopyOnWriteArrayList<>();
// or Vector, or ConcurrentLinkedQueue

// ✅ FIX 3: LongAdder for counting
LongAdder cnt = new LongAdder();
IntStream.range(0, 1000).parallel().forEach(i -> cnt.increment());
cnt.sum(); // always 1000
Stream Reuse & Stateful Lambda Puzzle P5
What happens when you reuse a stream? Why is this stateful lambda dangerous in parallel mode? What is the output and why?
Java — stream gotchas
// Puzzle 1: Stream reuse — throws IllegalStateException
Stream<String> s = Stream.of("a","b","c");
s.forEach(System.out::println);
s.forEach(System.out::println); // IllegalStateException: stream already operated
// Streams are single-use pipelines

// Puzzle 2: stateful lambda in parallel (non-deterministic)
List<Integer> seen = new ArrayList<>();
IntStream.range(0,10).parallel()
    .filter(i -> { seen.add(i); return i%2==0; }) // STATEFUL SIDE EFFECT
    .count();
// seen.size() may be 6, 8, 10 — non-deterministic
// filter may be called in any order, possibly concurrently

// Puzzle 3: sorted + parallel order
List.of(3,1,4,1,5).parallelStream()
    .sorted()
    .forEachOrdered(System.out::println); // [1,1,3,4,5] — ordered despite parallel
// sorted() is stateful — must see all before outputting
// forEachOrdered preserves encounter order in parallel

// Puzzle 4: peek() in parallel
long count = List.of(1,2,3).parallelStream()
    .peek(System.out::println)  // may print in any order
    .count();                   // count = 3, always correct
ConcurrentHashMap — compute, merge, atomic word-count P6 🆕
What is the difference between putIfAbsent, compute, and merge in ConcurrentHashMap? Why is compute atomic? Implement thread-safe word counting using merge.

putIfAbsent(k, v): atomically puts only if key absent. Returns old value (or null). Does not compute lazily.

compute(k, fn): atomically applies fn to key+current value. The entire read-compute-write is done under the bucket lock — no race condition. Returns new value. If fn returns null, removes the key.

merge(k, v, fn): if key absent, puts v; if present, fn(oldVal, v) determines new value. Same atomic guarantee as compute. Perfect for aggregation.

Java — CHM atomic operations
// Thread-safe word counter via merge
ConcurrentHashMap<String, Long> wordCount = new ConcurrentHashMap<>();

// merge: if absent insert 1L; if present add 1L (atomic)
words.parallelStream()
    .forEach(w -> wordCount.merge(w, 1L, Long::sum));

// compute — atomic read-modify-write for complex state
wordCount.compute("hello", (k, v) -> v == null ? 1L : v + 1L);

// computeIfAbsent — lazy initialization pattern
ConcurrentHashMap<String, List<Event>> groups = new ConcurrentHashMap<>();
groups.computeIfAbsent(key, k -> new CopyOnWriteArrayList<>())
      .add(event); // thread-safe group creation

// ⚠ DON'T do this — not atomic, race between get + put
if (!map.containsKey(k)) map.put(k, new ArrayList<>()); // RACE!

// frequency — Java 8 reduce + CHM
Map<String, Long> freq = words.stream()
    .collect(Collectors.groupingByConcurrent(
        w -> w, Collectors.counting()));  // parallel-safe
🏭

Part V — Framework & Production Patterns

Spring · Hibernate · Kafka · Snowflake · Real collection choices in production

F1–F4
Spring · Hibernate · Kafka — Internal Map Choices F1
Where do Spring, Hibernate, and Kafka use specific collection types internally and why? This tests whether you understand real production code, not just theory.

Spring:

  • Bean registry (DefaultSingletonBeanRegistry): ConcurrentHashMap<String, Object> singletonObjects — concurrent reads during initialization
  • Event listener lists: CopyOnWriteArraySet via AbstractApplicationEventMulticaster — read-heavy, rarely written after context start
  • AOP proxy cache: ConcurrentHashMap mapping target class → proxy class
  • MVC handler mappings: LinkedHashMap to preserve registration order of URL patterns

Hibernate:

  • 1st level cache (Session): HashMap<EntityKey, Object> — per-session, not shared, no concurrency needed
  • 2nd level cache: ConcurrentHashMap via Ehcache/Caffeine (region-based)
  • Metadata cache: WeakHashMap for ClassLoader-sensitive metadata

Kafka:

  • Consumer offset tracking: HashMap<TopicPartition, OffsetAndMetadata> — single consumer thread per partition, no sync needed
  • Leader state (Broker): ConcurrentHashMap<TopicPartition, Partition>
  • Producer metadata: ConcurrentSkipListMap for sorted partition/topic navigation

Snowflake (JDBC / Spring): Partition metadata range queries use TreeMap<Long, Partition> for timestamp-based subMap() / floorKey() lookups when pruning micro-partitions.

Production Concurrency — Rate Limiter, Circuit Breaker, Caching F2
How do you implement a thread-safe rate limiter using Semaphore? A simple concurrent cache using ConcurrentHashMap.computeIfAbsent? And the token bucket pattern?
Java — production concurrency patterns
// Rate limiter with Semaphore (N requests/sec)
class SemaphoreRateLimiter {
    private final Semaphore sem;
    SemaphoreRateLimiter(int rps) {
        sem = new Semaphore(0);
        Executors.newSingleThreadScheduledExecutor()
            .scheduleAtFixedRate(() -> sem.release(rps), 0, 1, TimeUnit.SECONDS);
    }
    void acquire() throws InterruptedException { sem.acquire(); }
}

// Concurrent cache with computeIfAbsent + expiry
class TimedCache<K,V> {
    record Entry<V>(V value, long expiryMs) {}
    private final ConcurrentHashMap<K, Entry<V>> map = new ConcurrentHashMap<>();

    V get(K key, Supplier<V> loader, long ttlMs) {
        long now = System.currentTimeMillis();
        return map.compute(key, (k, e) -> {
            if (e != null && e.expiryMs() > now) return e; // fresh
            return new Entry<>(loader.get(), now + ttlMs);  // load + refresh
        }).value();
    }
}

// Token bucket rate limiter (AtomicLong based)
class TokenBucket {
    private final int capacity;
    private final AtomicLong tokens;
    private volatile long lastRefill = System.nanoTime();

    synchronized boolean tryConsume() {
        refill();
        if (tokens.get() > 0) { tokens.decrementAndGet(); return true; }
        return false;
    }
    private void refill() {
        long now = System.nanoTime();
        long add = (now - lastRefill) / 1_000_000; // per ms
        if (add > 0) {
            tokens.set(Math.min(capacity, tokens.get() + add));
            lastRefill = now;
        }
    }
}
Spring @Async + MDC + Virtual Thread Migration F3
How do you propagate MDC context to @Async methods? What Spring Boot 3.2 configuration enables virtual threads? What ThreadLocal-related caveats exist?
Java / YAML — Spring @Async MDC + Virtual Threads
// MDC-aware TaskDecorator for @Async propagation
@Bean
public Executor asyncExecutor() {
    ThreadPoolTaskExecutor ex = new ThreadPoolTaskExecutor();
    ex.setTaskDecorator(runnable -> {
        Map<String,String> ctx = MDC.getCopyOfContextMap(); // capture
        return () -> {
            try {
                if (ctx != null) MDC.setContextMap(ctx); // restore in new thread
                runnable.run();
            } finally { MDC.clear(); } // always clean up
        };
    });
    ex.setCorePoolSize(10);
    ex.initialize();
    return ex;
}

// Spring Boot 3.2+: enable virtual threads for Tomcat
// application.yml:
// spring:
//   threads:
//     virtual:
//       enabled: true
// This switches Tomcat's request handler from platform thread pool
// to Executors.newVirtualThreadPerTaskExecutor()

// Spring Boot 3.2+: virtual thread bean config
@Bean
public Executor virtualThreadExecutor() {
    return Executors.newVirtualThreadPerTaskExecutor();
}

// ⚠ ThreadLocal caveat with virtual threads:
// Each VT has its own ThreadLocalMap — value not shared, but
// with 100k concurrent VTs, 100k copies of ThreadLocal data in heap
// Use ScopedValue instead for read-only per-request context
Kafka — Concurrency, Partition Assignment, Offset Management F4
How does Kafka consumer concurrency work with Spring Kafka? Why is a KafkaConsumer not thread-safe? How does Spring @KafkaListener manage concurrent partition processing?

KafkaConsumer thread-safety: Kafka's KafkaConsumer is NOT thread-safe — all calls must happen from the same polling thread. Spring Kafka wraps it in a KafkaMessageListenerContainer with a dedicated polling loop thread per partition (or per consumer instance).

Concurrency in Spring: ConcurrentKafkaListenerContainerFactory with concurrency=3 creates 3 KafkaMessageListenerContainer instances, each with their own thread, consuming from assigned partitions.

Java / YAML — Spring Kafka concurrency
// Factory configuration
@Bean
public ConcurrentKafkaListenerContainerFactory<String,Order> factory() {
    var factory = new ConcurrentKafkaListenerContainerFactory<String,Order>();
    factory.setConsumerFactory(consumerFactory);
    factory.setConcurrency(3); // 3 threads = 3 partition consumers
    factory.getContainerProperties()
           .setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
    return factory;
}

// Listener with manual offset commit
@KafkaListener(topics = "orders", concurrency = "3")
public void onOrder(ConsumerRecord<String,Order> record,
                     Acknowledgment ack) {
    try {
        processOrder(record.value());
        ack.acknowledge(); // manual commit after success
    } catch (Exception e) {
        log.error("Processing failed: {}", record.offset(), e);
        // don't ack → message retried on next poll
    }
}

// Offset tracking: HashMap<TopicPartition, OffsetAndMetadata>
// Per-thread (per container) — no synchronization needed
// ConcurrentHashMap used at the KafkaConsumer coordinator level