Senior Java Interview
Master Prep Guide
Every question from your interviews — answered in depth
CI (Continuous Integration) is the practice of automatically building and testing code on every push. CD (Continuous Delivery/Deployment) automates the release to staging or production.
In my project: We used GitLab CI/CD. Every PR triggered a pipeline with stages: build → test → SonarQube scan → Docker build → deploy to staging. On merge to main, it auto-deployed to production via Kubernetes rolling update.
# .gitlab-ci.yml example stages: - build - test - quality - docker - deploy build: stage: build script: mvn clean package -DskipTests test: stage: test script: mvn test sonar: stage: quality script: mvn sonar:sonar -Dsonar.host.url=$SONAR_URL
VM: Runs a full OS on top of a hypervisor. Heavy — each VM has its own kernel, GBs of disk, slow startup.
Docker: Shares the host OS kernel. Containers are lightweight processes — start in seconds, MBs of overhead.
- Isolation: VMs = hardware-level; Docker = process-level (namespaces + cgroups)
- Portability: Docker images run identically on dev/staging/prod
- Use case: Microservices → Docker. Legacy monolith requiring OS isolation → VM
# Dockerfile for Spring Boot FROM eclipse-temurin:21-jre-alpine WORKDIR /app COPY target/*.jar app.jar EXPOSE 8080 ENTRYPOINT ["java", "-jar", "app.jar"]
Three strategies depending on risk tolerance:
- Blue-Green: Two identical environments (blue=live, green=new). Switch load balancer after green is healthy. Instant rollback by flipping back.
- Canary: Route 5% of traffic to new version. Monitor error rates. Gradually increase to 100%.
- Rolling (Kubernetes default): Replace pods one at a time. New pod must pass readiness probe before old one terminates.
# Kubernetes rolling update strategy strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # 1 extra pod during update maxUnavailable: 0 # never take pods down first
Logging Stack: SLF4J + Logback → structured JSON logs → shipped to ELK (Elasticsearch + Logstash + Kibana) or Loki + Grafana.
Monitoring Stack: Micrometer (metrics) → Prometheus (scrape) → Grafana (dashboards). Spring Boot Actuator exposes /actuator/prometheus endpoint.
Distributed Tracing: Micrometer Tracing (Spring Boot 3.x) with Zipkin or Tempo. Every request gets a traceId propagated across services via HTTP headers.
# application.yml management: endpoints: web: exposure: include: health,info,prometheus,metrics tracing: sampling: probability: 1.0
Yes. Key Kubernetes concepts used in production:
- Deployment: Manages pod replicas and rolling updates
- Service: ClusterIP for internal, LoadBalancer for external traffic
- ConfigMap/Secret: Externalize config and credentials
- HPA (Horizontal Pod Autoscaler): Scale pods based on CPU/memory
- Ingress: Route external HTTP/S traffic to services (NGINX Ingress)
- Namespace: Logical isolation per team/environment
Multi-stage builds separate the build environment from the runtime image. The final image only has what's needed to run — no JDK, no Maven, no source code. Result: smaller, safer image.
# Stage 1: Build FROM maven:3.9-eclipse-temurin-21 AS builder WORKDIR /build COPY pom.xml . RUN mvn dependency:go-offline # cache deps layer COPY src ./src RUN mvn package -DskipTests # Stage 2: Runtime only FROM eclipse-temurin:21-jre-alpine RUN addgroup -S app && adduser -S app -G app # non-root user USER app WORKDIR /app COPY --from=builder /build/target/*.jar app.jar EXPOSE 8080 ENTRYPOINT ["java", "-XX:+UseContainerSupport", "-jar", "app.jar"]
Multiple layers — outer to inner:
- Spring Profiles: application-dev.yml, application-prod.yml. Activated via SPRING_PROFILES_ACTIVE env var.
- Config Server: Central Git-backed config repo. Services fetch their config at startup. Supports per-environment directories.
- Secrets: Never in Git. Injected via Kubernetes Secrets, AWS Secrets Manager, or HashiCorp Vault. Referenced as env vars in pod spec.
- CI/CD pipeline variables: GitLab/Jenkins masked variables per environment. Pipeline passes them at deploy time.
# Kubernetes Secret injected as env var env: - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-secret key: password
Helm is the package manager for Kubernetes. A Helm chart is a collection of YAML templates for all K8s resources (Deployment, Service, Ingress, ConfigMap, etc.) parameterized with a values.yaml.
- Why Helm: Without it, you maintain dozens of separate YAML files per service per environment. Helm lets you template once, override values per environment.
- Workflow: Developer pushes code → CI builds Docker image, tags with git SHA → CD updates values.yaml image tag → helm upgrade --install deploys to cluster.
- Rollback: helm rollback <release> <revision> — instant, auditable.
# Helm deploy in CI/CD pipeline
helm upgrade --install order-service ./charts/order-service \
--set image.tag=$CI_COMMIT_SHA \
--set replicaCount=3 \
-f values-prod.yaml \
--namespace production
Containers have fixed CPU/memory. JVM must be aware of container limits — not host machine resources.
- -XX:+UseContainerSupport (default Java 10+): JVM reads cgroup limits, not host RAM. Critical in containers.
- Heap sizing: -XX:MaxRAMPercentage=75.0 — use 75% of container memory for heap, leave rest for off-heap (Metaspace, thread stacks, NIO buffers).
- GC choice: G1GC (default Java 9+) is good for most. ZGC for low-latency requirements (<1ms pauses). Shenandoah for consistent pause time.
- Thread pool: Default ForkJoinPool uses Runtime.availableProcessors() — works correctly with -XX:ActiveProcessorCount if needed.
ENTRYPOINT ["java",
"-XX:+UseContainerSupport",
"-XX:MaxRAMPercentage=75.0",
"-XX:+UseZGC",
"-Xlog:gc:file=/logs/gc.log:time,uptime:filecount=5,filesize=20m",
"-jar", "app.jar"]
IaC treats infrastructure (VPCs, ECS clusters, RDS, S3) as versioned code — same Git workflow, code review, and rollback as application code.
- Terraform: Cloud-agnostic HCL. Plan → Apply workflow. State file tracks actual infrastructure. Used for AWS VPC, EKS cluster, RDS, ElastiCache provisioning.
- AWS CDK: Define infra in Java/TypeScript. Compiles to CloudFormation. Good if team is Java-heavy.
- Ansible: Configuration management. Used for server bootstrapping, installing agents, config drift remediation.
# Terraform RDS example resource "aws_db_instance" "orders_db" { engine = "postgres" engine_version = "15" instance_class = "db.t3.medium" multi_az = true storage_encrypted = true }
Rollback strategy must be decided BEFORE deployment, not during incident.
- Kubernetes: kubectl rollout undo deployment/order-service — rolls back to previous ReplicaSet in seconds. Kubernetes keeps last 10 revisions by default.
- Blue-Green: Flip load balancer back to blue. Zero downtime, instant.
- Canary (Argo Rollouts): Abort rollout — traffic automatically reverts to stable version.
- Database rollback: This is the hard part. Liquibase rollback must be pre-scripted. Use expand-contract migrations to avoid needing DB rollback.
- Feature flags: Kill switch in LaunchDarkly/Unleash — disable the new feature without redeployment.
# Kubernetes instant rollback kubectl rollout undo deployment/order-service -n production kubectl rollout status deployment/order-service # verify
GitOps: Git is the single source of truth for both application code AND infrastructure/deployment state. Any cluster change must go through Git — no manual kubectl apply in production.
ArgoCD: Watches a Git repo (Helm charts/manifests). If the cluster drifts from the Git state, ArgoCD detects it and auto-syncs (or alerts). Pull-based model — cluster pulls from Git, not CI pushing to cluster.
- Benefits: Full audit trail (every deployment is a Git commit), easy rollback (git revert), self-healing clusters.
- Workflow: CI builds image + updates image tag in Git → ArgoCD detects change → syncs to cluster.
- vs Push-based: Jenkins/GitLab pushing to cluster requires cluster credentials in CI. ArgoCD only needs cluster-internal credentials.
Gitflow: Long-lived branches (feature, develop, release, hotfix, main). Good for scheduled release cycles but creates merge hell in large teams.
Trunk-based: Everyone commits to main/trunk frequently (at least daily). Feature flags hide incomplete features. Preferred in CI/CD-heavy shops.
I prefer trunk-based for microservices — it eliminates integration debt, forces small commits, and keeps CI feedback fast. We used feature flags (LaunchDarkly) to safely deploy incomplete features to prod.
Poll SCM: Jenkins periodically checks the repo on a cron schedule (e.g., every 5 minutes). Wasteful — mostly does nothing.
Webhook: GitHub/GitLab pushes an HTTP event to Jenkins the moment a commit/PR happens. Instant trigger, zero wasted polling.
Always prefer Webhooks in production. Poll SCM is only used when Jenkins is behind a firewall and can't receive inbound requests.
Steps: Install SonarQube Scanner plugin in Jenkins → Add SonarQube server URL in Jenkins global config → Store token as Jenkins credential → Add sonar stage to Jenkinsfile.
stage('SonarQube Analysis') { steps { withSonarQubeEnv('SonarQube-Server') { sh 'mvn sonar:sonar' } } } stage('Quality Gate') { steps { timeout(time: 2, unit: 'MINUTES') { waitForQualityGate abortPipeline: true } } }
Jenkins has a built-in Credentials Store (encrypted at rest). Never hardcode secrets in Jenkinsfile.
// Reference credentials in Declarative Pipeline environment { DB_PASS = credentials('db-password-secret-id') AWS_CREDS = credentials('aws-access-key') } // For username+password type: withCredentials([usernamePassword( credentialsId: 'dockerhub-creds', usernameVariable: 'USER', passwordVariable: 'PASS' )]) { sh 'docker login -u $USER -p $PASS' }
For enterprise: integrate Jenkins with HashiCorp Vault or AWS Secrets Manager via plugins for dynamic secret injection.
Parallel stages run multiple steps simultaneously, reducing pipeline duration. Ideal when steps are independent — e.g., unit tests + integration tests + security scan can all run at the same time.
stage('Parallel Tests') { parallel { stage('Unit Tests') { steps { sh 'mvn test -Dgroups=unit' } } stage('Integration Tests') { steps { sh 'mvn test -Dgroups=integration' } } stage('SAST Scan') { steps { sh './run-security-scan.sh' } } } }
Authentication: Verifying WHO you are. (Login with username/password, JWT, OAuth2 token)
Authorization: Verifying WHAT you can do. (Role check — can this user access /admin?)
In Spring Security: Authentication is handled by AuthenticationManager → UserDetailsService. Authorization is handled by SecurityFilterChain with .hasRole() or @PreAuthorize.
// 1. JWT Utility @Component public class JwtUtil { private final String SECRET = "mySecretKey256bits"; public String generateToken(String username) { return Jwts.builder() .subject(username) .issuedAt(new Date()) .expiration(new Date(System.currentTimeMillis() + 86400000)) .signWith(Keys.hmacShaKeyFor(SECRET.getBytes())) .compact(); } public String extractUsername(String token) { return Jwts.parser() .verifyWith(Keys.hmacShaKeyFor(SECRET.getBytes())) .build().parseSignedClaims(token) .getPayload().getSubject(); } }
// 2. JWT Filter @Component public class JwtFilter extends OncePerRequestFilter { @Autowired JwtUtil jwtUtil; @Autowired UserDetailsService userDetailsService; @Override protected void doFilterInternal(HttpServletRequest req, HttpServletResponse res, FilterChain chain) throws IOException, ServletException { String header = req.getHeader("Authorization"); if (header != null && header.startsWith("Bearer ")) { String token = header.substring(7); String username = jwtUtil.extractUsername(token); if (username != null && SecurityContextHolder.getContext().getAuthentication() == null) { UserDetails user = userDetailsService.loadUserByUsername(username); UsernamePasswordAuthenticationToken auth = new UsernamePasswordAuthenticationToken(user, null, user.getAuthorities()); SecurityContextHolder.getContext().setAuthentication(auth); } } chain.doFilter(req, res); } }
// 3. Security Config @Configuration @EnableWebSecurity public class SecurityConfig { @Bean public SecurityFilterChain filterChain(HttpSecurity http) throws Exception { return http .csrf(AbstractHttpConfigurer::disable) .sessionManagement(s -> s.sessionCreationPolicy(STATELESS)) .authorizeHttpRequests(auth -> auth .requestMatchers("/api/auth/**").permitAll() .requestMatchers("/api/admin/**").hasRole("ADMIN") .anyRequest().authenticated()) .addFilterBefore(jwtFilter, UsernamePasswordAuthenticationFilter.class) .build(); } }
- Header (Bearer token): Stored in JS memory or localStorage. Vulnerable to XSS if stored in localStorage. SPAs typically use this.
- HttpOnly Cookie: JS cannot read it — XSS-proof. But vulnerable to CSRF unless SameSite=Strict is set.
- Most secure: HttpOnly + Secure + SameSite=Strict cookie. Combine with CSRF token for form submissions.
// DTO with validation annotations public class UserRequest { @NotBlank(message = "Name is required") private String name; @Email(message = "Invalid email") private String email; @Min(18) @Max(100) private int age; } // Controller @PostMapping("/users") public ResponseEntity createUser(@Valid @RequestBody UserRequest req) { ... } // Global exception handler @RestControllerAdvice public class GlobalExceptionHandler { @ExceptionHandler(MethodArgumentNotValidException.class) public ResponseEntity handleValidation(MethodArgumentNotValidException ex) { Map errors = ex.getBindingResult().getFieldErrors().stream() .collect(toMap(FieldError::getField, FieldError::getDefaultMessage)); return ResponseEntity.badRequest().body(errors); } }
SSO Flow (OAuth2/OIDC):
- User clicks Login → redirected to Identity Provider (Okta, Keycloak, Azure AD)
- IdP authenticates user → returns Authorization Code
- App exchanges code for Access Token + ID Token
- App validates token, creates session
SAML (older enterprise SSO): XML-based. IdP returns a signed SAML Assertion (XML document) containing user attributes. SP validates the signature using IdP's public key. Spring Security SAML extension handles this. Key: SAML assertions are Base64-encoded XML, signed with IdP's private key — any tampering breaks the signature.
List<String> words = List.of("java", "spring", "java", "kafka", "spring", "java"); // Approach 1: Collectors.groupingBy + counting Map<String, Long> freq = words.stream() .collect(Collectors.groupingBy( Function.identity(), Collectors.counting() )); // Output: {java=3, spring=2, kafka=1} // Approach 2: toMap with merge function Map<String, Integer> freq2 = words.stream() .collect(Collectors.toMap( Function.identity(), w -> 1, Integer::sum ));
// Classic trick question Map<Integer, String> map = new HashMap<>(); map.put(1, "one"); map.put(1, "ONE"); // duplicate key System.out.println(map.size()); // Output: 1 (not 2) System.out.println(map.get(1)); // Output: "ONE" (overwritten)
Internals: HashMap uses an array of buckets. Key's hashCode() determines bucket index. If two keys hash to the same bucket → linked list (Java 7) or Red-Black Tree (Java 8+, when bucket size > 8). Default capacity=16, load factor=0.75 → resizes at 12 entries.
null key: HashMap allows one null key (always goes to bucket 0). HashTable does NOT allow null keys.
A race condition occurs when multiple threads access shared mutable state concurrently, producing unpredictable results depending on thread scheduling.
// Race condition: counter++ is NOT atomic int counter = 0; // Thread 1 reads counter=5, Thread 2 reads counter=5 // Both increment to 6, both write 6 → lost update! // Fix 1: AtomicInteger AtomicInteger counter = new AtomicInteger(0); counter.incrementAndGet(); // CAS operation — atomic // Fix 2: synchronized synchronized(this) { counter++; } // Fix 3: ReentrantLock lock.lock(); try { counter++; } finally { lock.unlock(); }
Object class methods: equals(), hashCode(), toString(), clone(), wait(), notify(), notifyAll(), getClass(), finalize()
wait() / notify(): Used for inter-thread communication. wait() releases the monitor lock and pauses the thread. notify() wakes one waiting thread. Must be called inside a synchronized block.
// Producer-Consumer with wait/notify synchronized (lock) { while (queue.isEmpty()) { lock.wait(); // releases lock, waits } process(queue.poll()); } // Producer synchronized (lock) { queue.add(item); lock.notifyAll(); // wake waiting consumers }
- Stream API: Synchronous (blocking), sequential/parallel data processing pipeline. Operations like map, filter, reduce on collections.
- CompletableFuture: Asynchronous, non-blocking. Represents a future result. Chains async operations, handles failures, combines multiple futures.
// Stream — synchronous processing List result = orders.stream() .filter(o -> o.getAmount() > 1000) .map(Order::getId).collect(toList()); // CompletableFuture — async chaining CompletableFuture.supplyAsync(() -> fetchUser(id)) .thenApplyAsync(user -> fetchOrders(user)) .thenAcceptAsync(orders -> sendEmail(orders)) .exceptionally(ex -> { log.error(ex); return null; });
- WebClient (Spring WebFlux): Reactive, non-blocking. Best for high-throughput async calls. Can be used in servlet apps too.
- RestClient (Spring Boot 3.2+): Synchronous, fluent API. Replacement for RestTemplate. Simple and modern for blocking calls.
- HttpClient (Java 11+): Standard library, no Spring dependency. Supports sync and async. Good for non-Spring projects.
// RestClient (Spring Boot 3.2+) — preferred for simple cases RestClient client = RestClient.create(); User user = client.get() .uri("https://api.example.com/users/{id}", userId) .retrieve() .body(User.class); // WebClient — reactive/async webClient.get().uri("/users/{id}", id) .retrieve().bodyToMono(User.class) .subscribe(user -> process(user));
- Heap: Object instances, arrays. Divided into Young (Eden + Survivor S0/S1) and Old Generation. GC manages this.
- Metaspace (Java 8+): Class metadata, method bytecode. Replaced PermGen. Grows dynamically — set -XX:MaxMetaspaceSize to cap it.
- Stack: Per-thread. Stores method frames, local primitives, references. StackOverflowError = infinite recursion.
- Off-heap: Direct ByteBuffers, NIO. Used by Kafka, Netty. Not GC-managed.
OOM types:
- Java heap space: Memory leak or heap too small
- Metaspace: Class loader leak (CGLIB proxies, JSP engines)
- GC overhead limit exceeded: JVM spending >98% time in GC — classic leak sign
- Unable to create native thread: Too many threads, OS limit hit
Memory leak = objects kept alive by references even though no longer needed. GC cannot collect them — heap grows until OOM.
Common causes (10-year level):
- Static collections accumulating data never cleared
- Unclosed resources — DB connections, InputStream not in try-with-resources
- ThreadLocal not removed in pooled threads — thread never dies, value lives forever
- Event listeners registered but never deregistered
- Cache without eviction (Guava/Caffeine without maximumSize or expireAfterWrite)
// Dangerous: ThreadLocal leak in pooled threads static ThreadLocal<UserContext> ctx = new ThreadLocal<>(); // Fix: always clean up in finally block try { ctx.set(userContext); doWork(); } finally { ctx.remove(); // CRITICAL — never skip this }
Diagnosis: 1) Monitor heap growth in Grafana. 2) jmap -dump:live,format=b,file=heap.hprof <pid>. 3) Open in Eclipse MAT → Dominator Tree → find the GC root holding the leak chain.
Deadlock: Thread A holds lock1, waits for lock2. Thread B holds lock2, waits for lock1. Both wait forever.
// Classic deadlock — different lock order // Thread 1: lockA → lockB // Thread 2: lockB → lockA — DEADLOCK // Prevention: always acquire in same order synchronized(lockA) { synchronized(lockB) { doWork(); } // both threads same order } // Or use tryLock with timeout (ReentrantLock) if (lock1.tryLock(1, SECONDS) && lock2.tryLock(1, SECONDS)) { try { doWork(); } finally { lock2.unlock(); lock1.unlock(); } }
Detection: jstack <pid> — JVM prints deadlock cycle with thread dump. Also visible in VisualVM/JConsole. ThreadMXBean.findDeadlockedThreads() in code.
Platform threads = 1:1 mapped to OS threads. ~1MB stack, costly to create. Thread pools required. Blocking wastes an OS thread.
Virtual threads = JVM-managed, ~few KB. Millions can exist. When VT blocks (I/O, sleep), JVM parks it and reuses the carrier OS thread for another VT. Blocking is cheap.
// One virtual thread per task — no pool tuning needed try (var exec = Executors.newVirtualThreadPerTaskExecutor()) { IntStream.range(0, 100_000).forEach(i -> exec.submit(() -> callDatabase(i)) ); } # Spring Boot 3.2+ — one line to enable spring.threads.virtual.enabled=true
- Best for: I/O-bound work (DB, HTTP, file). Each request gets its own VT.
- Not for: CPU-intensive computation — use platform threads + ForkJoinPool.
- Pinning trap: synchronized blocks inside VT pin the carrier thread — negates the benefit. Replace with ReentrantLock.
- Observe pinning: -Djdk.tracePinnedThreads=full
- volatile: Visibility only. Every read sees latest write. No atomicity. Use for: flags, state fields read by multiple threads without compound operations.
- synchronized: Visibility + atomicity + mutual exclusion. Use for: compound actions on shared state, methods that must be atomic as a whole.
- AtomicInteger/AtomicReference: Lock-free atomic operations via CAS (Compare-And-Swap). Faster than synchronized for single-variable updates under high contention.
// volatile — correct: simple flag volatile boolean shutdown = false; // volatile — WRONG: compound action volatile int count = 0; count++; // read-modify-write is NOT atomic! // Correct: use Atomic AtomicInteger count = new AtomicInteger(0); count.incrementAndGet(); // atomic CAS — no lock // Double-checked locking — volatile required private volatile Singleton instance; if (instance == null) { synchronized(this) { if (instance == null) instance = new Singleton(); } }
- Java 8: Streams, Lambda, Optional, CompletableFuture, LocalDate/Time API
- Java 9: List.of(), Map.of() immutable factories; module system (rarely used in practice)
- Java 11: String.isBlank(), strip(), lines(); HttpClient; var in lambda params
- Java 15: Text blocks — multiline strings for JSON/SQL inline
- Java 16: Records (GA), instanceof pattern matching
- Java 17 (LTS): Sealed classes, switch expressions
- Java 21 (LTS): Virtual Threads (GA), Sequenced Collections, pattern matching in switch, record patterns
// Record — Java 16+ (replaces Lombok @Data for simple DTOs) record OrderSummary(Long id, String status, BigDecimal amount) {} // Sealed class + pattern switch — Java 21 sealed interface PaymentResult permits Success, Failure {} record Success(String txId) implements PaymentResult {} record Failure(String reason) implements PaymentResult {} String message = switch (result) { case Success s -> "Paid: " + s.txId(); case Failure f -> "Failed: " + f.reason(); };
- S — Single Responsibility: UserService doing auth + email + reporting → split into AuthService, NotificationService, ReportService.
- O — Open/Closed: PaymentService with if-else per payment type → PaymentStrategy interface + UpiStrategy, CardStrategy implementations.
- L — Liskov Substitution: Square extends Rectangle but breaks setWidth/setHeight — a square can't be a rectangle behaviorally. Fix: no inheritance, separate classes.
- I — Interface Segregation: Animal interface forces Dog to implement swim(). Fix: separate Swimmable, Flyable interfaces.
- D — Dependency Inversion: OrderService directly instantiates EmailSender. Fix: inject NotificationService interface via Spring @Autowired.
// O — Open/Closed violation vs fix // Bad void pay(String type, double amt) { if (type.equals("UPI")) { ... } else if (type.equals("CARD")) { ... } } // Good — add new types without touching existing code interface PaymentStrategy { void pay(double amount); } class UpiStrategy implements PaymentStrategy { ... } class CardStrategy implements PaymentStrategy { ... }
Hashtable: All methods synchronized on the whole object. One lock for the entire map — complete bottleneck under concurrency.
ConcurrentHashMap (Java 8+): No global lock. Uses CAS (Compare-And-Swap) for most operations. Only locks at the individual bucket level on hash collision. 16 default concurrency segments conceptually, but Java 8 went even finer.
// Thread-safe compute — no external sync needed ConcurrentHashMap<String, Long> wordCount = new ConcurrentHashMap<>(); wordCount.merge(word, 1L, Long::sum); // atomic merge wordCount.compute(word, (k, v) -> v == null ? 1 : v + 1); // atomic compute // Wrong: NOT atomic even with CHM if (!map.containsKey(key)) map.put(key, val); // use putIfAbsent instead map.putIfAbsent(key, val); // atomic check-and-put
CQRS (Command Query Responsibility Segregation): Separate the write model (Commands) from the read model (Queries). Commands mutate state; Queries only read. They use different databases optimized for each purpose.
- Write side: PostgreSQL with normalized schema. Handles order creation, updates.
- Read side: Elasticsearch or MongoDB with denormalized documents. Optimized for fast queries.
- Sync: Domain events published to Kafka → event handler updates read store.
Classic double-submit problem. Solutions:
- Idempotency Key: Client sends a unique key (UUID) per request. Server stores processed keys in Redis with TTL. Duplicate requests return cached response.
- Optimistic Locking (JPA):
@Versionfield — concurrent update throws OptimisticLockException, client retries. - Database UNIQUE constraint: Unique constraint on (userId, idempotencyKey) ensures only one insert succeeds.
- Distributed Lock (Redisson): Acquire Redis lock before processing, release after.
@Version private Long version; // JPA optimistic lock // Redis idempotency check String key = "order:" + idempotencyKey; Boolean isNew = redis.setIfAbsent(key, "processing", 10, MINUTES); if (!isNew) return getCachedResponse(key);
Every incoming request gets a traceId (unique per user request) and a spanId (unique per service hop). Both are propagated via HTTP headers (W3C TraceContext standard).
# Headers propagated between services traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 # version-traceId-parentSpanId-flags
In Spring Boot 3.x: Micrometer Tracing auto-propagates traceId via WebClient/RestClient/Kafka headers. Add traceId to MDC for structured logs:
# logback-spring.xml pattern %d [%X{traceId}] [%X{spanId}] %-5level %logger - %msg%n
Communication options:
- Synchronous: REST (RestClient/WebClient) or gRPC (proto-based, faster)
- Asynchronous: Kafka events (decoupled, resilient)
Authorization between services:
- mTLS: Each service has a client certificate. Mutual authentication — both sides verify identity. Used in service meshes (Istio).
- Service account JWT: Each service gets its own JWT (client_credentials OAuth2 flow) from the IdP. Sent as Bearer token in service calls.
- API Gateway: Gateway validates external tokens. Internal services trust gateway and verify internal JWTs only.
Spring Cloud Config Server: Centralizes externalized configuration. Reads from Git repo. Each microservice fetches config at startup (and on refresh via /actuator/refresh).
API Gateway (Spring Cloud Gateway): Single entry point. Handles: routing, load balancing, authentication, rate limiting, CORS, request/response transformation.
# gateway application.yml spring.cloud.gateway.routes: - id: order-service uri: lb://ORDER-SERVICE # load balanced via Eureka predicates: - Path=/api/orders/** filters: - AuthFilter # custom JWT validation filter - name: RequestRateLimiter args: redis-rate-limiter.replenishRate: 10 redis-rate-limiter.burstCapacity: 20
Traditional CRUD: Store current state. UPDATE overwrites the previous value. No history.
Event Sourcing: Never update. Append immutable events to an event store. Current state = replay of all events. Complete audit trail.
// Traditional: UPDATE orders SET status='SHIPPED' // Event Sourcing: append event record OrderShipped(String orderId, Instant shippedAt, String carrier) {} // State rebuilt by replaying events Order rebuildState(List<OrderEvent> events) { Order order = new Order(); events.forEach(order::apply); return order; }
- Benefits: Full audit trail, temporal queries (state at any point in time), event replay for projections, debugging.
- Combined with CQRS: Event store = write side. Events project to read models (Elasticsearch, Redis).
- Drawback: Eventual consistency, complex replay logic, schema evolution of old events.
Saga manages distributed transactions across microservices without 2PC. Each step publishes an event; on failure, compensating transactions undo previous steps.
Choreography: No central coordinator. Each service reacts to events and publishes its own. Decoupled but hard to track overall state.
Orchestration: Central orchestrator (e.g., Saga orchestrator service or Temporal) tells each service what to do. Easier to track, debug, and add steps. More coupling to orchestrator.
// Orchestration saga — order placement // 1. Reserve inventory → 2. Charge payment → 3. Ship order // On payment failure: compensate → release inventory class OrderSaga { void execute(Order order) { inventoryService.reserve(order); // step 1 try { paymentService.charge(order); // step 2 shippingService.ship(order); // step 3 } catch (PaymentException e) { inventoryService.release(order); // compensate } } }
This is a backpressure + scaling problem. Layered approach:
- Scale consumers: Add more consumer instances (up to partition count). Each partition is consumed by exactly one consumer in a group.
- Increase partitions: More partitions = more parallelism. Plan this upfront — repartitioning is disruptive.
- Batch processing: Set max.poll.records higher, process in micro-batches instead of one-by-one.
- Async consumer processing: Don't do heavy work in poll loop. Offload to thread pool, commit offsets after processing.
- Back-pressure monitoring: Alert on consumer lag (kafka_consumer_lag metric in Prometheus). Set threshold alerts.
- Dead Letter Topic: Failed messages go to DLT for retry/manual review — don't block the main partition.
@KafkaListener(topics = "orders", containerFactory = "batchFactory") public void processBatch(List<ConsumerRecord<String, Order>> records) { records.parallelStream().forEach(r -> processOrder(r.value())); }
- CompletableFuture with thread pool: Submit each record as a future, join all at the end.
- @Async (Spring): Simple annotation-based async — uses TaskExecutor.
- Virtual Threads (Java 21): One virtual thread per record — massively scalable with simple blocking code.
- Reactive Streams (WebFlux): Flux.fromIterable(records).flatMap(r -> processAsync(r)).
- Spring Batch: For large-scale batch jobs with retry, skip, restart capabilities.
// CompletableFuture batch processing List<CompletableFuture<Void>> futures = records.stream() .map(r -> CompletableFuture.runAsync(() -> process(r), executor)) .collect(toList()); CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join(); // Virtual Threads (Java 21) try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { records.forEach(r -> executor.submit(() -> process(r))); }
Three delivery guarantees: at-most-once (fire and forget), at-least-once (retry on failure, may duplicate), exactly-once (no loss, no duplicate). Only exactly-once guarantees correctness for financial transactions.
- Producer side: enable.idempotence=true (retries don't duplicate) + transactional.id for atomic multi-partition writes.
- Consumer side: isolation.level=read_committed — only read committed messages from transactional producers.
- Application side: Idempotent consumer — store processed messageId in DB. Before processing, check if already handled.
# Producer config for exactly-once enable.idempotence=true transactional.id=order-producer-1 acks=all retries=Integer.MAX_VALUE # Consumer config isolation.level=read_committed
Rebalancing = partition reassignment across consumer group members. During rebalance, ALL consumers stop processing (stop-the-world by default).
Triggers: Consumer joins/leaves group, consumer crashes (heartbeat timeout), partitions added, session.timeout.ms exceeded.
Minimize disruption:
- Static group membership: group.instance.id — consumer rejoins with same ID, avoids full rebalance on restart. Ideal for Kubernetes rolling updates.
- Cooperative Sticky Assignor: Only moves partitions that need to move (vs eager assignor that revokes ALL). partition.assignment.strategy=CooperativeStickyAssignor.
- Tune heartbeat: heartbeat.interval.ms=3000, session.timeout.ms=45000. Don't set too aggressively.
- Commit offsets before shutdown: Call consumer.commitSync() in shutdown hook.
- Kafka strengths: High throughput (millions/sec), message replay (retention period), multiple independent consumer groups, event streaming, exactly-once semantics, Kafka Streams for stream processing.
- SQS strengths: Fully managed (no broker to maintain), auto-scaling, dead letter queue built-in, visibility timeout for safe processing, FIFO queues for ordering, native AWS integration (Lambda triggers). Max message size 256KB.
- Choose Kafka when: You need replay, multiple consumers reading the same events independently, high throughput, event sourcing, or stream processing (Kafka Streams/Flink).
- Choose SQS when: Simple task queuing on AWS, you want zero ops burden, Lambda-driven processing, or loose coupling between AWS services.
Framework: CACHING → ASYNC → SHARDING → CDN → OPTIMIZE
- Caching: Redis L2 cache for hot data. Cache-aside pattern. TTL based on data volatility.
- Async processing: Kafka for write-heavy operations. Return 202 Accepted immediately, process in background.
- DB read replicas: Route read queries to replicas, writes to primary.
- Connection pooling: HikariCP (Spring Boot default) — tune pool size = (core_count * 2) + effective_spindle_count.
- Horizontal scaling + Load balancer: Stateless services + sticky sessions for state (Redis-backed).
- CDN: Static assets + edge caching for geographically distributed users.
- DB indexes: Cover query patterns, use EXPLAIN ANALYZE, avoid N+1.
Partitioning: Splitting a table within the same DB server (horizontal partitions by range/hash/list). Transparent to application.
Sharding: Splitting data across multiple DB servers. Each shard is an independent DB. Application must know which shard to query (shard key). More complex but truly distributed.
Read-heavy design:
- Multiple read replicas, route reads via read-only datasource
- Aggressive caching (Redis) with write-through or read-aside
- CQRS — separate read model (Elasticsearch)
- Materialized views for complex queries
Write-heavy design:
- Async writes via Kafka — decouple write pressure from DB
- Batch inserts instead of row-by-row
- Write to append-only log (Event Sourcing)
- Time-series DB for metrics (InfluxDB, TimescaleDB)
Rate limiting controls how many requests a client can make in a time window.
Algorithms:
- Token Bucket: N tokens refilled per second. Each request consumes one token. Allows bursts.
- Fixed Window Counter: Count per window (e.g., 100/min). Simple but boundary spike issue.
- Sliding Window Log: Most accurate. Tracks request timestamps in Redis sorted set.
// Spring Cloud Gateway rate limiter (Redis-backed) @Bean public KeyResolver userKeyResolver() { return exchange -> Mono.just( exchange.getRequest().getHeaders() .getFirst("X-User-Id") ); }
Liquibase manages DB migrations as versioned changesets. On app startup, it runs pending changesets in order. Applied changesets are tracked in DATABASECHANGELOG table.
# db/changelog/v1.0/create-orders-table.yaml databaseChangeLog: - changeSet: id: 1 author: bheemesh changes: - createTable: tableName: orders columns: - column: name: id type: BIGINT autoIncrement: true
Zero-downtime migration rules: Only additive changes (new columns with defaults, new tables). Never rename columns directly. Use expand-contract pattern: add new column → migrate data → remove old column (in separate deployments).
Requirements: Shorten URL, redirect, handle ~100M URLs, 10:1 read-write ratio, 99.99% availability.
- Short key generation: Base62 encoding of auto-increment ID or MD5 hash of URL (first 7 chars). Base62 (a-z A-Z 0-9) gives 62^7 = 3.5 trillion unique keys.
- Storage: MySQL/Postgres — schema: (short_key, original_url, created_at, user_id, expiry). Index on short_key.
- Read path (redirect): GET /{key} → Redis cache lookup → DB fallback → 301/302 redirect. Cache hit rate ~99%.
- Write path: POST /shorten → generate key → write to DB → cache it → return short URL.
- Scale: Read replicas for DB. Redis cluster for cache. CDN at edge for globally popular URLs.
- 301 vs 302: 301 = permanent (browser caches, no future tracking). 302 = temporary (every redirect hits your server → analytics possible).
Requirements: Send notifications across channels (push, email, SMS, in-app). High volume, low latency. Retry on failure. User preferences respected.
- Producer services publish NotificationEvent to Kafka topic
- Notification Router consumes events, checks user preferences (DB), routes to channel-specific topics
- Channel workers (email-worker, sms-worker, push-worker) consume their topic, call third-party providers (SES, Twilio, FCM/APNs)
- Retry: Exponential backoff via Spring Retry or DLT in Kafka. Provider failures → retry up to 3x, then DLT for alert
- Template service: Handlebars/Thymeleaf templates per notification type, per language
- Delivery tracking: Webhook callbacks from providers update notification status in DB → exposed via API
// NotificationEvent on Kafka record NotificationEvent( String userId, NotificationType type, Map<String,String> payload, Instant createdAt ) {}
CAP theorem: A distributed system can guarantee only 2 of these 3 properties simultaneously:
- Consistency (C): Every read gets the most recent write (or an error)
- Availability (A): Every request gets a response (not necessarily latest data)
- Partition Tolerance (P): System continues operating despite network partitions
In practice: network partitions WILL happen. So the real choice is CP vs AP.
- CP systems: HBase, MongoDB (in strong consistency mode), Zookeeper. During partition, refuse to serve stale reads → consistent but potentially unavailable.
- AP systems: Cassandra, CouchDB, DynamoDB (eventual consistency). During partition, serve possibly stale data → always available.
- CA systems: Traditional RDBMS (PostgreSQL, MySQL) — only works if no network partition (single-node or synchronous replication).
Micrometer is Spring Boot's metrics facade — like SLF4J for metrics. It records metrics and exports to various backends (Prometheus, Datadog, CloudWatch).
# pom.xml <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId> </dependency> # application.yml management.endpoints.web.exposure.include: prometheus management.metrics.export.prometheus.enabled: true
// Custom metric example @Autowired MeterRegistry registry; Counter orderCounter = Counter.builder("orders.created") .tag("type", "premium") .register(registry); orderCounter.increment();
Commonly used: EC2 (compute), ECS/EKS (containers), RDS (managed DB), S3 (storage), SQS (queue), SNS (notifications), API Gateway, Lambda, ElastiCache (Redis), CloudWatch (logs/metrics), IAM (access), Secrets Manager.
RDS Read Replication:
- Primary handles all writes. Read replicas async-replicate from primary.
- Replicas can serve SELECT queries — reduce primary load.
- Replication lag is typically <1 second but can spike under write pressure.
- Multi-AZ = synchronous replication for failover (high availability). Read Replica = async replication for read scaling. These are different!
- Spring: configure separate DataSource beans for read/write routing using AbstractRoutingDataSource.
The Sidecar pattern deploys a secondary container alongside your main service container in the same Kubernetes pod. The sidecar handles cross-cutting concerns so the main service doesn't have to.
Observability sidecar examples:
- Envoy proxy: Captures all in/out traffic metrics (requests, latency, errors) transparently
- Fluentd/Fluent Bit: Tails log files, ships to ELK/Loki — no logging SDK needed in app
- Istio (service mesh): Injects Envoy sidecar automatically — gives you mTLS, distributed tracing, and metrics for free
The Four Golden Signals (Google SRE) — these are the most critical metrics for any service:
- Latency: p50, p95, p99 response times. Alert when p99 > SLA threshold (e.g., 500ms).
- Traffic: Requests per second. Sudden drop = possible outage. Sudden spike = potential DDoS.
- Errors: HTTP 5xx rate, exception rate. Alert when error rate > 1% of traffic.
- Saturation: CPU, heap usage, DB connection pool usage, Kafka consumer lag.
Java-specific alerts (Prometheus + AlertManager):
- jvm_memory_used_bytes / jvm_memory_max_bytes > 0.85 → heap pressure
- hikaricp_connections_pending > 5 → connection pool exhaustion
- kafka_consumer_lag > 10000 → consumer falling behind
- GC pause time > 500ms → GC tuning needed
# Prometheus AlertManager rule example - alert: HighErrorRate expr: rate(http_server_requests_seconds_count{status=~"5.."}[5m]) / rate(http_server_requests_seconds_count[5m]) > 0.01 for: 2m labels: severity: critical
ELK (Elasticsearch + Logstash + Kibana): Full-text search on logs. High storage cost. Complex to operate. Best when you need powerful ad-hoc log querying, log analytics, or regex search across millions of logs.
Loki + Grafana: Label-based log querying (like Prometheus for logs). Much cheaper storage (logs stored compressed, not indexed by content). Best when you already use Grafana for metrics — single pane of glass.
Structured logging setup in Spring Boot:
# application.yml — JSON structured logs logging.structured.format.console: ecs # Spring Boot 3.4+ # Or with logback-spring.xml # Output: {"@timestamp":"...","level":"INFO","traceId":"abc","message":"..."}
// Add custom fields to MDC (appears in every log line) MDC.put("userId", userId); MDC.put("orderId", orderId); log.info("Order processed"); // MDC fields auto-included MDC.clear();
Spring Cloud Function + AWS Lambda Adapter lets you run Spring Boot logic serverlessly. The function is the handler; API Gateway triggers it via HTTP.
// Spring Cloud Function handler @Bean public Function<Order, OrderConfirmation> processOrder() { return order -> { // business logic return new OrderConfirmation(order.getId(), "CONFIRMED"); }; } # application.properties spring.cloud.function.definition=processOrder
Cold start problem: JVM startup time (2-3s) causes first Lambda invocation to be slow. Solutions:
- Provisioned concurrency: Keep N Lambdas warm — costs money
- GraalVM native compilation: Spring Boot 3 + native image — startup in <100ms. No JVM overhead.
- SnapStart (Java 21): AWS takes snapshot after JVM init — restores from snapshot in ~1s
Redis is an in-memory data structure store used as: cache, session store, rate limiter, pub/sub message broker, distributed lock.
Use cases in my project:
- L2 cache for frequently read data (product catalog, user profile)
- JWT blacklist for logout/token revocation
- Rate limiting counters (API Gateway)
- Distributed locks (Redisson) for idempotency
- Session storage for stateless JWT auth
Sensitive data in Redis: Encrypt before storing. Use Redis AUTH password + TLS in transit. Never store raw PII — hash or encrypt it. Set TTL to minimize exposure window.
-- Average salary per department SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id HAVING AVG(salary) > 50000 ORDER BY avg_salary DESC;
GROUP BY: Collapses multiple rows into summary groups. Used with aggregate functions (SUM, AVG, COUNT). Executed before HAVING.
ORDER BY: Sorts the result set. Applied last — after GROUP BY, HAVING, SELECT. Does not reduce rows, just reorders them.
Execution order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
- Partition Key (PK): Determines which partition stores the item. Must be unique if used alone.
- Sort Key (SK): Enables range queries within a partition. PK+SK combo must be unique.
- GSI (Global Secondary Index): Query on non-key attributes. Has its own PK/SK.
- Single-table design: Store multiple entity types in one table using PK/SK patterns (e.g., PK=USER#123, SK=ORDER#456).
- Capacity modes: On-demand (pay per request) vs Provisioned (set RCU/WCU).
- DynamoDB Streams: Change data capture — trigger Lambda on insert/update/delete.
N+1 problem: 1 query to fetch N entities, then N additional queries to fetch their associations. 100 orders → 101 queries. Invisible in dev, catastrophic in prod under load.
// N+1 trap — LAZY fetch with loop List<Order> orders = orderRepo.findAll(); // 1 query orders.forEach(o -> System.out.println(o.getCustomer().getName())); // N queries! // Fix 1: JOIN FETCH @Query("SELECT o FROM Order o JOIN FETCH o.customer") List<Order> findAllWithCustomer(); // Fix 2: @EntityGraph @EntityGraph(attributePaths = "customer") List<Order> findAll(); // Fix 3: batch fetching (hibernate) @BatchSize(size = 50) private List<OrderItem> items;
Detection: hibernate.show_sql=true in dev. In prod: enable Hibernate statistics, alert when query count per request exceeds threshold. Datasource-proxy logs all queries with call stack.
Cache stampede: A hot cache key expires. Suddenly 1000 concurrent requests all miss the cache simultaneously, all hit the DB to rebuild it, the DB gets overwhelmed.
Prevention strategies:
- Mutex/distributed lock: Only one thread rebuilds cache. Others wait and then read the rebuilt value.
- Probabilistic early expiration: Randomly start refreshing cache slightly before TTL — distributes the rebuild load.
- Stale-while-revalidate: Serve stale cache while asynchronously refreshing in background. No thundering herd, slight staleness acceptable.
- TTL jitter: Add random offset to TTL (e.g., 300s ± 30s) so not all keys expire simultaneously.
// Mutex approach with Redisson RLock lock = redisson.getLock("rebuild-lock:" + key); if (lock.tryLock(0, 30, SECONDS)) { try { Object val = cache.get(key); // double-check after lock if (val == null) cache.set(key, db.fetch(key), 300, SECONDS); } finally { lock.unlock(); } } else { return waitAndGetCache(key); }
Window functions perform calculations across a set of rows related to the current row — without collapsing rows like GROUP BY does.
-- Rank employees by salary within each department SELECT name, department_id, salary, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS dept_rank, LAG(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS prev_salary, SUM(salary) OVER (PARTITION BY department_id) AS dept_total, salary * 100.0 / SUM(salary) OVER (PARTITION BY department_id) AS pct_of_dept FROM employees; -- Running total (cumulative sum) SELECT order_date, amount, SUM(amount) OVER (ORDER BY order_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total FROM orders;
Common window functions: ROW_NUMBER() (unique rank), RANK() (ties get same rank, gap after), DENSE_RANK() (no gap), LAG/LEAD (previous/next row value), NTILE() (divide into buckets).
Auto-configuration is the magic behind Spring Boot. How it works:
@SpringBootApplicationincludes@EnableAutoConfiguration- Spring scans
META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports(Boot 3.x) for all auto-configuration classes - Each class is annotated with
@ConditionalOnClass,@ConditionalOnMissingBean, etc. - Only conditions that evaluate to true result in beans being registered
// Example: How DataSource auto-config works @Configuration @ConditionalOnClass(DataSource.class) // only if JDBC on classpath @ConditionalOnMissingBean(DataSource.class) // only if no custom bean @EnableConfigurationProperties(DataSourceProperties.class) public class DataSourceAutoConfiguration { ... }
Industry standard: 200–400 lines per PR. Research shows review quality degrades sharply above 400 lines — reviewers stop carefully reading and just approve.
Best practices:
- One PR = one concern (single feature, single bug fix)
- Large features = stacked PRs (feature branch → intermediate → main)
- Generated code (migrations, DTOs) can exceed limits — annotate clearly
- Include: what changed, why, how to test, screenshots if UI
- Saga Pattern: Manage distributed transactions across microservices. Each step publishes an event; compensating transactions on failure.
- Outbox Pattern: Atomically write to DB + publish event. Avoid dual-write problem. Write to outbox table in same transaction; relay reads and publishes to Kafka.
- Circuit Breaker (Resilience4j): Stop calling a failing service. States: CLOSED → OPEN (on failure threshold) → HALF_OPEN (test recovery).
- Builder: Complex object creation (RequestDTO, configuration objects).
- Strategy: Payment processing — different strategies for UPI/card/wallet without if-else chains.
- Factory: Notification service — EmailNotification vs SMSNotification via NotificationFactory.
- Decorator: Layered caching — CachingOrderRepository wraps JpaOrderRepository.
Spring Boot 2.x → 3.x migration steps:
- Upgrade to Java 17+ first (Spring Boot 3 requires minimum Java 17)
- Use OpenRewrite migration recipes:
mvn rewrite:run -Drewrite.activeRecipes=org.openrewrite.java.spring.boot3.UpgradeSpringBoot_3_0 - javax.* → jakarta.* package rename (biggest breaking change)
- Spring Security config: SecurityFilterChain bean replaces WebSecurityConfigurerAdapter
- Actuator endpoint changes — verify /actuator paths
- Test with feature flags enabled on a canary environment first
// Controller layer: mock the Service, not the Repository @WebMvcTest(OrderController.class) // loads only web layer class OrderControllerTest { @Autowired MockMvc mockMvc; @MockBean OrderService orderService; // mock the service @Test void shouldCreateOrder() throws Exception { Order mockOrder = new Order(1L, "PENDING"); given(orderService.create(any())).willReturn(mockOrder); mockMvc.perform(post("/api/orders") .contentType(APPLICATION_JSON) .content("""{"amount": 500}""")) .andExpect(status().isCreated()) .andExpect(jsonPath("$.status").value("PENDING")); } }
Integration tests spin up real infrastructure (DB, Redis, Kafka) using TestContainers. Docker containers start before tests, torn down after. Tests run against real implementations — not mocks.
@SpringBootTest @Testcontainers class OrderServiceIntegrationTest { @Container static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:15"); @Container static GenericContainer<?> redis = new GenericContainer<>("redis:7").withExposedPorts(6379); @DynamicPropertySource static void configureProperties(DynamicPropertyRegistry registry) { registry.add("spring.datasource.url", postgres::getJdbcUrl); registry.add("spring.data.redis.host", redis::getHost); registry.add("spring.data.redis.port", redis::getFirstMappedPort); } @Test void shouldCreateOrderAndPersistToDb() { // test against real Postgres + Redis } }
Breaking API changes in a microservices world hurt dependent consumers. Strategies to maintain backward compatibility:
- URI versioning: /api/v1/orders vs /api/v2/orders. Simple, explicit. Old clients continue working on v1.
- Header versioning: Accept: application/vnd.myapp.v2+json. Cleaner URLs but harder to test in browser.
- Additive changes only: New fields are added, never removed. Consumers ignore unknown fields (Jackson default).
- Deprecation header: Sunset: Sat, 31 Dec 2025 23:59:59 GMT — tells consumers when v1 will be removed.
- Consumer-driven contracts (Pact): Each consumer defines a contract (expected request/response). Provider runs Pact tests to verify it still satisfies all contracts before release.
// Jackson — add new field safely (no breaking change) public class OrderResponse { private Long id; private String status; @JsonInclude(NON_NULL) private String trackingCode; // new field — old clients ignore it }
Resilience4j provides fault tolerance primitives for microservices. All composable via annotations.
// Circuit Breaker — stop calling failing service @CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback") @Retry(name = "paymentService") public PaymentResult processPayment(PaymentRequest req) { return paymentClient.pay(req); } public PaymentResult paymentFallback(PaymentRequest req, Exception ex) { log.warn("Payment service down, returning cached result", ex); return PaymentResult.pending(req.getOrderId()); }
# application.yml — CB config resilience4j.circuitbreaker.instances.paymentService: slidingWindowSize: 10 failureRateThreshold: 50 # open after 50% failures waitDurationInOpenState: 10s # wait before half-open permittedNumberOfCallsInHalfOpenState: 3 resilience4j.retry.instances.paymentService: maxAttempts: 3 waitDuration: 500ms retryExceptions: [java.io.IOException, java.util.concurrent.TimeoutException]
Bulkhead: Limit concurrent calls to a service — prevents one slow dependency from exhausting all threads and taking down everything else.
Load testing tools:
- JMeter: GUI and CLI based. HTTP load testing. Thread groups simulate concurrent users. Assertions on response time.
- Gatling: Scala DSL, CI-friendly, beautiful HTML reports. Better for complex scenarios.
- k6: JS scripting, cloud-native, excellent for microservices API testing.
Profiling in production (low overhead):
- Async Profiler: CPU + memory sampling. Can attach to running JVM. Output: flamegraph. Zero instrumentation overhead.
- JFR (Java Flight Recorder): Built into JVM since Java 11. Continuous profiling with <1% overhead. View in JDK Mission Control.
- VisualVM / JConsole: Heap, threads, GC in dev/staging. Too invasive for prod.
# Attach async profiler to running JVM ./profiler.sh -d 30 -f flamegraph.html <pid> # Enable JFR in production JVM args -XX:StartFlightRecording=duration=60s,filename=recording.jfr
Spring AOP intercepts method calls by wrapping beans in proxies at startup. Two proxy types:
- JDK Dynamic Proxy: Bean must implement an interface. Proxy implements the same interface. Intercepts interface method calls.
- CGLIB Proxy: Subclasses the target class. Used when bean has no interface (Spring Boot default). Cannot proxy final classes or final methods.
// Self-invocation bug — THIS IS THE #1 AOP TRAP @Service public class OrderService { @Transactional public void createOrder(Order o) { saveOrder(o); sendNotification(o); // calls internal method } @Transactional(propagation = REQUIRES_NEW) public void sendNotification(Order o) { // THIS @Transactional is IGNORED! // Because: this.sendNotification() bypasses the proxy } } // Fix: inject self reference or extract to separate bean @Autowired private OrderService self; // inject proxy self.sendNotification(o); // goes through proxy → AOP applies
Propagation levels (most important):
- REQUIRED (default): Join existing tx if present, else create new one.
- REQUIRES_NEW: Always create a new tx. Suspend existing one. Used for audit logging — must persist even if outer tx rolls back.
- SUPPORTS: Join tx if exists, else run non-transactionally.
- NOT_SUPPORTED: Suspend active tx, run non-transactionally. For read-only operations on unreliable resources.
- NEVER: Throw exception if a transaction exists.
- NESTED: Savepoint within existing tx. Partial rollback possible.
Isolation levels:
- READ_COMMITTED (Postgres default): Cannot read uncommitted data. Prevents dirty reads.
- REPEATABLE_READ: Same row read twice = same value. Prevents non-repeatable reads.
- SERIALIZABLE: Full isolation. Prevents phantom reads. Highest contention.
// Common trap: @Transactional only works on public methods @Transactional protected void internalMethod() { // IGNORED — not public! } // Common trap: RuntimeException rolls back, checked exception does NOT @Transactional public void process() throws IOException { // IOException thrown → transaction COMMITS (not rolled back!) } // Fix: @Transactional(rollbackFor = IOException.class)
@Aspect @Component public class PerformanceAspect { // Pointcut: all methods in service layer @Around("execution(* com.app.service.*.*(..))") public Object logExecutionTime(ProceedingJoinPoint pjp) throws Throwable { long start = System.currentTimeMillis(); String method = pjp.getSignature().toShortString(); try { Object result = pjp.proceed(); long elapsed = System.currentTimeMillis() - start; if (elapsed > 500) log.warn("SLOW method {} took {}ms", method, elapsed); return result; } catch (Exception e) { log.error("Exception in {} : {}", method, e.getMessage()); throw e; } } } // Custom annotation-based pointcut @Target(METHOD) @Retention(RUNTIME) public @interface Audited {} @Around("@annotation(com.app.Audited)") public Object audit(ProceedingJoinPoint pjp) throws Throwable { auditLog.record(pjp.getSignature().getName(), getCurrentUser()); return pjp.proceed(); }
Full Spring Bean lifecycle in order:
- 1. Instantiation: Constructor called (or factory method)
- 2. Dependency injection: @Autowired fields/setters injected
- 3. BeanNameAware / BeanFactoryAware: Aware callbacks if implemented
- 4. BeanPostProcessor.postProcessBeforeInitialization(): All registered BPPs run (this is where AOP proxies are created)
- 5. @PostConstruct / InitializingBean.afterPropertiesSet(): Init logic
- 6. BeanPostProcessor.postProcessAfterInitialization(): Final processing
- 7. Bean is ready for use
- 8. @PreDestroy / DisposableBean.destroy(): On context close
@Component public class CacheLoader { @PostConstruct public void init() { // runs after all dependencies injected — safe to use @Autowired fields loadInitialCacheData(); } @PreDestroy public void cleanup() { // runs on application shutdown — close connections, flush data cache.clear(); } }
- useState: Local component state. Re-renders component on state change.
- useEffect: Side effects (API calls, subscriptions, timers). Runs after render. Cleanup function handles unmount.
- useCallback: Memoize a function reference. Prevents child re-renders when parent re-renders but function hasn't changed.
- useMemo: Memoize an expensive computed value. Only recomputes when dependencies change.
- useRef: Mutable ref that doesn't trigger re-render. DOM access or storing interval IDs.
- useContext: Consume React context without prop drilling.
// useEffect with cleanup useEffect(() => { const sub = webSocket.subscribe(userId, setMessages); return () => sub.unsubscribe(); // cleanup on unmount }, [userId]); // re-run only when userId changes // useCallback — stable function reference for child const handleSubmit = useCallback((data) => { submitOrder(data); }, [submitOrder]); // only recreate if submitOrder changes
React re-renders a component when its state, props, or context changes. But it also re-renders ALL children by default — even if their props didn't change.
Causes of unnecessary re-renders:
- Inline object/array props: {style={{color:'red'}}} creates a new object every render
- Inline callback props: onClick={() => doSomething()} — new function reference every render
- Context: any context update re-renders all consumers, even if relevant value didn't change
- Parent re-render cascading to all children unconditionally
Fixes:
- React.memo: Wrap child component — only re-render if props actually changed
- useCallback: Stable function references passed as props
- useMemo: Stable object/array references passed as props
- Context split: Separate frequently-changing context from stable context
- React DevTools Profiler: Identify which components re-render and why
- React Context: Good for low-frequency global state (theme, auth user, locale). NOT good for high-frequency updates — any context update re-renders all consumers.
- Redux Toolkit: Industry standard for complex global state with many slices, time-travel debugging, middleware (RTK Query for API caching). Verbose but powerful.
- Zustand: Lightweight (~1KB). Simple API. No providers needed. Selective subscription — components only re-render when their specific slice changes. Great for medium complexity apps.
- React Query / TanStack Query: Server state management (API calls, caching, background refetching). Not a replacement for client state.
// Zustand — simple global store const useOrderStore = create((set) => ({ orders: [], addOrder: (order) => set((state) => ({ orders: [...state.orders, order] })), })); // Component — only re-renders when orders changes const orders = useOrderStore((state) => state.orders);
CORS (Cross-Origin Resource Sharing): Browser blocks JS from making requests to a different origin (domain/port/protocol) than the page itself. This is enforced by the browser — not the server or network.
When React app at localhost:3000 calls Spring Boot at localhost:8080 — different ports = different origin → CORS error.
// Spring Boot 3 — global CORS config @Bean public WebMvcConfigurer corsConfigurer() { return new WebMvcConfigurer() { @Override public void addCorsMappings(CorsRegistry registry) { registry.addMapping("/api/**") .allowedOrigins("https://myapp.com", "http://localhost:3000") .allowedMethods("GET", "POST", "PUT", "DELETE") .allowedHeaders("*") .allowCredentials(true) .maxAge(3600); } }; } // Or per-controller @CrossOrigin(origins = "https://myapp.com") @RestController public class OrderController { ... }