2023 年我们把 Java 17 服务从虚拟机迁到容器化(Docker + K8s),最初镜像 1.2GB、启动 90 秒,优化后镜像 180MB、启动 8 秒。期间踩了 5 个坑:JVM 不识别容器内存、CMS GC 在 K8s 表现糟、镜像层缓存失效、APM agent 启动卡、健康检查不准。本文复盘 JVM 容器化的完整优化方案。
背景
服务:订单 API(Spring Boot 2.7 + JDK 17)
旧部署:CentOS 7 虚拟机,32C 64G,3 节点
新部署:K8s 1.27,Pod 4C 8G,10 副本
迁移目标:
- 镜像体积 < 500MB
- 启动时间 < 30s(滚动更新快)
- 资源用对(不浪费 CPU/内存)
- 健康检查准(不误杀)
初版 Dockerfile:
FROM openjdk:17
WORKDIR /app
COPY target/app.jar /app/app.jar
ENTRYPOINT ["java", "-jar", "/app/app.jar"]
结果:
- 镜像 1.2GB(openjdk:17 太大)
- 启动 90s(Spring 慢 + GC 调参错)
- Pod 频繁 OOMKilled(JVM 看不到 cgroup limit)
坑 1:JVM 不识别容器内存
Pod limit 8GB,JVM Xmx 没设,默认是宿主机内存的 1/4
宿主机 256GB → JVM 取 64GB → 超过 Pod 8GB → OOMKilled
JDK 8u131+ 有 -XX:+UseContainerSupport,JDK 10+ 默认开
JDK 17 默认正确识别 cgroup
但还是要显式设置避免歧义
# Dockerfile 加 JVM 参数
ENTRYPOINT ["java", \
"-XX:MaxRAMPercentage=75", \ # 用 75% Pod 内存
"-XX:InitialRAMPercentage=50", \ # 初始 50%
"-XX:+UseContainerSupport", \ # 显式声明
"-XX:+ExitOnOutOfMemoryError", \ # OOM 立即退出(让 K8s 重启)
"-XX:+HeapDumpOnOutOfMemoryError", \
"-XX:HeapDumpPath=/tmp/heapdump.hprof", \
"-jar", "/app/app.jar"]
# 比硬编码 -Xmx 好:Pod 调整 limit 时自动适配
# 例:Pod 8GB → JVM 6GB heap;Pod 4GB → JVM 3GB heap
# 验证
$ kubectl exec pod -- jcmd 1 VM.flags | grep -i ram
-XX:MaxRAMPercentage=75.000000
-XX:InitialRAMPercentage=50.000000
坑 2:镜像基础太大
# 镜像优化:多阶段构建 + JRE only + Alpine
# 不好:openjdk:17 (700MB)
FROM openjdk:17
# 一般:eclipse-temurin:17-jre (300MB)
FROM eclipse-temurin:17-jre
# 好:eclipse-temurin:17-jre-alpine (180MB)
FROM eclipse-temurin:17-jre-alpine
# 最好:用 jlink 裁剪自定义 JRE(80MB)
FROM eclipse-temurin:17-jdk AS builder
WORKDIR /app
COPY target/*.jar app.jar
RUN jdeps --print-module-deps --multi-release 17 --recursive --ignore-missing-deps app.jar > deps.txt
RUN jlink \
--add-modules $(cat deps.txt) \
--strip-debug \
--no-man-pages \
--no-header-files \
--compress=2 \
--output /opt/jre-min
FROM debian:bookworm-slim
COPY --from=builder /opt/jre-min /opt/jre
COPY --from=builder /app/app.jar /app/app.jar
ENV PATH=/opt/jre/bin:$PATH
ENTRYPOINT ["java", "-XX:MaxRAMPercentage=75", "-jar", "/app/app.jar"]
# 镜像大小:
# openjdk:17 → 1.2GB
# temurin:17-jre → 350MB
# temurin:17-jre-alpine → 180MB
# jlink 自定义 → 90MB
# distroless + jlink → 60MB
多阶段 Dockerfile
# 完整版多阶段
FROM maven:3.9.6-eclipse-temurin-17 AS builder
WORKDIR /build
# 缓存 dependency(代码改动不重新下载)
COPY pom.xml .
RUN mvn dependency:go-offline -B
COPY src ./src
RUN mvn package -DskipTests -B
# JLink 阶段
FROM eclipse-temurin:17-jdk AS jlink
COPY --from=builder /build/target/*.jar /tmp/app.jar
RUN cd /tmp && \
jar -xf app.jar BOOT-INF/lib && \
jdeps --multi-release 17 --print-module-deps --ignore-missing-deps \
$(find BOOT-INF/lib -name '*.jar' | tr '\n' ':') /tmp/app.jar > /tmp/deps.txt
RUN jlink \
--add-modules java.base,java.logging,java.naming,java.management,java.security.jgss,java.sql,java.net.http,java.instrument,jdk.crypto.ec,jdk.unsupported,$(cat /tmp/deps.txt) \
--strip-debug --no-man-pages --no-header-files --compress=2 \
--output /opt/jre
# 最终 runtime
FROM gcr.io/distroless/java-base-debian12:nonroot
COPY --from=jlink /opt/jre /opt/jre
COPY --from=builder /build/target/*.jar /app/app.jar
WORKDIR /app
USER nonroot
ENV PATH="/opt/jre/bin:$PATH" \
JAVA_TOOL_OPTIONS="-XX:MaxRAMPercentage=75 \
-XX:InitialRAMPercentage=50 \
-XX:+UseG1GC \
-XX:+ExitOnOutOfMemoryError \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:HeapDumpPath=/tmp \
-Djava.security.egd=file:/dev/./urandom"
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "/app/app.jar"]
坑 3:构建缓存失效
# 错误:每次源码改 → 重新下所有依赖
COPY . /build
RUN mvn package
# 正确:先 copy pom.xml 缓存依赖,再 copy 源码
COPY pom.xml /build/
RUN mvn dependency:go-offline -B # 缓存这一层
COPY src /build/src # 源码改动不影响上层
RUN mvn package -DskipTests
# CI 用 BuildKit 缓存
# .github/workflows/build.yml
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
context: .
cache-from: type=registry,ref=registry/app:cache
cache-to: type=registry,ref=registry/app:cache,mode=max
push: true
tags: registry/app:${{ github.sha }}
坑 4:启动慢
Spring Boot 启动 60-90s 主要消耗:
1. Class loading
2. Bean 初始化
3. AspectJ proxy
4. Hibernate metadata 扫描
优化手段:
1. AOT compilation(Spring Native)
2. CDS(Class Data Sharing)
3. CRaC(Coordinated Restore at Checkpoint)
4. Lazy initialization
5. 移除无用 starter
# 1. CDS:AppCDS 缓存 class data
$ java -XX:ArchiveClassesAtExit=app-cds.jsa -jar app.jar &
$ sleep 30 && kill %1 # 让应用启动一遍写 archive
$ java -XX:SharedArchiveFile=app-cds.jsa -jar app.jar
# 启动时间从 60s → 30s
# 2. JDK 21 Project Leyden(Premain)
# JDK 21+ 默认带 AppCDS 优化
# 3. Spring AOT(Native Image)
$ mvn -Pnative native:compile
# 编译出原生二进制,启动 0.2s,内存只占 1/5
# 但编译慢(15-30 分钟),反射要配置
# 4. Lazy initialization
spring:
main:
lazy-initialization: true # 按需初始化 Bean
# 5. Tomcat 改 Undertow(更轻)
spring:
main:
web-application-type: servlet
# exclude tomcat starter,添加 undertow
坑 5:健康检查不准
# 错误:启动慢但健康检查很严
livenessProbe:
httpGet: { path: /health, port: 8080 }
initialDelaySeconds: 10 # 太短,启动还没完
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 3 # 30s 内 3 次失败 → kill
# 结果:Spring Boot 启动 30s,被 K8s 反复 kill,陷入崩溃循环
# 正确:三种 probe 分工
startupProbe: # 启动专用
httpGet: { path: /actuator/health/liveness, port: 8080 }
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 30 # 启动最多 150s
livenessProbe: # 已启动后的活检
httpGet: { path: /actuator/health/liveness, port: 8080 }
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
readinessProbe: # 流量准备
httpGet: { path: /actuator/health/readiness, port: 8080 }
initialDelaySeconds: 15
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 2
# Spring Boot 配置专门的 liveness / readiness
management:
endpoint:
health:
probes:
enabled: true
show-details: always
group:
liveness:
include: livenessState, db
readiness:
include: readinessState, db, redis, kafka
JVM Pod 资源配置
resources:
requests:
cpu: 2 # 保证 2 核
memory: 6Gi # 保证 6GB
limits:
cpu: 4 # 最多 4 核
memory: 6Gi # 限制 6GB(JVM 用 75% = 4.5GB heap)
# 注意:
# 1. memory limit = memory request(避免 OOMKilled)
# 2. cpu request 决定调度(节点资源不够不调度)
# 3. cpu limit 决定上限(超出限流)
# 4. JVM 在容器内看 cpu_quota / cpu_period 计算可用核数
JVM 调参(K8s 环境)
# JDK 17 K8s 推荐配置
JAVA_TOOL_OPTIONS="
-XX:MaxRAMPercentage=75
-XX:InitialRAMPercentage=50
-XX:+UseG1GC
-XX:MaxGCPauseMillis=100
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/tmp/heapdump.hprof
-XX:+UnlockDiagnosticVMOptions
-XX:+LogVMOutput
-Xlog:gc*:file=/tmp/gc.log:time,uptime,level,tags
-Dnetworkaddress.cache.ttl=60
-Dnetworkaddress.cache.negative.ttl=10
-Djava.security.egd=file:/dev/./urandom
-Dfile.encoding=UTF-8
"
# 大堆切 ZGC
JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -XX:+UseZGC"
# JDK 21+ 分代 ZGC
JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -XX:+UseZGC -XX:+ZGenerational"
优雅停机
# Pod terminationGracePeriodSeconds
spec:
terminationGracePeriodSeconds: 60 # K8s 等 60s
containers:
- name: app
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"] # 给 LB 时间摘流量
# Spring Boot graceful shutdown
server:
shutdown: graceful
spring:
lifecycle:
timeout-per-shutdown-phase: 30s
# 应用收到 SIGTERM 后:
# 1. 拒绝新请求(LB 摘流量)
# 2. 等待进行中请求完成(最多 30s)
# 3. 关闭线程池 / 连接池
# 4. 退出
观测性集成
# Micrometer + Prometheus
management:
endpoints:
web:
exposure:
include: prometheus, health, metrics, info
metrics:
distribution:
percentiles-histogram:
http.server.requests: true
# JVM metrics 暴露
- jvm_memory_used_bytes
- jvm_gc_pause_seconds
- jvm_threads_live_threads
- process_cpu_usage
- system_load_average_1m
- http_server_requests_seconds
# Pod 注入 JVM Exporter
- name: jvm-exporter
image: bitnami/jmx-exporter:0.20.0
args:
- "9404"
- "/etc/jmx-exporter/config.yml"
ports:
- { containerPort: 9404, name: metrics }
volumeMounts:
- { name: config, mountPath: /etc/jmx-exporter }
# ServiceMonitor 抓取
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: app-jvm
spec:
selector:
matchLabels: { app: order-api }
endpoints:
- port: actuator
path: /actuator/prometheus
interval: 15s
优化后效果
指标 优化前 优化后 变化
====================================================
镜像大小 1.2GB 180MB -85%
启动时间 90s 15s -83%
Pod 内存占用 8GB(OOM) 4.5GB 稳定
GC pause p99 500ms 50ms -90%
滚动更新时长 15min(10 Pod) 3min -80%
镜像 pull 时间 60s 12s -80%
冷启动 Pod 就绪 120s 25s -79%
业务影响:
- 部署速度快,迭代周期缩短
- Pod 资源用对,集群成本 -30%
- 大促弹性扩容快(冷启动 < 30s)
避坑清单
- JVM 必须用 MaxRAMPercentage,不要硬编码 Xmx
- 镜像选 jre-alpine 或 jlink 裁剪,不用完整 JDK
- Dockerfile 分层缓存:依赖 → 源码 → 编译
- startupProbe 给慢启动留时间,liveness/readiness 分工
- memory limit = memory request,避免被 OOMKilled
- preStop sleep + graceful shutdown 实现优雅停机
- JVM 启动慢可以试 AppCDS / Spring Native
- HeapDump 路径挂 emptyDir + 出问题能 kubectl cp
- JVM 监控通过 Micrometer + Prometheus 统一
- Pod 数量 + JVM 内存协调,不要单 Pod 太大(浪费)
总结
JVM 容器化是 Java 服务上云的标配,但并不是"打个镜像就行"。镜像、启动、内存、健康检查、停机每个环节都有坑。我们这次优化把镜像从 1.2GB 降到 180MB、启动从 90s 降到 15s,直接收益是大促弹性扩容更快、滚动更新更稳。最大的认知改变:JVM 容器化的优化空间在镜像 + JVM 参数 + K8s 配置三层的协同,任何一层不对都会有问题。JDK 17/21 在容器内的表现已经非常好,如果还在 JDK 8,2024 年值得升级 — 收益不止是新语言特性,更是云原生场景的基础适配。
—— 别看了 · 2026