Apache SkyWalking – Performance

Zh: AI Coding 如何重塑软件架构师的工作方式

Sun, 15 Mar 2026 00:00:00 +0000

以 SkyWalking GraalVM Distro 为例，看 AI Coding 如何把一批探索性 PoC 打磨成一条可重复的迁移流水线。

这个项目给我最大的启发，不是 AI 能写多少代码，而是 AI Coding 改变了架构设计的试错成本。当一个想法可以很快做成 PoC、跑起来验证、不行就推翻重来时，架构师就更有机会逼近自己真正想要的设计，而不是过早停在“团队现在做得出来”的折中方案上。

这种变化在成熟开源系统里尤其重要。Apache SkyWalking OAP 长期以来一直是一个功能强大且经过生产验证的可观测性后端，但大型 Java 平台该有的问题它一个不少：运行时字节码生成、重反射初始化、classpath 扫描、基于 SPI 的模块装配，以及动态 DSL 执行——这些机制方便扩展，但做 GraalVM Native Image 时全是障碍。

SkyWalking GraalVM Distro 的出现，源于我们把这个挑战当成一个架构设计问题来处理，而不是一次性的移植工程。目标不仅是让 OAP 能以原生二进制运行，更是把 GraalVM 迁移本身做成一条可重复执行、能够持续跟上上游演进的自动化流水线。

如果你想看完整的技术设计、基准数据和上手方式，请阅读配套文章：SkyWalking GraalVM Distro：设计与基准测试。

从停滞的想法到可运行的系统

这件事其实很多年前就开始了。在这个仓库创建不久之后，yswdqz 曾花了数个月探索迁移方案。真正做下来才发现，这个项目远比 GraalVM 文档里列出的那些单点限制复杂得多，这项工作最终也因此搁置了很多年。

这段停滞很重要。缺少的并不是想法。成熟维护者通常从来不缺想法，真正稀缺的，是把这些想法真正做出来的时间、人力和精力。即使架构师已经看到了几条很有前景的路线，有限的开发资源也会迫使大家更早做出权衡：优先选择实现成本最低的方案，而不是那个更干净、更可复用、更经得起未来变化的方案。

这种情况非常普遍，并不特殊。在开源社区里，很多工作依赖志愿者或有限的企业赞助；在商业产品里，约束的形式不同，但本质仍然一样：路线图承诺、团队规模和交付压力都会让工程资源始终紧张。在这两种环境里，很多好想法被放弃，并不是因为它们错了，而是因为要把它们真正验证清楚、实现完整，成本太高。

还有一个同样重要的约束：架构师通常同时也是非常资深的工程师，而不是一个可以全职扑在实现细节上的人。问题在于个人编码精力有限、时间高度碎片化，同时还要在代码尚未出现之前，不断向其他资深工程师解释自己的设计意图。传统上，这种解释主要通过图、文档和沟通完成。它很慢、信息损失大，而且充满不确定性。我们都体验过“传话游戏”：哪怕是很简单的意思，也很容易被误解，而等误解真正暴露出来时，时间已经过去很多了。

到了 2025 年末，AI Coding 让”同时尝试多条路线”这件事终于变得现实。我们不必再因为实现能力稀缺而过早接受折中，而是可以在多个设计之间来回切换，用代码验证，快速淘汰弱方案，持续迭代，直到架构本身变得足够稳固、足够实用、足够高效。

这种设计自由度至关重要。GraalVM 文档对单个限制讲得很清楚，但成熟 OSS 平台遇到的是一整套彼此牵连的系统性问题。只修补一个动态机制远远不够。要让 native image 真正落地，我们必须把整类运行时行为改造成构建期产物和自动生成的元数据。

在这条路的早期历史中，还有一座非常具体的大山。那时上游 SkyWalking 仍然大量依赖 Groovy 来处理 LAL、MAL 和 Hierarchy 脚本。理论上，这只不过是另一个“不支持运行时动态行为”的例子；但在实践中，Groovy 是整条路径上最大的障碍。它不仅意味着脚本执行，还意味着一整套在 JVM 里极其便利、在 native image 里却极其不友好的动态模型。

为了跨过这道坎，我们围绕 AOT-first 模式重新设计了 OAP 的核心引擎。早期实验必须直接面对 Groovy 时代的运行时行为，并尝试不同的脚本编译方案来绕过去。最终方案走得更远：对齐上游编译器流水线，把动态生成前移到构建期，并引入自动化机制，让这条迁移路径在上游持续演进时依然保持可控。具体来说，就是把 OAL、MAL、LAL 和 Hierarchy 的生成过程变成构建期预编译器的输出，而不是继续保留为启动期的动态行为。

AI Coding 如何改写架构迭代

这次转变的关键，并不只是“写代码更快了”。AI 真正改变的，是想法、原型、验证和重设计之间来回迭代的速度。围绕同一个问题，我们可以很快做出几个可运行的 PoC，迅速淘汰不成立的方向，再把值得保留的抽象慢慢沉淀成一套连贯的迁移系统。

这并不会削弱人的架构价值，反而会放大它。哪些行为应该前移到构建期，哪些地方应该保留可配置性，哪里应该引入 same-FQCN 替换，如何让上游同步保持可控，以及哪些抽象值得不惜代价保留下来，这些判断仍然只能由人来做。不同的是，AI 的速度让我们终于有机会把这些更好的设计真正做出来，而不是过早退回到更简单、也更差的折中方案。

这才是软件架构师工作方式真正发生变化的地方。过去，架构师往往已经知道更干净的方向在哪里，但有限的工程产能会逼着那个愿景退回到一个更便宜的妥协方案。现在，架构师在某种意义上又重新变回了“能快速动手的人”：可以直接用代码把思路搭出来，把高层抽象落成接口，再用真实运行的实现去证明设计。

这不仅改变了实现，也改变了沟通方式。在开源里，我们常说：talk is cheap, show me the code。在 AI Coding 时代，“把代码拿出来”这件事变得容易多了。设计不再那么依赖一个缓慢的、自上而下的翻译过程：从想法到文档，再到解释，再到实现。代码可以更早出现，也可以更早跑起来。

这也让其他资深工程师受益。他们不必只靠图、会议或长篇解释来还原整个设计，而是可以直接审查抽象、阅读真实代码、运行它、质疑它，并在具体实现上一起打磨。这让架构协作更快、更清晰，也少了很多沟通误差。

也正因为如此，我总觉得今天很多 AI 讨论有点跑偏。很多项目确实很有趣、也很好玩，拿来体验当然没问题，但高级工程工作并不会因为“给代码库接了个 agent”就自然变好。真正重要的，不是哪个 demo 看起来最炫，而是哪些工程能力真的被放大了，同时软件开发本身的纪律有没有被保留下来。

对于架构师和资深工程师来说，这里真正重要的能力包括：

快速做对比式原型验证：不是只用 slides 和文档去论证某个想法，而是直接把多个方案做成可运行代码来比较。
大规模代码理解能力：能在大量模块之间快速阅读，同时保持对整个系统的全局认识。
系统性的重构能力：把基于反射、依赖运行时动态行为的路径，系统性地改造成适配 AOT 约束的设计。
搭建自动化的能力：当一个迁移步骤在每次上游同步时都必须重做一次，靠手工处理本身就很费时费力，而且越往后只会越累。AI 让我们真正有条件去投资生成器、清单、一致性检查和漂移检测，把重复的人力劳动变成可重复的自动化流程。
大范围审查能力：在很大的代码面上检查边界条件、兼容性约束，以及方案是否经得起反复执行。

这些能力也都体现在最终的设计结果里。same-FQCN 替换为 GraalVM 特定行为建立了清晰、受控的边界；反射元数据不再依赖手工维护的猜测清单，而是直接从构建产物中生成；各种清单机制和漂移检测，则把原本模糊的“上游同步风险”变成了显式的工程工作流。

对于初级工程师，我觉得这里的启发同样重要。AI 不会让架构设计、系统约束、接口设计、测试和可维护性这些基本功变得不重要。恰恰相反，这些能力只会变得更重要，因为它们决定了“被加速的实现”最终产出的是一个可持续演进的系统，还是只是更快地制造出更多代码。真正的杠杆来自工程判断力，而不是新鲜感。

Claude Code 和 Gemini AI 在整个过程中都扮演了工程加速器的角色。在 GraalVM Distro 这个项目里，它们具体帮我们做了几件事：

把迁移思路直接做成可运行代码：不是争论哪个方向可能行得通，而是把多个真实原型做出来、跑起来、比较掉，把不成立的方向淘汰掉。
重构重反射、重动态的代码路径：把不适合运行时的模式系统性替换成 AOT 友好的实现方式。
让上游同步真正可持续：每次 distro 从上游 SkyWalking 拉取变更后，元数据扫描、配置再生成和重新编译都必须再来一次。AI 帮助我们把这些过程做成流水线，使每次同步都变成一个可控、且大部分自动化的过程，而不是一次比一次更长的手工重复劳动。
在大范围内审查逻辑和边界情况：特别是在功能对等性比纯实现速度更重要的地方。

最终产出的，不只是一次大重写，而是一套可重复的系统：预编译器、manifest 驱动的加载、反射配置生成、替换边界，以及让上游迁移可审查、可自动化的漂移检测机制。

如果你想看这种开发方法背后的更广泛背景，可以读这篇文章：在成熟开源大型项目中实践 Agentic Vibe Coding：软件工程与工程控制论还在延续。这篇文章则是这个故事的下一步：不仅是在一个成熟代码库里增强功能，而是重新激活一项曾经停滞的工作，并把它真正做成可运行系统。

真正改变的到底是什么

这个项目最重要的结果，并不是一张 benchmark 表。基准数据当然属于 distro 本身，而且它们很重要，因为它们证明这套系统是真实可运行的。但对这篇文章来说，更深层的变化发生在方法论层面：AI Coding 改变了我们探索、验证和打磨架构方案的方式。

过去，架构往往更像一项以文档为主、后面拖着漫长而昂贵实现过程的活动。现在，我们可以更快地在想法、原型、比较和重设计之间切换。这让我们真正有机会去追求更高抽象层次的方案，保留更干净的边界，并建设那些让迁移过程可持续维护的自动化机制。

这项工作的技术证据，就是 SkyWalking GraalVM Distro 本身：它不仅是一个可运行的系统，更是一条由预编译器、自动生成的反射元数据、受控替换边界和漂移检查组成的迁移流水线。基准数据之所以重要，是因为它们证明这套系统在实践里是成立的；但从架构角度看，真正的结果是：这次迁移不再是一场一次性的移植，而是变成了一套可重复执行的系统工程。关于完整测试方法、原始数据和技术设计，请阅读配套文章：SkyWalking GraalVM Distro：设计与基准测试。

项目仓库位于 apache/skywalking-graalvm-distro。我们欢迎社区成员测试这个新发行版、提交 issue，并帮助它逐步走向生产可用。

对我来说，更深层的启发并不止于这个发行版。AI Coding 不会让架构变得不重要，反而会让架构更值得被认真追求。当实现速度提升到一定程度时，我们终于有机会在真实代码里验证更多想法，保留那些真正好的抽象，并把那些过去常常因为投入太大而半途妥协的系统真正做出来。

对于资深工程师来说，瓶颈正在从单纯的代码实现速度，转向品味、系统判断力，以及定义稳定边界的能力。对于初级工程师来说，真正该走的路不是追逐每一种看上去都很刺激的 AI 工作流，而是把基础能力练得更扎实，让加速真正产生复利：理解需求、阅读陌生系统、质疑假设，并识别出在系统快速变化时仍然必须保持正确的那些部分。AI Coding 降低了验证好设计的代价，但并没有降低工程判断本身的门槛。

Zh: SkyWalking GraalVM Distro：设计与基准测试

Sun, 15 Mar 2026 00:00:00 +0000

这篇文章会完整介绍我们如何把 Apache SkyWalking OAP 迁移到 GraalVM Native Image。目标不是做一次性移植，而是把这件事做成一套能持续跟上上游演进的流程。

如果你想看这项工作的更大背景，以及 AI Coding 如何让这个项目真正做得出来，请阅读：AI Coding 如何重塑软件架构师的工作方式。

为什么 GraalVM 在这里是刚需

GraalVM Native Image 可以把 Java 应用做 Ahead-of-Time（AOT）编译，生成独立可执行文件。对于像 SkyWalking OAP 这样的可观测性后端来说，这不是“锦上添花”的性能优化，而是明确的工程刚需。

可观测性平台必须是基础设施中最可靠的部分。它必须在自己要观测的那些故障发生时依然存活。在云原生环境里，工作负载会不断扩缩容、迁移和重启，负责观测一切的后端本身不能还是那个启动慢、空闲占用大、恢复缓慢的重型进程。

我们的基准测试结果让这个结论变得非常具体：

**启动时间：**约 5 ms 对比约 635 ms。在 Kubernetes 集群里，当 OAP Pod 被驱逐或重新调度时，635 ms 的差距意味着这段时间里的遥测数据可能会丢失。5 ms 的情况下，新 Pod 往往在大部分客户端还没感知到中断之前就已经重新开始接收数据了。
**空闲内存：**约 41 MiB 对比约 1.2 GiB。可观测性后端是 24/7 常驻运行的。在多租户或边缘部署场景里，基础 RSS 降了 97%，可以放进更小的节点，而不再必须占用一台专用机器。
**负载下内存：**在 20 RPS 下约 629 MiB 对比约 2.0 GiB。生产级负载下内存降了 70%，直接对应更少的节点、更低的云账单，以及在后端本身成为扩容瓶颈之前更多的余量。
**没有预热惩罚：**峰值吞吐可以更早发挥出来。JVM 的 JIT 编译器往往需要数分钟流量才能完成热点优化，在这段时间里，尾延迟更差，数据处理也会滞后。原生二进制没有同样的阶段。
**更小的攻击面：**不再需要完整 JDK 运行时，需要跟踪和修补的 CVE 也就少了很多。对于一个会接收整个集群所有服务数据的组件来说，这一点很重要。

这些都不是“小修小补”。它们直接改变了哪些部署形态开始变得可行：无服务器形态的可观测性后端、边车式采集模型、内存预算极其紧张的边缘节点。只有当后端足够轻、足够快时，这些方案才真正有落地空间。

挑战：一个成熟、动态特性很多的 Java 平台

SkyWalking OAP 身上有大型 Java 平台的所有典型问题：运行时字节码生成、重反射初始化、classpath 扫描、基于 SPI 的模块装配，以及动态 DSL 执行。这些机制方便扩展，但做 GraalVM native image 时全是障碍。

GraalVM 文档中列出的限制，只是问题的开始。在一个成熟的 OSS 平台里，这些限制会深深缠绕在多年积累下来的运行时设计决策中。常规的 GraalVM native image 很难处理运行时类生成、反射、动态发现和脚本执行，而这些在 SkyWalking OAP 中都不是零散存在的，它们本来就是系统设计的一部分。

在这个发行版的早期历史里，还有一座非常具体的大山。那时上游 SkyWalking 仍然高度依赖 Groovy 来处理 LAL、MAL 和 Hierarchy 脚本。理论上，它只是另一个“不支持运行时动态”的组件；但在实践里，Groovy 是整条路径上最大的障碍。它不仅仅是脚本执行问题，而是代表着一整套在 JVM 世界里极其便利、在 native image 世界里极其不友好的动态模型。

设计目标：让迁移这件事可以重复做

设计目标不是”把 native-image 跑通一次就完”，而是做出一套能反复用、能长期维护的迁移系统：

把运行时生成的产物前移到构建期。 OAL、MAL、LAL、Hierarchy 规则，以及 meter 相关的生成类，都在构建期完成编译并打包，而不是等到启动时才动态生成。
用确定性的加载机制替代动态发现。 classpath 扫描和运行时注册路径被转换为基于 manifest 的加载方式。
减少运行时反射，并在构建期生成 native 元数据。 反射配置不再依赖人工维护的猜测清单，而是根据真实 manifest 和扫描结果生成。
让上游同步边界保持清晰。 same-FQCN replacements 会被显式打包、列清单，并通过陈旧性检查守住边界。
让变化第一时间暴露出来。 一旦上游 provider、规则文件或被替换的源文件发生变化，测试就会失败，迫使我们做显式审查。

这才是最关键的架构转变。好的抽象和前瞻性，在 AI 时代并没有变得不重要，反而变得更重要了，因为它们决定了 AI 带来的速度，最终产出的是一个可维护的系统，还是一堆膨胀得更快的代码。

把运行时动态行为变成构建期产物

SkyWalking OAP 里有多个在 JVM 世界里很自然、但在 native image 里很棘手的动态子系统：

OAL 会在运行时生成类。
LAL、MAL 和 Hierarchy 在历史上与大量基于 Groovy 的运行时行为绑定在一起，这也是早期 distro 工作中最难处理的阻碍之一。
MAL、LAL 和 Hierarchy 规则依赖运行时编译行为。
基于 Guava 的 classpath 扫描会发现注解、dispatcher、decorator 和 meter function。
基于 SPI 的模块和 provider 发现依赖更动态的运行时环境。
YAML/config 初始化和框架集成依赖反射访问。

在 SkyWalking GraalVM Distro 里，这些问题不是靠零散补丁一个个修掉的，而是被统一收敛到一条构建期流水线里。

预编译器会在构建过程中运行 DSL 引擎、导出生成类、写入 manifest、序列化配置数据，并生成 native-image 元数据。这样一来，启动时只需要做类加载和注册，不再需要运行时代码生成。运行期之所以能变得更简单，是因为原本的复杂性被前移到了构建期。

这也是为什么这个项目不只是一次性能优化。我们的设计目标，是把复杂性前移到一个更容易验证、更容易自动化、也更便于反复执行的位置。

same-FQCN 替换：一条可控的边界

这个发行版里最实用的设计选择之一，就是使用 same-FQCN 替换类。我们没有依赖模糊的启动技巧，也没有依赖未文档化的加载顺序假设。相反，我们会重新打包 GraalVM 特定 jar，排除原本的上游类，再让替换类占据完全相同的 fully-qualified class name。

这对可维护性非常关键，因为它建立了一条非常清晰的边界：

上游类仍然定义行为契约；
GraalVM 侧的替换类提供兼容的实现策略；
打包过程则让这次替换变得显式可见。

例如，OAL 的加载过程从运行时编译变成了基于 manifest 的预编译类加载。类似的替换也处理了 MAL 和 LAL DSL 加载、模块装配、配置初始化，以及多个对反射敏感的路径。目标不是把一切都 fork 出去，而是只替换那些运行时模型从根本上不适合 native image 的部分。

随后，这条边界还会通过测试来守护：测试会对照与 replacement 对应的上游源文件做哈希。当上游改动了这些文件中的任何一个，构建就会失败，并明确告诉我们哪个 replacement 需要重新审查。这样一来，“如何跟上上游”就不再是一个充满焦虑的抽象问题，而变成一项明确、可落地的工程工作。

反射配置不是猜出来的，而是生成出来的

在很多 GraalVM 迁移项目里，reflect-config.json 最终会变成一个靠经验不断累积的工件。它会越来越大，越来越陈旧，最后没有人真正清楚它是不是完整，也不清楚每一项配置为什么存在。这种模式在一个持续演化的大型 OSS 平台里是无法扩展的。

在这个发行版里，反射元数据直接从构建产物和扫描结果中生成，包括：

OAL、MAL、LAL、Hierarchy 以及 meter 生成类的 manifest；
注解扫描得到的类；
Armeria HTTP handler；
GraphQL resolver 和 schema 映射类型；
被接受的 ModuleConfig 类。

这是一种健康得多的模式。我们不再依赖人去记住所有可能触发反射访问的路径，而是让系统根据真实迁移流水线推导出反射元数据。构建过程本身，成为了事实来源。

让上游同步变得现实可行

如果这个发行版只是一次性的工程冲刺，那它的意义会小很多。真正困难的事情，是在上游 SkyWalking 继续演进的同时，让它还能持续维护下去。

这也是为什么仓库里会有一整套显式的清单和漂移检测机制：

provider 清单，用来强制新上游 provider 被分类；
规则文件清单，用来强制新 DSL 输入被显式确认；
预编译 YAML 输入的 SHA watcher；
带 GraalVM 特定 replacement 的上游源文件 SHA watcher。

好的抽象不仅仅是代码结构优雅，更在于你是否选择了一种能在未来变化面前继续成立的迁移设计。

基准测试结果

我们在一台 Apple M3 Max（macOS、Docker Desktop、10 CPUs / 62.7 GB）上，对标准 JVM OAP 和 GraalVM Distro 做了对比测试，两者都连接到 BanyanDB。

启动测试（Docker Compose，无流量，3 次取中位数）

指标	JVM OAP	GraalVM OAP	差异
冷启动时间	635 ms	5 ms	约快 127 倍
热启动时间	630 ms	5 ms	约快 126 倍
空闲 RSS	约 1.2 GiB	约 41 MiB	约降低 97%

启动时间的测量方式，是从 OAP 第一条应用日志时间戳开始，到出现 listening on 11800 日志（即 gRPC 服务 ready）为止。

持续负载下（Kind + Istio 1.25.2 + Bookinfo，约 20 RPS，2 个 OAP 副本）

在 60 秒预热之后，每 10 秒采样一次，共 30 个样本。

指标	JVM OAP	GraalVM OAP	差异
CPU 中位数（millicores）	101	68	-33%
CPU 平均值（millicores）	107	67	-37%
内存中位数（MiB）	2068	629	-70%
内存平均值（MiB）	2082	624	-70%

两个版本报告的 entry-service CPM 一致，说明在这个测试负载下，两者的流量处理能力相同。

我们每 30 秒通过 swctl 对所有已发现服务收集这些指标： service_cpm、service_resp_time、service_sla、service_apdex、service_percentile。

完整的基准测试脚本和原始数据位于发行版仓库中的 benchmark/ 目录。

当前状态

这个项目已经是一个可运行的实验性发行版，托管在独立仓库中：apache/skywalking-graalvm-distro。

当前发行版有意聚焦在一种现代、高性能的运行模式上：

存储： BanyanDB
集群模式： Standalone 和 Kubernetes
配置方式： 无配置或 Kubernetes ConfigMap
运行模型： 固定模块集合、预编译产物和 AOT 友好的装配方式

这种聚焦是刻意的。要把迁移做成一套可重复的系统，第一步必须先把边界收清楚，做出一个真正能跑起来的版本，然后再在不失控的前提下逐步扩展。

快速开始

由于 SkyWalking GraalVM Distro 的设计目标就是追求极致性能，它目前最适合与 BanyanDB 存储后端搭配使用。当前发布的镜像已经可以在 Docker Hub 获取，你可以直接用下面这个 docker-compose.yml 启动整套系统。

version: '3.8'

services:
  banyandb:
    image: ghcr.io/apache/skywalking-banyandb:e1ba421bd624727760c7a69c84c6fe55878fb526
    container_name: banyandb
    restart: always
    ports:
      - "17912:17912"
      - "17913:17913"
    command: standalone --stream-root-path /tmp/stream-data --measure-root-path /tmp/measure-data --measure-metadata-cache-wait-duration 1m --stream-metadata-cache-wait-duration 1m
    healthcheck:
      test: ["CMD", "sh", "-c", "nc -nz 127.0.0.1 17912"]
      interval: 5s
      timeout: 10s
      retries: 120

  oap:
    image: apache/skywalking-graalvm-distro:0.1.1
    container_name: oap
    depends_on:
      banyandb:
        condition: service_healthy
    restart: always
    ports:
      - "11800:11800"
      - "12800:12800"
    environment:
      SW_STORAGE: banyandb
      SW_STORAGE_BANYANDB_TARGETS: banyandb:17912
      SW_HEALTH_CHECKER: default
    healthcheck:
      test: ["CMD-SHELL", "nc -nz 127.0.0.1 11800 || exit 1"]
      interval: 5s
      timeout: 10s
      retries: 120

  ui:
    image: ghcr.io/apache/skywalking/ui:10.3.0
    container_name: ui
    depends_on:
      oap:
        condition: service_healthy
    restart: always
    ports:
      - "8080:8080"
    environment:
      SW_OAP_ADDRESS: http://oap:12800

只需要执行：

docker compose up -d

欢迎社区来测试这个新发行版、提交 issue，并帮助我们推动它走向生产可用。

特别感谢 GraalVM 团队提供的技术基础。

Blog: How AI Changed the Economics of Architecture

Fri, 13 Mar 2026 00:00:00 +0000

SkyWalking GraalVM Distro: A case study in turning runnable PoCs into a repeatable migration pipeline.

The most important lesson from this project is not that AI can generate a large amount of code. It is that AI changes the economics of architecture. When runnable PoCs become cheap to build, compare, discard, and rebuild, architects can push further toward the design they actually want instead of stopping early at a compromise they can afford to implement.

That shift matters a lot in mature open source systems. Apache SkyWalking OAP has long been a powerful and production-proven observability backend, but it also carries all the realities of a large Java platform: runtime bytecode generation, reflection-heavy initialization, classpath scanning, SPI-based module wiring, and dynamic DSL execution that are friendly to extensibility but hostile to GraalVM native image.

SkyWalking GraalVM Distro is the result of treating that challenge as a design-system problem instead of a one-off porting exercise. The goal was not only to make OAP run as a native binary, but to turn GraalVM migration itself into a repeatable automation pipeline that can stay aligned with upstream evolution.

For the full technical design, benchmark data, and getting-started guide, see the companion post: SkyWalking GraalVM Distro: Design and Benchmarks.

From Paused Idea to Runnable System

This journey actually began years ago. Shortly after this repository was created, yswdqz spent several months exploring the transition. The project proved much harder in practice than the individual GraalVM limitations sounded on paper, and the work eventually paused for years.

That pause is important. The missing ingredient was not ideas. Mature maintainers usually have more ideas than time. The real constraint was implementation economics. Even when the architect can see several promising directions, limited developer resources force an earlier trade-off: choose the path that is cheapest to implement, not necessarily the path that is cleanest, most reusable, or most future-proof.

This is a very common reality, not an exceptional one. In open source communities, much of the work depends on volunteers or limited company sponsorship. In commercial products, the pressure is different but the constraint is still real: roadmap commitments, staffing limits, and delivery deadlines keep engineering resources tight. In both worlds, good ideas are often abandoned not because they are wrong, but because they are too expensive to validate and implement thoroughly.

There is another constraint that matters just as much: the architect is usually also a very senior engineer, not a full-time implementation machine. That means limited personal coding energy, fragmented time, and a constant need to explain ideas to other senior engineers before the code exists. Traditionally, that explanation happens through diagrams, documents, and conversations. It is slow, lossy, and unpredictable. We all know some version of the Telephone Game: even simple words are easy to misunderstand, and by the time the misunderstanding becomes visible, a lot of time has already passed.

What changed in late 2025 was that AI engineering made multiple runnable ideas affordable. Instead of picking an early compromise because implementation capacity was scarce, we could switch repeatedly between designs, validate them with code, discard weak directions quickly, and keep iterating until the architecture became solid, practical, and efficient enough to hold.

That design freedom was critical. GraalVM documentation gives clear guidance on isolated limitations, but a mature OSS platform hits them as a connected system. Fixing only one dynamic mechanism is not enough. To make native image practical, we had to turn whole categories of runtime behavior into build-time artifacts and automated metadata generation.

There was also a very concrete mountain in front of us in the early history of this distro. In the first several commits of the repository, upstream SkyWalking still relied heavily on Groovy for LAL, MAL, and Hierarchy scripts. In theory, that was just one more unsupported runtime-heavy component. In practice, Groovy was the biggest obstacle in the whole path. It represented not only script execution, but a whole dynamic model that was deeply convenient on the JVM side and deeply unfriendly to native image.

To bridge the gap, we re-architected the core engines of OAP around an AOT-first model. Earlier experiments had to confront Groovy-era runtime behavior directly and explore alternative script-compilation approaches to get around it. The finalized direction went further: align with the upstream compiler pipeline, move dynamic generation to build time, and add automation so the migration stays controllable as upstream keeps moving. Concretely, that meant turning OAL, MAL, LAL, and Hierarchy generation into build-time precompiler outputs instead of leaving them as startup-time dynamic behavior.

AI Speed Changed the Design Loop

The scale of this transformation was not only about coding faster. AI changed the loop between idea, prototype, validation, and redesign. We could build runnable PoCs for different approaches, throw away weak ones quickly, and preserve the promising abstractions until they formed a coherent migration system.

That does not reduce the role of human architecture. It raises the value of it. Human judgment was still required to decide what should become build-time, what should stay configurable, where to introduce same-FQCN replacements, how to keep upstream sync controllable, and which abstractions were worth preserving. But AI speed made it realistic to pursue those better designs instead of settling for a simpler compromise too early.

This is the real change in the economics of architecture. In the past, an architect might already know the cleaner direction, but limited engineering capacity often forced that vision back toward a cheaper compromise. Now the architect can return much closer to being a fast developer again: building code, shaping high-abstraction interfaces, and using design patterns to prove the vision directly in the real world.

That changes communication as much as implementation. In open source, we often say, talk is cheap, show me the code. With AI engineering, showing the code becomes much more straightforward. The design no longer depends so heavily on a slow top-down translation from idea to documents to interpretation to implementation. The code can appear earlier, and it can run earlier.

Other senior engineers benefit from this too. They do not need to reconstruct the whole design only from diagrams, meetings, or long explanations. They can review the actual abstraction, see the behavior in code, run it, challenge it, and refine it from something concrete. That makes architectural collaboration faster, clearer, and less lossy.

This is also where I think the current AI discussion is often noisy. Many projects are fun, surprising, and worth exploring, but advanced engineering work is not improved merely by attaching an agent to a codebase. The important question is not which demo looks most magical. The important question is which engineering capabilities are actually being accelerated without losing the discipline of software development itself.

For architects and senior engineers, the capabilities that mattered most here were:

Fast comparative prototyping: Building several runnable approaches in code instead of defending one idea with slides and documents.
Large-scale code comprehension: Reading across many modules quickly enough to keep the whole system in view.
Systematic refactoring: Converting reflection-heavy or runtime-dynamic paths into designs that fit AOT constraints.
Automation construction: When a migration step must be repeated every upstream sync, doing it manually once is already expensive. Doing it manually again next time is even more expensive. AI made it practical to invest in generators, inventories, consistency checks, and drift detectors that turn repeated manual work into repeatable automation.
Review at breadth: Checking edge cases, compatibility boundaries, and repeatability across a large surface area.

Those capabilities were visible in the resulting design. Same-FQCN replacements created a controlled boundary for GraalVM-specific behavior. Reflection metadata was generated from build outputs instead of maintained as a hand-written guess list. Inventories and drift detectors turned upstream sync from a vague maintenance risk into an explicit engineering workflow.

For junior engineers, I think the lesson is equally important. AI does not remove the need to learn architecture, invariants, interfaces, testing, or maintenance. It makes those skills more valuable, because they determine whether accelerated implementation produces a durable system or just more code faster. The leverage comes from engineering judgment, not from novelty.

Claude Code and Gemini AI acted as engineering accelerators throughout this process. In the GraalVM Distro specifically, they helped us:

Explore migration strategies as running code: Instead of debating which approach might work, we built and compared multiple real prototypes, discarded the weak ones, and kept what held up.
Refactor reflection-heavy and dynamic code paths: Replace runtime-hostile patterns with AOT-friendly alternatives across the codebase.
Make upstream sync sustainable: Every time the distro pulls from upstream SkyWalking, metadata scanning, config regeneration, and recompilation must happen again. AI helped build the pipeline so that each sync is a controlled, largely automated process rather than a fresh manual effort that grows longer each time.
Review logic and edge cases at scale: Especially in places where feature parity mattered more than raw implementation speed.

The result was not just a large rewrite. It was a repeatable system: precompilers, manifest-driven loading, reflection-config generation, replacement boundaries, and drift detectors that make upstream migration reviewable and automatable.

For the broader methodology behind this style of development, see Agentic Vibe Coding in a Mature OSS Project. This post is the next step in that story: not only enhancing an active mature codebase, but reviving a paused effort and making it actually runnable.

What Actually Changed

The most important outcome of this project is not a benchmark table. The benchmark results belong to the distro itself, and they matter because they prove the system is real. But for this post, the deeper result is methodological: AI engineering changed how architecture could be explored, validated, and refined.

Instead of treating architecture as a mostly document-driven activity followed by a long and expensive implementation phase, we were able to move much faster between idea, prototype, comparison, and redesign. That made it realistic to pursue higher-abstraction solutions, preserve cleaner boundaries, and build the automation needed to keep the migration maintainable over time.

The technical evidence for that work is the SkyWalking GraalVM Distro itself: not only a runnable system, but a migration pipeline expressed as precompilers, generated reflection metadata, controlled replacement boundaries, and drift checks. The benchmark data matter because they prove the system works in practice, but the architectural result is that the migration became a repeatable system rather than a one-time port. For detailed benchmark methodology, per-pod data, and the full technical design, see SkyWalking GraalVM Distro: Design and Benchmarks.

The project is hosted at apache/skywalking-graalvm-distro. We invite the community to test it, report issues, and help move it toward production readiness.

For me, the deeper takeaway is broader than this distro. AI engineering does not make architecture less important. It makes architecture more worth pursuing. When implementation speed rises enough, we can afford to test more ideas in code, keep the good abstractions, and build systems that would previously have been judged too expensive to finish well.

For senior engineers, that means the bottleneck shifts away from raw typing speed and toward taste, system judgment, and the ability to define stable boundaries. For junior engineers, it means the path forward is not to chase every exciting AI workflow, but to become stronger at the fundamentals that let acceleration compound: understanding requirements, reading unfamiliar systems, questioning assumptions, and recognizing what must remain correct as everything around it changes. AI changed the economics of architecture because it lowered the cost of validating better designs without lowering the bar for engineering judgment.

Blog: SkyWalking GraalVM Distro: Design and Benchmarks

Fri, 13 Mar 2026 00:00:00 +0000

A technical deep-dive into how we migrated Apache SkyWalking OAP to GraalVM Native Image — not as a one-off port, but as a repeatable pipeline that stays aligned with upstream.

For the broader story of how AI engineering made this project economically viable, see How AI Changed the Economics of Architecture.

Why GraalVM Is Not Optional

GraalVM Native Image compiles Java applications Ahead-of-Time (AOT) into standalone executables. For an observability backend like SkyWalking OAP, this is not a performance optimization — it is an operational necessity.

An observability platform must be the most reliable component in the infrastructure. It has to survive the failures it is supposed to observe. In cloud-native environments where workloads scale, migrate, and restart constantly, the backend that watches everything cannot itself be the slow, heavy process that takes seconds to recover and gigabytes to idle.

Our benchmarks make the case concrete:

Startup: ~5 ms vs ~635 ms. In a Kubernetes cluster where an OAP pod gets evicted or rescheduled, a 635 ms gap means lost telemetry — traces, metrics, and logs that arrive during that window are simply dropped. At 5 ms, the new pod is receiving data before most clients even notice the disruption.
Idle memory: ~41 MiB vs ~1.2 GiB. Observability backends run 24/7. In a multi-tenant or edge deployment, a 97% reduction in baseline RSS is the difference between fitting the observability stack on a small node and needing a dedicated one.
Memory under load: ~629 MiB vs ~2.0 GiB at 20 RPS. A 70% reduction at production-like traffic means fewer nodes, lower cloud bills, and more headroom before the backend itself becomes a scaling bottleneck.
No warm-up penalty: Peak throughput is available from the first request. The JVM’s JIT compiler needs minutes of traffic before it optimizes hot paths — during that window, tail latency is worse and data processing lags behind. A native binary has no such phase.
Smaller attack surface: No JDK runtime means fewer CVEs to track and patch. For a component that ingests data from every service in the cluster, that matters.

These are not incremental improvements. They change what deployment topologies are practical. Serverless observability backends, sidecar-model collectors, edge nodes with tight memory budgets — all become realistic when the backend is this light and this fast.

The Challenge: A Mature, Dynamic Java Platform

SkyWalking OAP carries all the realities of a large Java platform: runtime bytecode generation, reflection-heavy initialization, classpath scanning, SPI-based module wiring, and dynamic DSL execution. These patterns are friendly to extensibility but hostile to GraalVM native image.

The documented GraalVM limitations are only the beginning. In a mature OSS platform, those limitations are deeply entangled with years of runtime design decisions. Standard GraalVM native images struggle with runtime class generation, reflection, dynamic discovery, and script execution — all of which had deep roots in SkyWalking OAP.

There was also a very concrete mountain in the early history of this distro. Upstream SkyWalking relied heavily on Groovy for LAL, MAL, and Hierarchy scripts. In theory, that was just one more unsupported runtime-heavy component. In practice, Groovy was the biggest obstacle in the whole path. It represented not only script execution, but a whole dynamic model that was deeply convenient on the JVM side and deeply unfriendly to native image.

The Design Goal: Make Migration Repeatable

The final design is not just “run native-image successfully.” It is a system that keeps migration work repeatable:

Pre-compile runtime-generated assets at build time. OAL, MAL, LAL, Hierarchy rules, and meter-related generated classes are compiled during the build and packaged as artifacts instead of being generated at startup.
Replace dynamic discovery with deterministic loading. Classpath scanning and runtime registration paths are converted into manifest-driven loading.
Reduce runtime reflection and generate native metadata from the build. Reflection configuration is produced from actual manifests and scanned classes instead of being maintained as a hand-written guess list.
Keep the upstream sync boundary explicit. Same-FQCN replacements are intentionally packaged, inventoried, and guarded with staleness checks.
Make drift visible immediately. If upstream providers, rule files, or replaced source files change, tests fail and force explicit review.

That is the architectural shift that matters most. Reusable abstraction and foresight did not become less important in the AI era. They became more important, because they determine whether AI speed produces a maintainable system or just a fast-growing pile of code.

Turning Runtime Dynamism into Build-Time Assets

SkyWalking OAP has several dynamic subsystems that are natural in a JVM world but problematic for native image:

OAL generates classes at runtime.
LAL, MAL, and Hierarchy were historically tied to Groovy-heavy runtime behavior, which became one of the biggest practical blockers in the early distro work.
MAL, LAL, and Hierarchy rules depend on runtime compilation behavior.
Guava-based classpath scanning discovers annotations, dispatchers, decorators, and meter functions.
SPI-based module/provider discovery expects a more dynamic runtime environment.
YAML/config initialization and framework integrations depend on reflective access.

In SkyWalking GraalVM Distro, these are not solved one by one as isolated patches. They are pulled into a build-time pipeline.

The precompiler runs the DSL engines during the build, exports generated classes, writes manifests, serializes config data, and generates native-image metadata. That means startup becomes class loading and registration, not runtime code generation. The runtime path is simpler because the build path became richer.

This is also why the project is more than a performance exercise. The design goal was to move complexity into a place where it is easier to verify, easier to automate, and easier to repeat.

Same-FQCN Replacements as a Controlled Boundary

One of the most practical design choices in this distro is the use of same-FQCN replacement classes. We do not rely on vague startup tricks or undocumented ordering assumptions. Instead, the GraalVM-specific jars are repackaged so the original upstream classes are excluded and the replacement classes occupy the exact same fully-qualified names.

This matters for maintainability. It creates a very clear boundary:

the upstream class still defines the behavior contract,
the GraalVM replacement provides a compatible implementation strategy,
and the packaging makes that swap explicit.

For example, OAL loading changes from runtime compilation into manifest-driven loading of precompiled classes. Similar replacements handle MAL and LAL DSL loading, module wiring, config initialization, and several reflection-sensitive paths. The goal is not to fork everything. The goal is to replace only the places where the runtime model is fundamentally unfriendly to native image.

That boundary is then guarded by tests that hash the upstream source files corresponding to the replacements. When upstream changes one of those files, the build fails and tells us exactly which replacement needs review. This is what turns “keeping up with upstream” from an anxiety problem into a visible engineering task.

Reflection Config Is Generated, Not Guessed

In many GraalVM migrations, reflect-config.json becomes a manually accumulated artifact. It grows over time, gets stale, and nobody is fully sure whether it is complete or why each entry exists. That approach does not scale well for a large, evolving OSS platform.

In this distro, reflection metadata is generated from the build outputs and scanned classes:

manifests for OAL, MAL, LAL, Hierarchy, and meter-generated classes,
annotation-scanned classes,
Armeria HTTP handlers,
GraphQL resolvers and schema-mapped types,
and accepted ModuleConfig classes.

This is a much healthier model. Instead of asking people to remember every reflective access path, the system derives reflection metadata from the actual migration pipeline. The build becomes the source of truth.

Keeping Upstream Sync Practical

If this distro were only a one-time engineering sprint, it would be much less interesting. The real challenge is keeping it alive while upstream SkyWalking continues to evolve.

That is why the repo includes explicit inventories and drift detectors:

provider inventories that force new upstream providers to be categorized,
rule-file inventories that force new DSL inputs to be acknowledged,
SHA watchers for precompiled YAML inputs,
and SHA watchers for upstream source files with GraalVM-specific replacements.

Good abstraction is not only about elegant code structure. It is about choosing a migration design that can survive contact with future change.

Benchmark Results

We benchmarked the standard JVM OAP against the GraalVM Distro on an Apple M3 Max (macOS, Docker Desktop, 10 CPUs / 62.7 GB), both connecting to BanyanDB.

Boot Test (Docker Compose, no traffic, median of 3 runs)

Metric	JVM OAP	GraalVM OAP	Delta
Cold boot startup	635 ms	5 ms	~127x faster
Warm boot startup	630 ms	5 ms	~126x faster
Idle RSS	~1.2 GiB	~41 MiB	~97% reduction

Boot time is measured from OAP’s first application log timestamp to the listening on 11800 log line (gRPC server ready).

Under Sustained Load (Kind + Istio 1.25.2 + Bookinfo at ~20 RPS, 2 OAP replicas)

30 samples at 10s intervals after 60s warmup.

Metric	JVM OAP	GraalVM OAP	Delta
CPU median (millicores)	101	68	-33%
CPU avg (millicores)	107	67	-37%
Memory median (MiB)	2068	629	-70%
Memory avg (MiB)	2082	624	-70%

Both variants reported identical entry-service CPM, confirming equivalent traffic processing capability.

Service metrics collected every 30s via swctl for all discovered services: service_cpm, service_resp_time, service_sla, service_apdex, service_percentile.

Full benchmark scripts and raw data are in the benchmark/ directory of the distro repository.

Current Status

The project is a runnable experimental distribution, hosted in its own repository: apache/skywalking-graalvm-distro.

The current distro intentionally focuses on a modern, high-performance operating model:

Storage: BanyanDB
Cluster modes: Standalone and Kubernetes
Configuration: none or Kubernetes ConfigMap
Runtime model: fixed module set, precompiled assets, and AOT-friendly wiring

This focus is deliberate. A repeatable migration system starts by making a clear scope runnable, then expanding without losing control.

Getting Started

Because the SkyWalking GraalVM Distro is designed for peak performance, it is optimized to work with BanyanDB as its storage backend. The current published image is available on Docker Hub, and you can boot the stack using the following docker-compose.yml.

version: '3.8'

services:
  banyandb:
    image: ghcr.io/apache/skywalking-banyandb:e1ba421bd624727760c7a69c84c6fe55878fb526
    container_name: banyandb
    restart: always
    ports:
      - "17912:17912"
      - "17913:17913"
    command: standalone --stream-root-path /tmp/stream-data --measure-root-path /tmp/measure-data --measure-metadata-cache-wait-duration 1m --stream-metadata-cache-wait-duration 1m
    healthcheck:
      test: ["CMD", "sh", "-c", "nc -nz 127.0.0.1 17912"]
      interval: 5s
      timeout: 10s
      retries: 120

  oap:
    image: apache/skywalking-graalvm-distro:0.1.1
    container_name: oap
    depends_on:
      banyandb:
        condition: service_healthy
    restart: always
    ports:
      - "11800:11800"
      - "12800:12800"
    environment:
      SW_STORAGE: banyandb
      SW_STORAGE_BANYANDB_TARGETS: banyandb:17912
      SW_HEALTH_CHECKER: default
    healthcheck:
      test: ["CMD-SHELL", "nc -nz 127.0.0.1 11800 || exit 1"]
      interval: 5s
      timeout: 10s
      retries: 120

  ui:
    image: ghcr.io/apache/skywalking/ui:10.3.0
    container_name: ui
    depends_on:
      oap:
        condition: service_healthy
    restart: always
    ports:
      - "8080:8080"
    environment:
      SW_OAP_ADDRESS: http://oap:12800

Simply run:

docker compose up -d

We invite the community to test this new distribution, report issues, and help us move it toward a production-ready state.

Special thanks to the GraalVM team for the technology foundation.

Blog: Diagnose Service Mesh Network Performance with eBPF

Tue, 27 Sep 2022 00:00:00 +0000

Background

This article will show how to use Apache SkyWalking with eBPF to make network troubleshooting easier in a service mesh environment.

Apache SkyWalking is an application performance monitor tool for distributed systems. It observes metrics, logs, traces, and events in the service mesh environment and uses that data to generate a dependency graph of your pods and services. This dependency graph can provide quick insights into your system, especially when there’s an issue.

However, when troubleshooting network issues in SkyWalking’s service topology, it is not always easy to pinpoint where the error actually is. There are two reasons for the difficulty:

Traffic through the Envoy sidecar is not easy to observe. Data from Envoy’s Access Log Service (ALS) shows traffic between services (sidecar-to-sidecar), but not metrics on communication between the Envoy sidecar and the service it proxies. Without that information, it is more difficult to understand the impact of the sidecar.
There is a lack of data from transport layer (OSI Layer 4) communication. Since services generally use application layer (OSI Layer 7) protocols such as HTTP, observability data is generally restricted to application layer communication. However, the root cause may actually be in the transport layer, which is typically opaque to observability tools.

Access to metrics from Envoy-to-service and transport layer communication can make it easier to diagnose service issues. To this end, SkyWalking needs to collect and analyze transport layer metrics between processes inside Kubernetes pods - a task well suited to eBPF. We investigated using eBPF for this purpose and present our results and a demo below.

Monitoring Kubernetes Networks with eBPF

With its origins as the Extended Berkeley Packet Filter, eBPF is a general purpose mechanism for injecting and running your own code into the Linux kernel and is an excellent tool for monitoring network traffic in Kubernetes Pods. In the next few sections, we'll provide an overview of how to use eBPF for network monitoring as background for introducing Skywalking Rover, a metrics collector and profiler powered by eBPF to diagnose CPU and network performance.

How Applications and the Network Interact

Interactions between the application and the network can generally be divided into the following steps from higher to lower levels of abstraction:

User Code: Application code uses high-level network libraries in the application stack to exchange data across the network, like sending and receiving HTTP requests.
Network Library: When the network library receives a network request, it interacts with the language API to send the network data.
Language API: Each language provides an API for operating the network, system, etc. When a request is received, it interacts with the system API. In Linux, this API is called syscalls.
Linux API: When the Linux kernel receives the request through the API, it communicates with the socket to send the data, which is usually closer to an OSI Layer 4 protocol, such as TCP, UDP, etc.
Socket Ops: Sending or receiving the data to/from the NIC.

Our hypothesis is that eBPF can monitor the network. There are two ways to implement the interception: User space (uprobe) or Kernel space (kprobe). The table below summarizes the differences.

	Pros	Cons
uprobe	• Get more application-related contexts, such as whether the current request is HTTP or HTTPS. • Requests and responses can be intercepted by a single method	• Data structures can be unstable, so it is more difficult to get the desired data. • Implementation may differ between language/library versions. • Does not work in applications without symbol tables.
kprobe	• Available for all languages. • The data structure and methods are stable and do not require much adaptation. • Easier correlation with underlying data, such as getting the destination address of TCP, OSI Layer 4 protocol metrics, etc.	• A single request and response may be split into multiple probes. • Contextual information is not easy to get for stateful requests. For example header compression in HTTP/2.

For the general network performance monitor, we chose to use the kprobe (intercept the syscalls) for the following reasons:

It’s available for applications written in any programming language, and it’s stable, so it saves a lot of development/adaptation costs.
It can be correlated with metrics from the system level, which makes it easier to troubleshoot.
As a single request and response are split into multiple probes, we can use technology to correlate them.
For contextual information, It’s usually used in OSI Layer 7 protocol network analysis. So, if we just monitor the network performance, then they can be ignored.

Kprobes and network monitoring

Following the network syscalls of Linux documentation, we can implement network monitoring by intercepting two types of methods: socket operations and send/receive methods.

Socket Operations

When accepting or connecting with another socket, we can get the following information:

Connection information: Includes the remote address from the connection which helps us to understand which pod is connected.
Connection statics: Includes basic metrics from sockets, such as round-trip time (RTT), lost packet count in TCP, etc.
Socket and file descriptor (FD) mapping: Includes the relationship between the Linux file descriptor and socket object. It is useful when sending and receiving data through a Linux file descriptor.

Send/Receive

The interface related to sending or receiving data is the focus of performance analysis. It mainly contains the following parameters:

Socket file descriptor: The file descriptor of the current operation corresponding to the socket.
Buffer: The data sent or received, passed as a byte array.

Based on the above parameters, we can analyze the following data:

Bytes: The size of the packet in bytes.
Protocol: The protocol analysis according to the buffer data, such as HTTP, MySQL, etc.
Execution Time: The time it takes to send/receive the data.

At this point (Figure 1) we can analyze the following steps for the whole lifecycle of the connection:

Connect/Accept: When the connection is created.
Transform: Sending and receiving data on the connection.
Close: When the connection is closed.

Figure 1

Protocol and TLS

The previous section described how to analyze connections using send or receive buffer data. For example, following the HTTP/1.1 message specification to analyze the connection. However, this does not work for TLS requests/responses.

Figure 2

When TLS is in use, the Linux Kernel transmits data encrypted in user space. In the figure above, The application usually transmits SSL data through a third-party library (such as OpenSSL). For this case, the Linux API can only get the encrypted data, so it cannot recognize any higher layer protocol. To decrypt inside eBPF, we need to follow these steps:

Read unencrypted data through uprobe: Compatible multiple languages, using uprobe to capture the data that is not encrypted before sending or after receiving. In this way, we can get the original data and associate it with the socket.
Associate with socket: We can associate unencrypted data with the socket.

OpenSSL Use case

For example, the most common way to send/receive SSL data is to use OpenSSL as a shared library, specifically the SSL_read and SSL_write methods to submit the buffer data with the socket.

Following the documentation, we can intercept these two methods, which are almost identical to the API in Linux. The source code of the SSL structure in OpenSSL shows that the Socket FD exists in the BIO object of the SSL structure, and we can get it by the offset.

In summary, with knowledge of how OpenSSL works, we can read unencrypted data in an eBPF function.

Introducing SkyWalking Rover, an eBPF-based Metrics Collector and Profiler

SkyWalking Rover introduces the eBPF network profiling feature into the SkyWalking ecosystem. It’s currently supported in a Kubernetes environment, so must be deployed inside a Kubernetes cluster. Once the deployment is complete, SkyWalking Rover can monitor the network for all processes inside a given Pod. Based on the monitoring data, SkyWalking can generate the topology relationship diagram and metrics between processes.

Topology Diagram

The topology diagram can help us understand the network access between processes inside the same Pod, and between the process and external environment (other Pod or service). Additionally, it can identify the data direction of traffic based on the line flow direction.

In Figure 3 below, all nodes within the hexagon are the internal process of a Pod, and nodes outside the hexagon are externally associated services or Pods. Nodes are connected by lines, which indicate the direction of requests or responses between nodes (client or server). The protocol is indicated on the line, and it’s either HTTP(S), TCP, or TCP(TLS). Also, we can see in this figure that the line between Envoy and Python applications is bidirectional because Envoy intercepts all application traffic.

Figure 3

Metrics

Once we recognize the network call relationship between processes through the topology, we can select a specific line and view the TCP metrics between the two processes.

The diagram below (Figure 4) shows the metrics of network monitoring between two processes. There are four metrics in each line. Two on the left side are on the client side, and two on the right side are on the server side. If the remote process is not in the same Pod, only one side of the metrics is displayed.

Figure 4

The following two metric types are available:

Counter: Records the total number of data in a certain period. Each counter contains the following data: a. Count: Execution count. b. Bytes: Packet size in bytes. c. Execution time: Execution duration.
Histogram: Records the distribution of data in the buckets.

Based on the above data types, the following metrics are exposed:

Name	Type	Unit	Description
Write	Counter and histogram	Millisecond	The socket write counter.
Read	Counter and histogram	Millisecond	The socket read counter.
Write RTT	Counter and histogram	Microsecond	The socket write round trip time (RTT) counter.
Connect	Counter and histogram	Millisecond	The socket connect/accept with another server/client counter.
Close	Counter and histogram	Millisecond	The socket with other socket counter.
Retransmit	Counter	Millisecond	The socket retransmit package counter.
Drop	Counter	Millisecond	The socket drop package counter.

Demo

In this section, we demonstrate how to perform network profiling in the service mesh. To follow along, you will need a running Kubernetes environment.

NOTE: All commands and scripts are available in this GitHub repository.

Install Istio

Istio is the most widely deployed service mesh, and comes with a complete demo application that we can use for testing. To install Istio and the demo application, follow these steps:

Install Istio using the demo configuration profile.
Label the default namespace, so Istio automatically injects Envoy sidecar proxies when we’ll deploy the application.
Deploy the bookinfo application to the cluster.
Deploy the traffic generator to generate some traffic to the application.

export ISTIO_VERSION=1.13.1

# install istio
istioctl install -y --set profile=demo
kubectl label namespace default istio-injection=enabled

# deploy the bookinfo applications
kubectl apply -f https://raw.githubusercontent.com/istio/istio/$ISTIO_VERSION/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/$ISTIO_VERSION/samples/bookinfo/networking/bookinfo-gateway.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/$ISTIO_VERSION/samples/bookinfo/networking/destination-rule-all.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/$ISTIO_VERSION/samples/bookinfo/networking/virtual-service-all-v1.yaml

# generate traffic
kubectl apply -f https://raw.githubusercontent.com/mrproliu/skywalking-network-profiling-demo/main/resources/traffic-generator.yaml

Install SkyWalking

The following will install the storage, backend, and UI needed for SkyWalking:

git clone https://github.com/apache/skywalking-kubernetes.git
cd skywalking-kubernetes
cd chart
helm dep up skywalking
helm -n istio-system install skywalking skywalking \
 --set fullnameOverride=skywalking \
 --set elasticsearch.minimumMasterNodes=1 \
 --set elasticsearch.imageTag=7.5.1 \
 --set oap.replicas=1 \
 --set ui.image.repository=apache/skywalking-ui \
 --set ui.image.tag=9.2.0 \
 --set oap.image.tag=9.2.0 \
 --set oap.envoy.als.enabled=true \
 --set oap.image.repository=apache/skywalking-oap-server \
 --set oap.storageType=elasticsearch \
 --set oap.env.SW_METER_ANALYZER_ACTIVE_FILES='network-profiling'

Install SkyWalking Rover

SkyWalking Rover is deployed on every node in Kubernetes, and it automatically detects the services in the Kubernetes cluster. The network profiling feature has been released in the version 0.3.0 of SkyWalking Rover. When a network monitoring task is created, the SkyWalking rover sends the data to the SkyWalking backend.

kubectl apply -f https://raw.githubusercontent.com/mrproliu/skywalking-network-profiling-demo/main/resources/skywalking-rover.yaml

Start the Network Profiling Task

Once all deployments are completed, we must create a network profiling task for a specific instance of the service in the SkyWalking UI.

To open SkyWalking UI, run:

kubectl port-forward svc/skywalking-ui 8080:80 --namespace
istio-system

Currently, we can select the specific instances that we wish to monitor by clicking the Data Plane item in the Service Mesh panel and the Service item in the Kubernetes panel.

In the figure below, we have selected an instance with a list of tasks in the network profiling tab. When we click the start button, the SkyWalking Rover starts monitoring this instance’s network.

Figure 5

Done!

After a few seconds, you will see the process topology appear on the right side of the page.

Figure 6

When you click on the line between processes, you can see the TCP metrics between the two processes.

Figure 7

Conclusion

In this article, we detailed a problem that makes troubleshooting service mesh architectures difficult: lack of context between layers in the network stack. These are the cases when eBPF begins to really help with debugging/productivity when existing service mesh/envoy cannot. Then, we researched how eBPF could be applied to common communication, such as TLS. Finally, we demo the implementation of this process with SkyWalking Rover.

For now, we have completed the performance analysis for OSI layer 4 (mostly TCP). In the future, we will also introduce the analysis for OSI layer 7 protocols like HTTP.

Get Started with Istio

To get started with service mesh today, Tetrate Istio Distro is the easiest way to install, manage, and upgrade Istio. It provides a vetted upstream distribution of Istio that’s tested and optimized for specific platforms by Tetrate plus a CLI that facilitates acquiring, installing, and configuring multiple Istio versions. Tetrate Istio Distro also offers FIPS certified Istio builds for FedRAMP environments.

For enterprises that need a unified and consistent way to secure and manage services and traditional workloads across complex, heterogeneous deployment environments, we offer Tetrate Service Bridge, our flagship edge-to-workload application connectivity platform built on Istio and Envoy.

Additional Resources

Blog: Scaling with Apache SkyWalking

Mon, 24 Jan 2022 00:00:00 +0000

Background

In the Apache SkyWalking ecosystem, the OAP obtains metrics, traces, logs, and event data through SkyWalking Agent, Envoy, or other data sources. Under the gRPC protocol, it transmits data by communicating with a single server node. Only when the connection is broken, the reconnecting policy would be used based on DNS round-robin mode. When new services are added at runtime or the OAP load is kept high due to increased traffic of observed services, the OAP cluster needs to scale out for increased traffic. The load of the new OAP node would be less due to all existing agents having connected to previous nodes. Even without scaling, the load of OAP nodes would be unbalanced, because the agent would keep the connection due to random policy at the booting stage. In these cases, it would become a challenge to keep up the health status of all nodes, and be able to scale out when needed.

In this article, we mainly discuss how to solve this challenge in SkyWalking.

How to Load Balance

SkyWalking mainly uses the gRPC protocol for data transmission, so this article mainly introduces load balancing in the gRPC protocol.

Proxy Or Client-side

Based on the gRPC official Load Balancing blog, there are two approaches to load balancing:

Client-side: The client perceives multiple back-end services and uses a load-balancing algorithm to select a back-end service for each RPC.
Proxy: The client sends the message to the proxy server, and the proxy server load balances the message to the back-end service.

From the perspective of observability system architecture:

	Pros	Cons
Client-side	High performance because of the elimination of extra hop	Complex client (cluster awareness, load balancing, health check, etc.) Ensure each data source to be connected provides complex client capabilities
Proxy	Simple Client	Higher latency

We choose Proxy mode for the following reasons:

Observable data is not very time-sensitive, a little latency caused by transmission is acceptable. A little extra hop is acceptable and there is no impact on the client-side.
As an observability platform, we cannot/should not ask clients to change. They make their own tech decisions and may have their own commercial considerations.

Transmission Policy

In the proxy mode, we should determine the transmission path between downstream and upstream.

Different data protocols require different processing policies. There are two transmission policies:

Synchronous: Suitable for protocols that require data exchange in the client, such as SkyWalking Dynamic Configuration Service. This type of protocol provides real-time results.
Asynchronous batch: Used when the client doesn’t care about the upstream processing results, but only the transmitted data (e.g., trace report, log report, etc.)

The synchronization policy requires that the proxy send the message to the upstream server when receiving the client message, and synchronously return the response data to the downstream client. Usually, only a few protocols need to use the synchronization policy.

As shown below, after the client sends the request to the Proxy, the proxy would send the message to the server synchronously. When the proxy receives the result, it returns to the client.

The asynchronous batch policy means that the data is sent to the upstream server in batches asynchronously. This policy is more common because most protocols in SkyWalking are primarily based on data reporting. We think using the queue as a buffer could have a good effect. The asynchronous batch policy is executed according to the following steps:

The proxy receives the data and wraps it as an Event object.
An event is added into the queue.
When the cycle time is reached or when the queue elements reach the fixed number, the elements in the queue will parallel consume and send to the OAP.

The advantage of using queues is:

Separate data receiving and sending to reduce the mutual influence.
The interval quantization mechanism can be used to combine events, which helps to speed up sending events to the OAP.
Using multi-threaded consumption queue events can make fuller use of network IO.

As shown below, after the proxy receives the message, the proxy would wrap the message as an event and push it to the queue. The message sender would take batch events from the queue and send them to the upstream OAP.

Routing

Routing algorithms are used to route messages to a single upstream server node.

The Round-Robin algorithm selects nodes in order from the list of upstream service nodes. The advantage of this algorithm is that the number of times each node is selected is average. When the size of the data is close to the same, each upstream node can handle the same quantity of data content.

With the Weight Round-Robin, each upstream server node has a corresponding routing weight ratio. The difference from Round-Robin is that each upstream node has more chances to be routed according to its weight. This algorithm is more suitable to use when the upstream server node machine configuration is not the same.

The Fixed algorithm is a hybrid algorithm. It can ensure that the same data is routed to the same upstream server node, and when the upstream server scales out, it still maintains routing to the same node; unless the upstream node does not exist, it will reroute. This algorithm is mainly used in the SkyWalking Meter protocol because this protocol needs to ensure that the metrics of the same service instance are sent to the same OAP node. The Routing steps are as follows:

Generate a unique identification string based on the data content, as short as possible. The amount of data is controllable.
Get the upstream node of identity from LRU Cache, and use it if it exists.
According to the identification, generate the corresponding hash value, and find the upstream server node from the upstream list.
Save the mapping relationship between the upstream server node and identification to LRU Cache.

The advantage of this algorithm is to bind the data with the upstream server node as much as possible, so the upstream server can better process continuous data. The disadvantage is that it takes up a certain amount of memory space to save the corresponding relationship.

As shown below, the image is divided into two parts:

The left side represents that the same data content always is routed to the same server node.
The right side represents the data routing algorithm. Get the number from the data, and use the remainder algorithm to obtain the position.

We choose to use a combination of Round-Robin and Fixed algorithm for routing:

The Fixed routing algorithm is suitable for specific protocols, mainly used when passing metrics data to the SkyWalking Meter protocol
The Round-Robin algorithm is used by default. When the SkyWalking OAP cluster is deployed, the configuration of the nodes needs to be as much the same as possible, so there would be no need to use the Weight Round-Robin algorithm.

How to balance the load balancer itself?

Proxy still needs to deal with the load balancing problem from client to itself, especially when deploying a Proxy cluster in a production environment.

There are three ways to solve this problem:

Connection management: Use the max_connection config on the client-side to specify the maximum connection duration of each connection. For more information, please read the proposal.
Cluster awareness: The proxy has cluster awareness, and actively disconnects the connection when the load is unbalanced to allow the client to re-pick up the proxy.
Resource limit+HPA: Restrict the connection resource situation of each proxy, and no longer accept new connections when the resource limit is reached. And use the HPA mechanism of Kubernetes to dynamically scale out the number of the proxy.

	Connection management	Cluster awareness	Resource Limit+HPA
Pros	Simple to use	Ensure that the number of connections in each proxy is relatively	Simple to use
Cons	Each client needs to ensure that data is not lost The client is required to accept GOWAY responses	May cause a sudden increase in traffic on some nodes Each client needs to ensure that data is not lost	Traffic will not be particularly balanced in each instance

We choose Limit+HPA for these reasons:

Easy to config and use the proxy and easy to understand based on basic data metrics.
No data loss due to broken connection. There is no need for the client to implement any other protocols to prevent data loss, especially when the client is a commercial product.
The connection of each node in the proxy cluster does not need to be particularly balanced, as long as the proxy node itself is high-performance.

SkyWalking-Satellite

We have implemented this Proxy in the SkyWalking-Satellite project. It’s used between Client and SkyWalking OAP, effectively solving the load balancing problem.

After the system is deployed, the Satellite would accept the traffic from the Client, and the Satellite will perceive all the nodes of the OAP through Kubernetes Label Selector or manual configuration, and load balance the traffic to the upstream OAP node.

As shown below, a single client still maintains a connection with a single Satellite, Satellite would establish the connection with each OAP, and load balance message to the OAP node.

When scaling Satellite, we need to deploy the SWCK adapter and configure the HPA in Kubernetes. SWCK is a platform for the SkyWalking users, provisions, upgrades, maintains SkyWalking relevant components, and makes them work natively on Kubernetes.

After deployment is finished, the following steps would be performed:

Read metrics from OAP: HPA requests the SWCK metrics adapter to dynamically read the metrics in the OAP.
Scaling the Satellite: Kubernetes HPA senses that the metrics values are in line with expectations, so the Satellite would be scaling automatically.

As shown below, use the dotted line to divide the two parts. HPA uses SWCK Adapter to read the metrics in the OAP. When the threshold is met, HPA would scale the Satellite deployment.

Example

In this section, we will demonstrate two cases:

SkyWalking Scaling: After SkyWalking OAP scaling, the traffic would auto load balancing through Satellite.
Satellite Scaling: Satellite’s own traffic load balancing.

NOTE: All commands could be accessed through GitHub.

SkyWalking Scaling

We will use the bookinfo application to demonstrate how to integrate Apache SkyWalking 8.9.1 with Apache SkyWalking-Satellite 0.5.0, and observe the service mesh through the Envoy ALS protocol.

Before starting, please make sure that you already have a Kubernetes environment.

Install Istio

Istio provides a very convenient way to configure the Envoy proxy and enable the access log service. The following step:

Install the istioctl locally to help manage the Istio mesh.
Install Istio into the Kubernetes environment with a demo configuration profile, and enable the Envoy ALS. Transmit the ALS message to the satellite. The satellite we will deploy later.
Add the label into the default namespace so Istio could automatically inject Envoy sidecar proxies when you deploy your application later.

# install istioctl
export ISTIO_VERSION=1.12.0
curl -L https://istio.io/downloadIstio | sh - 
sudo mv $PWD/istio-$ISTIO_VERSION/bin/istioctl /usr/local/bin/

# install istio
istioctl install -y --set profile=demo \
	--set meshConfig.enableEnvoyAccessLogService=true \
	--set meshConfig.defaultConfig.envoyAccessLogService.address=skywalking-system-satellite.skywalking-system:11800

# enbale envoy proxy in default namespace
kubectl label namespace default istio-injection=enabled

Install SWCK

SWCK provides convenience for users to deploy and upgrade SkyWalking related components based on Kubernetes. The automatic scale function of Satellite also mainly relies on SWCK. For more information, you could refer to the official documentation.

# Install cert-manager
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.3.1/cert-manager.yaml

# Deploy SWCK
mkdir -p skywalking-swck && cd skywalking-swck
wget https://dlcdn.apache.org/skywalking/swck/0.6.1/skywalking-swck-0.6.1-bin.tgz
tar -zxvf skywalking-swck-0.6.1-bin.tgz
cd config
kubectl apply -f operator-bundle.yaml

Deploy Apache SkyWalking And Apache SkyWalking-Satellite

We have provided a simple script to deploy the skywalking OAP, UI, and Satellite.

# Create the skywalking components namespace
kubectl create namespace skywalking-system
kubectl label namespace skywalking-system swck-injection=enabled
# Deploy components
kubectl apply -f https://raw.githubusercontent.com/mrproliu/sw-satellite-demo-scripts/5821a909b647f7c8f99c70378e197630836f45f7/resources/sw-components.yaml

Deploy Bookinfo Application

export ISTIO_VERSION=1.12.0
kubectl apply -f https://raw.githubusercontent.com/istio/istio/$ISTIO_VERSION/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl wait --for=condition=Ready pods --all --timeout=1200s
kubectl port-forward service/productpage 9080

Next, please open your browser and visit http://localhost:9080. You should be able to see the Bookinfo application. Refresh the webpage several times to generate enough access logs.

Then, you can see the topology and metrics of the Bookinfo application on SkyWalking WebUI. At this time, you can see that the Satellite is working!

Deploy Monitor

We need to install OpenTelemetry Collector to collect metrics in OAPs and analyze them.

# Add OTEL collector
kubectl apply -f https://raw.githubusercontent.com/mrproliu/sw-satellite-demo-scripts/5821a909b647f7c8f99c70378e197630836f45f7/resources/otel-collector-oap.yaml

kubectl port-forward -n skywalking-system  service/skywalking-system-ui 8080:80

Next, please open your browser and visit http://localhost:8080/ and create a new item on the dashboard. The SkyWalking Web UI pictured below shows how the data content is applied.

Scaling OAP

Scaling the number of OAPs by deployment.

kubectl scale --replicas=3 -n skywalking-system deployment/skywalking-system-oap

Done!

After a period of time, you will see that the number of OAPs becomes 3, and the ALS traffic is balanced to each OAP.

Satellite Scaling

After we have completed the SkyWalking Scaling, we would carry out the Satellite Scaling demo.

Deploy SWCK HPA

SWCK provides an adapter to implement the Kubernetes external metrics to adapt the HPA through reading the metrics in SkyWalking OAP. We expose the metrics service in Satellite to OAP and configure HPA Resource to auto-scaling the Satellite.

Install the SWCK adapter into the Kubernetes environment:

kubectl apply -f skywalking-swck/config/adapter-bundle.yaml

Create the HPA resource, and limit each Satellite to handle a maximum of 10 connections:

kubectl apply -f https://raw.githubusercontent.com/mrproliu/sw-satellite-demo-scripts/5821a909b647f7c8f99c70378e197630836f45f7/resources/satellite-hpa.yaml

Then, you could see we have 9 connections in one satellite. One envoy proxy may establish multiple connections to the satellite.

$ kubectl get HorizontalPodAutoscaler -n skywalking-system
NAME       REFERENCE                                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-demo   Deployment/skywalking-system-satellite   9/10      1         3         1          5m18s

Scaling Application

The scaling application could establish more connections to the satellite, to verify whether the HPA is in effect.

kubectl scale --replicas=3 deployment/productpage-v1 deployment/details-v1

Done!

By default, Satellite will deploy a single instance and a single instance will only accept 11 connections. HPA resources limit one Satellite to handle 10 connections and use a stabilization window to make Satellite stable scaling up. In this case, we deploy the Bookinfo application in 10+ instances after scaling, which means that 10+ connections will be established to the Satellite.

So after HPA resources are running, the Satellite would be automatically scaled up to 2 instances. You can learn about the calculation algorithm of replicas through the official documentation. Run the following command to view the running status:

$ kubectl get HorizontalPodAutoscaler -n skywalking-system --watch
NAME       REFERENCE                                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-demo   Deployment/skywalking-system-satellite   11/10     1         3         1          3m31s
hpa-demo   Deployment/skywalking-system-satellite   11/10     1         3         1          4m20s
hpa-demo   Deployment/skywalking-system-satellite   11/10     1         3         2          4m38s
hpa-demo   Deployment/skywalking-system-satellite   11/10     1         3         2          5m8s
hpa-demo   Deployment/skywalking-system-satellite   6/10      1         3         2          5m23s

By observing the “number of connections” metric, we would be able to see that when the number of connections of each gRPC exceeds 10 connections, then the satellite automatically scales through the HPA rule. As a result, the connection number is down to normal status (in this example, less than 10)

swctl metrics linear --name satellite_service_grpc_connect_count --service-name satellite::satellite-service

Blog: SkyWalking performance in Service Mesh scenario

Fri, 25 Jan 2019 00:00:00 +0000

Author: Hongtao Gao, Apache SkyWalking & ShardingShpere PMC
GitHub, Twitter, Linkedin

Service mesh receiver was first introduced in Apache SkyWalking 6.0.0-beta. It is designed to provide a common entrance for receiving telemetry data from service mesh framework, for instance, Istio, Linkerd, Envoy etc. What’s the service mesh? According to Istio’s explain:

The term service mesh is used to describe the network of microservices that make up such applications and the interactions between them.

As a PMC member of Apache SkyWalking, I tested trace receiver and well understood the performance of collectors in trace scenario. I also would like to figure out the performance of service mesh receiver.

Different between trace and service mesh

Following chart presents a typical trace map:

You could find a variety of elements in it just like web service, local method, database, cache, MQ and so on. But service mesh only collect service network telemetry data that contains the entrance and exit data of a service for now(more elements will be imported soon, just like Database). A smaller quantity of data is sent to the service mesh receiver than the trace.

But using sidecar is a little different.The client requesting “A” that will send a segment to service mesh receiver from “A”’s sidecar. If “A” depends on “B”, another segment will be sent from “A”’s sidecar. But for a trace system, only one segment is received by the collector. The sidecar model splits one segment into small segments, that will increase service mesh receiver network overhead.

Deployment Architecture

In this test, I will pick two different backend deployment. One is called mini unit, consist of one collector and one elasticsearch instance. Another is a standard production cluster, contains three collectors and three elasticsearch instances.

Mini unit is a suitable architecture for dev or test environment. It saves your time and VM resources, speeds up depolyment process.

The standard cluster provides good performance and HA for a production scenario. Though you will pay more money and take care of the cluster carefully, the reliability of the cluster will be a good reward to you.

I pick 8 CPU and 16GB VM to set up the test environment. This test targets the performance of normal usage scenarios, so that choice is reasonable. The cluster is built on Google Kubernetes Engine(GKE), and every node links each other with a VPC network. For running collector is a CPU intensive task, the resource request of collector deployment should be 8 CPU, which means every collector instance occupy a VM node.

Testing Process

Receiving mesh fragments per second(MPS) depends on the following variables.

Ingress query per second(QPS)
The topology of a microservice cluster
Service mesh mode(proxy or sidecar)

In this test, I use Bookinfo app as a demo cluster.

So every request will touch max 4 nodes. Plus picking the sidecar mode(every request will send two telemetry data), the MPS will be QPS * 4 *2.

There are also some important metrics that should be explained

Client Query Latency: GraphQL API query response time heatmap.
Client Mesh Sender: Send mesh segments per second. The total line represents total send amount and the error line is the total number of failed send.
Mesh telemetry latency: service mesh receiver handling data heatmap.
Mesh telemetry received: received mesh telemetry data per second.

Mini Unit

You could find collector can process up to 25k data per second. The CPU usage is about 4 cores. Most of the query latency is less than 50ms. After login the VM on which collector instance running, I know that system load is reaching the limit(max is 8).

According to the previous formula, a single collector instance could process 3k QPS of Bookinfo traffic.

Standard Cluster

Compare to the mini-unit, cluster’s throughput increases linearly. Three instances provide total 80k per second processing power. Query latency increases slightly, but it’s also very small(less than 500ms). I also checked every collector instance system load that all reached the limit. 10k QPS of BookInfo telemetry data could be processed by the cluster.

Conclusion

Let’s wrap them up. There are some important things you could get from this test.

QPS varies by the there variables. The test results in this blog are not important. The user should pick property value according to his system.
Collector cluster’s processing power could scale out.
The collector is CPU intensive application. So you should provide sufficient CPU resource to it.

This blog gives people a common method to evaluate the throughput of Service Mesh Receiver. Users could use this to design their Apache Skywalking backend deployment architecture.