Provide Best Programming Tutorials

Zipkin VS Pinpoint

The background


Pinpoint is an open-source application performance management tool developed by the South Korean search engine portal Naver (as of May 2016, Alexa ranked 58th in the world and ranked first in Korea). The project started in July 2012 and was open-sourced in January 2015. The stable version to date is 1.5.2. Similar to Zipkin, the theoretical basis is based on the paper by Google Dapper.

The difference between Pinpoint and Zipkin


There are obvious differences between Pinpoint and Zipkin, which are mainly reflected in the following aspects:

  1. Pinpoint is a complete performance monitoring solution: from probes, collectors, storage to web interfaces, and Zipkin only focuses on collectors and storage services. Although it also has a user interface, its functionality is not the same as Pinpoint. Instead, Zipkin provides a Query interface, a more powerful user interface, and system integration capabilities, which can be developed based on this interface.
  2. Zipkin officially provides interfaces based on the Finagle framework (Scala language), while other framework interfaces are contributed by the community and currently support mainstream development languages ​​and frameworks such as Java, Scala, Node, Go, Python, Ruby, and C#; however, Pinpoint currently only has The official Java Agent probes are provided in the request community support (see #1759 and #1760).
  3. Pinpoint provides a Java Agent probe to implement call interception and data collection through bytecode injection. It can achieve real code without intrusion. You only need to add some parameters when starting the server to complete the probe deployment. Zipkin’s Java interface implementation, Brave, only provides a basic operational API. If you need to integrate with a framework or project, you need to manually add configuration files or add code.
  4. Pinpoint’s backend storage is based on HBase, while Zipkin is based on Cassandra.

The similarity between Pinpoint and Zipkin

Pinpoint and Zipkin are both based on Google Dapper’s paper, so the theoretical foundation is roughly the same. Both are splitting the service call into a number of Spans with cascading relationships, and cascading the calling relationships through SpanId and ParentSpanId; finally, all the Spans flowing through the entire calling chain are aggregated into one Trace and reported to the service. The collector of the end collects and stores it.

Even at this point, the concept adopted by Pinpoint is not entirely consistent with that paper. For example, he uses TransactionId instead of TraceId, and the real TraceId is a structure containing TransactionId, SpanId and ParentSpanId. And Pinpoint adds a SpanEvent structure under Span to record the details of a call inside a Span (such as a specific method call, etc.), so Pinpoint will record more trace data than Zipkin by default. But in theory does not limit the granularity of Span, so a service call can be a Span, then the method call in each service can also be a Span, in this case, in fact, Brave can also track the level of the method call, but the specific implementation and Did not do this.

Compare

The following is mainly to compare the advantages and disadvantages of the two to provide a selection reference.

Bytecode injection Vs API call

According to the above, this is the biggest difference between the two. Pinpoint implements a Java Agent probe based on bytecode injection, while Zipkin’s Brave framework only provides an application-level API, but the problem is far from simple. Bytecode injection is a simple and rude solution. In theory, any method call can be intercepted by injecting code, which means that it can’t be implemented, and it will not be implemented. But Brave is different, and the application-level API provided by it also needs the support of the underlying driver of the framework to achieve interception. For example, MySQL’s JDBC driver provides a method for injecting an interceptor, so you only need to implement the StatementInterceptor interface and configure it in the Connection String to easily implement the relevant interception. In contrast, the lower version of MongoDB Drivers or Spring Data MongoDB implementations do not have such an interface, and it is more difficult to implement the function of intercepting query statements.

So at this point, Brave is hard, no matter how difficult it is to use bytecode injection, but at least it can be achieved, but Brave has the possibility to start, and whether it can be injected, how much can be injected, more It depends on the framework’s API and not its own capabilities. (This issue continues to discuss another possibility in the next section)

Difficulty and cost

After a brief reading of the code for the Pinpoint and Brave plugins, you can see that the implementation of the two is very different. With no development documentation support, Brave is easier to use than Pinpoint. Brave’s code is small, and core functionality is concentrated under the brave-core module, a mid-level developer who can read its content in a single day and have a clear understanding of the structure of the API.

Pinpoint’s code wrapper is also very good, especially the encapsulation of the upper API for bytecode injection is excellent, but this still requires the reader to have some understanding of the bytecode injection, although it is used to inject the core API of the code and Not much, if you want to understand thoroughly, I am afraid I have to go deep into the relevant code of the Agent. For example, it is difficult to understand the difference between add interceptors and addScopedInterceptor at a glance, and these two methods are located in the relevant types of Agent.

Because Brave’s injection relies on the underlying framework to provide the relevant interface, there is no need to have a comprehensive understanding of the framework, just know where to inject, and what data can be obtained at the time of injection. Just like the above example, we don’t need to know how MySQL’s JDBC Driver is implemented or the ability to intercept SQL. But Pinpoint doesn’t, because Pinpoint can inject almost any code anywhere, which requires developers to have a deep understanding of the code implementation of the libraries that need to be injected. You can see this by looking at the implementation of its MySQL and Http Client plugins. Of course, this also shows that Pinpoint’s capabilities can be very powerful, and many of the plugins that are implemented by default have done very fine-grained interception.

When there is no API exposed to the underlying framework, Brave is not completely unreasonable. We can take AOP and inject the relevant interception into the specified code, and obviously AOP is much simpler than bytecode injection.

These are directly related to the cost of implementing monitoring, and reference data are given in Pinpoint’s official technical documentation. If you integrate a system, the cost for developing a Pinpoint plug-in is 100, and the cost of integrating this plug-in into the system is 0; but for Brave, the cost of plug-in development is only 20, and the integration cost is 10. From this point, it can be seen that the official cost reference data is 5:1. But the official emphasizes that if there are 10 systems that need to be integrated, then the total cost is 10 * 10 + 20 = 120, which is beyond the development cost of Pinpoint 100, and the more services that need to be integrated, the greater the gap.

Versatility and scalability

Obviously, Pinpoint is completely at a disadvantage at this point, as evidenced by the integrated interface developed by the community.

Pinpoint’s data interface lacks documentation and is not very standard (see forum discussion posts). You need to read a lot of code to implement a probe of your own (such as Node or PHP). And the team used Thrift as a data transfer protocol standard for performance reasons, which is a lot more difficult than HTTP and JSON.

Community support

Needless to say, Zipkin is developed by Twitter and can be considered a star team, while Naver’s team is just a small team that is unknown (as can be seen from the #1759 discussion). Although this project is unlikely to disappear or stop updating in the short term, it is not as safe as the former. And no more plug-ins developed by the community, it is difficult for Pinpoint to rely on the team’s own strength to complete the integration of many frameworks, and their current focus is still on improving performance and stability.

Other

Pinpoint took into account performance issues at the beginning of the implementation. Some services on the back end of www.naver.com website handle more than 20 billion requests per day, so they choose Thrift’s binary variable-length encoding format and use UDP as the transport. Link, while also passing the constants, try to use the data reference dictionary, pass a number instead of directly passing the string and so on. These optimizations also add to the complexity of the system: the difficulty of using the Thrift interface, the problem of UDP data transfer, and the registration of data constant dictionaries.

In contrast, Zipkin uses the familiar Restful interface plus JSON, with almost no learning cost and integration difficulty. As long as you know the data transfer structure, you can easily develop a corresponding interface for a new framework.

In addition, Pinpoint lacks the sampling capability for requests. Obviously, in a high-traffic production environment, it is unlikely that all requests will be recorded. This requires sampling the request to determine what kind of request I need to record. Both Pinpoint and Brave support the sampling percentage, which is how many requests are recorded. However, in addition to this, Brave also provides a Sampler interface that can customize the sampling strategy, especially when doing A/B testing.

Summary

From a short-term goal, Pinpoint does have an overwhelming advantage: deploying probes without any changes to the project code, tracking data fine-grained to method-call levels, a powerful user interface, and almost comprehensive Java framework support. But in the long run, learning the development interface of Pinpoint and the cost of implementing interfaces for different frameworks in the future is still unknown. On the contrary, mastering Brave is relatively easy, and Zipkin’s community is more powerful and more likely to develop more interfaces in the future. In the worst case, we can also add monitoring code that suits us ourselves through AOP, without introducing too many new technologies and concepts. Moreover, when the business changes in the future, whether the report provided by Pinpoint official can meet the requirements is not good. Adding new reports will also bring unpredictable work difficulty and workload.

This Post Has 2 Comments

  1. I’m one of the developers of Team Pinpoint.
    Your research very well explains the merits and demerits of both projects. It’s hard to find such a high-quality article.
    With your permission, I’d like to twit this article as a link on our twitter (https://twitter.com/Pinpoint_APM). Will it be Okay?

    Due to our lack of documents, you may not have heard recent updates.
    We have adapted gRPC to replace Thrift. It will be included in v2.0.0 (which will probably be released in a few weeks)
    Also, we already have a PHP agent as an open-source (https://github.com/naver/pinpoint-c-agent) and node.js agent is on the final test stage.

    1. Sure of course and I would glad you put the original link on your tweet.

Leave a Reply

Close Menu