With the popularity of microservices, many companies have split the system into many microservices according to business boundaries, when troubleshooting and checking logs. Because business links run through many microservice nodes, it becomes very difficult to locate the logs of a certain request and the logs of upstream and downstream businesses.
Based on the OpenTracing specification, distributed tracing open source frameworks such as SkyWalking and Pinpoint were born. However, it takes a certain amount of time and learning costs to build a distributed tracing system, become familiar with it, and promote it to the entire company, and it involves link span nodes. Storage cost issue, full collection or partial collection? If full collection is used, taking SkyWalking storage as an example, at least 5 nodes are required to build an ES cluster. This requires increased server costs. Moreover, if there are many microservice nodes, it is actually very normal to generate tens of gigabytes or hundreds of gigabytes of data in a day. If you want to save it for a longer time, you also need to increase the cost of the server disk.
This article is based on the simplest way to solve the log tracking problem. It does not collect logs and does not require additional storage space. It just automatically labels your logs and automatically generates a TraceId that runs through your microservices. links, without intrusion into existing code, quickly solve distributed system log tracking, and facilitate daily positioning and troubleshooting of business logs.
Implementation ideas and difficulties
Each request uses a unique identifier to track all links and display them in the log, and does not modify the original printing method (no code intrusion)
Add the traceId identifier to the log template of the MDC mechanism using Logback, and the value method is %X{trace_id}
Since MDC uses ThreadLocal internally, only this thread is valid, and the values in sub-threads and downstream service MDC will be lost; therefore, the main difficulty of the solution is to solve the value transfer problem, which mainly includes the following parts:
How to pass MDC data in spring gateway to downstream
How to pass and accept microservices before calling other remote services
How to pass it to child threads in asynchronous situations (thread pool)
How message queue producers and consumers transmit and receive messages, such as kafka, etc.
How to mark the log of each scheduled execution of a scheduled task
Based on the technical framework currently used by the company:
@Configuration public class WebConfig implements WebMvcConfigurer { @Override public void addInterceptors(InterceptorRegistry registry) { TraceIdInterceptor traceIdInterceptor = new TraceIdInterceptor(); registry.addInterceptor(traceIdInterceptor); }
7. Add interceptor to support Kafka traceId delivery
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
@Slf4j public class TraceIdKafkaProducerInterceptor implements ProducerInterceptor<String, String> { private static final String TRACE_ID = "trace_id";
@Slf4j public class TraceIdKafkaConsumerInterceptor implements ConsumerInterceptor<String, String> { private static final String TRACE_ID = "trace_id";