1.concept

The Timing Wheel is the result of George Varghese and Tony Lauck’s 1996 paper [Hashed and Hierarchical Timing Wheels: data structures to efficiently implement a timer facility], which is widely used in the Linux kernel and is one of the implementations and foundations of Linux kernel timers, and is present in components such as Netty, Akka, Quartz, ZooKeeper, Kafka, and others.
Time wheel is an efficient algorithm for implementing delay functions (timers). If a system has a large number of tasks to schedule, time wheels can efficiently utilise thread resources for batch scheduling. Bind all the scheduling tasks in a large batch to the time wheel, and manage, trigger, and run all the tasks through the time wheel. It can efficiently manage all kinds of delayed tasks, periodic tasks, notification tasks and so on.
However, the time accuracy of the time wheel scheduler may not be very high, for the scheduling tasks that require particularly high accuracy may not be suitable, because the accuracy of the time wheel algorithm depends on the minimum granularity of the time period ‘pointer’ cell size. For example, if the grid of the time wheel jumps once a second, then tasks with scheduling accuracy less than one second cannot be scheduled by the time wheel.

2.theory

There are two types of task scheduling at regular intervals:
Relative time: implementation after a certain period of time
Absolute time: specifies a definite time for execution
Of course, the two can be converted to each other, for example, the current time is 12 o’clock, timed to be executed after 5 minutes, in fact, the absolute time is: 12:05; timed to be executed at 12:05, the relative time is executed after 5 minutes.

  • tickMs (basic time span): the time wheel consists of several time frames, each time frame represents the basic time span (tickMs) of the current time wheel.
  • wheelSize (number of time units): the number of time frames of the time wheel is fixed, which can be expressed by (wheelSize), then the overall time span of the whole time wheel (interval) can be calculated by the formula tickMs × wheelSize.
  • currentTime: the timer wheel also has a dial pointer (currentTime), which indicates the current time of the timer wheel, currentTime is an integer multiple of tickMs. currentTime can divide the whole timer wheel into expired and non-expired part, the currentTime is also the expired part. The time frame pointed to by currentTime belongs to the expiration part, which means it is expiring and needs to process all the tasks in the TimerTaskList corresponding to this time frame.
  • scale: the smallest unit above the time wheel.
    As time goes by, the pointer currentTime keeps advancing, and every time it advances, the tasks in the TaskList corresponding to the currentTime are processed, and if there is a newly inserted task, it will be split into the TaskList corresponding to the wheel after the currentTime according to the execution time of the task. If there is a new task inserted, it will be split into the TaskList corresponding to the wheel after the currentTime according to its execution time.
    If the time span checks out past the wheelSize of the time wheel, such as adding a task that will be executed 12 seconds from now, a single wheel won’t be able to meet our needs. We will have to consider using another solution.

3.extension

3.1 multi-wheel

The idea of a multi-stage time wheel is closer to that of a clock, which is divided into hour, minute and second hands. Every 60s, the minute hand advances one frame, and every 60min, the hour hand advances one frame.
The idea of multi-level time wheel is to have two or even more levels of wheel, when the first level wheel goes round, the second level wheel advances by one frame, the second level wheel goes round, the third level wheel advances by one frame, and so on.
In a single round, we assume that there are m scales, forming an array (the bottom of the ring queue is represented by an array), the time interval of each scale is t, then the time range of each scale is m * t, to get the taskList on the wheel, you can use the scale[i] to represent the scale, i is the subscript of the scale array.
If it is a multi-level wheel, we can understand it as multiple ring queues, if a two-level wheel is wheel1 and wheel2, if the same wheel2 is assigned m scales, the time interval t of each scale is 1s.
At this time, the time range corresponding to each scale in the wheel should be m * t (the time range represented by wheel1), and the time range represented by the whole two-stage time wheel is m * m * t.

3.2 task associated round

Each task adds a round parameter. If the time wheel scale is not exceeded then round = 0, if it is exceeded then the value of round is calculated.
For example, if a task needs to be executed in 12s, then round = 1 for that task, and the calculated dial position scale = 4, which means that the task needs to be executed at position 4 after the dial has travelled once. If the dial hand advances to that scale, the rounds in the list of tasks at that location are reduced by 1, and the task is not executed until the task round = 0 in the task chain with scale = 4.

4.issues

Suppose we have a wheel with tickMs of 1s and wheelSize of 1000, now we write two delay tasks into this wheel. The first one will be executed after 200s and the second one after 850s. So in the first delayed task, you need to let the ticker drive 200 times to execute the first task, and the first 199 out of these 200 drive advances are ‘empty advances’. When the second delayed task is executed, another 649 empty advances are needed. This is a huge drain on the machine’s performance resources.

How to solve ‘empty propulsion’ in kafka
For the above ‘empty push’ problem, Kafka’s delayed queue uses a JDK (JAVA language) queue called DelayQueue to help push the wheel of time. delayQueue, as the name suggests, is a delayed queue, with all the characteristics of a queue. The strategy is as follows:

  • Kafka adds all the task lists corresponding to the TimerTaskEntity to the DelayQueue. Each TimerTaskList has an expiration time, which is the currentTime of the timer + the delay time of the delayed task (TimerTaskEntity.DelayTime). Based on this time to do a sort, the shortest expiration (expiration time) of the task list (TimerTaskList) in the queue head of the DelayQueue, the other task lists in accordance with the timeout time in turn into the queue. In this way, each element in the DelayQueue queue is a task list (TimerTaskList) sorted according to expiration.
  • Then open a separate thread ‘ExpiredOperationReaper’ (expired reaper thread?). Then you can get the expired tasks in the DelayQueue queue by queue header out (time complexity of getting the queue header element is O(1), after getting the queue header element, you will switch to a new queue header element).
  • Finally, based on the ExpiredOperationReaper, we get the TimerTaskList, and we can advance the time wheel based on its expiration, and we can do the corresponding actions (demote or execute) on the TimerTaskEntity in the TimerTaskList. task entries in the TimerTaskList and do the corresponding actions (demote or execute) on the TimerTaskEntity.
    To sum up, we use TimeWheel to complete the insertion and deletion of task items (TimerTaskEntity), and use DelayQueue to complete the ‘precise advancement’ of TimeWheel.