Abstract:Aiming at the problem that the current driver fatigue detection algorithm based on deep learning requires a large number of parameters and calculation costs, and it is difficult to be effectively applied on low-computing devices, a YOLOv5 driver algorithm based on a lightweight backbone is proposed. Fatigue is judged by detecting the proportion of time that the three labels of closing eyes, opening mouth, and lowering head occupy. The EfficientViT network was utilized as the backbone network of the model, resulting in a reduction in both the model's parameter count and computational costs. Within the bottleneck network section of the model, a contextual transformer module was integrated, and the Normalized Wasserstein Distance was adopted as the new loss function. This was done to enhance the model's accuracy and alleviate any potential losses caused by the lightweight backbone. The experimental results show that the improved algorithm has an accuracy rate of 97.9%. Compared with YOLOv5, YOLOv7, and YOLOv8, its parameters are reduced by 3.4, 17.7, and 5.4 times, and the amount of calculation is reduced by 4.5, 29.5, and 8.2 times. The inference speed of a single image on the CPU is accelerated to 76.4ms, and it can effectively complete the real-time detection task.