Abstract:In order to realize multi-scale instrument detection in complex scenes, a video multi-scale instrument detection algorithm based on attention mechanism was proposed. Firstly, the feature extraction network based on spatial attention mechanism was used to model the long distance dependence of features and enhance the expression ability of features. Secondly, an Adaptive Feature Selection Module (AFSM) is proposed to adjust the weight of feature maps in different stages to enhance the capability of multi-scale target detection. The experiment was carried out on a self-built instrument data set. Experimental results show that, compared with the original Faster RCNN method, the detection accuracy of the proposed method is improved by 7.6%. Compared with the comparison method, the detection accuracy can also reach 95.4%. In the actual instrument monitoring video test, the detection results and speed can meet the actual needs. By improving the feature extraction network and feature selection operation, the proposed method enhances the feature expression ability, effectively reduces the false alarm, and improves the multi-scale target detection performance of the network.