Abstract:In response to the limitations of the traditional convolutional neural network (CNN) method in spatiotemporal feature extraction, an improved joint model combining Inception and bi-directional long short-term memory (BiLSTM) has been proposed to comprehensively learn spatial and temporal information from vibration signals. First, an Inception module with multi-scale receptive fields was constructed to adaptively extract spatial features at different scales. Secondly, temporal features were serialized using BiLSTM to deeply mine temporal correlations. Finally, damage identification of steel frame structures was achieved through global average pooling and a Softmax classifier. To evaluate the model's robustness to noise, Gaussian white noise was introduced as interference. Additionally, a transfer learning strategy was employed to assess the model's generalization ability under different intensity excitations and small sample sizes, ensuring applicability to diverse damage identification tasks. The results demonstrate that compared with the traditional CNN method, the model maintains 100% recognition accuracy under noise-free conditions and when the signal-to-noise ratio exceeds 25dB. This approach effectively addresses the practical challenges of insufficient sample size and varying excitation intensities in civil engineering applications. Through fine-tuning the parameters of the pre-trained model, knowledge transfer and generalization under different intensity incentives and small sample sizes are achieved, thereby enhancing the model's practical applicability.