Zum Inhalt springen

Mathematics, Vol. 14, Pages 1989: RSD-YOLOv8n: A Lightweight PCB Defect Detection Algorithm

Prometheus Redaktion

In industrial PCB defect detection, the current object detection algorithm suffers from low detection accuracy and difficulties in deployment on edge detection devices. The RSD-YOLOv8n algorithm was proposed to optimize accuracy, speed, and compactness for PCB defect detection. In actual design, the C2f layer in the feature extraction network is improved by employing feature reuse and structural reparameterization techniques, and the enhanced C2f-RepGhost module is introduced to strengthen the feature extraction capability of the backbone network. And the SPDConv module was adopted. This module converts spatial information in the feature maps into depth information while maintaining the resolution of the feature maps, further enhancing the feature extraction capabilities. Meanwhile, the C2f-DWR module is used. By partitioning the input feature maps and fusing feature information across different scales, it enhances the network to integrate feature maps of varying scales. Ablation and comparative experiments show that the RSD-YOLOv8n algorithm increases mAP@50 from 93.4% to 95.9% compared to the baseline model, while also reducing the number of network parameters. Generalization experiments demonstrate that this RSD-YOLOv8n algorithm has high generalization ability. Our method provides a solution for PCB defect detection. 1. Introduction With the rapid advancement of artificial intelligence and the growing demand for industrial automation and smart transformation, the quality of printed circuit boards (PCBs), a core component of electronic products, directly impacts the performance of the final products. Therefore, defect detection for this critical hardware has become an essential step prior to proceeding with manufacturing, which is of great significance for enhancing product competitiveness and market acceptance. In recent years, the rapid development of deep learning has provided a new solution for intelligent manufacturing [ 5, 6], such as PCB defect detection. Deep learning-based object detection methods can be categorized into two-stage and single-stage object detection algorithms based on differences in network architecture. Among these, two-stage object detection algorithms are primarily represented by RCNN, Fast-RCNN, and Faster-RCNN. These algorithms process input feature maps by first segmenting regions of interest (ROIs), then extracting features, and finally detecting defects through classification and regression. Single-stage object detection algorithms are represented by SSD [ 7] and the YOLO series. These algorithms do not require region selection. Instead, they achieve end-to-end classification and localization of objects through an adaptive anchor box algorithm. Therefore, compared to single-stage object detection algorithms, two-stage object detection algorithms offer higher detection accuracy but are slower in processing speed. Building upon the previously proposed RCNN object detection algorithm, Girshick et al. [ 8] presented the Fast-RCNN object detection algorithm at the renowned IEEE International Conference on Computer Vision. This algorithm aims to address issues such as slow processing speeds, high memory requirements, and structural complexity by optimizing network performance through the construction of a unified network architecture and the redesign of the network’s loss function training strategy. In experiments on the PASCAL VOC 2012 dataset, the Fast-RCNN algorithm achieved the highest detection accuracy at the time and improved training speed by approximately nine times compared to the RCNN algorithm. However, it still faced significant performance bottlenecks compared to single-stage object detection algorithms. Subsequently, Hu et al. [ 9] building upon the Faster-RCNN framework, replaced the original VGG backbone with ResNet-101, significantly enhancing the network’s ability to detect small defects and complex texture features. To improve the model’s feature extraction capabilities for irregular defects, they introduced deformable convolutions. Finally, by adopting a joint spatial-channel attention mechanism to further enhance the discriminative power of feature representations, we achieved an average accuracy improvement of 4.5% in experiments on the PCB dataset. Zeng et al. [ 10] focused on the detection of micro-defects and proposed a multi-scale feature fusion strategy tailored to this class of defects. By constructing a more refined feature pyramid structure, they effectively mitigated the impact of network depth on feature loss. For the YOLO series [ 11], researchers have optimized the backbone, feature fusion, and loss functions to improve accuracy for small defects. To address the issue of poor model robustness caused by imbalanced class distributions in industrial production data, Li et al. [ 12] designed an adaptive tail attention module to enhance the model’s ability to represent the features of low-frequency defect categories. Meanwhile, they introduced a dynamic multi-scale fusion architecture, which adaptively adjusted the fusion weights according to the actual scale of the input defect. As a result, they achieved a detection accuracy of 99.1% on their self-built PCB defect dataset. Tao H et al. [ 13] proposed an enhanced YOLO-based industrial small object detection algorithm by integrating receptive-field attention and multi-scale feature fusion, achieving a mean average precision of 94.8% with a 6.3% improvement over the baseline YOLOv8 model and a detection speed of 107.2 frames per second. At the feature fusion level, Zhang et al. [ 14] addressed the significant variation in defect scales on PCBs. Some studies designed adaptive feature focusing modules to enhance the interaction and fusion of multiscale features within the feature fusion network, thereby effectively preserving the fine-grained information of small objects. With the emergence of advanced versions such as YOLOv8, more innovative improvements have been integrated. Vinod Kumar Ancha et al. [ 15] proposed a lightweight yet high-performance detection model TRSBi-YOLO by integrating a C3TR module into the backbone for enhanced multi-scale feature extraction, incorporating a parameter-free SimAM attention mechanism [ 16] to improve feature representation without additional computational cost. It achieves a mean average precision of 98.1% with only 1.79 million parameters and 4.4 G FLOPs, alongside an inference speed of 135.27 frames per second. Wang et al. [ 17] addressed the issues of difficulty in identifying small targets and high model complexity in PCB defect detection. Based on YOLOv8n, they proposed a lightweight improved algorithm. This method introduces the C2f_SHSA attention mechanism in the backbone network to enhance the feature expression of small defects, adopts the C2f_IdentityFormer structure in the neck network to improve the multi-scale feature fusion ability, and replaces the CIoU loss function with the PIoU to improve the positioning accuracy and convergence speed. Experimental results show that this method achieved a Recall of 94.0%, an mAP of 96.1%, and an F1-score of 94.35% on the PCB_DATASET, with a model size of only 5.2 MB. Compared with YOLOv8n, the mAP increased by 1.8%, the F1 score increased by 1.9%, and the parameter quantity and computation volume were reduced by 18.27% and 12.35% respectively, achieving model lightweighting while maintaining accuracy. In summary, this paper proposes a network model (RSD-YOLOv8n) that optimizes both detection accuracy and speed based on YOLOv8n. The main innovations of this work are as follows: 1. To enhance backbone feature extraction capability, a C2f-RepGhost module is designed by improving the C2f layer through feature reuse and structural reparameterization. 2. The SPDConv module replaces traditional stride convolution and pooling operations for spatial downsampling. This module converts spatial information into channel information while preserving feature map resolution, thereby preventing the loss of spatial information. 3. A C2f-DWR module is introduced for multi-scale feature fusion by partitioning input feature maps and effectively integrating information across different scales. 2. Design Methodology for the RSD-YOLOv8n Network 2.1. YOLOv8 Model The YOLOv8 model, developed by the Ultralytics team, is a state-of-the-art object detection architecture that has achieved breakthroughs in multiple key areas, including speed, accuracy, and user-friendliness, thereby redefining the performance boundaries in the field of object detection. YOLO stands for “You Only Look Once,” and its core design philosophy lies in simultaneously predicting all object bounding boxes through a single-pass network architecture. This single-stage detection mechanism significantly enhances algorithmic efficiency, making it particularly well-suited for real-time object detection tasks, where it offers a natural efficiency advantage over traditional multi-stage detection methods. As an evolved version of the YOLO series of models, YOLOv8 builds upon the strengths of the classic YOLO architecture while incorporating systematic optimizations. Its network structure is similar to that of YOLOv5 and can be broadly divided into three components: the backbone, the neck, and the detection head. In addition, YOLOv8’s core technological innovation lies in its adoption of an anchor-free detection mechanism. By eliminating the reliance on predefined anchor boxes, this design not only simplifies the model training process but also significantly improves the efficiency of the Non-Maximum Suppression (NMS) algorithm during the post-processing stage. This improvement enables the model to maintain high detection accuracy while further enhancing inference speed. The YOLOv8 network architecture is shown in Figure 1. The main network serves as the core component for feature extraction, employing hierarchical convolution and deconvolution operations to construct a feature pyramid. Additionally, residual connections and bottleneck structures are introduced to achieve lightweight model design and enhance feature representation capabilities. In YOLOv8, the Backbone mainly uses the C2f module as the basic building block. Compared to the C3 module in YOLOv5, C2f significantly reduces the number of parameters through structural optimization and enhances multi-scale feature extraction capabilities. Specifically, the C2f module employs a streamlined topological design and efficient feature fusion to reduce computational redundancy without sacrificing feature expression accuracy, thereby boosting inference efficiency. The structure of the C2f module is shown in Figure 2. 2.2. The C2f-RepGhost Module Enhanced with Re-Parameterization Techniques GhostNet [ 18] is a highly efficient, lightweight network architecture proposed by Huawei’s Noah’s Ark team. Its core concept is feature reuse. It first uses standard convolutions to extract basic features from feature maps, then employs depth-wise convolutions to generate additional feature maps from the existing ones. Finally, the original features and the generated feature maps are concatenated to reduce the number of parameters and computational load. The DenseNet [ 19] network similarly leverages the concept of feature reuse, passing features extracted by earlier layers to subsequent layers via a Concat operation to reuse existing features. While both of these networks employ the Concat operation for feature reuse, the Add operation serves a similar functional purpose. However, unlike the Concat operation, the Add operation not only enables feature reuse but also performs feature fusion. Moreover, the feature fusion process takes place in the weight space and does not introduce additional inference time. The implementation of the Add operation, which achieves both feature reuse and feature fusion, can be expressed by Equation ( 1). y = A d d ( [ x , ϕ 1 ( x ) , ⋯ , ϕ s − 1 ( x ) ] ) = ϕ * ( x ) (1) Here, x represents the input, y represents the output. ϕ i ( x ) denote the other layers of the neural network applied to the input x. RepGhost [ 20] is a high-efficiency module that achieves implicit reuse of feature information by combining GhostNet modules with structural re-parameterization techniques. The core design philosophy of the reparameterized module is multi-branch fusion. By processing multiple low-cost operations in parallel—that is, by using different path branches for feature fusion during both training and inference—the final features are generated through weighted fusion during the training phase, while during inference, they are integrated into a single convolutional kernel. This reduces the number of network parameters and computational load, thereby lowering hardware dependency requirements. The improvement process from the original Ghost module to the RepGhost module is shown in Figure 3. Figure 3a shows the original Ghost module structure, while Figure 3b is obtained by replacing the Concat module with an Add module in Figure 3a to achieve higher efficiency without increasing computational cost. Figure 3c further refines the previous improvement by moving the ReLU activation function after the Add operation. This modification adheres to the principles of structural reparameterization, thereby enabling fast inference. Building upon the structure in Figure 3c, Figure 3d is obtained by adding a normalization operation (BN) to the identity mapping branch. This enhances the network’s nonlinear processing capabilities and accelerates feature fusion. Figure 3e depicts the RepGhost architecture during inference, while Figure 3c,d can be merged into Figure 3e through structural reparameterization. Based on hyperparameter tuning techniques and the concept of feature reuse, this paper designs the bottleneck structure in the C2f module as a RepGhostBottleneck. This approach allows the network to retain multiple branches during training and merge them into a single convolutional operation during inference, thereby enabling more efficient extraction of information from feature maps while also ensuring the model remains lightweight. The network architecture of C2f-RepGhost is shown in Figure 4. 2.3. The SPDConv Module Based on Spatial Coding Technology In convolutional neural network architectures, stride convolutions and pooling operations are commonly used to reduce the dimensionality of feature maps. In stride convolutions, the default stride is 1, meaning the convolution kernel moves one step at a time across the feature map to extract features. This approach yields more complete and detailed spatial information. To achieve spatial dimensionality reduction in feature maps, convolution with a stride of 2 is commonly used. This employs selective sampling, which may result in the loss of continuous spatial information within the feature map. In addition, pooling operations are not only a means of dimensionality reduction but also a core method for maintaining spatial translation invariance. The dimension reduction mechanism of pooling is similar to that of stride convolution: it compresses information within a local region by sliding a pooling window (e.g., average pooling and max pooling). Maintaining spatial translation invariance is the most important property of pooling. Taking max pooling as an example, by retaining only the strongest pixel feature within the pooling window’s region while ignoring the precise location of that feature within the region, this achieves translation invariance. Traditional convolutional downsampling and pooling simply compress feature size, easily losing fine texture and edge details and weakening spatial features of tiny defects. Different from them, SPDConv realizes downsampling through spatial splitting, rearrangement and convolution fusion. It avoids discarding pixel-level information, effectively retains fine-grained spatial details such as defect contours and textures, and greatly reduces information loss, showing superior performance in capturing subtle PCB defects. However, in tasks involving low-resolution images and the detection of small-scale defects, spatial dimension reduction using stride convolution and pooling operations can lead to the loss of fine-grained information, thereby affecting the network’s final detection performance. Therefore, this section introduces the SPDConv module [ 21], which is based on spatial encoding techniques, to reduce the network’s computational complexity without losing details in the feature maps. The specific procedure is as follows: first, pixel blocks in the spatial dimension are rearranged into the depth dimension (also known as the channel dimension) to increase the number of spatial channels while simultaneously reducing the resolution in the spatial dimension. Next, the rearranged feature maps are further processed using non-strided convolution to ensure that more effective information is retained in the feature maps. The structure of the SPDConv module is shown in Figure 5. 2.4. C2f-DWR Improved with an Expansive Residual Module To capture multi-scale contextual information in the network and enhance the model’s ability to extract features at different scales, a single-step multi-scale deep dilated convolutional layer is typically employed. However, the use of single-step multi-scale deep dilated convolutions can make it difficult for the network to acquire contextual multi-scale information due to unreasonable network architecture design and overly complex input feature representations. To address these issues, Wei et al. proposed a more efficient multi-scale feature extraction method—the Dilation-wise Residual (DWR) module [ 22], which is achieved by dividing the original single-step path feature extraction method into two steps. To extract more context-critical information from complex feature maps, we first segment the input feature map into distinct regions, generating concise feature regions that contain varying levels of complexity. This step is referred to as regional residualization. Next, to address the differing receptive field requirements across various network stages, we employ dilated convolutions to design receptive fields of varying sizes for extracting multi-scale contextual features. This step is referred to as semantic residualization. The specific differences between traditional multi-scale feature extraction and the expansive residual method are illustrated in Figure 6a, and the structure of the expansive residual module is shown in Figure 6b. Assuming the input feature map has 2c channels, as shown in Figure 7, the DWR module first applies a standard 3 × 3 convolution to the input feature map to obtain an output feature map with 3c channels. It then divides the feature map—after normalization and activation function processing—into three feature maps, each with c channels, thereby simplifying the complex feature map. This is the concrete manifestation of regional residualization. Second, to extract semantic information at different scales from the feature map, multi-scale dilated convolutions with varying dilation ratios are applied to the three feature maps, each with c channels. This step represents semantic residualization. Among these, the convolution with a dilation ratio of 1 has an equivalent receptive field of 3 × 3 and is used to capture local details in the feature map. The convolution with a dilation ratio of 3 has an equivalent receptive field of 7 × 7, used to capture medium-range information in the feature maps. And the convolution with a dilation ratio of 5 has the largest equivalent receptive field, equivalent to an 11 × 11 network receptive field, primarily used to capture global semantic information in the feature maps and understand feature information within the overall scene. This enables the network to process multi-scale feature information simultaneously at every stage, enhancing its ability to represent and learn the features of the dataset. Building on the influence of the dilated residual model, this paper improves the C2f module in YOLOv8—which is used to enhance feature representation capabilities and increase network depth—and proposes the C2f-DWR module. The standard C2f also employs residual connections to reuse features and increase network depth, thereby extracting multi-scale and deep-level feature information. In C2f, the Bottleneck residual module processes the same local features repeatedly, and the network’s receptive field size remains constant, allowing for multi-scale fusion only at specific layers. In contrast, C2f-DWR replaces the Bottleneck module in C2f with an Expansive Residual Module (DWR). This allows each layer of the network to have a distinct receptive field, enabling the capture of multi-scale feature information across the entire feature map, rather than performing multi-scale fusion only at specific layers as in FPN. The C2f-DWR module is shown in Figure 8. The improved network architecture is illustrated in Figure 9. 3. Experimental Setup and Simulation Analysis 3.1. Experimental Environment The experimental environment used in this paper is shown in below. The training parameters are shown in . 3.2. Data Preparation 3.3. Evaluation Criteria Common metrics used to evaluate the performance of object detection models include precision (P), recall (R), average precision (AP), intersection over union (IOU), and mean average precision (mAP). This paper primarily uses metrics such as average precision and mean average precision to evaluate model performance. Below, I will provide a brief overview of the definitions of these metrics: Precision (P): The proportion of samples correctly classified as positive out of the total number of actual positive samples. The mathematical formula can be expressed by Equation ( 2). P = T P T P + F P (2) Recall (R): The proportion of positive samples correctly identified as positive out of all positive samples. The mathematical formula can be expressed by Equation ( 3). R = T P T P + F N (3) Here, T P refers to the number of positive samples correctly identified, FP refers to the number of negative samples incorrectly identified as positive, and F N refers to the number of negative samples incorrectly identified as negative. Average Precision (AP): Refers to the average precision, represented by the area enclosed by the two metrics of precision and recall. Since both precision and recall typically range from 0 to 1, the range of average precision (AP) is also 0 to 1. Its mathematical expression can be expressed by Equation ( 4). A P = ∫ 0 1 P ( R ) d R (4) Mean Average Precision (mAP): This is the average of the sum of the precisions for each defect category. It provides a better indication of the training model’s generalization ability and robustness. Its mathematical expression can be expressed by Equation ( 5). m A P = ∑ i = 1 n A P n (5) Here, i represents the average accuracy value for a particular defect category. Since there are six defect categories in the dataset used in this experiment, n is 6. 3.4. Ablation Experiment To validate the effectiveness of the lightweight improved model proposed in this paper, we conduct ablation experiments to analyze the model’s performance in terms of accuracy, recall, mean average precision, number of parameters, and model size. In these ablation experiments, the original YOLOv8n model is defined as the Baseline. After sequentially introducing the C2f-RepGhost, SPDConv, and C2f-DWR modules into the Baseline, the resulting models are designated as Model 1, Model 2, and Model 3, respectively. The model resulting from the simultaneous introduction of the C2f-RepGhost and SPDConv modules into the Baseline is designated as Model 4. The model incorporating both C2f-RepGhost and C2f-DWR is designated as Model 5. The model incorporating both SPDConv and C2f-DWR is designated as Model 6. Finally, the model integrating all three improvement methods into the Baseline is designated as Model 7. The configuration of the ablation experiment is shown in . All eight experiments were conducted under identical experimental environments and training parameters to ensure the standardization and accuracy of the experiments. The experimental results provide a detailed presentation of the precision (AP) values for each defect category, the mean average precision (mAP@50) across all defect categories, model size, and the number of model parameters. The specific results are shown in . As shown in , the ablation experiments revealed that introducing the C2f-RepGhost module based on the reparameterization method reduced the network model size and number of parameters by 1.6 MB and 0.42 M, respectively, while the model’s mAP@50 only decreased by 0.2%, verifying that the structural reparameterization method can effectively reduce the number of network parameters and the model size. Introducing the SPDConv module based on spatial coding technology, compared to the baseline model, significantly improved the AP values for all five PCB defect types except for short-circuit defects (which decreased by 0.1%), with mAP@50 increasing by 2.1%. This demonstrates that using stride convolution and pooling operations for spatial dimensionality reduction leads to the loss of fine-grained information. After introducing the extended residual module (DWR), the accuracy (AP) of all six defect classes is improved compared to YOLOv8n, and the average accuracy mAP@50 of all classes is also increased from 93.4% to 94.3%. In addition, the number of model parameters and size are also optimized for lightweighting, verifying the effectiveness of extracting feature information of each layer by designing different receptive fields through residual structure. The experimental results of Model4 show that the experimental results of combining the C2f-RepGhost module and the SPDConv module achieve a synergistic effect, not only improving the AP value of all defect classes, but also simplifying the model. The Model 5 experiment results show that the accuracy detection index mAP@50 only decreased by 0.3%, while the number of model parameters decreased from 3.01 M to 2.53 M, with a reduction of 15.95%. This indicates that the C2f-RepGhost and C2f-DWR modules are key to achieving network lightweighting in this algorithm. The Model 6 experiment results show that the simultaneous introduction of the SPDConv module and the C2f-DWR module resulted in the highest improvement in AP for copper excess and burr defects, increasing by 1.6% and 8.4% respectively. The accuracy of other defects also improved to varying degrees, it has been proved that the SPDConv module mainly plays a role in balancing accuracy. The final Model 7 experiment shows that the combination of the three improvement methods proposed in this chapter not only reduced the model size and number of parameters, but also improved the accuracy of other types of defects. Among them, the AP of short-circuit defects reached the highest of 98%, and mAP@50 also increased from 93.4% of the baseline model to 95.9%, with an improvement of 2.5%. To visually illustrate the impact of the three improvement methods proposed in this chapter on precision (P), recall (R), mean precision (AP), and mean average precision (mAP@50), the following visualization is provided, as shown in Figure 13. To further analyze the detection performance of each model, an experimental analysis was conducted from three aspects: precision, recall rate, and F1 score. The experimental results are shown in . As shown in , the baseline model has the highest accuracy rate, but the recall rate is relatively low, and the comprehensive F1 score is 93.8%. In the improved model, Model 6 significantly increases the recall rate without sacrificing the accuracy rate, achieving the optimal F1 score of 95.0% and the best overall performance. Models 2, 4, and 7 also effectively improve the recall rate by slightly reducing the accuracy rate, and their overall performance is superior to the baseline. The three indicators of Model 1 and Model 5 have declined, indicating that the C2f-RepGhost module failed to achieve the expected gain in accuracy. The performance of Model 3 is the same as the baseline, with limited improvement. Overall, the improved scheme of Model 6 achieves the best balance between accuracy and recall rate, verifying the effectiveness of the SPDConv and C2f-DWR modules, and providing a direction for subsequent optimization. To intuitively reflect the variations of each parameter loss function curve during network training, the loss changes throughout the full training epochs of the original baseline model and the improved RSD-YOLOv8n model are visualized. The variations of loss functions for YOLOv8n are presented in Figure 14, and those for RSD-YOLOv8n are illustrated in Figure 15. By comparing the training curves of the two groups, it can be observed that the improved model not only maintains stable convergence during training but also achieves a significant improvement in overall performance. From the perspective of the loss function, the box_loss, cls_loss, and dfl_loss of the improved model in the training set and validation set show a smoother downward trend. Although the initial loss value is slightly higher than that of the original model, the final convergence value is comparable to that of the original model, without any overfitting or oscillation phenomena. This indicates that the improvement scheme did not disrupt the optimization process of the model but instead enhanced the training stability. In terms of accuracy indicators, the convergence speed of the precision, recall, mAP50, and mAP50-95 curves of the improved model is faster, and the final indicator values are also higher than those of the original model. Among them, the improvement in mAP50-95 is particularly significant, indicating that the improved model has stronger generalization ability and better detection performance at different intersection-over-union thresholds. Overall, the improvement scheme effectively enhances the target detection accuracy and robustness of the model without introducing training instability issues, and verifies the effectiveness of the improvement strategy. 3.6. Generalization Experiment The ablation and comparison experiments described above have demonstrated the robustness of the RSD-YOLOv8n algorithm. Next, I will conduct experiments to evaluate the network’s generalization ability and verify the algorithm’s performance on other datasets. Here, we have selected the VisDrone2019 drone vision dataset [ 24]. This dataset features defects similar to those found on PCBs. Both datasets share the common characteristic of containing small targets. For the generalization experiments, we selected the OSD-YOLOv10 and C4D-YOLOv8 algorithms. The experimental results are shown in . As shown in , the proposed RSD-YOLOv8n algorithm improves the mAP@50 and mAP@50:95 metrics by 10.7% and 7.7% respectively compared to the YOLOv8n baseline model, while also reducing the number of parameters by 0.04M. Compared to the OSD-YOLOv10 and C4D-YOLOv8 algorithms, it also achieves the highest accuracy, fully demonstrating the effectiveness and generalization ability of our proposed RSD-YOLOv8n algorithm. 3.7. Visualization of Experimental Results Based on the verification of quantitative indicators, in order to further visually compare the actual detection performance of the original model and the improved model, this section selects representative typical scenarios from the test set to conduct a visual comparison and analysis of the detection results of the two models. Through a qualitative approach, the optimization effects of the improvement strategy in terms of target positioning accuracy, small target recognition, and anti-interference ability in complex backgrounds are verified. The visualization of experimental results are shown in Figure 17. 4. Conclusions To address the real-time performance and lightweighting challenges encountered when deploying printed circuit board (PCB) defect detection models on embedded devices, this paper proposes the RSD-YOLOv8n lightweight detection algorithm based on YOLOv8n. First, the C2f module is improved through feature reuse and structural reparameterization, and C2f-RepGhost is proposed to enhance the backbone feature extraction capability. Second, SPDConv is introduced to replace conventional strided convolution and pooling, which avoids spatial information loss while maintaining feature resolution and further strengthens feature learning. Finally, the C2f-DWR module is designed to improve the model’s adaptability to defects of different scales through partition-based multi-scale feature fusion. Experimental results show that the mAP@50 of RSD-YOLOv8n is increased to 95.9%, while the parameter volume is significantly reduced, which verifies the effectiveness of the proposed method. Cross-dataset generalization experiments also demonstrate its good generalization capability. However, there are still several limitations in the current research. Firstly, the model’s robustness in detecting under extreme lighting conditions and target occlusion in complex industrial scenarios needs to be further improved. Secondly, there is still room for optimization in balancing the model’s lightweighting and detection accuracy, and the actual deployment inference speed on edge devices has not been fully verified. Finally, the model’s detection performance for small samples and long-tail distribution defects still has significant shortcomings, and targeted improvements are needed in the future. Future work can address the issue of small sample defect detection by integrating semi-supervised learning, explore more efficient model compression and quantization methods, further enhance the real-time performance of edge devices, conduct research on multi-modal information fusion, and improve the generalization ability of the model in complex industrial environments. Author Contributions Conceptualization, J.L.; Methodology, J.L., X.J. and Y.Z.; software, J.L.; validation, J.M.; formal analysis, J.L. and J.M.; investigation, J.L.; resources, X.J.; data curation, J.L.; writing—original draft preparation, J.L.; writing—review and editing, X.J. and Y.Z.; visualization, J.L. and J.M.; supervision, X.J. and Y.Z.; project administration, X.J. and Y.Z.; funding acquisition, X.J. and Y.Z. All authors have read and agreed to the published version of the manuscript. Funding This work was supported in part by the Fujian Provincial Natural Science Foundation of China under Grant 2026J0011094, in part by the Putian Science and Technology Plan Project under Grant 2023GZ2001PTXY19, 2023GJGZ003, in part by the Fujian Province Social Sciences Fund Project under Grant FJ2024C058, and in part by the Startup Fund for Advanced Talents of Putian University under Grant 2023015. Institutional Review Board Statement Not applicable. Informed Consent Statement Not applicable. Data Availability Statement The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author. Conflicts of Interest The authors declare no conflicts of interest. References YOLOv8 Network Architecture. YOLOv8 Network Architecture. C2f Structure. C2f Structure. The evolution from the Ghost module to the RepGhost module. The evolution from the Ghost module to the RepGhost module. C2f-RepGhost module. C2f-RepGhost module. SPDConv Module. SPDConv Module. Traditional multiscale feature extraction methods and the expanded residual method. Traditional multiscale feature extraction methods and the expanded residual method. DWR Module Structure. DWR Module Structure. C2f-DWR Module. C2f-DWR Module. Improved network architecture. Improved network architecture. Actual Number of Defects by Type. Actual Number of Defects by Type. Images of various defect types and the red boxes indicate the marked defect locations. Images of various defect types and the red boxes indicate the marked defect locations. Enhanced images after data augmentation. Enhanced images after data augmentation. Effects of three improvement methods. Effects of three improvement methods. Changes in loss functions during the training stage of YOLOv8n. Changes in loss functions during the training stage of YOLOv8n. Changes in loss functions during the training stage of RSD-YOLOv8n. Changes in loss functions during the training stage of RSD-YOLOv8n. Precision, Recall, and mAP@0.5 and mAP curves at 0.5:0.95 during the training of different models. Precision, Recall, and mAP@0.5 and mAP curves at 0.5:0.95 during the training of different models. Visualization of experimental results of YOLOv8n and RSD-YOLOv8n. Visualization of experimental results of YOLOv8n and RSD-YOLOv8n. Experimental environment configuration. Experimental environment configuration. Item Content Operating System Ubuntu 22.04.5 CPU AMD Zen 1 GPU NVIDIA GeForce RTX4090D Memory 24GB CUDA Version CUDA12.1 Deep Learning Frameworks Pytorch Training parameters. Training parameters. Parameter Value Epoch 300 Batch Size 32 Workers 8 Learning Rate 0.001 Optimizer Adam Loss CIOU Number of defects by category before and after data augmentation. Number of defects by category before and after data augmentation. Defect Type Category Number Before Augmentation Category Number After Augmentation Open circuit 116 580 Short circuit 116 580 Mouse_bite 115 580 Missing_hole 115 605 Spurious_copper 116 580 Spur 115 575 Total 693 3500 Configuration instructions for ablation experiment. Configuration instructions for ablation experiment. Baseline YOLOv8n Model 1 +C2f_RepGhost Model 2 +SPDConv Model 3 +C2f_DWR Model 4 +C2f_RepGhost+SPDConv Model 5 +C2f_RepGhost+C2f_DWR Model 6 +SPDConv+C2f_DWR Model 7 RSD-YOLOv8n Ablation experiment of different models. Ablation experiment of different models. Model OP 1SC 2MB 3MH 4SC 5SP 6mAP@50 MS 7NP 8Baseline 93.1 96.9 90.6 98.4 96.0 85.6 93.4 11.48 3.01 Model 1 93.5 95.8 89.2 99.2 94.8 86.5 93.2 9.88 2.59 Model 2 96.6 96.8 93.6 99.0 96.6 90.5 95.5 13.28 3.48 Model 3 93.8 97.4 91.2 99.4 96.2 87.7 94.3 11.25 2.95 Model 4 96.7 97.0 93.8 99.2 97.1 90.3 95.7 11.33 2.97 Model 5 92.4 97.1 88.8 99.1 94.5 86.1 93.1 9.65 2.53 Model 6 96.0 97.2 93.5 99.4 97.5 92.8 96.1 12.93 3.39 Model 7 96.6 98.0 93.2 99.2 97.3 91.1 95.9 11.33 2.97 1 open circuit. 2 short circuit. 3 mouse bite. 4 missing hole. 5 spurious copper. 6 spur. 7 model size. 8 number of parameters. Verification of supplementary parameters. Verification of supplementary parameters. Model P/% R/% F1/% Baseline 97.8 90.2 93.8 Modell 96.8 88.4 92.4 Model2 96.8 92.1 94.4 Model3 96.9 90.8 93.8 Model4 96.9 91.9 94.3 Model5 96.6 88.2 92.2 Model6 97.6 92.5 95.0 Model7 97.5 91.9 94.6 Comparative experiment of different models. Comparative experiment of different models. Model P(%) 1R(%) 2mAP@50(%) mAP@50:95(%) P(M) 3FPS YOLOv3 98.3 90.8 95.9 73.4 103.7 41.6 YOLOv5n 97.6 86.7 92.4 60.1 2.51 47.4 YOLOv5s 97.9 91.1 94.9 67.9 9.12 46.1 YOLOv6n 96.2 85.5 91.4 55.4 4.24 52.6 YOLOv8n 96.9 88.8 93.4 63.3 3.01 49.1 YOLOv9t 97.1 87.5 93.3 61.8 2.01 27.2 YOLOv9s 98.6 90.6 95.2 69.7 7.29 26.2 YOLOv10n 97.3 87.5 92.9 63.8 2.71 44.4 YOLOv11n 96.8 89.9 94.3 68.3 2.59 41.3 RSD-YOLOv8n 97.5 91.9 95.9 69.5 2.97 36.0 1 precision. 2 recall. 3 paramters. Generalization experiment of different models. Generalization experiment of different models. Model mAP@50(%) mAP@50:95(%) Parameters/M YOLOv8n 35.2 20.5 3.01 OSD-YOLOv10 [ 25] 43.4 19.1 1.6 C4D-YOLOv8 [ 26] 45.8 - - RSD-YOLOv8n 45.9 28.2 2.97 Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. © 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Share and Cite MDPI and ACS Style Jin, X.; Li, J.; Ma, J.; Zhao, Y. RSD-YOLOv8n: A Lightweight PCB Defect Detection Algorithm. Mathematics 2026, 14, 1989. https://doi.org/10.3390/math14111989 AMA Style Jin X, Li J, Ma J, Zhao Y. RSD-YOLOv8n: A Lightweight PCB Defect Detection Algorithm. Mathematics. 2026; 14(11):1989. https://doi.org/10.3390/math14111989 Chicago/Turabian Style Jin, Xianli, Jinqiang Li, Jie Ma, and Yangyang Zhao. 2026. "RSD-YOLOv8n: A Lightweight PCB Defect Detection Algorithm" Mathematics 14, no. 11: 1989. https://doi.org/10.3390/math14111989 APA Style Jin, X., Li, J., Ma, J., & Zhao, Y. (2026). RSD-YOLOv8n: A Lightweight PCB Defect Detection Algorithm. Mathematics, 14(11), 1989. https://doi.org/10.3390/math14111989 Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here. Article Metrics Article metric data becomes available approximately 24 hours after publication online.

www.mdpi.com

Zum Originalartikel