Synthetic Data generation for Object Recognition and Pose Estimation

Typ
Examensarbete för masterexamen
Master's Thesis
Program
Data science and AI (MPDSC), MSc
Publicerad
2024
Författare
Zhang, Desheng
Kong, Xiangbo
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
The automotive industry is looking for methods to promote a flexible robotic assembly on objects which are difficult for conventional robotic systems based on preprogramming. Specifically, it is difficult to automate wire harness assembly onto vehicles due to the limitation of current robotic assembly on pre-programed tasks and the deformability of wire harnesses. In order to achieve flexible wire harness assembly a well developed vision system. It is necessary for robots to perceive the spatial information of wire harnesses and adapt their actions to handle the objects. Computer vision techniques such as object recognition and pose estimation have been widely adopted in robotics to promote the perception capabilities of robots and the significant advancement in deep learning has remarkably promoted the research in computer vision. However, a vast amount of time and human effort are needed to collect and annotate the datasets for training deep learning models. In recent years, researchers apply synthetic dataset in computer vision tasks, which requires less human effort and promises good annotation quality. In this thesis, the synthetic datasets of connectors are created by using a physically based rendering method BlenderProc and procedures for data creation are provided for further research and investigations. Then, the performance of synthetic datasets are evaluated by object detection models(Yolov5 [22] and Yolov8 [23]) and a pose estimation model (Wide Depth Range [20]). Also, the influences of applying domain randomization methods(e.g. adding distarctors into the synthetic dataset) is discussed and evaluated. By evaluating the experiment results, the study finds that similar objects will cause mis-classification problems in connectors detection tasks, the domain gap will lead to a poor performance on real data and adding distractors into synthetic dataset can improve the robustness of the detectors. The study concludes with recommendations for future research, such as using Generative Adversarial Networks (GANs) to transfer the overall color and texture from the source images to the target images, or apply a bilevel optimization approach. These kinds of methods improve the domain gap between synthetic data and real data, thereby improving the performance of models trained on synthetic datasets in the future
Beskrivning
Ämne/nyckelord
Synthesis dataset , object recognition , pose estimation , computer vision , machine learning , neural network , domain randomization , domain gap
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index