오브젝트 검출

80개의 공통 클래스의 객체를 탐지할 수 있는 COCO 데이터 세트에 대해 훈련된 YOLOv3 모델을 사용하여 OpenCV의 Deep Neural Network Module(dnn)에서 감지된 객체.

객체 검출은 컴퓨터 비전 및 이미지 처리와 관련된 컴퓨터 기술로 디지털 이미지와 ^[1]비디오에서 특정 클래스(사람, 건물 또는 자동차 등)의 시맨틱 객체의 인스턴스를 검출하는 것을 처리합니다.잘 조사된 물체 감지 영역에는 얼굴 감지 및 보행자 감지 영역이 포함됩니다.물체 감지는 이미지 검색 및 비디오 감시를 포함한 컴퓨터 비전의 많은 영역에서 응용되고 있습니다.

사용하다

도로상의 물체 감지

이미지 주석,^[2] 차량 계수,^[3] 활동 인식,^[4] 얼굴 감지, 얼굴 인식, 비디오 객체 공동 분할과 같은 컴퓨터 비전 작업에 널리 사용된다.이것은 또한 축구 경기 중에 공을 추적하거나 크리켓 배트의 움직임을 추적하거나 비디오에 나오는 사람을 추적하는 데에도 사용됩니다.

개념.

모든 오브젝트 클래스에는 클래스를 분류하는 데 도움이 되는 특별한 기능이 있습니다.예를 들어, 모든 원은 둥글게 되어 있습니다.오브젝트 클래스 검출에는, 이러한 특수 기능이 사용됩니다.예를 들어 원을 찾을 때 점으로부터 특정 거리(즉, 중심)에 있는 물체를 찾습니다.마찬가지로 정사각형을 찾을 때 모서리에 수직이고 변의 길이가 같은 객체가 필요합니다.눈, 코, 입술을 확인할 수 있고 피부색, 눈 사이의 거리 등의 특징을 찾을 수 있는 얼굴 식별에도 비슷한 접근법이 사용된다.

방법들

Microsoft COCO testdev 데이터 세트 https://cocodataset.org에서 다양한 검출기의 속도와 정확도 비교(모든 값은 이러한 알고리즘의 저자의 https://arxiv.org 기사에서 확인할 수 있습니다)

객체 검출 방법은 일반적으로 뉴럴 네트워크 기반 또는 비뉴럴 접근법 중 하나로 분류됩니다.비뉴럴 어프로치의 경우 먼저 다음 방법 중 하나를 사용하여 피쳐를 정의한 후 Support Vector Machine(SVM; 지원 벡터 머신) 등의 기술을 사용하여 분류해야 합니다.한편, 신경 기술은 특별히 특징을 정의하지 않고 엔드 투 엔드 객체 검출을 수행할 수 있으며, 일반적으로 컨볼루션 뉴럴 네트워크(CNN)에 기초한다.

비신경적 접근법:
- Haar 특징을 기반으로 한 Viola-Jones 객체 감지 프레임워크
- 스케일 불변 기능 변환(SIFT)
- 방향 그라데이션(HOG) 피쳐^[6] 히스토그램
뉴럴 네트워크 접근법:
- 지역 제안(R-CNN,^[7] 고속 R-CNN,^[8] 고속 R-CNN,^[9] 캐스케이드 R-CNN)^[10]
- 싱글샷 멀티박스 검출기(SSD)
- 1회만 (YOLO)
- 객체 검출을 위한 싱글샷 정제 뉴럴 네트워크(RefineDet)
- 레티나넷
- 변형 가능한 컨볼루션 네트워크

「」를 참조해 주세요.

레퍼런스

^ 다시오풀루, 스타마티아 등「지식 지원 시멘틱 비디오 객체 검출」IEEEE Transactions on Circuits and Systems for Video Technology (2005) : 1210 ~1224 。
^ Ling Guan; Yifeng He; Sun-Yuan Kung (1 March 2012). Multimedia Image and Video Processing. CRC Press. pp. 331–. ISBN 978-1-4398-3087-1.
^ Alsanabani, Ala; Ahmed, Mohammed; AL Smadi, Ahmad (2020). "Vehicle Counting Using Detecting-Tracking Combinations: A Comparative Analysis". 2020 the 4th International Conference on Video and Image Processing. pp. 48–54. doi:10.1145/3447450.3447458. ISBN 9781450389075. S2CID 233194604.
^ 우, 지안신 등「객체 사용에 근거한 액티비티 인식의 스케일러블한 어프로치」2007년, 제11회 컴퓨터 비전 국제 회의.IEEE, 2007.
^ ^a ^b Bochkovskiy, Alexey (2020). "Yolov4: Optimal Speed and Accuracy of Object Detection". arXiv:2004.10934 [cs.CV].
^ Dalal, Navneet (2005). "Histograms of oriented gradients for human detection" (PDF). Computer Vision and Pattern Recognition. 1.
^ Ross, Girshick (2014). "Rich feature hierarchies for accurate object detection and semantic segmentation" (PDF). Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE. pp. 580–587. arXiv:1311.2524. doi:10.1109/CVPR.2014.81. ISBN 978-1-4799-5118-5. S2CID 215827080.
^ Girschick, Ross (2015). "Fast R-CNN" (PDF). Proceedings of the IEEE International Conference on Computer Vision. pp. 1440–1448. arXiv:1504.08083. Bibcode:2015arXiv150408083G.
^ Shaoqing, Ren (2015). "Faster R-CNN". Advances in Neural Information Processing Systems. arXiv:1506.01497.
^ ^a ^b Pang, Jiangmiao; Chen, Kai; Shi, Jianping; Feng, Huajun; Ouyang, Wanli; Lin, Dahua (2019-04-04). "Libra R-CNN: Towards Balanced Learning for Object Detection". arXiv:1904.02701v1 [cs.CV].
^ Liu, Wei (October 2016). "SSD: Single shot multibox detector". Computer Vision – ECCV 2016. European Conference on Computer Vision. Lecture Notes in Computer Science. Vol. 9905. pp. 21–37. arXiv:1512.02325. doi:10.1007/978-3-319-46448-0_2. ISBN 978-3-319-46447-3. S2CID 2141740.
^ Redmon, Joseph (2016). "You only look once: Unified, real-time object detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. arXiv:1506.02640. Bibcode:2015arXiv150602640R.
^ Redmon, Joseph (2017). "YOLO9000: better, faster, stronger". arXiv:1612.08242 [cs.CV].
^ Redmon, Joseph (2018). "Yolov3: An incremental improvement". arXiv:1804.02767 [cs.CV].
^ Wang, Chien-Yao (2021). "Scaled-YOLOv4: Scaling Cross Stage Partial Network". Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). arXiv:2011.08036. Bibcode:2020arXiv201108036W.
^ Zhang, Shifeng (2018). "Single-Shot Refinement Neural Network for Object Detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4203–4212. arXiv:1711.06897. Bibcode:2017arXiv171106897Z.
^ Lin, Tsung-Yi (2020). "Focal Loss for Dense Object Detection". IEEE Transactions on Pattern Analysis and Machine Intelligence. 42 (2): 318–327. arXiv:1708.02002. Bibcode:2017arXiv170802002L. doi:10.1109/TPAMI.2018.2858826. PMID 30040631. S2CID 47252984.
^ Zhu, Xizhou (2018). "Deformable ConvNets v2: More Deformable, Better Results". arXiv:1811.11168 [cs.CV].
^ Dai, Jifeng (2017). "Deformable Convolutional Networks". arXiv:1703.06211 [cs.CV].

"Object Class Detection". Vision.eecs.ucf.edu. Archived from the original on 2013-07-14. Retrieved 2013-10-09.
"ETHZ – Computer Vision Lab: Publications". Vision.ee.ethz.ch. Archived from the original on 2013-06-03. Retrieved 2013-10-09.

외부 링크

[1] 다시오풀루, 스타마티아 등「지식 지원 시멘틱 비디오 객체 검출」IEEEE Transactions on Circuits and Systems for Video Technology (2005) : 1210 ~1224 。

[GuanHe2012-2] Ling Guan; Yifeng He; Sun-Yuan Kung (1 March 2012). Multimedia Image and Video Processing. CRC Press. pp. 331–. ISBN 978-1-4398-3087-1.

[3] Alsanabani, Ala; Ahmed, Mohammed; AL Smadi, Ahmad (2020). "Vehicle Counting Using Detecting-Tracking Combinations: A Comparative Analysis". 2020 the 4th International Conference on Video and Image Processing. pp. 48–54. doi:10.1145/3447450.3447458. ISBN 9781450389075. S2CID 233194604.

[4] 우, 지안신 등「객체 사용에 근거한 액티비티 인식의 스케일러블한 어프로치」2007년, 제11회 컴퓨터 비전 국제 회의.IEEE, 2007.

[yolov4-5] Bochkovskiy, Alexey (2020). "Yolov4: Optimal Speed and Accuracy of Object Detection". arXiv:2004.10934 [cs.CV].

[6] Dalal, Navneet (2005). "Histograms of oriented gradients for human detection" (PDF). Computer Vision and Pattern Recognition. 1.

[7] Ross, Girshick (2014). "Rich feature hierarchies for accurate object detection and semantic segmentation" (PDF). Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE. pp. 580–587. arXiv:1311.2524. doi:10.1109/CVPR.2014.81. ISBN 978-1-4799-5118-5. S2CID 215827080.

[8] Girschick, Ross (2015). "Fast R-CNN" (PDF). Proceedings of the IEEE International Conference on Computer Vision. pp. 1440–1448. arXiv:1504.08083. Bibcode:2015arXiv150408083G.

[9] Shaoqing, Ren (2015). "Faster R-CNN". Advances in Neural Information Processing Systems. arXiv:1506.01497.

[Pang_Chen_Shi_Feng_2019-10] Pang, Jiangmiao; Chen, Kai; Shi, Jianping; Feng, Huajun; Ouyang, Wanli; Lin, Dahua (2019-04-04). "Libra R-CNN: Towards Balanced Learning for Object Detection". arXiv:1904.02701v1 [cs.CV].

[11] Liu, Wei (October 2016). "SSD: Single shot multibox detector". Computer Vision – ECCV 2016. European Conference on Computer Vision. Lecture Notes in Computer Science. Vol. 9905. pp. 21–37. arXiv:1512.02325. doi:10.1007/978-3-319-46448-0_2. ISBN 978-3-319-46447-3. S2CID 2141740.

[12] Redmon, Joseph (2016). "You only look once: Unified, real-time object detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. arXiv:1506.02640. Bibcode:2015arXiv150602640R.

[13] Redmon, Joseph (2017). "YOLO9000: better, faster, stronger". arXiv:1612.08242 [cs.CV].

[14] Redmon, Joseph (2018). "Yolov3: An incremental improvement". arXiv:1804.02767 [cs.CV].

[15] Wang, Chien-Yao (2021). "Scaled-YOLOv4: Scaling Cross Stage Partial Network". Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). arXiv:2011.08036. Bibcode:2020arXiv201108036W.

[16] Zhang, Shifeng (2018). "Single-Shot Refinement Neural Network for Object Detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4203–4212. arXiv:1711.06897. Bibcode:2017arXiv171106897Z.

[17] Lin, Tsung-Yi (2020). "Focal Loss for Dense Object Detection". IEEE Transactions on Pattern Analysis and Machine Intelligence. 42 (2): 318–327. arXiv:1708.02002. Bibcode:2017arXiv170802002L. doi:10.1109/TPAMI.2018.2858826. PMID 30040631. S2CID 47252984.

[18] Zhu, Xizhou (2018). "Deformable ConvNets v2: More Deformable, Better Results". arXiv:1811.11168 [cs.CV].

[19] Dai, Jifeng (2017). "Deformable Convolutional Networks". arXiv:1703.06211 [cs.CV].

[1]

[2]

[3]

[4]

[6]

[7]

[8]

[9]

[10]

Search

오브젝트 검출

네임스페이스

더

목차

사용하다

개념.

방법들

「」를 참조해 주세요.

레퍼런스

외부 링크

Search

오브젝트 검출

사용하다

개념.

방법들

「 」를 참조해 주세요.

레퍼런스

외부 링크

「」를 참조해 주세요.