[Artificial Intelligence / TensorFlow] TensorFlow Object Detection API를 이용한 다물체 인식하기 Part 3.

04-30 19:11

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

Scientific Computing & Data Science

[Artificial Intelligence / TensorFlow] TensorFlow Object Detection API를 이용한 다물체 인식하기 Part 3. - Web Cam 연동하기 본문

Artificial Intelligence/TensorFlow

[Artificial Intelligence / TensorFlow] TensorFlow Object Detection API를 이용한 다물체 인식하기 Part 3. - Web Cam 연동하기

cinema4dr12 2017. 11. 1. 23:36

Written by Geol Choi | Nov. 01, 2017

이번 포스팅에서는 웹캠으로부터 입력받은 영상을 TensorFlow Object Detection API와 연동하여 오브젝트를 감지하는 방법에 대해 알아보겠습니다.

지난 포스팅을 읽지 않았다면, 먼저 읽을 것을 권장하며, Python-OpenCV에 대한 간단한 지식도 필요합니다.

TensorFlow Object Detection API를 이용한 다물체 인식하기 Part 1. - 개발환경 설정
TensorFlow Object Detection API를 이용한 다물체 인식하기 Part 2. - 코드 설명 및 응용
Python-OpenCV 개발환경 구축
TensorFlow Object Detection API GitHub Page

* 주의사항: 본 포스팅은 웹캠영상을 입력받아 영상 내 Object Detection에 관한 것으로, 관련 코드만을 설명 드립니다. 따라서, 전체 코드에 대한 설명은 지난 포스팅을 참고하시기 바랍니다.

1. Python-OpenCV 라이브러리 임포트

웹캠 영상을 입력받기 위해 다음과 같이 OpenCV 라이브러리를 임포트합니다:

1
pyimport cv2
cs

2. 웹캠으로부터 입력받은 영상 내 오브젝트 인식

우선 아래 코드를 살펴보시기 바랍니다:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        cam = cv2.VideoCapture(0)
        
        while True:
            ret_val, image = cam.read()
            
            if ret_val:
                # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
                image_np_expanded = np.expand_dims(image, axis=0)
                image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
                
                # Each box represents a part of the image where a particular object was detected.
                boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
                
                # Each score represent how level of confidence for each of the objects.
                # Score is shown on the result image, together with the class label.
                scores = detection_graph.get_tensor_by_name('detection_scores:0')
                classes = detection_graph.get_tensor_by_name('detection_classes:0')
                num_detections = detection_graph.get_tensor_by_name('num_detections:0')
                
                # Actual detection.
                (boxes, scores, classes, num_detections) = sess.run(
                        [boxes, scores, classes, num_detections],
                        feed_dict={image_tensor: image_np_expanded})
                
                # Visualization of the results of a detection.
                vis_util.visualize_boxes_and_labels_on_image_array(
                        image,
                        np.squeeze(boxes),
                        np.squeeze(classes).astype(np.int32),
                        np.squeeze(scores),
                        category_index,
                        use_normalized_coordinates=True,
                        line_thickness=8)
                
                cv2.imshow('my webcam', image)
                
                if cv2.waitKey(1) == 27: 
                    break  # esc to quit
        
        cv2.destroyAllWindows()
Colored by Color Scripter
cs

핵심이 되는 코드만을 설명하겠습니다.

Line 3에서는 웹캠 디바이스 객체를 생성하고, Line 6에서는 현재 웹캠 영상을 받아옵니다. 웹캠으로부터 정상적으로 영상을 받아올 경우 rev_val은 True 값을 가지며 그렇지 않을 경우 False 값을 갖게 됩니다.

Line 9~35는 웹캠으로부터 입력받은 영상 내 오브젝트를 감지하고 영상 위에 인식된 객체의 경계로 Bounding Box를 추가합니다.

Line 37에서 Bounding Box가 추가된 영상을 윈도우로 출력하고, Line 39~40은 사용자로부터 'Esc' 키를 입력받을 때 동작을 멈추고, Line 42에서 OpenCV가 띄운 모든 창을 닫습니다.

웹캠 영상 인식 화면 예시

다음 화면은 웹캠 영상을 인식 시킨 화면 예시인데, 구동을 해보면 계산량이 좀 있어서 버벅거림을 느끼게 될 것입니다 (직접 측정해보지는 않았지만 FPS가 5이하인 것 같습니다).

전체 Python 코드

마지막으로 웹캠 영상과 TensorFlow Object Detection API를 연동하여 영상 내 오브젝트를 인식하는 Python-TensorFlow 코드를 소개하고 이 포스팅을 마무리합니다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# -*- coding: utf-8 -*-
"""
TensorFlow Object Detection API + OpenCV Sample
Created on Mon Oct 30 12:43:54 2017
@author: gchoi
"""
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import cv2
 
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
 
from utils import label_map_util
from utils import visualization_utils as vis_util
 
tf.reset_default_graph()
tf.get_default_graph()
 
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
 
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
 
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')
 
NUM_CLASSES = 90
 
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
 
for file in tar_file.getmembers():
    file_name = os.path.basename(file.name)
    if 'frozen_inference_graph.pb' in file_name:
        tar_file.extract(file, os.getcwd())
 
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')
        
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
 
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
  
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
 
with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        cam = cv2.VideoCapture(0)
        
        while True:
            ret_val, image = cam.read()
            
            if ret_val:
                # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
                image_np_expanded = np.expand_dims(image, axis=0)
                image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
                
                # Each box represents a part of the image where a particular object was detected.
                boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
                
                # Each score represent how level of confidence for each of the objects.
                # Score is shown on the result image, together with the class label.
                scores = detection_graph.get_tensor_by_name('detection_scores:0')
                classes = detection_graph.get_tensor_by_name('detection_classes:0')
                num_detections = detection_graph.get_tensor_by_name('num_detections:0')
                
                # Actual detection.
                (boxes, scores, classes, num_detections) = sess.run(
                        [boxes, scores, classes, num_detections],
                        feed_dict={image_tensor: image_np_expanded})
                
                # Visualization of the results of a detection.
                vis_util.visualize_boxes_and_labels_on_image_array(
                        image,
                        np.squeeze(boxes),
                        np.squeeze(classes).astype(np.int32),
                        np.squeeze(scores),
                        category_index,
                        use_normalized_coordinates=True,
                        line_thickness=8)
                
                cv2.imshow('my webcam', image)
                
                if cv2.waitKey(1) == 27: 
                    break  # esc to quit
        
        cv2.destroyAllWindows()
Colored by Color Scripter
cs

'Artificial Intelligence > TensorFlow' 카테고리의 다른 글

[Artificial Intelligence / TensorFlow] TensorFlow Object Detection API를 이용한 다물체 인식하기 Part 2. (6)	2017.10.29
[Artificial Intelligence / TensorFlow] TensorBoard를 이용하여 TensorFlow 데이터 시각화 (0)	2017.10.21
[Artificial Intelligence / TensorFlow] Windows 환경에서 GPU 지원 R-TensorFlow 설치하기 (1)	2017.10.01
[Artificial Intelligence / TensorFlow] TensorFlow Object Detection API를 이용한 다물체 인식하기 Part 1. (21)	2017.09.13
[Artificial Intelligence / TensorFlow] R-TensorFlow 예제 - Autoencoder (0)	2017.09.04

공유하기 링크

페이스북
카카오스토리
트위터

'Artificial Intelligence/TensorFlow' Related Articles

Comments