TY - GEN
T1 - UAV Target-Selection
T2 - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
AU - Medeiros, Anna C.S.
AU - Ratsamee, Photchara
AU - Orlosky, Jason
AU - Uranishi, Yuki
AU - Higashida, Manabu
AU - Takemura, Haruo
N1 - Funding Information:
We would like to thank the Kobe Fire Academy for their cooperation and collaboration. We also would like to thank MBZIRC2020 for their financial support. This research was funded in part by the Office of Naval Research, grant N62909-18-1-2036.
Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - This paper presents a 3D pointing interface application to signal a UAV's target in a large-scale environment. This system enables UAVs equipped with a monocular camera to determine which window of a building is selected by a human user in large-scale indoor or outdoor environments. The 3D pointing interface consists of three parts: YOLO, OpenPose, and ORB-SLAM. YOLO detects the target objects, e.g., windows, OpenPose extracts the user pose, and ORB-SLAM builds a scale-dependent 3D map, a set of 3D sparse feature points. To obtain the visual scale, it performs a calibration step with the user standing in front of the UAV at a certain distance. We detail how we chose the gesture, localize and detect objects, and transform between coordinate systems. The real-world experiment results showed that the 3D pointing interface obtained a 0.73 F1-score average and a 0.58 F1-Score at the maximum distance of 25 meters between UAV and building.
AB - This paper presents a 3D pointing interface application to signal a UAV's target in a large-scale environment. This system enables UAVs equipped with a monocular camera to determine which window of a building is selected by a human user in large-scale indoor or outdoor environments. The 3D pointing interface consists of three parts: YOLO, OpenPose, and ORB-SLAM. YOLO detects the target objects, e.g., windows, OpenPose extracts the user pose, and ORB-SLAM builds a scale-dependent 3D map, a set of 3D sparse feature points. To obtain the visual scale, it performs a calibration step with the user standing in front of the UAV at a certain distance. We detail how we chose the gesture, localize and detect objects, and transform between coordinate systems. The real-world experiment results showed that the 3D pointing interface obtained a 0.73 F1-score average and a 0.58 F1-Score at the maximum distance of 25 meters between UAV and building.
UR - http://www.scopus.com/inward/record.url?scp=85125483602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125483602&partnerID=8YFLogxK
U2 - 10.1109/ICRA48506.2021.9561688
DO - 10.1109/ICRA48506.2021.9561688
M3 - Conference contribution
AN - SCOPUS:85125483602
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 3963
EP - 3969
BT - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 30 May 2021 through 5 June 2021
ER -