D-Robotics YOLO World开放词汇目标检测

时间： 2024-11-23 来源：D-Robotics官网

代理服务

技术支持

采购服务

YOLO-World

功能介绍

YOLO-World是一种先进的开放词汇目标检测方法，根据输入文本的变化可以实现以零样本的方式高效检测出不同的全新类别目标。

应用场景：YOLO-World强大的零样本检测能力使得其具有更强的泛化能力，可以应用在智能驾驶、智能家居、地质检测等领域。

支持平台

准备工作

RDK平台

1. RDK已烧录好Ubuntu 22.04系统镜像。

2. RDK已成功安装TogetheROS.Bot。

3. RDK已安装MIPI或者USB摄像头。

4. 确认PC机能够通过网络访问RDK。

安装功能包

sudo apt update

sudo apt install tros-humble-hobot-yolo-world

使用介绍

YOLO-World（hobot_yolo_world）package订阅sensor package发布的图片，同时YOLO-World支持根据输入文本变化改变检测类别，其中文本特征来源于本地特征库，通过输入文本查询对应特征，并输入模型推理，经过推理后发布算法msg，通过websocket package实现在PC端浏览器上渲染显示sensor发布的图片和对应的算法结果。

RDK平台

使用MIPI摄像头发布图片

# 从tros.b的安装路径中拷贝出运行示例需要的配置文件。

cp -r/opt/tros/${TROS_DISTRO}/lib/hobot_yolo_world/config/.

# 配置MIPI摄像头

export CAM_TYPE=mipi

# 启动launch文件

ros2 launch hobot_yolo_world yolo_world.launch.py yolo_world_texts:="red bottle,trash bin"

使用USB摄像头发布图片

# 配置tros.b环境

source/opt/tros/humble/setup.bash

使用本地回灌图片

# 配置tros.b环境

source/opt/tros/humble/setup.bash

# 从tros.b的安装路径中拷贝出运行示例需要的配置文件。

cp -r/opt/tros/${TROS_DISTRO}/lib/hobot_yolo_world/config/.

# 配置本地回灌图片

export CAM_TYPE=fb

# 启动launch文件

ros2 launch hobot_yolo_world yolo_world.launch.py yolo_world_texts:="red bottle,trash bin"

结果分析

在运行终端输出如下信息：

[hobot_yolo_world-3] [WARN] [0000003710.693524477] [hobot_yolo_world]: This is hobot yolo world!

[hobot_yolo_world-3] [WARN] [0000003710.792557185] [hobot_yolo_world]: Parameter:

[hobot_yolo_world-3] feed_type（0:local, 1:sub）:1

[hobot_yolo_world-3] image: config/yolo_world_test.jpg

[hobot_yolo_world-3] dump_render_img: 0

[hobot_yolo_world-3] is_shared_mem_sub: 1

[hobot_yolo_world-3] score_threshold: 0.05

[hobot_yolo_world-3] iou_threshold: 0.45

[hobot_yolo_world-3] nms_top_k: 50

[hobot_yolo_world-3] texts: red bottle,trash bin

[hobot_yolo_world-3] ai_msg_pub_topic_name: /hobot_yolo_world

[hobot_yolo_world-3] ros_img_sub_topic_name: /image

[hobot_yolo_world-3] ros_string_sub_topic_name: /target_words

[hobot_yolo_world-3] [WARN] [0000003710.848418019] [hobot_yolo_world]: Parameter:

[hobot_yolo_world-3] model_file_name: config/yolo_world.bin

[hobot_yolo_world-3] model_name:

[hobot_yolo_world-3] [WARN] [0000003710.848540935] [hobot_yolo_world]: model_file_name_: config/yolo_world.bin, task_num: 4

[hobot_yolo_world-3] [BPU_PLAT]BPU Platform Version(1.3.6)!

[hobot_yolo_world-3] [HBRT] set log level as 0. version = 3.15.49.0

[hobot_yolo_world-3] [DNN] Runtime version = 1.23.8_(3.15.49 HBRT)

[hobot_yolo_world-3] [A][DNN][packed_model.cpp:247][Model](1970-01-01,01:01:51.482.877) [HorizonRT] The model builder version = 1.23.5

[hobot_yolo_world-3] [WARN] [0000003711.739402019] [hobot_yolo_world]: Get model name: yolo_world_pad_pretrain_norm_new from load model.

[hobot_yolo_world-3] [WARN] [0000003711.739551686] [hobot_yolo_world]: Create ai msg publisher with topic_name: /hobot_yolo_world

[hobot_yolo_world-3] [WARN] [0000003711.794810269] [hobot_yolo_world]: Create string subscription with topic_name: /target_words

[hobot_yolo_world-3] [WARN] [0000003711.808682144] [hobot_yolo_world]: Create img hbmem_subscription with topic_name: /hbmem_img

[hobot_yolo_world-3] [WARN] [0000003712.541236020] [yolo_world]: Loaned messages are only safe with const ref subscription callbacks. If you are using any other kind of subscriptions, set the ROS_DISABLE_LOANED_MESSAGES environment variable to 1 (the default).

[hobot_yolo_world-3] [W][DNN]bpu_model_info.cpp:491][Version](1970-01-01,01:01:51.727.259) Model: yolo_world_pad_pretrain_norm_new. Inconsistency between the hbrt library version 3.15.49.0 and the model build version 3.15.47.0 detected, in order to ensure correct model results, it is recommended to use compilation tools and the BPU SDK from the same OpenExplorer package.

[hobot_yolo_world-3] [WARN] [0000003714.698775687] [hobot_yolo_world]: Sub img fps: 1.00, Smart fps: 1.51, pre process time ms: 30, infer time ms: 121, post process time ms: 5

[hobot_yolo_world-3] [WARN] [0000003716.714586355] [hobot_yolo_world]: Sub img fps: 1.00, Smart fps: 0.99, pre process time ms: 40, infer time ms: 127, post process time ms: 6

[hobot_yolo_world-3] [WARN] [0000003718.707619939] [hobot_yolo_world]: Sub img fps: 1.00, Smart fps: 1.00, pre process time ms: 39, infer time ms: 121, post process time ms: 6

在PC端的浏览器输入http://IP:8000即可查看图像和算法渲染效果（IP为RDK的IP地址）：

进阶使用

如果您想更改本地的文本特征，可以利用相应的工具在本地生成。使用方法。

# 从tros.b的安装路径中拷贝出运行示例需要的工具文件。

cp -r/opt/tros/${TROS_DISTRO}/lib/hobot_yolo_world/tool/.

# 下载模型并解压

wget http://archive.D-Robotics.cc/models/yoloworld_encode_text/huggingclip_text_encode.tar.gz

sudo tar -xf huggingclip_text_encode.tar.gz -C tool

cd tool/

# 安装依赖

pip install -r requirements.txt

# 修改class.list里的词汇

# 生成本地词汇

python main.py

#拷贝新的词汇特征

mv offline_vocabulary_embeddings.json ../config/

发送到邮箱 |
+1 赞 0
收藏
评论 0
| 转发至：

本文由雪飘梦飞转载自D-Robotics官网，原文标题为:YOLO World 开放词汇目标检测，本站所有转载文章系出于传递更多信息之目的，且明确注明来源，不希望被转载的媒体或个人可与我们联系，我们将立即进行删除处理。

全部评论（0）

暂无评论

D-Robotics 人形机器人自动踢球控制

本文介绍了人形机器人自动踢球控制的设计方案及应用。该功能包通过接收物体识别节点的消息，控制机器人自动踢球。

2024-11-23 - 设计经验代理服务技术支持采购服务

基于地平线RDK模型YOLOv5s深度学习的方法进行赛道障碍物检测应用实例

该功能为基于深度学习的方法识别赛道中的障碍物，使用模型为YOLOv5s。地平线RDK通过摄像头获取小车前方环境数据，图像数据通过训练好的YOLO模型进行推理得到障碍物的图像坐标值并发布。

2024-11-08 - 设计经验代理服务技术支持采购服务

基于RDK系统的双目深度估计算法示例

双目深度估计算法是使用地平线OpenExplorer在SceneFlow数据集上训练出来的StereoNet模型。算法输入为双目图像数据，分别是左右视图。算法输出为左视图的视差。此示例使用mipi双目相机作为图像数据输入源，利用BPU进行算法推理，发布包含双目图像左图和感知结果的话题消息，在PC端rviz2上渲染算法结果。

2024-11-05 - 设计经验代理服务技术支持采购服务

地瓜机器人与火山引擎联手打造基于大模型的“云-边-端”一体化机器人开发底座

地瓜机器人与字节跳动旗下云服务平台火山引擎达成合作，双方将在“机器人+大模型”领域强强联合，实现地瓜机器人软硬件通用底座与火山引擎边缘智能大模型网关“云-边-端”的全面打通，加速大模型在机器人场景的落地，拓展机器人的无限智能化潜能。作为拥有极致算力性价比与极简开发体验的地瓜机器人开发者套件，RDK X5可提供高达10 TOPS的端侧算力，接口丰富，极致易用。

2024-10-29 - 原厂动态代理服务技术支持采购服务

【IC】D-Robotics RDK X5机器人开发者套件，可提供高达10 Tops算力，加速智能化应用快速落地

D-Robotics RDK X5搭载Sunrise 5智能计算芯片，可提供高达10 Tops的算力，是一款面向智能计算与机器人应用的全能开发套件，接口丰富，极致易用，支持Transfomer、RWKV、Occupancy、Stereo Perception等多种复杂模型和最新算法，加速智能化应用快速落地。

2024-09-20 - 产品代理服务技术支持采购服务