Enable OpenVINO™ Optimization for GroundingDINO

Authors: Wenyi Zou, Xiake Sun

Introduction

GroundingDINO introduces a language-guided query selection module to enhance object detection using input text. This module selects relevant features from image and text inputs and uses them as decoder queries. In this blog, we provide the OpenVINO™ optimization for GroundingDINO on Intel® platforms.

The public GroundingDINO project is referenced from: GroundingDINO

The GroundingDINO refer the model structure in below picture:

Figure 1. The framework of Grounding DINO. We present the overall framework, a feature enhancer layer, and a decoder layer in block 1,block 2, and block 3,respectively.

OpenVINO™ backend on GroundingDINO

In this project, you do not require to download OpenVINO™ and build the library with GroundingDINO project manually. It’s already fully integrated with OpenVINO™ runtime library for downloading, program compiling and linking.

At present, this repository already optimized and validated by OpenVINO™ 2023.1.0.dev20230811 version. Check the operating system which can support OpenVINO™ runtime library directly:

  • Ubuntu 22.04 long-term support     (LTS), 64-bit (Kernel 5.15+)
  • Ubuntu 20.04 long-term support     (LTS), 64-bit (Kernel 5.15+)
  • Ubuntu 18.04 long-term support     (LTS) with limitations, 64-bit (Kernel 5.4+)
  • Windows* 10 
  • Windows* 11 
  • macOS* 10.15 and above,     64-bit 
  • Red Hat Enterprise Linux* 8,     64-bit

Step 1: Install system dependency and setup environment

Create and enable python virtual environment

conda create -n ov_py310 python=3.10 -y
conda activate ov_py310

Clone the GroundingDINO repository from GitHub

git clone https://github.com/wenyi5608/GroundingDINO.git -b wenyi5608-openvino

Change the current directory to the GroundingDINO folder

cd GroundingDINO/

Install python dependency

pip install -r requirements.txt
pip install openvino==2023.1.0.dev20230811 openvino-dev==2023.1.0.dev20230811 onnx onnxruntime

Install the required dependencies in the current directory

pip install -e .

Download pre-trained model weights

mkdir weights
cd weights/
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
cd ..

Step 2: Export to OpenVINO™ models

python demo/export_openvino.py -c groundingdino/config/GroundingDINO_SwinT_OGC.py -p weights/groundingdino_swint_ogc.pth -o weights/

Step 3: Simple inference test with PyTorch and OpenVINO™

Inference with PyTorch

python demo/inference_on_a_image.py \
-c groundingdino/config/GroundingDINO_SwinT_OGC.py \
-p weights/groundingdino_swint_ogc.pth \
-i .asset/demo7.jpg \
-t "Horse. Clouds. Grasses. Sky. Hill." \
-o logs/1111 \
 --cpu-only
 

Inference with OpenVINO™

python demo/ov_inference_on_a_image.py \
-c groundingdino/config/GroundingDINO_SwinT_OGC.py \
-p weights/groundingdino.xml \
-i .asset/demo7.jpg  \
-t " Horse. Clouds. Grasses. Sky. Hill."  \
-o logs/2222 -d CPU
Figure2. Detection Prompt: “Horse. Clouds. Grasses. Sky. Hill.”, Visualization of OpenVINO™(left) and PyTorch(right) model output.