Object detection with Owl-ViT
This project implements the Owl-ViT model for zero-shot object detection in videos (or several images).
 Detection using the prompts "person" and "ball".
Detection using the prompts "person" and "ball".
 Detection using the prompts "person" and "balloon".
Detection using the prompts "person" and "balloon".
Table of contents:
Run the following commands:
git clone https://github.com/killian31/ObjectsDetection.git
cd ObjectsDetection
pip install -r requirements.txt --upgrade
Usage: 
    python3 owl_vit.py [-h] --imgs_dir IMGS_DIR --save_to SAVE_TO [--process_video] [--video_filename VIDEO_FILENAME] [--image_start IMAGE_START] [--image_end IMAGE_END]
                       [--texts TEXTS [TEXTS ...]] [--thresholds THRESHOLDS [THRESHOLDS ...]] [--box_thickness BOX_THICKNESS] [--save_model]
Options:
  -h, --help            show this help message and exit
  --imgs_dir IMGS_DIR   The directory containing the images.
  --save_to SAVE_TO     the directory in which to save the processed images
  --fps FPS             Number of frames per second for the output video (default: None, determined automatically if --process_video).
  --process_video       Wether to get images from a video or not (default: False).
  --video_filename VIDEO_FILENAME
                        Name of video file to process (default: None).
  --image_start IMAGE_START
                        Frame to start from (default:0).
  --image_end IMAGE_END
                        Frame to end with (default: last (0).
  --texts TEXTS [TEXTS ...]
                        A list of texts to detect in the images (default: 14 random texts).
  --thresholds THRESHOLDS [THRESHOLDS ...]
                        A list of thresholds between 0 and 1 (default: 14 low thresholds for the texts).
  --box_thickness BOX_THICKNESS
                        The thickness of the bounding boxes to draw (default: 2).
  --save_model          Whether to save the pretrained model locally or not (default: False).
Run python3 owl_vit.py -h to display the help above message.
python3 owl_vit.py --imgs_dir frames --save_to detected_example --process_video --video_filename data/video.mp4 --texts person ball --thresholds 0.08 0.12 --save_model