Skip to content

Commit 714c786

Browse files
robotraptaAuto-format Botpositavi
authored
Wait for confident result (Blocking submit) (#18)
* User guide tweak to show how to send in a JPEG image * Client-side polling * Updating userguide Co-authored-by: Auto-format Bot <runner@fv-az353-195.tkyx2seitsuu3nmpmouz2httfb.bx.internal.cloudapp.net> Co-authored-by: positavi <[email protected]>
1 parent a1f06ce commit 714c786

File tree

10 files changed

+163
-20
lines changed

10 files changed

+163
-20
lines changed

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,32 @@ $ make generate
3434
## Testing
3535
Most tests need an API endpoint to run.
3636

37+
### Getting the tests to use your current code.
38+
39+
You kinda want to do a `pip install -e .` equivalent but I don't know how to do that with poetry. The ugly version is this...
40+
41+
Find the directory where `groundlight` is installed:
42+
43+
```
44+
$ python
45+
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
46+
[GCC 7.3.0] :: Anaconda, Inc. on linux
47+
Type "help", "copyright", "credits" or "license" for more information.
48+
>>> import groundlight
49+
>>> groundlight
50+
<module 'groundlight' from '/home/leo/anaconda3/lib/python3.7/site-packages/groundlight/__init__.py'>
51+
```
52+
53+
Then blow this away and set up a symlink from that directory to your source.
54+
55+
```
56+
cd /home/leo/anaconda3/lib/python3.7/site-packages/
57+
rm -rf groundlight
58+
ln -s ~/ptdev/groundlight-python-sdk/src/groundlight groundlight
59+
```
60+
61+
TODO: something better.
62+
3763
### Local API endpoint
3864

3965
1. Set up a local [janzu API

UserGuide.md

Lines changed: 35 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,49 @@
22

33
Groundlight makes it simple to understand images. You can easily create computer vision detectors just by describing what you want to know using natural language.
44

5+
## Computer vision made simple
6+
7+
How to build a working computer vision system in just 5 lines of python code:
8+
9+
```Python
10+
from groundlight import Groundlight
11+
gl = Groundlight()
12+
d = gl.create_detector("door", query="Is the door open?") # define with natural language
13+
image_query = gl.submit_image_query(detector=d, image=jpeg_img) # send in an image
14+
print(f"The answer is {image_query.result}") # get the result
15+
```
16+
517
**How does it work?** Your images are first analyzed by machine learning (ML) models which are automatically trained on your data. If those models have high enough confidence, that's your answer. But if the models are unsure, then the images are progressively escalated to more resource-intensive analysis methods up to real-time human review. So what you get is a computer vision system that starts working right away without even needing to first gather and label a dataset. At first it will operate with high latency, because people need to review the image queries. But over time, the ML systems will learn and improve so queries come back faster with higher confidence.
618

719
*Note: The SDK is currently in "beta" phase. Interfaces are subject to change in future versions.*
820

921

10-
## Simple Example
22+
## Managing confidence levels and latency
1123

12-
How to build a computer vision system in 5 lines of python code:
24+
Groundlight gives you a simple way to control the trade-off of latency against accuracy. The longer you can wait for an answer to your image query, the better accuracy you can get. In particular, if the ML models are unsure of the best response, they will escalate the image query to more intensive analysis with more complex models and real-time human monitors as needed. Your code can easily wait for this delayed response. Either way, these new results are automatically trained into your models so your next queries will get better results faster.
25+
26+
The desired confidence level is set as the escalation threshold on your detector. This determines what is the minimum confidence score for the ML system to provide before the image query is escalated.
27+
28+
For example, say you want to set your desired confidence level to 0.95, but that you're willing to wait up to 60 seconds to get a confident response.
1329

1430
```Python
15-
from groundlight import Groundlight
16-
gl = Groundlight()
17-
d = gl.create_detector("door", query="Is the door open?") # define with natural language
18-
image_query = gl.submit_image_query(detector=d, image="path/filename.jpeg") # send an image
19-
print(f"The answer is {image_query.result}") # get the result
31+
d = gl.create_detector("trash", query="Is the trash can full?", confidence=0.95)
32+
image_query = gl.submit_image_query(detector=d, image=jpeg_img, wait=60)
33+
# This will wait until either 30 seconds have passed or the confidence reaches 0.95
34+
print(f"The answer is {image_query.result}")
2035
```
2136

37+
Or if you want to run as fast as possible, set `wait=0`. This way you will only get the ML results, without waiting for escalation. Image queries which are below the desired confidence level still be escalated for further analysis, and the results are incorporated as training data to improve your ML model, but your code will not wait for that to happen.
38+
39+
```Python
40+
image_query = gl.submit_image_query(detector=d, image=jpeg_img, wait=0)
41+
```
42+
43+
You can see the confidence score returned for the image query:
44+
45+
```Python
46+
print(f"The confidence is {image_query.result.confidence}")
47+
```
2248

2349
## Getting Started
2450

@@ -45,6 +71,7 @@ $ python3 glapp.py
4571
```
4672

4773

74+
4875
## Prerequisites
4976

5077
### Using Groundlight SDK on Ubuntu 18.04
@@ -125,6 +152,7 @@ gl = Groundlight()
125152
try:
126153
detectors = gl.list_detectors()
127154
except ApiException as e:
155+
# Many fields available to describe the error
128156
print(e)
129157
print(e.args)
130158
print(e.body)

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[tool.poetry]
22
name = "groundlight"
3-
version = "0.5.4"
3+
version = "0.6.0"
44
license = "MIT"
55
readme = "UserGuide.md"
66
homepage = "https://groundlight.ai"

samples/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Code samples
2+

samples/blocking_submit.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
"""Example of how to wait for a confident result
2+
"""
3+
import logging
4+
5+
logging.basicConfig(level=logging.DEBUG)
6+
7+
from groundlight import Groundlight
8+
9+
gl = Groundlight()
10+
11+
d = gl.get_or_create_detector(name="dog", query="is there a dog in the picture?")
12+
13+
print(f"Submitting image query")
14+
iq = gl.submit_image_query(d, image="../test/assets/dog.jpeg", wait=30)
15+
print(iq)

spec/public-api.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
openapi: 3.0.3
22
info:
33
title: Groundlight API
4-
version: 0.1.0
5-
description: Ask visual queries.
4+
version: 0.6.0
5+
description: Easy Computer Vision powered by Natural Language
66
contact:
77
name: Questions?
88
@@ -273,6 +273,7 @@ components:
273273
like to use.
274274
maxLength: 100
275275
required:
276+
# TODO: make name optional - that's how the web version is going.
276277
- name
277278
- query
278279
x-internal: true

src/groundlight/client.py

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1-
import os
21
from io import BufferedReader, BytesIO
2+
import logging
3+
import os
4+
import time
35
from typing import Optional, Union
46

57
from model import Detector, ImageQuery, PaginatedDetectorList, PaginatedImageQueryList
@@ -15,6 +17,8 @@
1517

1618
GROUNDLIGHT_ENDPOINT = os.environ.get("GROUNDLIGHT_ENDPOINT", "https://api.groundlight.ai/device-api")
1719

20+
logger = logging.getLogger("groundlight")
21+
1822

1923
class ApiTokenError(Exception):
2024
pass
@@ -57,7 +61,10 @@ def __init__(self, endpoint: str = GROUNDLIGHT_ENDPOINT, api_token: str = None):
5761
self.detectors_api = DetectorsApi(ApiClient(configuration))
5862
self.image_queries_api = ImageQueriesApi(ApiClient(configuration))
5963

60-
def get_detector(self, id: str) -> Detector:
64+
def get_detector(self, id: Union[str, Detector]) -> Detector:
65+
if isinstance(id, Detector):
66+
# Short-circuit
67+
return id
6168
obj = self.detectors_api.get_detector(id=id)
6269
return Detector.parse_obj(obj.to_dict())
6370

@@ -107,19 +114,22 @@ def submit_image_query(
107114
self,
108115
detector: Union[Detector, str],
109116
image: Union[str, bytes, BytesIO, BufferedReader],
117+
wait: float = 0,
110118
) -> ImageQuery:
111119
"""Evaluates an image with Groundlight.
112120
:param detector: the Detector object, or string id of a detector like `det_12345`
113121
:param image: The image, in several possible formats:
114122
- a filename (string) of a jpeg file
115123
- a byte array or BytesIO with jpeg bytes
116124
- a numpy array in the 0-255 range (gets converted to jpeg)
125+
:param wait: How long to wait (in seconds) for a confident answer
117126
"""
118127
if isinstance(detector, Detector):
119128
detector_id = detector.id
120129
else:
121130
detector_id = detector
122131
image_bytesio: Union[BytesIO, BufferedReader]
132+
# TODO: support PIL Images
123133
if isinstance(image, str):
124134
# Assume it is a filename
125135
image_bytesio = buffer_from_jpeg_file(image)
@@ -134,5 +144,29 @@ def submit_image_query(
134144
"Unsupported type for image. We only support JPEG images specified through a filename, bytes, BytesIO, or BufferedReader object."
135145
)
136146

137-
obj = self.image_queries_api.submit_image_query(detector_id=detector_id, body=image_bytesio)
138-
return ImageQuery.parse_obj(obj.to_dict())
147+
raw_img_query = self.image_queries_api.submit_image_query(detector_id=detector_id, body=image_bytesio)
148+
img_query = ImageQuery.parse_obj(raw_img_query.to_dict())
149+
if wait:
150+
threshold = self.get_detector(detector).confidence_threshold
151+
img_query = self._poll_for_confident_result(img_query, wait, threshold)
152+
return img_query
153+
154+
def _poll_for_confident_result(self, img_query: ImageQuery, wait: float, threshold: float) -> ImageQuery:
155+
"""Polls on an image query waiting for the result to reach the specified confidence."""
156+
start_time = time.time()
157+
delay = 0.1
158+
while time.time() - start_time < wait:
159+
current_confidence = img_query.result.confidence
160+
if current_confidence is None:
161+
logging.debug(f"Image query with None confidence implies human label (for now)")
162+
break
163+
if current_confidence >= threshold:
164+
logging.debug(f"Image query confidence {current_confidence:.3f} above {threshold:.3f}")
165+
break
166+
logger.debug(
167+
f"Polling for updated image_query because confidence {current_confidence:.3f} < {threshold:.3f}"
168+
)
169+
time.sleep(delay)
170+
delay *= 1.4 # slow exponential backoff
171+
img_query = self.get_image_query(img_query.id)
172+
return img_query

src/groundlight/images.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ def buffer_from_jpeg_file(image_filename: str) -> io.BufferedReader:
99
For now, we only support JPEG files, and raise an ValueError otherwise.
1010
"""
1111
if imghdr.what(image_filename) == "jpeg":
12+
# Note this will get fooled by truncated binaries since it only reads the header.
13+
# That's okay - the server will catch it.
1214
return open(image_filename, "rb")
1315
else:
1416
raise ValueError("We only support JPEG files, for now.")

test/assets/blankfile.jpeg

Loading

test/integration/test_groundlight.py

Lines changed: 41 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,23 @@
11
import os
22
from datetime import datetime
33

4+
import openapi_client
45
import pytest
6+
57
from groundlight import Groundlight
68
from model import Detector, ImageQuery, PaginatedDetectorList, PaginatedImageQueryList
79

810

911
@pytest.fixture
1012
def gl() -> Groundlight:
13+
"""Creates a Groundlight client object for testing."""
1114
endpoint = os.environ.get("GROUNDLIGHT_TEST_API_ENDPOINT", "http://localhost:8000/device-api")
1215
return Groundlight(endpoint=endpoint)
1316

1417

1518
@pytest.fixture
1619
def detector(gl: Groundlight) -> Detector:
20+
"""Creates a new Test detector."""
1721
name = f"Test {datetime.utcnow()}" # Need a unique name
1822
query = "Test query?"
1923
return gl.create_detector(name=name, query=query)
@@ -24,7 +28,6 @@ def image_query(gl: Groundlight, detector: Detector) -> ImageQuery:
2428
return gl.submit_image_query(detector=detector.id, image="test/assets/dog.jpeg")
2529

2630

27-
# @pytest.mark.skip(reason="We don't want to create a million detectors")
2831
def test_create_detector(gl: Groundlight):
2932
name = f"Test {datetime.utcnow()}" # Need a unique name
3033
query = "Test query?"
@@ -33,7 +36,6 @@ def test_create_detector(gl: Groundlight):
3336
assert isinstance(_detector, Detector)
3437

3538

36-
# @pytest.mark.skip(reason="We don't want to create a million detectors")
3739
def test_create_detector_with_config_name(gl: Groundlight):
3840
name = f"Test b4mu11-mlp {datetime.utcnow()}" # Need a unique name
3941
query = "Test query with b4mu11-mlp?"
@@ -49,27 +51,60 @@ def test_list_detectors(gl: Groundlight):
4951
assert isinstance(detectors, PaginatedDetectorList)
5052

5153

52-
# @pytest.mark.skip(reason="We don't want to create a million detectors")
5354
def test_get_detector(gl: Groundlight, detector: Detector):
5455
_detector = gl.get_detector(id=detector.id)
5556
assert str(_detector)
5657
assert isinstance(_detector, Detector)
5758

5859

59-
# @pytest.mark.skip(reason="We don't want to create a million detectors and image_queries")
60-
def test_submit_image_query(gl: Groundlight, detector: Detector):
60+
def test_submit_image_query_blocking(gl: Groundlight, detector: Detector):
61+
# Ask for a trivially small wait so it never has time to update, but uses the code path
62+
_image_query = gl.submit_image_query(detector=detector.id, image="test/assets/dog.jpeg", wait=5)
63+
assert str(_image_query)
64+
assert isinstance(_image_query, ImageQuery)
65+
66+
67+
def test_submit_image_query_filename(gl: Groundlight, detector: Detector):
6168
_image_query = gl.submit_image_query(detector=detector.id, image="test/assets/dog.jpeg")
6269
assert str(_image_query)
6370
assert isinstance(_image_query, ImageQuery)
6471

6572

73+
def test_submit_image_query_jpeg_bytes(gl: Groundlight, detector: Detector):
74+
jpeg = open("test/assets/dog.jpeg", "rb").read()
75+
_image_query = gl.submit_image_query(detector=detector.id, image=jpeg)
76+
assert str(_image_query)
77+
assert isinstance(_image_query, ImageQuery)
78+
79+
80+
def test_submit_image_query_jpeg_truncated(gl: Groundlight, detector: Detector):
81+
jpeg = open("test/assets/dog.jpeg", "rb").read()
82+
jpeg_truncated = jpeg[:-500] # Cut off the last 500 bytes
83+
# This is an extra difficult test because the header is valid.
84+
# So a casual check of the image will appear valid.
85+
with pytest.raises(openapi_client.exceptions.ApiException) as exc_info:
86+
_image_query = gl.submit_image_query(detector=detector.id, image=jpeg_truncated)
87+
e = exc_info.value
88+
assert e.status == 400
89+
90+
91+
def test_submit_image_query_bad_filename(gl: Groundlight, detector: Detector):
92+
with pytest.raises(FileNotFoundError):
93+
_image_query = gl.submit_image_query(detector=detector.id, image="missing-file.jpeg")
94+
95+
96+
def test_submit_image_query_bad_jpeg_file(gl: Groundlight, detector: Detector):
97+
with pytest.raises(ValueError) as exc_info:
98+
_image_query = gl.submit_image_query(detector=detector.id, image="test/assets/blankfile.jpeg")
99+
assert "jpeg" in str(exc_info).lower()
100+
101+
66102
def test_list_image_queries(gl: Groundlight):
67103
image_queries = gl.list_image_queries()
68104
assert str(image_queries)
69105
assert isinstance(image_queries, PaginatedImageQueryList)
70106

71107

72-
# @pytest.mark.skip(reason="We don't want to create a million detectors and image_queries")
73108
def test_get_image_query(gl: Groundlight, image_query: ImageQuery):
74109
_image_query = gl.get_image_query(id=image_query.id)
75110
assert str(_image_query)

0 commit comments

Comments
 (0)