Skip to content

retry errors hide responses of row mutations #485

@nika-qubit

Description

@nika-qubit

When bigtable row mutations run into retryable errors and exhausts all retry budgets, the responses of batched/flushed row mutations are all masked/hidden by the client who later suppresses all retryable errors.

Additionally, the returned response of the client is no longer Status types but the default None value.

To force a retryable error to be raised in normal conditions, manually set a very tiny mutation_timeout (0.1 ms), run below code, you get None values in the response.

import datetime
from google.cloud import bigtable

project_id = 'ningk-test-project'
instance_id = 'ningk-test'
table_id = 'test'
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
table = instance.table(table_id, mutation_timeout=0.0001)
timestamp = datetime.datetime.utcnow()
column_family_id = 'stats'

rows = [
    table.direct_row('stats#a1#20220106001'),
    table.direct_row('stats#a1#20220106002')
]

rows[0].set_cell(column_family_id,
               'name',
               'dango',
               timestamp)
rows[0].set_cell(column_family_id,
               'type',
               'alpha',
               timestamp)
rows[1].set_cell(column_family_id,
               'name',
               'pump',
               timestamp)
rows[1].set_cell(column_family_id,
               'type',
               'beta',
               timestamp)

from google.cloud.bigtable.batcher import MutationsBatcher

FLUSH_COUNT = 1000
MAX_ROW_BYTES = 5242880  # 5MB

class _MutationsBatcher(MutationsBatcher):
    def __init__(
        self, table, flush_count=FLUSH_COUNT, max_row_bytes=MAX_ROW_BYTES):
      super().__init__(table, flush_count, max_row_bytes)
      self.rows = []

    def set_flush_callback(self, callback_fn):
      self.callback_fn = callback_fn

    def flush(self):
      if len(self.rows) != 0:
        response = self.table.mutate_rows(self.rows)
        self.callback_fn(response)
        self.total_mutation_count = 0
        self.total_size = 0
        self.rows = []

batcher = _MutationsBatcher(table)

def callback(response):
    for i, status in enumerate(response):
        print(type(status))

batcher.set_flush_callback(callback)

for row in rows:
    batcher.mutate(row)

batcher.flush()

You will get

<class 'NoneType'>
<class 'NoneType'>

More context can be found here: https://issues.apache.org/jira/browse/BEAM-13606.

The expected solution:

  1. The returned response should always contain Status types instead of None. A DEADLINE_EXCEEDED might be a good fit here as retry budget is exhausted;
  2. All retryable errors shouldn't be suppressed by the client because then the invoker of the client wouldn't know what rows need to be retried.

Metadata

Metadata

Assignees

Labels

api: bigtableIssues related to the googleapis/python-bigtable API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions