From 4e95826bec5450044aa8860486de3a40a6644889 Mon Sep 17 00:00:00 2001
From: Sergei Grebnov <sergei.grebnov@gmail.com>
Date: Tue, 21 Oct 2025 23:34:56 +0000
Subject: [PATCH 1/2] Document `partition_mode` DuckDB acceleration param
 (#1194)

* Document `partition_mode` DuckDB acceleration param

* Clarify partition_mode connection pooling details

Updated description of `partition_mode` to clarify connection pooling behavior.

* Update partition_mode description for clarity

Enhanced explanation of 'partition_mode' to clarify performance benefits.
---
 website/docs/components/data-accelerators/duckdb.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/website/docs/components/data-accelerators/duckdb.md b/website/docs/components/data-accelerators/duckdb.md
index 9d44e07fb..a3749c72c 100644
--- a/website/docs/components/data-accelerators/duckdb.md
+++ b/website/docs/components/data-accelerators/duckdb.md
@@ -38,6 +38,8 @@ DuckDB acceleration supports the following optional parameters under `accelerati
 - `duckdb_data_dir` (string, default:`.spice/data/`): Path to the directory the DuckDB database file(s) will be placed in. This is useful when using the `partition_by` acceleration parameter. If both `duckdb_data_dir` and `duckdb_file` are specified, `duckdb_file` will be used and `duckdb_data_dir` will be ignored.
 - `duckdb_memory_limit` (string, default: none): Limits DuckDB's memory usage for instance. Acceptable units are KB, MB, GB, TB (decimal: 1000^i) or KiB, MiB, GiB, TiB (binary: 1024^i). See [DuckDB memory limit documentation](https://duckdb.org/docs/stable/configuration/overview).
 - `duckdb_preserve_insertion_order` (boolean, default: `true`): Controls whether DuckDB preserves the insertion order of rows in tables. When set to `true`, rows are returned in the order they were inserted. See [DuckDB preserve insertion order documentation](https://duckdb.org/docs/stable/guides/performance/how_to_tune_workloads#the-preserve_insertion_order-option) and [order preservation documentation](https://duckdb.org/docs/stable/sql/dialect/order_preservation).
+- `partition_mode` (string, default: `files`): Controls how partitioned data is stored. Can only be used with `partition_by`. Set to `tables` to store partitions as separate tables within a single DuckDB database, improving resource usage through single shared connection pool for all partitions. Default `files` mode creates separate database files per partition with individual connection pools and generally faster query performance.
+- `duckdb_partitioned_write_flush_threshold` (integer, default: `122880`): The number of rows buffered per partition before flushing data to acceleration storage. Only applicable when using `partition_mode: tables`. Using a larger value can improve write performance but requires more memory.
 
 Refer to the [datasets configuration reference](/docs/reference/spicepod/datasets.md#acceleration) for additional supported fields.
 

From 06ca8360cfef22b35f2cd5b04940316a5e0a6c23 Mon Sep 17 00:00:00 2001
From: Kevin <4733573+kczimm@users.noreply.github.com.>
Date: Fri, 24 Oct 2025 12:08:28 -0500
Subject: [PATCH 2/2] add s3_vectors_index_poll_interval

---
 website/docs/components/vectors/s3_vectors.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/website/docs/components/vectors/s3_vectors.md b/website/docs/components/vectors/s3_vectors.md
index afa82d4bf..82400d417 100644
--- a/website/docs/components/vectors/s3_vectors.md
+++ b/website/docs/components/vectors/s3_vectors.md
@@ -43,6 +43,7 @@ embeddings:
 | `s3_vectors_bucket`                | The S3 vectors bucket to use. If `s3_vectors_index` is not specified, an index will be created based on the underlying embedding column. Incompatible with `s3_vectors_arn` | `a-bucket`                                                                           |
 | `s3_vectors_index`                 | The name of the s3 vectors index to use or create. Incompatible with `s3_vectors_arn`.                                                                                      | `index-of-important-embeddings`                                                      |
 | `s3_vectors_distance_metric`                 | The distance metric to be used for similarity search. One of: `euclidean`, `cosine`. Default `cosine`.  | `euclidean`                                                      |
+| `s3_vectors_index_poll_interval`             | The interval to poll for index updates to avoid excessive API calls. Minimum 5 seconds. Default is to poll on every scan. | `5m`                                                            |
 
 :::warning[Limitations]