Add support for non-default unnamed storages

## Problem

In local development, each run reuses the same default storages, which are wiped clean at the beginning. This behavior is intentional and should be preserved.

On the Apify platform, however, every Actor run automatically receives a new set of unnamed storages. Since these storages are unnamed, they are subject to the default 30-day retention policy.

The Apify Python client already exposes the ability to create new unnamed storages:

```python
await DatasetCollectionClientAsync.get_or_create()
```

Each call creates a new unnamed dataset with a unique ID.

Crawlee (and the underlying SDK), by contrast, does not currently support this. For example, repeated calls to:

```python
await Dataset.open()
```

always return the same default unnamed storage.

## Motivation

In more complex scenarios (e.g., [WCC](https://apify.com/apify/website-content-crawler)), an Actor may require multiple storage instances within a single run (for example, multiple request queues in WCC). To support this, we must allow creating additional unnamed storages beyond the single default instance - non-default unnamed storages (NDU). This provides flexibility for advanced workflows and better alignment with the platform capabilities.

## Goal state

Bring Crawlee storages (all storage clients, including `ApifyStorageClient`) to feature parity with the Apify platform by supporting non-default unnamed storages.

Preserve the current behavior of local storage clients: wipe storages at the start of each run and reuse the same default storage across runs.

## Possible solution 1) - new argument

Introduce a new argument to the storage `open` constructor:

```py
async def open(
    cls,
    name: str | None = None,
    id: str | None = None,
    scope: Literal['run', 'global'] = 'global',
) -> Dataset | KeyValueStore | RequestQueue:
    ...
```

- `scope='run'` indicates a non-default unnamed storage.
- `scope='global'` refers to globally named storages.
- The `name` parameter cannot be entirely removed for run scope storages, as it's needed for implementation:
    - For the filesystem storage: to use as a directory name.
    - For Apify platform storage: to store the mapping of name -> ID in the default key-value store.

Behavior matrix...

### Open storage by ID and name

- Raise an exception (choose one of these).
- Scope argument is ignored.

### Open storage by ID

- Opens an existing storage by ID.
- Scope?

### Open storage by name

- Scope run:
    - Opens or creates a run-scope (non-default unnamed) storage.
        - `name` is used internally for reference-storage purposes but is not the actual storage's "name".
- Scope global:
    - Opens or creates a global named storage.

### Open storage without args

- Opens the default unnamed storage.
- Scope argument is ignored.

## Possible solution 2) - new constructor

Introduce an alternative constructor for non-default unnamed storages (e.g., `open_by_alias` or similar):

```python
await Dataset.open_2(
    cls,
    alias: str,
)
```

- Each call to `open_new()` creates a new unnamed storage with a unique ID (mirroring the Apify API client).
- This also keeps the behavior of `open()` unchanged.

## Another solution?

## Discussion

- Both options have trade-offs:
  - Adding parameters to `open()` results in confusing combinations or a complex signature.
  - Alternative constructors may overwhelm users with choices from the start, and would need to be implemented for every storage type, and all related convenient helpers in crawlers & contexts.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for non-default unnamed storages #1175

Problem

Motivation

Goal state

Possible solution 1) - new argument

Open storage by ID and name

Open storage by ID

Open storage by name

Open storage without args

Possible solution 2) - new constructor

Another solution?

Discussion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add support for non-default unnamed storages #1175

Description

Problem

Motivation

Goal state

Possible solution 1) - new argument

Open storage by ID and name

Open storage by ID

Open storage by name

Open storage without args

Possible solution 2) - new constructor

Another solution?

Discussion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions