feat: improve actor tool output #260

MQ37 · 2025-09-03T13:25:16Z

closes https://github.com/apify/apify-mcp-server-internal/issues/147

Backstory

This PR rework how the MCP server handles the Actor tool output. Instead of dumping the whole dataset with filtered only the important fields which sometimes may cause LLM context or tool call limit overflow here we try to return feasible preview of the Actor output items and then provide a get-actor-output tool for output dataset items retrieval.

Details

The dataset items preview first tries to return the whole dataset with all items, if that does not fit into the 20k tokens limit (set to 20k to keep it conservative as some client put hard limit of 25k so there is some padding) then we include only important fields (fields, that are defined in some dataset view) and if that does not fit either then we try to truncate the items with important fields only one by one until we find number of important fields items that fit (in extreme case it can also retur no preview if the Actor has really bad and nested output format).

The get-actor-output tool is loaded whenever any tool that even indirectly interacts with Actors is present (call-actor, add-actor, Actor tool directly like apify-slash-rag-web-browser).

Changes

Reworked and centralized the Actor tool output format. Now it returns metadata (runId, datasetId, total items count), then the output items schema, if possible some data preview and then instructions how to retrieve additional data or specific fields.
Introduced new simple get-actor-output tools as I found the get-dataset-items complicated and wanted to keep this as original dataset centric advanced tool
Centralized dataset items schema generation logic and removed the additionalProperties input from the get-dataset-schema as it seemed useless @jirispilka correct me if I am wrong
Deleted the legacy remove-tool and helper-tool code
Refactored all tool to use the central Ajv instance
Centralized string comma-separated list parsing logic

…tool-output

src/utils/schema-generation.ts

MichalKalita · 2025-09-04T11:34:56Z

tests/integration/suite.ts

            expectToolNamesToContain(names, DEFAULT_TOOL_NAMES);
            expectToolNamesToContain(names, DEFAULT_ACTOR_NAMES);
+            expect(names).toContain('get-actor-output');
+            await client.close();


all await client.close() are not needed, because of afterEach on line 90.
AfterEach is safe, it's called if any test failed.

I just put it there, just in case :D

It's not removed 😄

I did not remove it 😄

I would keep that, it should not cause any issues, right?

Manually closing is required only when more clients will be in a single test.

Every client will be closed by

afterEach(async () => { await client?.close(); client = undefined; });

Even if the test failed, afterEach is always run, and the client will be closed and disconnected. Without this afterEach, each test should use try..finally to close client after.

It was a bug before. When one test failed, the client was not disconnected, and afterAll failed on timeout because the server was waiting for the client.

Co-authored-by: Michal Kalita <[email protected]>

jirispilka

It works great — I played with it, and so far it’s flawless 🙂

My biggest concern is that get-actor-output shows up without users explicitly knowing it.

For example, when you configure tools, you end up with one extra tool as a kind of “bonus.” I think get-actor-output should go into the actors category. It makes sense there, typically when you call an Actor, you also need to retrieve its output.

What do you think?

src/utils/actor-response.ts

jirispilka · 2025-09-05T12:31:15Z

src/tools/index.ts

        getDataset,
        getDatasetItems,
        getDatasetSchema,
+        getActorOutput,


I understand that logically, getActorOutput should be placed here BUT it might be really misleading for user. You will enable tools=actors and you will get an extra bonus of getActorOutput without any obvious reason. What about to add it into actors?

Good catch! This should not be here, I would personally not add this tool to any category and kept it as "magical" tool that appears when you include any tool that can interact with Actors in any way (see README change that explains this). What do you think?

I think the fewer surprises, the better. When a user specifies something, we shouldn’t magically include something else. That feels like a big surprise and just asks for trouble — what if the user doesn’t want it at all? And how would they switch it off? There’s no way…

Instead, I’d add it to the actors category (alongside call-actor), and maybe also to other categories if needed.

But, t should also be included whenever an Actor is added 🤔, ahh

I would personally keep this as a "magical" tool since user may not know when to add this tool and then might be frustrated when he only gets partial or no Actor results (we encountered user issues and reports like this already). I understand that now it seems more "magical" and is kind of a surprise but it makes the whole Actor calling and output retrieval work and this is crucial for the workflows. So I would not put this tool into any category and keep it hidden and auto-enabled if needed (based on the tool loader logic). I think there is no reason to Actually switch off this tool if you are working with Actors. Maybe there is, when you want to only call some Actor like send email and you do not care about the results, but that is miniory of the workflows and I would not complicate tool configuration for majorty only to satisfy this one use case where user might want to turn this tool off.

src/utils/tools-loader.ts

src/const.ts

tests/integration/internals.test.ts

commit 20e6753 Author: Apify Release Bot <[email protected]> Date: Tue Sep 9 11:41:31 2025 +0000 chore(release): Update changelog, package.json and manifest.json versions [skip ci] commit 7ef726d Author: Jakub Kopecký <[email protected]> Date: Tue Sep 9 13:40:04 2025 +0200 feat: improve actor tool output (#260) * feat: improve actor tool output * update readme * fix output tool, write test for that * add test based on Zuzka suggestion * lint * fix output response order so LLM does not lose the instructions * refactor: unify string list parsing logic * fix the tests - order of the Actor run response messages * Update src/utils/schema-generation.ts Co-authored-by: Michal Kalita <[email protected]> * address review comments * add get-actor-output tools note about when its loaded --------- Co-authored-by: Michal Kalita <[email protected]> commit 279293f Author: Michal Kalita <[email protected]> Date: Mon Sep 8 12:05:25 2025 +0200 fix: error when content type is json (#265) * fix: error when content type is json * fix: do not make json schema formatted for human readable commit 4659e03 Author: Apify Release Bot <[email protected]> Date: Thu Sep 4 12:30:50 2025 +0000 chore(release): Update changelog, package.json and manifest.json versions [skip ci]

* feat: improve actor tool output * update readme * fix output tool, write test for that * add test based on Zuzka suggestion * lint * fix output response order so LLM does not lose the instructions * refactor: unify string list parsing logic * fix the tests - order of the Actor run response messages * Update src/utils/schema-generation.ts Co-authored-by: Michal Kalita <[email protected]> * address review comments * feat: agentci payments v2 * add skyfire usage resource, fix skyfire pay id handling and passing to Actor * add skyfire instructions also to the call-actor info step result content * Squashed commit of the following: commit 20e6753 Author: Apify Release Bot <[email protected]> Date: Tue Sep 9 11:41:31 2025 +0000 chore(release): Update changelog, package.json and manifest.json versions [skip ci] commit 7ef726d Author: Jakub Kopecký <[email protected]> Date: Tue Sep 9 13:40:04 2025 +0200 feat: improve actor tool output (#260) * feat: improve actor tool output * update readme * fix output tool, write test for that * add test based on Zuzka suggestion * lint * fix output response order so LLM does not lose the instructions * refactor: unify string list parsing logic * fix the tests - order of the Actor run response messages * Update src/utils/schema-generation.ts Co-authored-by: Michal Kalita <[email protected]> * address review comments * add get-actor-output tools note about when its loaded --------- Co-authored-by: Michal Kalita <[email protected]> commit 279293f Author: Michal Kalita <[email protected]> Date: Mon Sep 8 12:05:25 2025 +0200 fix: error when content type is json (#265) * fix: error when content type is json * fix: do not make json schema formatted for human readable commit 4659e03 Author: Apify Release Bot <[email protected]> Date: Thu Sep 4 12:30:50 2025 +0000 chore(release): Update changelog, package.json and manifest.json versions [skip ci] * fix the port already in use issue with tests * lint * remove the try catch that was rethrowing generic error in callActorGetDataset * add skyfire seller id and pay id to the get actor output * lint * rename env var * include seller id conditionally in the instructions * handle skyfire pay id in get actor output * change usd amount --------- Co-authored-by: Michal Kalita <[email protected]>

MQ37 added 2 commits September 3, 2025 15:25

feat: improve actor tool output

ad7b772

Merge remote-tracking branch 'origin/master' into feat/improve-actor-…

6baa662

…tool-output

github-actions bot assigned MQ37 Sep 3, 2025

github-actions bot added t-ai Issues owned by the AI team. tested Temporary label used only programatically for some analytics. labels Sep 3, 2025

update readme

07035b0

MQ37 requested a review from MichalKalita September 4, 2025 09:08

MQ37 added 3 commits September 4, 2025 12:53

fix output tool, write test for that

4a402d8

add test based on Zuzka suggestion

f207cc2

lint

4ff957e

MichalKalita reviewed Sep 4, 2025

View reviewed changes

MQ37 and others added 4 commits September 4, 2025 14:40

fix output response order so LLM does not lose the instructions

a2bf3b5

refactor: unify string list parsing logic

26c34cc

fix the tests - order of the Actor run response messages

3186c88

Update src/utils/schema-generation.ts

259831a

Co-authored-by: Michal Kalita <[email protected]>

MQ37 marked this pull request as ready for review September 5, 2025 08:32

MQ37 requested a review from jirispilka September 5, 2025 08:32

MichalKalita approved these changes Sep 5, 2025

View reviewed changes

jirispilka requested changes Sep 5, 2025

View reviewed changes

address review comments

863dd47

MQ37 requested a review from jirispilka September 8, 2025 11:04

add get-actor-output tools note about when its loaded

ea8753a

MQ37 merged commit 7ef726d into master Sep 9, 2025
4 checks passed

MQ37 deleted the feat/improve-actor-tool-output branch September 9, 2025 11:40

feat: improve actor tool output #260

feat: improve actor tool output #260

Uh oh!

Conversation

MQ37 commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backstory

Details

Changes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jirispilka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MQ37 commented Sep 3, 2025 •

edited

Loading