[feat] enable dart_mcp_server to drive UI and test screenshots #248

csells · 2025-07-10T03:12:17Z

csells
Jul 10, 2025

to build a UI interactively from a design, an AI agent needs an evaluation function. in this case, the standard MCP server for driving UIs on the web won't work, since Puppeteer requires the HTML DOM, which Flutter Web doesn't provider. we really need the dart_mcp_server to allow the AI agent to drive a Flutter UI such that it can press buttons, enter text, scroll, etc. With that plus screenshots, the AI agent has everything it would need to allow Flutter devs to build Flutter UIs like the AI agents build and iterate on web-based UIs today.

jakemac53 · 2025-07-21T16:15:07Z

jakemac53
Jul 21, 2025
Collaborator

Note that the flutter driver tool does now exist (only on the dart dev channel and flutter main channels) - however we ended up disabling screenshots due to the support only existing on the skia renderer which means the majority of platforms it doesn't work by default.

We also likely need to iterate on the flutter driver support based on user feedback.

I am going to leave this issue open as the canonical place to discuss the current support and any improvements that should be made to it.

0 replies

csells · 2025-07-21T16:34:01Z

csells
Jul 21, 2025
Author

I think the dart mcp server needs a few things to make it a viable option for using an AI agent to iterate on a Flutter app:

the ability to start and manage the state of a running app, e.g. launch, hot reload, etc.
the ability to drive the UI via button presses, form filling, etc. so the AI can see the UI in various states and navigate through the app
the ability to take screenshots so the AI can see what the UI looks like currently and fix UX design deviations or runtime errors
the ability to see debug output to find and fix runtime errors

It's these capabilities that allow the agent to drive and test and fix a Flutter app rapidly w/o human intervention.

0 replies

jakemac53 · 2025-07-21T16:43:14Z

jakemac53
Jul 21, 2025
Collaborator

Yes those are all likely useful things - the only one we don't have right now is the screenshot support.

The flutter driver stuff (which allows it to drive the UI) also likely needs polish, but I am not 100% clear on the workflows people like to use for this. If you can come up with some clear user journeys for us to test that would be very helpful.

0 replies

jakemac53 · 2025-07-21T16:44:43Z

jakemac53
Jul 21, 2025
Collaborator

See flutter/flutter#170357 for the issue that would unblock screenshots.

0 replies

csells · 2025-07-22T12:41:58Z

csells
Jul 22, 2025
Author

@jakemac53 the flutter driver stuff is new and super exciting! I'll experiment with what's there. Also excited for screenshots when you can get them.

0 replies

jakemac53 · 2025-07-22T14:04:40Z

jakemac53
Jul 22, 2025
Collaborator

For flutter driver the biggest issue I was running into is getting it to reliably figure out how to select the right thing on the screen. Likely we need some prompting work here, but I don't know enough about flutter driver itself to know the best course of action to suggest in the prompt.

Usually, I find they like to guess at tooltip or text content, and then eventually end up getting the widget tree when that fails. Even after getting the widget tree they sometimes have trouble clicking specific buttons in menu bars.

The other issue I was having is that Gemini 2.5 pro is just too slow based on the number of function calls required to do things, I have a lot better experience with faster but less intelligent models. They might fumble around a bit more but ultimately complete the task much more quickly.

0 replies

csells · 2025-07-22T15:39:49Z

csells
Jul 22, 2025
Author

For flutter driver the biggest issue I was running into is getting it to reliably figure out how to select the right thing on the screen. Likely we need some prompting work here, but I don't know enough about flutter driver itself to know the best course of action to suggest in the prompt.

Usually, I find they like to guess at tooltip or text content, and then eventually end up getting the widget tree when that fails. Even after getting the widget tree they sometimes have trouble clicking specific buttons in menu bars.

The description for the flutter_driver tool doesn't contain the details necessary for the LLM to know what it can do or how it can do it. I recommend a much more detailed description in the flutter_driver tool. Also, I don't know how the system prompt is created, but I recommend duplicating all of the tool schema information in there, too. I've had very good luck with these two techniques.

0 replies

jakemac53 · 2025-07-22T16:23:23Z

jakemac53
Jul 22, 2025
Collaborator

Yes I agree figuring out the right prompting (via descriptions) is going to improve this a lot. It's figuring out the right prompting that is hard, when I have never used flutter driver myself or have any context into the best way to use it 🤣.

I will try and find somebody with some domain knowledge though to figure that out.

0 replies

jakemac53 · 2025-07-28T18:30:27Z

jakemac53
Jul 28, 2025
Collaborator

Fwiw, I played with this a whole bunch last week, trimmed down the set of available tools some and provided some more instruction in the tool itself, and in general things are working quite reliably. I even managed to live demo it without issues.

I have added a prompt to the MCP server as well, which is what I demo'd. You give it a user journey, and it will try to accomplish that user journey using flutter_driver. Once it has done so, it will write a flutter driver test for you based on the necessary actions it performed, and then also run that test to make sure it works. It works like a charm on simple-ish demos and I am excited to get some real world feedback on it once it hits beta!

I am also going to open up discussions on this repo (well, somebody with access is going to), and then I will convert this issue to a discussion.

0 replies

csells · 2025-07-28T19:18:14Z

csells
Jul 28, 2025
Author

That sounds awesome! Can you share your demo, either the recording or the script? I'm excited to try this out!

0 replies

jakemac53 · 2025-07-28T19:58:16Z

jakemac53
Jul 28, 2025
Collaborator

select_and_favorite_fruits_480p.mov

I created a new one using copilot which supports the actual prompt and auto connection to DTD etc. Sorry I had to compress it to 480p though for github to allow it.

2 replies

jakemac53 Jul 28, 2025
Collaborator

Note that I cut off the actual running of the flutter driver test but it was valid and works.

jakemac53 Jul 28, 2025
Collaborator

Notice also that it had to summarize the chat halfway through - although it did succeed after doing so.

My assumption is this is related to a relatively small context window + large widget trees. I haven't seen that happen in gemini CLI, but I am on the free copilot tier here.

However, if we can figure out the right way to model the widget tree as a resource, we could remove the duplication in the context.

csells · 2025-07-29T01:56:56Z

csells
Jul 29, 2025
Author

that is very hard to see. any chance you could either share something from Google Drive or paste the conversation here?

1 reply

jakemac53 Jul 29, 2025
Collaborator

I shared a drive link with you

lukemmtt · 2025-08-08T23:30:45Z

lukemmtt
Aug 8, 2025

@jakemac53 I see you've made some progress on adding screenshot support, even if it only works in the simulator for now; one option I haven't seen discussed in this thread is the potential of a code-level integration to achieve screenshotting in Impeller apps—this is what arenukvern/mcp_flutter does—works great for me.

Specifically, that mcp bypasses the _flutter.screenshot VM service limitation by directly accessing WidgetsBinding.instance.renderViews to create screenshots. While definitely inferior to a pure DTD-driven approach (since it requires adding a package dependency to the app and adding a few lines to the app itself), maybe it could serve as a practical interim solution if flutter/flutter#170357 takes longer than expected to resolve.

2 replies

jakemac53 Aug 8, 2025
Collaborator

Fwiw I don't actually know too much about the flutter driver screenshot function that we are using now and exactly where all it is supported but my assumption was that it works everywhere, I did try at least a couple modes but I don't remember which.

lukemmtt Aug 8, 2025

Oh ok rad! Maybe there's more hope for that then. I figured it might not be viable for macOS, or maybe it would only run during integration tests. Excited to help and try that out

[feat] enable dart_mcp_server to drive UI and test screenshots #248

Uh oh!

csells Jul 10, 2025

Replies: 13 comments · 5 replies

Uh oh!

jakemac53 Jul 21, 2025 Collaborator

Uh oh!

csells Jul 21, 2025 Author

Uh oh!

jakemac53 Jul 21, 2025 Collaborator

Uh oh!

jakemac53 Jul 21, 2025 Collaborator

Uh oh!

Uh oh!

csells Jul 22, 2025 Author

Uh oh!

jakemac53 Jul 22, 2025 Collaborator

Uh oh!

csells Jul 22, 2025 Author

Uh oh!

jakemac53 Jul 22, 2025 Collaborator

Uh oh!

jakemac53 Jul 28, 2025 Collaborator

Uh oh!

csells Jul 28, 2025 Author

Uh oh!

jakemac53 Jul 28, 2025 Collaborator

Uh oh!

jakemac53 Jul 28, 2025 Collaborator

Uh oh!

jakemac53 Jul 28, 2025 Collaborator

Uh oh!

csells Jul 29, 2025 Author

Uh oh!

jakemac53 Jul 29, 2025 Collaborator

Uh oh!

Uh oh!

lukemmtt Aug 8, 2025

Uh oh!

jakemac53 Aug 8, 2025 Collaborator

Uh oh!

lukemmtt Aug 8, 2025

csells
Jul 10, 2025

Replies: 13 comments 5 replies

jakemac53
Jul 21, 2025
Collaborator

csells
Jul 21, 2025
Author

jakemac53
Jul 21, 2025
Collaborator

jakemac53
Jul 21, 2025
Collaborator

csells
Jul 22, 2025
Author

jakemac53
Jul 22, 2025
Collaborator

csells
Jul 22, 2025
Author

jakemac53
Jul 22, 2025
Collaborator

jakemac53
Jul 28, 2025
Collaborator

csells
Jul 28, 2025
Author

jakemac53
Jul 28, 2025
Collaborator

jakemac53 Jul 28, 2025
Collaborator

jakemac53 Jul 28, 2025
Collaborator

csells
Jul 29, 2025
Author

jakemac53 Jul 29, 2025
Collaborator

lukemmtt
Aug 8, 2025

jakemac53 Aug 8, 2025
Collaborator