Skip to content

Conversation

hchen2020
Copy link
Contributor

@hchen2020 hchen2020 commented Sep 30, 2025

PR Type

Enhancement


Description

  • Add IWebElementLocator interface for coordinate-based element detection

  • Enhance element locating with position-based fallback mechanism

  • Add screenshot utility function for web automation

  • Refactor element action handling to support position coordinates


Diagram Walkthrough

flowchart LR
  A["IWebElementLocator Interface"] --> B["Element Detection"]
  B --> C["Position Coordinates"]
  C --> D["Element Actions"]
  E["Screenshot Function"] --> F["Web Utilities"]
  D --> G["Enhanced Web Automation"]
  F --> G
Loading

File Walkthrough

Relevant files
Enhancement
7 files
ElementLocatingArgs.cs
Add element locator description and position properties   
+5/-0     
IWebElementLocator.cs
Create new interface for element coordinate detection       
+8/-0     
PlaywrightWebDriver.ActionOnElement.cs
Add position support and refactor action flow                       
+4/-1     
PlaywrightWebDriver.DoAction.cs
Implement position-based element actions and refactor execution
+211/-15
PlaywrightWebDriver.LocateElement.cs
Integrate IWebElementLocator for position-based element detection
+17/-0   
WebUtilityHook.cs
Add screenshot function and remove template references     
+5/-2     
UtilWebTakeScreenshotFn.cs
Create new screenshot utility function implementation       
+39/-0   
Configuration changes
7 files
Using.cs
Add IWebElementLocator namespace import                                   
+2/-2     
util-web-action_on_element.fn.liquid
Remove template file for action on element                             
+0/-1     
util-web-go_to_page.fn.liquid
Remove template file for go to page                                           
+0/-1     
BotSharp.Plugin.WebDriver.csproj
Update project file with screenshot function and remove templates
+3/-8     
util-web-action_on_element.json
Add element locator description parameter and update requirements
+5/-1     
util-web-locate_element.json
Add element locator description parameter and update requirements
+5/-1     
util-web-take_screenshot.json
Create new screenshot function configuration                         
+10/-0   
Formatting
1 files
UtilWebActionOnElementFn.cs
Minor formatting adjustment in action execution                   
+1/-0     

Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Null Handling

Position-based locating sets a success result without populating selector or validating the returned position beyond basic >0 checks; downstream actions may depend on selector or non-null position. Ensure consistent result fields or guard in callers.

// Use IWebElementLocator to detect element position by element description
var locators = _services.GetServices<IWebElementLocator>();
foreach (var el in locators)
{
    location.Position = await el.DetectElementCoordinates(this, message.ContextId, location.ElementLocatorDescription);

    if (location.Position != null && location.Position.X > 0 && location.Position.Y > 0)
    {
        result.Message = $"Position based locating is found at {location.Position}";
        return new BrowserActionResult
        {
            IsSuccess = true
        };
    }
Logic Duplication

Redundant check for action.Position not null and non-zero is performed twice in the coordinates branch; simplify to a single guard to avoid drift and improve readability.

else if (action.Position != null && action.Position.X != 0 && action.Position.Y != 0)
{
    if (action.Position != null && action.Position.X != 0 && action.Position.Y != 0)
    {
        var elementHandle = await page.EvaluateHandleAsync(
            @"(coords) => document.elementFromPoint(coords.x, coords.y)",
            new { x = (int)action.Position.X, y = (int)action.Position.Y }
        );

        await ExecuteAction(message, page, elementHandle.AsElement(), action);
    }
}
Potential Misuse

In ExecuteAction overload using element handle, 'body' is used for most actions (click, dropdown, typing) ignoring the actual element handle; this may act at wrong coordinates or element type. Validate that body is intended, otherwise act on the elementHandle/locator.

    var body = page.Locator("body");

    if (action.Action == BroswerActionEnum.Click)
    {
        await body.ClickAsync(new LocatorClickOptions
        {
            Position = new Position
            {
                X = action.Position.X,
                Y = action.Position.Y
            }
        });
    }
    else if (action.Action == BroswerActionEnum.DropDown)
    {
        var tagName = await body.EvaluateAsync<string>("el => el.tagName.toLowerCase()");
        if (tagName == "select")
        {
            await HandleSelectDropDownAsync(page, body, action);
        }
        else
        {
            await body.ClickAsync();
            if (!string.IsNullOrWhiteSpace(action.PressKey))
            {
                await page.Keyboard.PressAsync(action.PressKey);
                await page.Keyboard.PressAsync("Enter");
            }
            else
            {
                var optionLocator = page.Locator($"//div[text()='{action.Content}']");
                var optionCount = await optionLocator.CountAsync();
                if (optionCount == 0)
                {
                    Serilog.Log.Error($"Dropdown option not found: {action.Content}");
                    return;
                }
                await optionLocator.First.ClickAsync();
            }
        }
    }
    else if (action.Action == BroswerActionEnum.InputText)
    {
        await elementHandle.FillAsync(action.Content);

        if (action.PressKey != null)
        {
            if (action.DelayBeforePressingKey > 0)
            {
                await Task.Delay(action.DelayBeforePressingKey);
            }
            await body.PressAsync(action.PressKey);
        }
    }
    else if (action.Action == BroswerActionEnum.FileUpload)
    {
        var _states = _services.GetRequiredService<IConversationStateService>();
        var files = new List<string>();
        if (action.FileUrl != null && action.FileUrl.Length > 0)
        {
            files.AddRange(action.FileUrl);
        }
        var hooks = _services.GetServices<IWebDriverHook>();
        foreach (var hook in hooks)
        {
            files.AddRange(await hook.GetUploadFiles(message));
        }
        if (files.Count == 0)
        {
            Serilog.Log.Warning($"No files found to upload: {action.Content}");
            return;
        }
        var fileChooser = await page.RunAndWaitForFileChooserAsync(async () =>
        {
            await body.ClickAsync();
        });
        var guid = Guid.NewGuid().ToString();
        var directory = Path.Combine(Path.GetTempPath(), guid);
        DeleteDirectory(directory);
        Directory.CreateDirectory(directory);
        var localPaths = new List<string>();
        var http = _services.GetRequiredService<IHttpClientFactory>();
        using var httpClient = http.CreateClient();
        foreach (var fileUrl in files)
        {
            try
            {
                using var fileData = await httpClient.GetAsync(fileUrl);
                var fileName = new Uri(fileUrl).AbsolutePath;
                var localPath = Path.Combine(directory, Path.GetFileName(fileName));
                await using var fs = new FileStream(localPath, FileMode.Create, FileAccess.Write, FileShare.None);
                await fileData.Content.CopyToAsync(fs);
                localPaths.Add(localPath);
            }
            catch (Exception ex)
            {
                Serilog.Log.Error($"FileUpload failed for {fileUrl}. Message: {ex.Message}");
            }
        }
        await fileChooser.SetFilesAsync(localPaths);
        await Task.Delay(1000 * action.WaitTime);
    }
    else if (action.Action == BroswerActionEnum.Typing)
    {
        await body.PressSequentiallyAsync(action.Content);
        if (action.PressKey != null)
        {
            if (action.DelayBeforePressingKey > 0)
            {
                await Task.Delay(action.DelayBeforePressingKey);
            }
            await body.PressAsync(action.PressKey);
        }
    }
    else if (action.Action == BroswerActionEnum.Hover)
    {
        await body.HoverAsync();
    }
    else if (action.Action == BroswerActionEnum.DragAndDrop)
    {
        // Locate the element to drag
        var box = await body.BoundingBoxAsync();

        if (box != null)
        {
            // Calculate start position
            float startX = box.X + box.Width / 2; // Start at the center of the element
            float startY = box.Y + box.Height / 2;

            // Drag offsets
            float offsetX = action.Position.X;
            // Move horizontally
            if (action.Position.Y == 0)
            {
                // Perform drag-and-move
                // Move mouse to the start position
                var mouse = page.Mouse;
                await mouse.MoveAsync(startX, startY);
                await mouse.DownAsync();

                // Move mouse smoothly in increments
                var tracks = GetVelocityTrack(offsetX);
                foreach (var track in tracks)
                {
                    startX += track;
                    await page.Mouse.MoveAsync(startX, 0, new MouseMoveOptions
                    {
                        Steps = 3
                    });
                }

                // Release mouse button
                await Task.Delay(1000);
                await mouse.UpAsync();
            }
            else
            {
                throw new NotImplementedException();
            }
        }
    }
}

Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Refactor action logic to avoid duplication

Refactor the DoAction.cs file to eliminate code duplication. Merge the two
overloaded ExecuteAction methods for ILocator and IElementHandle into a single,
unified method to improve maintainability.

Examples:

src/Plugins/BotSharp.Plugin.WebDriver/Drivers/PlaywrightDriver/PlaywrightWebDriver.DoAction.cs [70-238]
    private async Task ExecuteAction(MessageInfo message, IPage page, ILocator locator, ElementActionArgs action)
    {
        if (action.Action == BroswerActionEnum.Click)
        {
            if (action.Position == null)
            {
                await locator.ClickAsync();
            }
            else
            {

 ... (clipped 159 lines)
src/Plugins/BotSharp.Plugin.WebDriver/Drivers/PlaywrightDriver/PlaywrightWebDriver.DoAction.cs [240-403]
    private async Task ExecuteAction(MessageInfo message, IPage page, IElementHandle elementHandle, ElementActionArgs action)
    {
        var body = page.Locator("body");

        if (action.Action == BroswerActionEnum.Click)
        {
            await body.ClickAsync(new LocatorClickOptions
            {
                Position = new Position
                {

 ... (clipped 154 lines)

Solution Walkthrough:

Before:

public async Task DoAction(MessageInfo message, ElementActionArgs action, BrowserActionResult result)
{
    if (result.Selector != null)
    {
        ILocator locator = page.Locator(result.Selector);
        // ... validation logic ...
        await ExecuteAction(message, page, locator, action);
    }
    else if (action.Position != null)
    {
        var elementHandle = await page.EvaluateHandleAsync(...);
        await ExecuteAction(message, page, elementHandle.AsElement(), action);
    }
}

private async Task ExecuteAction(..., ILocator locator, ...)
{
    // ~100 lines of action logic (Click, Input, Upload, etc.)
}

private async Task ExecuteAction(..., IElementHandle elementHandle, ...)
{
    // ~100 lines of duplicated action logic
}

After:

public async Task DoAction(MessageInfo message, ElementActionArgs action, BrowserActionResult result)
{
    ILocator targetLocator;
    if (result.Selector != null)
    {
        targetLocator = page.Locator(result.Selector);
        // ... validation logic ...
    }
    else if (action.Position != null)
    {
        // For position-based actions, many Playwright actions can be
        // performed on a broader locator (like `body`) with position options.
        targetLocator = page.Locator("body");
    }
    else 
    {
        // handle error
        return;
    }
    await ExecuteAction(message, page, targetLocator, action);
}

private async Task ExecuteAction(..., ILocator locator, ...)
{
    // Unified action logic for both selector and position-based actions
    // e.g., for click:
    if (action.Position != null) {
        await locator.ClickAsync(new LocatorClickOptions { Position = ... });
    } else {
        await locator.ClickAsync();
    }
    // ... other actions
}
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies significant code duplication in DoAction.cs between the two ExecuteAction overloads, which is a major maintainability issue, and proposes a valid architectural improvement.

High
Learned
best practice
Add null-safety guards

Guard against null or empty elementDescription and null location before calling
detector, and use safe defaults to avoid NullReferenceExceptions.

src/Plugins/BotSharp.Plugin.WebDriver/Drivers/PlaywrightDriver/PlaywrightWebDriver.LocateElement.cs [27-39]

-foreach (var el in locators)
+if (location == null) return new BrowserActionResult { IsSuccess = false };
+var elementDesc = location.ElementLocatorDescription ?? string.Empty;
+if (!string.IsNullOrWhiteSpace(elementDesc))
 {
-    location.Position = await el.DetectElementCoordinates(this, message.ContextId, location.ElementLocatorDescription);
+    foreach (var el in locators)
+    {
+        var pos = await el.DetectElementCoordinates(this, message.ContextId, elementDesc) ?? new ElementPosition();
+        location.Position = pos;
 
-    if (location.Position != null && location.Position.X > 0 && location.Position.Y > 0)
-    {
-        result.Message = $"Position based locating is found at {location.Position}";
-        return new BrowserActionResult
+        if (pos.X > 0 && pos.Y > 0)
         {
-            IsSuccess = true
-        };
+            result.Message = $"Position based locating is found at {pos}";
+            return new BrowserActionResult { IsSuccess = true };
+        }
     }
 }
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Ensure null-safety on optional inputs, returning safe defaults and guarding against null dereferences.

Low
Guard evaluated element handle

Remove duplicated condition, check page not null and validate the evaluated
handle/element before use to avoid runtime errors.

src/Plugins/BotSharp.Plugin.WebDriver/Drivers/PlaywrightDriver/PlaywrightWebDriver.DoAction.cs [46-57]

 else if (action.Position != null && action.Position.X != 0 && action.Position.Y != 0)
 {
-    if (action.Position != null && action.Position.X != 0 && action.Position.Y != 0)
+    if (page == null) { Serilog.Log.Error("Page is null."); return; }
+    var handle = await page.EvaluateHandleAsync(
+        @"(coords) => document.elementFromPoint(coords.x, coords.y)",
+        new { x = (int)action.Position.X, y = (int)action.Position.Y }
+    );
+    var elem = handle?.AsElement();
+    if (elem == null)
     {
-        var elementHandle = await page.EvaluateHandleAsync(
-            @"(coords) => document.elementFromPoint(coords.x, coords.y)",
-            new { x = (int)action.Position.X, y = (int)action.Position.Y }
-        );
-
-        await ExecuteAction(message, page, elementHandle.AsElement(), action);
+        Serilog.Log.Error("No element found at given coordinates.");
+        return;
     }
+    await ExecuteAction(message, page, elem, action);
 }
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why:
Relevant best practice - Validate external/system state and asynchronous usage; avoid blocking and ensure awaited operations are guarded and null-safe.

Low
  • More

@Oceania2018 Oceania2018 merged commit 479087c into SciSharp:master Oct 1, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants