上回提到我想用 Playwright for .NET 也可以用來開發網頁操作自動化機器人,但部署到客戶端可能會是問題。Playwright 原本被設計用於 End-To-End 測試,多在開發測試環境或 Build 主機上執行,程式庫下載安裝不用太講究 User Friendly。我想把它放到客戶端應用,得研究所需檔案數量、大小、目錄位置要求,評估可行性。

首先,我想借用客戶端上現有的 Chrome 或 Edge,不要再另外下載安裝瀏覽器軟體,這不是難事。以 Chrome 為例,Chromium.LaunchAsync() 時可查詢 Registry 找到 Chrome 安裝路徑,傳入 ExecutablePath。

using System.Text.RegularExpressions;
using Microsoft.Playwright;

using var playwright = await Playwright.CreateAsync();
await using var browser = await playwright.Chromium.LaunchAsync(new BrowserTypeLaunchOptions {
    Headless = false,
    ExecutablePath = GetChromePath()
});
var page = await browser.NewPageAsync();
await page.GotoAsync("https://playwright.dev/dotnet");
await page.ScreenshotAsync(new()
{
    Path = "screenshot.png"
});

// get Chrome installed path from registry (Windows only)
string GetChromePath()
{
    var path = Microsoft.Win32.Registry.GetValue(
        @"HKEY_CLASSES_ROOT\ChromeHTML\shell\open\command", null, null) as string;
    if (string.IsNullOrEmpty(path))
        throw new ApplicationException("Chrome not installed");
    var m = Regex.Match(path, "\"(?<p>.+?)\"");
    if (!m.Success)
        throw new ApplicationException($"Invalid Chrome path - {path}");
    return m.Groups["p"].Value;
}

實測 dotnet publish -r Release -r win-x64 --no-self-contained,除了 dll 外,還有一個 .playwright 目錄,其下包含 node.exe 及 TypeScript、JavaScript 程式及設定檔。由此可知 Playwright 本體仍是跑在 Node.js 平台,Playwright for .NET 只是溝通整合的中介層。

但有點要注意,Microsoft.Playwright.dll 必須獨立存在,若使用 -p:PublishSignleFile=true 整併進 .exe,執行時會噴出錯誤:

Unhandled exception. System.NotSupportedException: CodeBase is not supported on assemblies loaded from a single-file bundle.
   at System.Reflection.RuntimeAssembly.get_CodeBase()
   at Microsoft.Playwright.Helpers.Driver.GetExecutablePath() in /_/src/Playwright/Helpers/Driver.cs:line 43
   at Microsoft.Playwright.Transport.StdIOTransport.GetProcess() in /_/src/Playwright/Transport/StdIOTransport.cs:line 114
   at Microsoft.Playwright.Transport.StdIOTransport..ctor() in /_/src/Playwright/Transport/StdIOTransport.cs:line 44
   at Microsoft.Playwright.Playwright.CreateAsync() in /_/src/Playwright/Playwright.cs:line 44
   at Program.<Main>$(String[] args) in X:\Github\MiscLabs\playwright-dep-test\Program.cs:line 4
   at Program.<Main>(String[] args)

全部程式共 310 個檔案,近 70 MB,大小還可被接受:

將整包程式丟到客戶端,實測能成功執行!

如此,至少驗證未來將發佈將整包程式複製過去並裝好 .NET SDK,要在客戶端跑 Playwright for .NET 是可行的。

還有一個待優化的地方 - 程式附帶的 68MB .playwright 資料夾內容都是相同的,每支程式都複製一份是種浪費,有沒有可能多支程式共用一份呢?追進原始碼,GetExecutablePath() 假設 .playwright 資料夾放在執行程式所在目錄或是 Microsoft.Playwright.dll 的同目錄下,目前並未提供自訂路徑的選項。有人提了 PR 想比照 PuppeteerSharp 為 Playwright.CreateAsync(driversPath: "path/.playwright") 加個自訂路徑參數但沒被接受。現階段要解決,可考慮 Fork 一個魔改版本或用 Hacking 技巧搞定,有空再來想想。

Playwright for .NET is designed for E2E testing, but it's handy to write automatic web operation robot, too. This article explore the issues when deploying the Playwright-based application to end-user clients.


Comments

Be the first to post a comment

Post a comment