Automation Skill Builder

User manual (customers) · local web app + API + optional MCP · Database MCP build & MCP tools (troubleshooting)

1. Overview

This document is the customer-facing manual: setup, settings, everyday use, and troubleshooting. Packaging, billing servers, and engineering internals ship with the downloadable source package; they are not published on this website.

The product is a FastAPI-based automation and skill-building workstation. It combines a browser UI, desktop helpers (screenshots, OCR, mouse/keyboard), optional Playwright browser automation (MCP relay), FastSkills-oriented skill packaging, and an optional MCP server for AI assistants.

Who this manual is for

Suggested reading paths

Quick start (about 5 minutes)

Follow these steps after you install the app for your OS. For prerequisites, see §2.1.

Step 1: Start the application

  1. Launch Automation Skill Builder from your Windows .exe, macOS .app, or Linux launcher.
  2. Allow any system permission prompts (especially on macOS — see §8.1).
  3. Wait until the tray icon or splash indicates the local service is running.

Step 2: Open the control panel in your browser

  1. Open a current Chrome or Edge (Chromium) window.
  2. Go to http://127.0.0.1:8800 (or the port you configured).
  3. Confirm the sidebar loads (script editor, utilities, Settings).

Automation Skill Builder web UI with sidebar and main workspace

Step 3: Set your language and verify capabilities

  1. Open Settings → General and choose your display language.
  2. Click Save in the header if prompted.
  3. Open MCP & server utilities from the sidebar → Advanced to review FastSkills, Playwright, and optional MCP status.

Step 4: Run your first action

  1. From the sidebar, open a utility such as Screenshot or follow the first capture tutorial below.
  2. Confirm the result appears in the UI or in your configured output folder.

Next steps: Common scenarios · Settings & security · FAQ

2. Installation & environment

2.1 Requirements

Platform requirements (Windows, macOS, Linux)

The product documentation does not name a single minimum OS build for every deployment (e.g. “Windows 10 1903” or “macOS 12”). What follows reflects practical use of the ML/desktop stack together with Python and vendor requirements for your machine.

Practical minimums (all platforms)

Windows

2.1.1 CPU: AVX support (Windows .exe and PaddlePaddle)

The standard Windows build (PyInstaller folder bundle with PaddleOCR / PaddlePaddle) requires a 64-bit x86-64 CPU with AVX (many prebuilt wheels effectively expect AVX2-class CPUs). Chips that only implement SSE4.x and not AVX—such as some low-power parts (e.g. Intel Pentium Gold 4425Y)—are outside the supported hardware for that package: startup may fail with errors like missing libpaddle, NameError: libpaddle, or DLL initialization failed.

If you must install on a machine without AVX, contact the software provider or your license/support channel to request a separate, individually supplied package when available. Such builds are provided only on request; the provider offers no guarantee of performance, responsiveness, stability, or full feature parity with the standard distribution.

macOS

Linux

2.1.5 Component availability matrix (base vs optional)

The matrix below lists a practical bottom line per OS column (what must be true on the host for that row to work). It is not a vendor certification—exact build numbers follow Python, Node/Playwright, Docker, and uvx documentation for your machine. Docker adds host rules from the Docker vendor (RAM, Windows WSL 2 / Hyper-V, minimum macOS / Linux kernel). If Database MCP uses Docker at runtime, those apply on top of the Skill Builder baseline. The Linux column assumes typical 64-bit glibc distributions (e.g. Debian/Ubuntu family); other distros may need extra work for ML wheels. Linux dependency lockfiles for source installs are documented in the application source package.

Component / stack Windows macOS Debian / Ubuntu (glibc Linux)
Skill Builder (core) 64-bit 10+; practical minimum 8 GB RAM for OCR + models. Install from requirements-windows.txt (pywin32, PyAutoGUI, etc.). Desktop automation expects a normal user session. Python 3.10+ when using a full stack from source (pyobjc, etc.). Enable the Privacy & Security permissions listed under macOS in §2.1 (Accessibility, Input Monitoring, Screen Recording, Automation as prompted). 64-bit glibc Linux (e.g. Debian/Ubuntu family). Optional tray/GUI needs DISPLAY (X11/Wayland). HTTP-only servers: --no-gui.
Playwright MCP (optional) Node (LTS recommended); install browsers per the Playwright MCP documentation in the application source package, or use external MCP via PLAYWRIGHT_MCP_EXTERNAL_URL. Same prerequisites as Windows; browsers must be supported by Playwright for your macOS/CPU. See the application source package for setup details. Same prerequisites as Windows; on minimal servers often run npx playwright install-deps once (usually sudo). Details: application source package.
Database MCP (optional) Native: vendor/db_mcp_server/db-mcp-server.exe (bundled or built with Go/Docker once). Or Docker runtime: Docker Desktop must meet Docker’s current Windows + WSL 2 requirements; image pulled (§5.1). Native: binary under vendor/db_mcp_server/ if present. Or Docker: Docker Desktop’s macOS version ceiling applies—old Macs may fail both paths (see warning below). Native: Linux binary in vendor/ or build. Or Docker Engine + docker on PATH + pulled image (common on servers; no Desktop required).
Python Interpreter MCP (optional) Default stdio: uvx on PATH (first run needs PyPI/network unless cached). Override command in python_interpreter_mcp_config.json. External: any reachable MCP base URL. Same pattern (uvx / custom command / external). Same pattern; install uv/uvx via pip or distro if missing.
FastSkills (required for full skill workflow) Child process: default uvx fastskills (needs uvx + network first time) or python -m fastskills with AUTOMATION_FASTSKILLS_USE_PYTHON_MODULE=1, or AUTOMATION_FASTSKILLS_CMD_JSON. Folders my-fastskills/skills + my-fastskills/output. Same env-based overrides; Apple Silicon / Intel per Python wheels. Same; ensure uvx or alternative launcher is available.

Legend: rows are major stacks; columns are host OS families. “Optional” integrations can be disabled in their config files if you do not need them.

Avoid a “double dead-end” on old hardware (example: pre-2015 Mac notebooks): You might install the core Python app, then discover that Database MCP has no usable native db-mcp-server binary in your bundle or build, so you plan to rely on the Docker fallback—only to find that Docker Desktop no longer supports your macOS version (or the machine lacks RAM / virtualisation support). Docker is not a universal escape hatch. Before counting on Database MCP, check Docker’s current host requirements for your exact OS build; on very old Macs, plan to run Database MCP on a newer host, use a pre-built native binary from a supported release channel, or leave Database MCP disabled if you only need the rest of the app.

The in-app Settings → Advanced → Capabilities report shows live status for FastSkills, Playwright MCP, Database MCP, and Python Interpreter MCP. Offline / air-gapped notes: §8.5 (FastSkills). Python Interpreter MCP (stdio, uvx, external mode): documented in the application source package.

2.2 Environment variables (optional)

Packaged desktop builds may use different launch flags. Operators and developers: additional environment variables (billing, recording, MCP) are documented in the application repository.

3. Running the application

Start the product with the installed executable or launcher from your platform’s package (Windows .exe, macOS .app, or your Linux distribution’s start method). The local web UI and API listen on the configured port (often 8800 for HTTP).

Open http://127.0.0.1:8800 (or your chosen host/port) in a browser. API docs are typically at /docs and OpenAPI at /openapi.json.

By default, the full app also serves HTTPS on port 8843 (see AUTOMATION_HTTPS_PORT). To disable HTTPS, use your build’s documented launch flags or set AUTOMATION_USE_HTTPS=0 if supported. Running from a Python source checkout (python main.py / uvicorn) is documented in the application source package.

Port conflicts: If 8800 is taken (e.g. by another stack), set AUTOMATION_SERVICE_PORT or pass --port to the launcher.

3.1 HTTPS & trusting the certificate (Chrome)

The first time HTTPS is used, the app creates a self-signed certificate under a tls/ folder next to your app data root (beside the .exe on Windows; inside .app/Contents/MacOS/ on macOS; or next to the runtime root for your install). For a source checkout, paths follow the project layout — see the application source package. Files: tls/localhost.crt (public) and tls/localhost.key (private). The console prints the full paths when the cert is generated or replaced.

The certificate includes SAN entries for localhost, open-skills.local, 127.0.0.1, and ::1. If you use open-skills.local, add 127.0.0.1 open-skills.local to your hosts file. After the app regenerates the cert (e.g. to add a new name), trust the new localhost.crt again.

Chrome on macOS — remove the “Not secure” / certificate warning

  1. Open Keychain Access (or double-click localhost.crt).
  2. Import localhost.crt if needed, then select the certificate (e.g. “Automation Studio Local”).
  3. Open Trust → set When using this certificate to Always Trust (at least for Secure Sockets Layer (SSL)).
  4. Quit Chrome completely (Chrome → Quit Google Chrome or Cmd+Q), then reopen — closing tabs alone is not enough.
  5. Visit https://127.0.0.1:8843/docs or https://open-skills.local:8843/docs; Chrome should treat the site as trusted.

Windows (brief): Import localhost.crt into Trusted Root Certification Authorities for the current user (e.g. via Certificate Manager / certmgr.msc), then fully restart Chrome.

Install only the .crt file into the trust store. Never share the .key file. Self-signed local certs are for development; do not use them for production internet-facing HTTPS.

4. Main UI & navigation

Skill Builder sidebar with script editor and AI-assisted panels

5. Settings

Open Settings from the sidebar. Tabs include:

Many security-sensitive options require Save in the header and sometimes an application restart (on-screen hints apply).

5.1 Database MCP — Docker image (first-time pull)

When no native db-mcp-server binary is present (PyInstaller bundle or vendor/db_mcp_server/), the app can run Database MCP through Docker if the image already exists locally. The app uses docker image inspect to detect that; it does not run docker pull for you.

Runtime: native binary vs Docker vs neither

1) Native db-mcp-server available (bundled next to the frozen app under _internal, or vendor/db_mcp_server/db-mcp-server(.exe) in dev)

2) No native binary — Docker fallback

3) Neither native binary nor a pulled Docker image

Command line (fastest)

Run in a terminal:

docker pull freepeak/db-mcp-server:v1.8.0

With Docker Desktop, this uses the same engine as the GUI. After a successful pull, the in-app check for the Docker fallback image should pass.

For another tag or a private image, set DB_MCP_SERVER_RUNTIME_IMAGE to match, then pull that reference, for example:

export DB_MCP_SERVER_RUNTIME_IMAGE=freepeak/db-mcp-server:v1.8.0
docker pull "$DB_MCP_SERVER_RUNTIME_IMAGE"

Docker Desktop (GUI)

  1. Open Docker Desktop.
  2. Open Images.
  3. Click Pull (or use Search to find the image).
  4. Enter freepeak/db-mcp-server:v1.8.0 and confirm the pull.
  5. When finished, the image appears in the Images list.

SQLite on the host while using Docker

When Database MCP starts db-mcp-server in Docker, this application bind-mounts the parent directory of your active connections JSON (the folder that contains db_mcp_server_config.json, or the file pointed to by AUTOMATION_DB_MCP_CONFIG_PATH) read-write at /asb-mcp-data inside the container. In connections, set SQLite to an absolute path under that mount, for example:

"database_path": "/asb-mcp-data/db_mcp_local.sqlite"

The database file then lives on the host next to the config file. To use another host folder instead, set DB_MCP_SERVER_DOCKER_DATA_DIR to that folder’s absolute path before starting the automation server. The whole mounted directory is visible inside the container — only mount directories you are comfortable exposing to the db-mcp image.

If you rely on SQLite with Database MCP inside Docker, the default upstream image tag may not match your needs. Operators should read Database MCP build & troubleshooting.

5.2 Database MCP — asking an AI assistant

With the app’s MCP endpoint connected in your client (for example streamable HTTP at http://127.0.0.1:8800/mcp-http), you do not need to guess JSON field names for tools. Say clearly which tool to invoke and which SQL to run; the model maps that to the tool schema (sql, query, etc.) for you.

Tool names follow the connection id in your connections JSON — for example local_sqlite yields asb_db_query_local_sqlite, asb_db_schema_local_sqlite, and so on. Replace local_sqlite in the examples below if your id differs.

Chinese (copy-paste)

English (copy-paste)

More checks (optional)

6. Common usage scenarios

Tutorial: Your first screen capture

Time: about 10 minutes · Difficulty: Beginner · Prerequisites: app running, browser open at http://127.0.0.1:8800

What you will do: capture a region of your desktop and confirm the image is saved or shown in the UI.

  1. In the sidebar, open the Screenshot (or equivalent capture) utility.
  2. Select a screen region, or accept the default full-screen capture if offered.
  3. Wait for processing — first OCR-related runs may download models (see §8.2).
  4. Verify the preview or file path shown in the dialog.
  5. Optional: narrow the capture region next time to speed up OCR workflows (see §6.1).

Pro tip: On macOS, grant Screen Recording if the capture is blank — then quit and restart the app.

6.1 Desktop OCR → find text → click

Use POST /api/desktop/desktop_ocr_action (MCP tool id desktop_ocr_click_text when exposed). You can pass region, wait seconds before capture, target_text, mouse action, and optional offsets. On macOS, mouse execution may use pynput / Quartz before PyAutoGUI (see server implementation).

6.2 Programmatic mouse & keyboard

Use POST /api/host/mouse/move and POST /api/host/mouse/click for mouse automation. The in-app Mouse Click flow records a replay script that calls /api/host/mouse/click with button and clicks (e.g. double-click = left + clicks: 2); omit x/y to click at the cursor position when the request is handled (after any client-side countdown).

Use POST /api/host/keyboard/type, …/press, and …/hotkey for typing and key combinations. Other desktop flows (window focus, region screenshot, launch/close app, etc.) are under /api/host/* in OpenAPI (canonical host automation prefix).

Older URLs such as /api/remote/mouse/click and /api/action/* aliases still work but are hidden from OpenAPI; new integrations should use /api/host/*.

6.3 Browser automation

Playwright is exposed via relay routes under /playwright/* and optional MCP tools when enabled. Configure browser mode in Settings → General → Playwright (including CDP attachment to an existing browser).

6.4 API recording & replay

Start/stop recording from the UI; recorded HTTP sequences can be exported as Python replay scripts. MCP transport calls (/mcp/*) are generally omitted from replay — prefer direct REST calls in recordings.

6.5 Skills & FastSkills

Build or import skills; when FastSkills MCP is configured, tools such as listing skills and executing tools may be available through that integration.

7. HTTP API & MCP (brief)

7.1 MCP tool visibility (enable/disable)

Some HTTP routes are hidden from the MCP tool list by default (for example license administration and diagnostics), while remaining available over HTTP or Swagger if enabled. Use Settings → MCP tools to adjust visibility; changes are saved to app data and usually require a server restart (follow on-screen hints).

Technical details (built-in exclusion IDs, JSON schema, non-configurable routes) are documented in the application source repository.

7.2 Monthly quota vs Usage statistics

If you have a subscription license, many API and automation calls count toward a monthly limit. The Usage view uses the same counting rules for graphs, with small exceptions so opening Usage does not skew its own numbers.

Exact path lists and source functions (should_count_request, etc.) are for operators — see the application source repository.

9. Frequently asked questions

Getting started

Do I need MCP or Claude skills to use the product?

No. Core workflows run in the browser UI and over HTTP. MCP and packaged skills are optional when you want an AI assistant to call your local tools.

Do I need Python or Node.js installed?

Not to launch the packaged desktop app. Some optional features (browser automation, FastSkills, skill execution) may require Python or Node.js — see §2.1 and the in-app capability report.

Which browser should I use?

Use a current Chromium-based browser (Chrome or Edge 80+). Legacy browsers may show a blank UI — see §8.6.

Privacy & deployment

Is my data sent to the cloud?

Automation runs on your machine by default. Data leaves your environment only when you configure external AI providers, downloads (models, packages), or integrations you enable.

Is this a hosted SaaS?

No. You run the service locally (or on infrastructure you control). You manage networking, TLS, and who can reach the port.

Features & permissions

Why does mouse or keyboard automation do nothing on macOS?

Enable Accessibility, Input Monitoring, and often Screen Recording for Skill Builder in System Settings, then restart the app. Details: §8.1.

Why is OCR slow the first time?

Vision models download on first use and CPU inference can take minutes on large regions. Subsequent runs are usually faster — see §8.2.

Are VS Code or browser extensions required?

No. Extensions embed the UI but the app must still run locally. See the product site for optional VSIX and browser sidebar downloads.

Still need help?

8. Troubleshooting

8.1 Mouse / keyboard does nothing (macOS)

8.2 OCR slow or “stuck” after log line

8.3 MCP client cannot connect

8.4 CORS or mixed content

8.5 FastSkills — network errors and offline installation

By default the app starts FastSkills with uvx fastskills, which may download packages from the network (PyPI). If you see TLS errors, timeouts, or “connection reset” during startup or first use, the machine may be offline, behind a strict proxy, or blocked from PyPI.

Monthly quota and the Usage tab both skip diagnostics-style paths (capabilities, machine ID, license endpoints, etc.). See the application source repository for the full list.

8.6 Web UI blank, missing lists, or console syntax errors (browser too old)

The in-app web UI is built with modern JavaScript (for example optional chaining and related language features from around ES2020). Open the app with a current Chromium-based browser — in practice Google Chrome or Microsoft Edge (Chromium) version 80 or newer.

Legacy Windows (for example Windows Server 2012 R2) may only have an old bundled Chrome or Internet Explorer. Typical symptoms: parts of the page never render, sidebar or capability lists stay empty, or the browser devtools console shows parse errors on the first script. This is not fixed by server flags such as --no-paddle; you need a supported browser engine on the client.

Portable Chromium snapshot (example, verified on Windows Server 2012 R2):

We have verified that unpacking the official Chromium continuous-integration zip for Windows x64, revision 1010524, and running chrome.exe from the archive is sufficient to use this UI on Windows Server 2012 R2. Direct download (Google Cloud Storage API; long URL):

https://www.googleapis.com/download/storage/v1/b/chromium-browser-snapshots/o/Win_x64%2F1010524%2Fchrome-win.zip?generation=1654260848325500&alt=media