AI Automation 12 min read

OpenClaw Skills Guide: Workflow, Structure, and Best Practices

If you've been using OpenClaw with ad-hoc prompts, you've probably experienced prompt drift—the agent takes a slightly different approach each run, the output schema shifts, and manual cleanup becomes a tax you pay every session. OpenClaw Skills solve this. Here's the complete guide to building skills that actually work.

If you've been using OpenClaw with ad-hoc prompts, you've probably hit prompt drift—the agent takes a slightly different approach each run, the output schema shifts, columns get renamed, rows get duplicated. You end up spending more time cleaning results than you saved on research.

OpenClaw Skills solve this. A skill is a declarative, reusable instruction set that locks in your workflow, your output format, and your quality gates—so every run produces consistent, trustworthy results. This guide walks you through everything: anatomy, load order, security, a real-world example, and copy-paste starter templates.

What Is an OpenClaw Skill?

At its core, a skill is a folder containing a SKILL.md file. That file has two parts: a YAML frontmatter block that tells OpenClaw how to identify and trigger the skill, and a Markdown body that gives the agent its exact instructions.

Think of it as the difference between telling a junior analyst "go research property launches and put them in a sheet" every week versus handing them a proper SOP document. The SOP covers what to look for, which sources to trust, how to format the output, and what to do when something goes wrong. Same idea here.

  • Repeatability: The same task produces the same structure every time, regardless of how you phrase the request.
  • Reduced improvisation: The agent doesn't guess at output format—it follows your defined schema.
  • Quality enforcement: Validation rules, source tiers, and confidence labels are baked in, not bolted on.
  • Shareability: A well-written skill can be promoted from a single project to your entire team.

Minimum SKILL.md Anatomy

A valid SKILL.md needs at minimum two frontmatter fields and a structured body. Here's the breakdown:

Frontmatter

---
name: your-skill-name
description: >
  Triggered when the user asks to [describe the task in
  natural language matching how users will invoke it].
user-invocable: true
---
  • name: Unique identifier used for loading and conflict resolution. Use kebab-case.
  • description: This is your trigger language. OpenClaw matches user intent to this field—if your wording doesn't reflect how users actually ask, the skill won't fire. Write it like a user request, not a technical label.
  • user-invocable: true: Optional. Marks the skill as directly callable by users. Omit for internal/chained skills.

Body Structure

The body is where you write the actual instructions. Structure it like an SOP, not a paragraph. Every skill body should cover these sections:

1. When to Use

Define the exact trigger conditions. What user request activates this skill? Are there situations where a different skill should be used instead?

2. Required Inputs

List every variable the skill needs to run: target location, date range, sheet ID, API keys, etc. If an input is missing, the skill should ask—not guess.

3. Step-by-Step Workflow

Numbered steps. Not prose. The agent follows these exactly. Include sub-steps where needed.

4. Output Format / Schema

Define every field: column name, type, allowed values, and whether it's required or optional. No ambiguity. If it's going to a Google Sheet, list the exact column headers in order.

5. Error Handling & Stop Conditions

What should the agent do when a fetch fails? When a required field is missing? When the output doesn't pass validation? Define retries, fallbacks, and when to halt and report to the user.

Tip

Write your skill body like a checklist, not a blog paragraph. The agent reads instructions literally. Vague prose like "gather relevant data" produces vague results. Specific steps like "search for projects with completion year between 2028–2030, filter by state: Selangor or KL" produce reliable ones.

Skill Locations & Load Order

OpenClaw loads skills from three locations, in this priority order:

Priority Location Best For
1 (Highest) <workspace>/skills Project-specific skills, custom output schemas, client-specific data sources
2 ~/.openclaw/skills Personal reusable skills shared across projects on your machine
3 (Lowest) Bundled skills Default OpenClaw capabilities (overridable)

Name conflict resolution: If two skills share the same name, the higher-priority location wins. This means you can override a bundled or shared skill by creating a workspace-level skill with the same name—useful when you need project-specific behavior without touching the shared version.

Practical rule of thumb: Start with a workspace skill. Once it's stable and you find yourself copying it to new projects repeatedly, promote it to ~/.openclaw/skills.

Recommended Skill Creation Workflow

Building a skill properly is an 8-step process. Skipping steps—especially output schema and error handling—is the root cause of most skill failures.

1

Define the objective

Write one sentence: "This skill [does exactly what] when the user [says what]." If you can't write that sentence, you're not ready to write the skill.

2

Write the trigger language

Match phrasing to how your users actually invoke tasks. Test: ask OpenClaw the task in natural language—does it select the right skill? If not, refine the description.

3

Define strict output

List every field: name, type, constraints. For sheets: column order matters. For JSON: include a sample. Don't leave any field "flexible"—flexibility is where inconsistency hides.

4

Add quality gates

Validation rules (e.g., "completion year must be 2028–2031"), source tiers (Tier 1 = official developer site, Tier 2 = reputable property portal, Tier 3 = social media), and confidence labels.

5

Add failure handling

Specify retry count, backoff strategy for transient network failures, and what to output when a required field can't be filled. Define when to stop and escalate versus when to skip and continue.

6

Create and test on real data

Run the skill once with real inputs. Check every output field. Don't test on synthetic data—edge cases only surface with real sources.

7

Iterate on weaknesses

Every failure is a missing instruction. After each bad run, add the missing constraint. Skills improve over iterations, not rewrites.

8

Promote (optional)

When the skill is stable across multiple runs and you're copying it to new projects, move it to ~/.openclaw/skills for shared access.

Security Checklist

Skills execute with the permissions of your OpenClaw instance. That means a skill with access to external APIs, file systems, or credentials can cause real damage if compromised or carelessly written. Treat third-party skills the same way you'd treat third-party code: as untrusted until reviewed.

Before enabling any skill—especially third-party ones:

  • Read the full SKILL.md and any referenced scripts. Understand every step the skill will execute before you run it.
  • Audit credential exposure. Check every apiKey and env reference. Does the skill really need those credentials? Could it exfiltrate them?
  • Disable skills you don't use. An unused skill that has access to sensitive APIs is an unnecessary attack surface.
  • Prefer least-privilege tools. If a skill can accomplish its goal with read-only access, don't give it write access.
  • Require explicit approval for external actions. Any skill that writes to external systems (sheets, APIs, email) should have a confirmation step—never auto-execute silently.
  • Version-control your workspace skills. If a skill gets modified by a compromised process, you want git history to catch it.

Real-World Example: saas-price-monitor

Here's what a production skill looks like in practice. This skill automates monitoring of competitor SaaS pricing pages, extracts plan details, and appends deduplicated change records to a Google Sheet for review.

The SKILL.md (condensed)

---
name: saas-price-monitor
description: >
  Monitor and track pricing changes across a defined list of
  SaaS competitor pricing pages. Extract plan names, prices,
  and feature limits. Append new or changed records to Google Sheet.
user-invocable: true
---

## When to Use
When user asks to check, refresh, or update the competitor
pricing tracker sheet.

## Required Inputs
- Google Sheet ID (from user config or prompt)
- Competitor URL list (from skill config file)

## Workflow
1. For each URL in competitor list:
   a. Fetch pricing page content
   b. Extract: vendor name, plan name, monthly price,
      annual price, key limits (seats, storage, API calls)
   c. Assign source tier (Tier 1 = official pricing page,
      Tier 2 = review site, Tier 3 = cached/third-party)
   d. Assign confidence label (High / Medium / Low)
   e. Generate dedupe key: slug(vendor + plan_name + scraped_date)
2. Check sheet for existing dedupe key — skip if duplicate
3. Validate JSON row before appending
4. Append new/changed rows to sheet via Sheets API

## Output Schema
| Field          | Type    | Required | Notes                        |
|----------------|---------|----------|------------------------------|
| vendor         | string  | yes      |                              |
| plan_name      | string  | yes      |                              |
| price_monthly  | number  | no       | USD, null if not listed      |
| price_annual   | number  | no       | USD, null if not listed      |
| seat_limit     | string  | no       | "Unlimited" if not capped    |
| source_url     | string  | yes      | Primary source               |
| source_tier    | string  | yes      | Tier 1 / Tier 2 / Tier 3    |
| confidence     | string  | yes      | High / Medium / Low          |
| dedupe_key     | string  | yes      | slug(vendor+plan+date)       |
| scraped_date   | date    | yes      | ISO 8601                     |

## Error Handling
- Transient fetch failure: retry up to 3 times with 2s backoff
- JSON validation failure: log row, skip append, continue
- Missing required field (vendor / plan_name):
  discard the row entirely, do not append partial data
- Sheet API error: halt and report full error to user

Reliability Lessons Learned

  • Avoid inline JSON in shell commands. Constructing JSON strings inside shell commands with escaping is fragile. Build the JSON object in a separate step, validate it, then pass it to the append command.
  • Validate before appending. A malformed row that gets into your sheet means manual cleanup. A failed validation that skips the row means a clean sheet and a log entry to review. The second outcome is far better.
  • Dedupe keys prevent accumulation drift. Without a dedupe key, every run risks appending the same record again. The key should be derived from stable identifiers—not row numbers or timestamps.
  • Retry with backoff, not immediate retry. Transient failures (API rate limits, network blips) resolve with a short wait. Immediate retries just hammer the same failing endpoint.

Real-World Example: insurance-policy-lookup

This second example covers a harder, more realistic scenario: an insurance portal that has no public API — access is entirely through a browser, requires username and password, and sends an SMS OTP after credentials are accepted. This is a common pattern for legacy enterprise portals.

No API. Browser-only. SMS OTP. Can a skill handle this?

Yes — but with important constraints you need to understand before designing the skill.

What the skill CAN do:

  • Drive a real Chromium browser via Playwright — navigate to the portal, fill in username and password, click login
  • Detect when the OTP screen appears and pause to ask the human for the code
  • Resume once the code is provided, complete login, then scrape the data
  • Save the authenticated session cookies to disk so subsequent runs skip the login entirely (until the session expires)

What the skill CANNOT do:

  • Receive the SMS itself — the agent has no access to your phone. The human must read the OTP and type it in when prompted
  • Bypass MFA — nor should it. If the portal requires OTP, the skill must honour that gate
  • Run fully unattended on the first login — there is always a human-in-the-loop step for OTP

The practical pattern: The skill logs in once with OTP, saves the session cookie, and reuses it for subsequent runs. The human only needs to intervene when the session expires — typically once per day or per shift.

The SKILL.md below uses Playwright browser automation with a human-in-the-loop OTP pause and session cookie reuse. This is the correct design for portals with no API access and SMS-based MFA.

The SKILL.md (condensed)

---
name: insurance-policy-lookup
description: >
  Look up a customer's active insurance policies, renewal
  dates, coverage limits, and open claims. Uses browser
  automation to access the portal. Supports SMS OTP login
  with session reuse to minimise re-authentication.
user-invocable: true
---

## When to Use
When an agent asks to check, review, or summarise a
customer's policy details before a call, renewal, or
claims discussion.

## Auth Strategy
This skill uses Playwright to drive a real Chromium browser.
There is no API — all data is accessed through the portal UI.

Login flow:
  1. Navigate to portal login URL
  2. Fill username + password from skill credentials config
  3. Submit form → portal sends SMS OTP to registered number
  4. PAUSE: prompt human operator to enter the OTP code
  5. Fill OTP field → submit → session established
  6. Save session cookies to: .skill-sessions/insurance-portal.json

Session reuse:
  On subsequent runs, load saved cookies first.
  Probe a known authenticated URL to test session validity.
  If session is valid → skip login entirely.
  If session is expired → repeat full login flow from step 1.

Credentials storage:
  Username and password must be set in skill config, never
  hardcoded in SKILL.md. Use environment variables:
    INSURANCE_PORTAL_USER
    INSURANCE_PORTAL_PASS
  Do NOT log, print, or expose these values at any point.

## Required Inputs
- customer_id: string (preferred) OR
- customer_name: string (used for search if ID not known)

## Workflow
0. Load saved session cookies if file exists.
   Test session: navigate to /dashboard, check for login redirect.
   - Session valid → skip to step 2
   - Session invalid or no file → proceed to step 1

1. Login flow (human-in-the-loop):
   a. Open browser → navigate to portal login page
   b. Fill username field from INSURANCE_PORTAL_USER env
   c. Fill password field from INSURANCE_PORTAL_PASS env
   d. Click login button
   e. Wait for OTP input screen (max 15s timeout)
   f. PAUSE: display message to human:
      "OTP sent to your registered number. Please enter it now:"
   g. Accept OTP input from human → fill OTP field → submit
   h. Wait for dashboard to load (max 20s)
   i. If login fails: halt with "Login failed — check credentials
      or OTP. Do not retry automatically."
   j. Save session cookies to .skill-sessions/insurance-portal.json

2. Customer lookup:
   a. If customer_name only: navigate to search page,
      enter name, submit
   b. If multiple results appear: PAUSE — present list to human,
      wait for confirmation of correct customer
   c. Resolve to a single customer record

3. Policy data extraction:
   a. Navigate to customer's policy summary page
   b. For each policy row: extract policy number, product type,
      status, start date, renewal date, premium, coverage limit
   c. Flag renewal_due_soon if renewal date within 30 days

4. Claims data extraction:
   a. Navigate to customer's claims tab
   b. Count open/pending claims, extract claim reference numbers

5. Compile structured output (schema below)
6. Present to agent — do NOT store, forward, or log customer
   data without explicit confirmation

## Output Schema
| Field             | Type    | Required | Notes                          |
|-------------------|---------|----------|--------------------------------|
| customer_id       | string  | yes      |                                |
| customer_name     | string  | yes      |                                |
| policy_number     | string  | yes      | one row per policy             |
| product_type      | string  | yes      | e.g. Life, Medical, Motor      |
| status            | string  | yes      | Active / Lapsed / Pending      |
| start_date        | date    | yes      | ISO 8601                       |
| renewal_date      | date    | yes      | ISO 8601                       |
| renewal_due_soon  | boolean | yes      | true if within 30 days         |
| premium_monthly   | number  | no       | null if not shown in UI        |
| coverage_limit    | number  | no       | null if not shown in UI        |
| open_claims_count | integer | yes      | 0 if none                      |
| notes             | string  | no       | any flagged issues             |

## Error Handling
- OTP screen timeout (>15s): halt — portal may be slow or
  login already failed; tell human to retry manually
- OTP wrong / rejected by portal: halt with "OTP rejected" —
  do NOT auto-retry; ask human to request a new OTP
- Login success but dashboard not loaded in 20s: take
  screenshot for debugging, halt and report
- Session cookie file missing or corrupt: delete file,
  restart full login flow from step 1
- Customer not found: return "No customer found" — do not guess
- Multiple customer matches: pause, present list, wait for human
- Navigation error mid-scrape: take screenshot, halt and report
  — never return partial data silently

Key Design Decisions for This Skill

  • Human-in-the-loop is not a failure — it's a feature. SMS OTP cannot be automated away safely. The skill explicitly pauses, prompts the human clearly, and resumes once the code is provided. Designing this as a deliberate step (not an error state) makes the skill reliable rather than fragile.
  • Session cookie reuse eliminates most OTP interruptions. The first run of the day requires OTP. After that, the saved session handles all subsequent lookups — the human only needs to intervene again when the session expires. This brings the workflow close to "one OTP per day" in practice.
  • Credentials live in environment variables, not the SKILL.md. The skill file may be version-controlled or shared. Username and password must be referenced by env var name only — never written inline in the instructions.
  • Screenshot on unexpected failure. Browser automation fails silently if the page layout changes or an unexpected modal appears. Capturing a screenshot on error gives you the context needed to diagnose what went wrong, rather than a generic "navigation failed" message.
  • Never auto-retry on OTP rejection. An incorrect OTP followed by automatic retries can trigger an account lockout on the portal. Always halt and ask the human to request a fresh code.
  • Null over inference. If the portal UI doesn't display a premium or coverage limit for a policy, the field is null. The skill never estimates from visible context — regulated data must be accurate or absent, not guessed.

Common Mistakes to Avoid

Most skill failures trace back to one of five problems. Recognising them early saves hours of debugging.

❌ Vague description

If your description doesn't match how users phrase their requests, the skill never triggers. Test by invoking it naturally and checking which skill fires. Refine until the match is reliable.

❌ No output schema

Without a defined schema, output varies run to run. Column names drift. Fields get added or dropped. Data becomes difficult to aggregate or trust. Always define the schema before you write the workflow steps.

❌ No dedupe strategy

If a skill runs multiple times against the same data source and appends results each time, you accumulate duplicates. Define a dedupe key and check it before every write operation.

❌ No retry or fallback

Network errors, rate limits, and timeouts are not edge cases—they're certainties over time. A skill without retry and fallback logic will break silently and produce partial results with no indication of what was missed.

❌ One skill doing too much

A skill that researches data, transforms it, formats a report, emails stakeholders, and updates three different systems is impossible to test, debug, or maintain. Split responsibilities across focused skills and chain them if needed.

Workspace vs Shared Skills: Decision Guide

Not every skill should live in the same location. Use this guide to decide:

Situation Use Workspace Skill Use Shared Skill
Tied to one project or client ✓ Yes
Custom output schema per project ✓ Yes
Requires specific credentials or data sources ✓ Yes
Generic workflow reused across many projects ✓ Yes
Stable instructions, rarely changes ✓ Yes
Minimal environment-specific dependency ✓ Yes
Still being iterated / not yet stable ✓ Yes (start here)

Practical Starter Templates

Use these as starting points. Copy, rename, and fill in the specifics for your task.

Template 1: Minimal Starter

---
name: your-skill-name
description: >
  Triggered when user asks to [natural language description
  of the task, matching how users will phrase it].
user-invocable: true
---

## When to Use
[Describe exact trigger conditions. List any cases where a
different skill should be preferred instead.]

## Required Inputs
- [Input 1]: [description, where to get it]
- [Input 2]: [description, default value if applicable]

## Workflow
1. [First step]
2. [Second step]
   a. [Sub-step if needed]
   b. [Sub-step if needed]
3. [Continue...]

## Output Format
[Table or JSON schema. Define every field.]

| Field   | Type   | Required | Notes      |
|---------|--------|----------|------------|
| field1  | string | yes      |            |
| field2  | int    | no       | default: 0 |

## Error Handling
- [Specific failure scenario]: [action to take]
- [Transient error]: retry [N] times with [Xs] backoff
- Stop condition: [when to halt and report vs. skip and continue]

Template 2: Data Pipeline Skill

---
name: data-pipeline-skill-name
description: >
  Fetch, validate, and append [data type] records to
  [destination] for [scope/filter].
user-invocable: true
---

## When to Use
When user asks to update, refresh, or populate [destination]
with [data type] data from [source].

## Required Inputs
- destination_id: [sheet ID / table name / file path]
- filter_param: [e.g., date range, region, category]

## Workflow
### Phase 1: Fetch
1. Query [source] with filter_param
2. For each result, extract required fields
3. On fetch failure: retry up to 3× with 2s backoff

### Phase 2: Validate
4. Confirm all required fields are present and correctly typed
5. Reject rows that fail validation — log to [error_log], do not append

### Phase 3: Deduplicate
6. Generate dedupe_key: [formula, e.g., slug(field1 + field2)]
7. Check destination for existing dedupe_key
8. Skip row if duplicate found

### Phase 4: Append
9. Construct output row per schema below
10. Validate JSON structure before API call
11. Append to destination
12. Confirm row count matches expected

## Output Schema
| Field       | Type   | Required | Notes                  |
|-------------|--------|----------|------------------------|
| dedupe_key  | string | yes      | slug(field1 + field2)  |
| field1      | string | yes      |                        |
| field2      | string | yes      |                        |
| source_url  | string | yes      |                        |
| confidence  | string | yes      | High / Medium / Low    |
| date_added  | date   | yes      | ISO 8601               |

## Error Handling
- Missing required field: discard row, log, continue
- Duplicate dedupe_key: skip silently, continue
- JSON invalid: log and skip, do not append partial row
- API write failure: halt, report full error to user

Template 3: Safe External Action Checklist

Use this checklist for any skill that writes to external systems (APIs, email, databases, webhooks).

## Safe External Action Checklist

Before any write/send/post operation, the skill MUST:

☐ Validate all output data against schema
☐ Check for existing record (dedupe) before create
☐ Present a summary to the user and request confirmation:
    "About to append N rows to [destination]. Proceed? (yes/no)"
☐ Only proceed on explicit "yes" — halt on anything else
☐ Log timestamp, action taken, and row count on success
☐ On failure:
    - Do not retry write operations without user confirmation
    - Report full error context: endpoint, payload, response
    - Never silently discard write failures

## Credential Rules
- Never log credential values
- Never include credentials in output data
- Use env references, not hardcoded values
- Confirm minimum required scope for the operation

Start With One High-Friction Task

You don't need to migrate your entire workflow to skills on day one. The best approach is to pick the one task you run most often, where results are most inconsistent, and where cleanup takes the most time.

Convert that single task to a skill with strict output and guardrails. Run it. Measure the difference:

  • Fewer manual corrections — because the schema enforces structure from the start
  • Fewer retries — because error handling and backoff are built in, not improvised
  • Faster delivery consistency — because the workflow is defined once and followed every time

Once you see the difference on one skill, the approach scales naturally. Build the next one. Promote the stable ones to shared. Over time, your OpenClaw instance becomes a toolkit of reliable, auditable automations—not a collection of prompts you have to re-explain every session.

Need Help Building Your First Skill?

If you're working on a research, data pipeline, or automation task and want to structure it as a reliable OpenClaw skill, get in touch. We can help you define the workflow, output schema, and quality gates—so your automation works the first time and every time after that.

Let's Talk →

About TechSona: We build reliable automations, modern web applications, and AI-assisted workflows for businesses that care about quality and consistency. From OpenClaw skill design to full-stack development, we help teams get repeatable results from their tools.