OpenClaw Skills Guide: Workflow, Structure, and Best Practices
If you've been using OpenClaw with ad-hoc prompts, you've probably experienced prompt drift—the agent takes a slightly different approach each run, the output schema shifts, and manual cleanup becomes a tax you pay every session. OpenClaw Skills solve this. Here's the complete guide to building skills that actually work.
If you've been using OpenClaw with ad-hoc prompts, you've probably hit prompt drift—the agent takes a slightly different approach each run, the output schema shifts, columns get renamed, rows get duplicated. You end up spending more time cleaning results than you saved on research.
OpenClaw Skills solve this. A skill is a declarative, reusable instruction set that locks in your workflow, your output format, and your quality gates—so every run produces consistent, trustworthy results. This guide walks you through everything: anatomy, load order, security, a real-world example, and copy-paste starter templates.
What Is an OpenClaw Skill?
At its core, a skill is a folder containing a SKILL.md file. That file has two parts: a YAML frontmatter block that tells OpenClaw how to identify and trigger the skill, and a Markdown body that gives the agent its exact instructions.
Think of it as the difference between telling a junior analyst "go research property launches and put them in a sheet" every week versus handing them a proper SOP document. The SOP covers what to look for, which sources to trust, how to format the output, and what to do when something goes wrong. Same idea here.
- Repeatability: The same task produces the same structure every time, regardless of how you phrase the request.
- Reduced improvisation: The agent doesn't guess at output format—it follows your defined schema.
- Quality enforcement: Validation rules, source tiers, and confidence labels are baked in, not bolted on.
- Shareability: A well-written skill can be promoted from a single project to your entire team.
Minimum SKILL.md Anatomy
A valid SKILL.md needs at minimum two frontmatter fields and a structured body. Here's the breakdown:
Frontmatter
--- name: your-skill-name description: > Triggered when the user asks to [describe the task in natural language matching how users will invoke it]. user-invocable: true ---
name: Unique identifier used for loading and conflict resolution. Use kebab-case.description: This is your trigger language. OpenClaw matches user intent to this field—if your wording doesn't reflect how users actually ask, the skill won't fire. Write it like a user request, not a technical label.user-invocable: true: Optional. Marks the skill as directly callable by users. Omit for internal/chained skills.
Body Structure
The body is where you write the actual instructions. Structure it like an SOP, not a paragraph. Every skill body should cover these sections:
1. When to Use
Define the exact trigger conditions. What user request activates this skill? Are there situations where a different skill should be used instead?
2. Required Inputs
List every variable the skill needs to run: target location, date range, sheet ID, API keys, etc. If an input is missing, the skill should ask—not guess.
3. Step-by-Step Workflow
Numbered steps. Not prose. The agent follows these exactly. Include sub-steps where needed.
4. Output Format / Schema
Define every field: column name, type, allowed values, and whether it's required or optional. No ambiguity. If it's going to a Google Sheet, list the exact column headers in order.
5. Error Handling & Stop Conditions
What should the agent do when a fetch fails? When a required field is missing? When the output doesn't pass validation? Define retries, fallbacks, and when to halt and report to the user.
Tip
Write your skill body like a checklist, not a blog paragraph. The agent reads instructions literally. Vague prose like "gather relevant data" produces vague results. Specific steps like "search for projects with completion year between 2028–2030, filter by state: Selangor or KL" produce reliable ones.
Skill Locations & Load Order
OpenClaw loads skills from three locations, in this priority order:
| Priority | Location | Best For |
|---|---|---|
| 1 (Highest) | <workspace>/skills | Project-specific skills, custom output schemas, client-specific data sources |
| 2 | ~/.openclaw/skills | Personal reusable skills shared across projects on your machine |
| 3 (Lowest) | Bundled skills | Default OpenClaw capabilities (overridable) |
Name conflict resolution: If two skills share the same name, the higher-priority location wins. This means you can override a bundled or shared skill by creating a workspace-level skill with the same name—useful when you need project-specific behavior without touching the shared version.
Practical rule of thumb: Start with a workspace skill. Once it's stable and you find yourself copying it to new projects repeatedly, promote it to ~/.openclaw/skills.
Recommended Skill Creation Workflow
Building a skill properly is an 8-step process. Skipping steps—especially output schema and error handling—is the root cause of most skill failures.
Define the objective
Write one sentence: "This skill [does exactly what] when the user [says what]." If you can't write that sentence, you're not ready to write the skill.
Write the trigger language
Match phrasing to how your users actually invoke tasks. Test: ask OpenClaw the task in natural language—does it select the right skill? If not, refine the description.
Define strict output
List every field: name, type, constraints. For sheets: column order matters. For JSON: include a sample. Don't leave any field "flexible"—flexibility is where inconsistency hides.
Add quality gates
Validation rules (e.g., "completion year must be 2028–2031"), source tiers (Tier 1 = official developer site, Tier 2 = reputable property portal, Tier 3 = social media), and confidence labels.
Add failure handling
Specify retry count, backoff strategy for transient network failures, and what to output when a required field can't be filled. Define when to stop and escalate versus when to skip and continue.
Create and test on real data
Run the skill once with real inputs. Check every output field. Don't test on synthetic data—edge cases only surface with real sources.
Iterate on weaknesses
Every failure is a missing instruction. After each bad run, add the missing constraint. Skills improve over iterations, not rewrites.
Promote (optional)
When the skill is stable across multiple runs and you're copying it to new projects, move it to ~/.openclaw/skills for shared access.
Security Checklist
Skills execute with the permissions of your OpenClaw instance. That means a skill with access to external APIs, file systems, or credentials can cause real damage if compromised or carelessly written. Treat third-party skills the same way you'd treat third-party code: as untrusted until reviewed.
Before enabling any skill—especially third-party ones:
- ☐ Read the full
SKILL.mdand any referenced scripts. Understand every step the skill will execute before you run it. - ☐ Audit credential exposure. Check every
apiKeyandenvreference. Does the skill really need those credentials? Could it exfiltrate them? - ☐ Disable skills you don't use. An unused skill that has access to sensitive APIs is an unnecessary attack surface.
- ☐ Prefer least-privilege tools. If a skill can accomplish its goal with read-only access, don't give it write access.
- ☐ Require explicit approval for external actions. Any skill that writes to external systems (sheets, APIs, email) should have a confirmation step—never auto-execute silently.
- ☐ Version-control your workspace skills. If a skill gets modified by a compromised process, you want git history to catch it.
Real-World Example: saas-price-monitor
Here's what a production skill looks like in practice. This skill automates monitoring of competitor SaaS pricing pages, extracts plan details, and appends deduplicated change records to a Google Sheet for review.
The SKILL.md (condensed)
---
name: saas-price-monitor
description: >
Monitor and track pricing changes across a defined list of
SaaS competitor pricing pages. Extract plan names, prices,
and feature limits. Append new or changed records to Google Sheet.
user-invocable: true
---
## When to Use
When user asks to check, refresh, or update the competitor
pricing tracker sheet.
## Required Inputs
- Google Sheet ID (from user config or prompt)
- Competitor URL list (from skill config file)
## Workflow
1. For each URL in competitor list:
a. Fetch pricing page content
b. Extract: vendor name, plan name, monthly price,
annual price, key limits (seats, storage, API calls)
c. Assign source tier (Tier 1 = official pricing page,
Tier 2 = review site, Tier 3 = cached/third-party)
d. Assign confidence label (High / Medium / Low)
e. Generate dedupe key: slug(vendor + plan_name + scraped_date)
2. Check sheet for existing dedupe key — skip if duplicate
3. Validate JSON row before appending
4. Append new/changed rows to sheet via Sheets API
## Output Schema
| Field | Type | Required | Notes |
|----------------|---------|----------|------------------------------|
| vendor | string | yes | |
| plan_name | string | yes | |
| price_monthly | number | no | USD, null if not listed |
| price_annual | number | no | USD, null if not listed |
| seat_limit | string | no | "Unlimited" if not capped |
| source_url | string | yes | Primary source |
| source_tier | string | yes | Tier 1 / Tier 2 / Tier 3 |
| confidence | string | yes | High / Medium / Low |
| dedupe_key | string | yes | slug(vendor+plan+date) |
| scraped_date | date | yes | ISO 8601 |
## Error Handling
- Transient fetch failure: retry up to 3 times with 2s backoff
- JSON validation failure: log row, skip append, continue
- Missing required field (vendor / plan_name):
discard the row entirely, do not append partial data
- Sheet API error: halt and report full error to user Reliability Lessons Learned
- Avoid inline JSON in shell commands. Constructing JSON strings inside shell commands with escaping is fragile. Build the JSON object in a separate step, validate it, then pass it to the append command.
- Validate before appending. A malformed row that gets into your sheet means manual cleanup. A failed validation that skips the row means a clean sheet and a log entry to review. The second outcome is far better.
- Dedupe keys prevent accumulation drift. Without a dedupe key, every run risks appending the same record again. The key should be derived from stable identifiers—not row numbers or timestamps.
- Retry with backoff, not immediate retry. Transient failures (API rate limits, network blips) resolve with a short wait. Immediate retries just hammer the same failing endpoint.
Real-World Example: insurance-policy-lookup
This second example covers a harder, more realistic scenario: an insurance portal that has no public API — access is entirely through a browser, requires username and password, and sends an SMS OTP after credentials are accepted. This is a common pattern for legacy enterprise portals.
No API. Browser-only. SMS OTP. Can a skill handle this?
Yes — but with important constraints you need to understand before designing the skill.
What the skill CAN do:
- Drive a real Chromium browser via Playwright — navigate to the portal, fill in username and password, click login
- Detect when the OTP screen appears and pause to ask the human for the code
- Resume once the code is provided, complete login, then scrape the data
- Save the authenticated session cookies to disk so subsequent runs skip the login entirely (until the session expires)
What the skill CANNOT do:
- Receive the SMS itself — the agent has no access to your phone. The human must read the OTP and type it in when prompted
- Bypass MFA — nor should it. If the portal requires OTP, the skill must honour that gate
- Run fully unattended on the first login — there is always a human-in-the-loop step for OTP
The practical pattern: The skill logs in once with OTP, saves the session cookie, and reuses it for subsequent runs. The human only needs to intervene when the session expires — typically once per day or per shift.
The SKILL.md below uses Playwright browser automation with a human-in-the-loop OTP pause and session cookie reuse. This is the correct design for portals with no API access and SMS-based MFA.
The SKILL.md (condensed)
---
name: insurance-policy-lookup
description: >
Look up a customer's active insurance policies, renewal
dates, coverage limits, and open claims. Uses browser
automation to access the portal. Supports SMS OTP login
with session reuse to minimise re-authentication.
user-invocable: true
---
## When to Use
When an agent asks to check, review, or summarise a
customer's policy details before a call, renewal, or
claims discussion.
## Auth Strategy
This skill uses Playwright to drive a real Chromium browser.
There is no API — all data is accessed through the portal UI.
Login flow:
1. Navigate to portal login URL
2. Fill username + password from skill credentials config
3. Submit form → portal sends SMS OTP to registered number
4. PAUSE: prompt human operator to enter the OTP code
5. Fill OTP field → submit → session established
6. Save session cookies to: .skill-sessions/insurance-portal.json
Session reuse:
On subsequent runs, load saved cookies first.
Probe a known authenticated URL to test session validity.
If session is valid → skip login entirely.
If session is expired → repeat full login flow from step 1.
Credentials storage:
Username and password must be set in skill config, never
hardcoded in SKILL.md. Use environment variables:
INSURANCE_PORTAL_USER
INSURANCE_PORTAL_PASS
Do NOT log, print, or expose these values at any point.
## Required Inputs
- customer_id: string (preferred) OR
- customer_name: string (used for search if ID not known)
## Workflow
0. Load saved session cookies if file exists.
Test session: navigate to /dashboard, check for login redirect.
- Session valid → skip to step 2
- Session invalid or no file → proceed to step 1
1. Login flow (human-in-the-loop):
a. Open browser → navigate to portal login page
b. Fill username field from INSURANCE_PORTAL_USER env
c. Fill password field from INSURANCE_PORTAL_PASS env
d. Click login button
e. Wait for OTP input screen (max 15s timeout)
f. PAUSE: display message to human:
"OTP sent to your registered number. Please enter it now:"
g. Accept OTP input from human → fill OTP field → submit
h. Wait for dashboard to load (max 20s)
i. If login fails: halt with "Login failed — check credentials
or OTP. Do not retry automatically."
j. Save session cookies to .skill-sessions/insurance-portal.json
2. Customer lookup:
a. If customer_name only: navigate to search page,
enter name, submit
b. If multiple results appear: PAUSE — present list to human,
wait for confirmation of correct customer
c. Resolve to a single customer record
3. Policy data extraction:
a. Navigate to customer's policy summary page
b. For each policy row: extract policy number, product type,
status, start date, renewal date, premium, coverage limit
c. Flag renewal_due_soon if renewal date within 30 days
4. Claims data extraction:
a. Navigate to customer's claims tab
b. Count open/pending claims, extract claim reference numbers
5. Compile structured output (schema below)
6. Present to agent — do NOT store, forward, or log customer
data without explicit confirmation
## Output Schema
| Field | Type | Required | Notes |
|-------------------|---------|----------|--------------------------------|
| customer_id | string | yes | |
| customer_name | string | yes | |
| policy_number | string | yes | one row per policy |
| product_type | string | yes | e.g. Life, Medical, Motor |
| status | string | yes | Active / Lapsed / Pending |
| start_date | date | yes | ISO 8601 |
| renewal_date | date | yes | ISO 8601 |
| renewal_due_soon | boolean | yes | true if within 30 days |
| premium_monthly | number | no | null if not shown in UI |
| coverage_limit | number | no | null if not shown in UI |
| open_claims_count | integer | yes | 0 if none |
| notes | string | no | any flagged issues |
## Error Handling
- OTP screen timeout (>15s): halt — portal may be slow or
login already failed; tell human to retry manually
- OTP wrong / rejected by portal: halt with "OTP rejected" —
do NOT auto-retry; ask human to request a new OTP
- Login success but dashboard not loaded in 20s: take
screenshot for debugging, halt and report
- Session cookie file missing or corrupt: delete file,
restart full login flow from step 1
- Customer not found: return "No customer found" — do not guess
- Multiple customer matches: pause, present list, wait for human
- Navigation error mid-scrape: take screenshot, halt and report
— never return partial data silently Key Design Decisions for This Skill
- Human-in-the-loop is not a failure — it's a feature. SMS OTP cannot be automated away safely. The skill explicitly pauses, prompts the human clearly, and resumes once the code is provided. Designing this as a deliberate step (not an error state) makes the skill reliable rather than fragile.
- Session cookie reuse eliminates most OTP interruptions. The first run of the day requires OTP. After that, the saved session handles all subsequent lookups — the human only needs to intervene again when the session expires. This brings the workflow close to "one OTP per day" in practice.
- Credentials live in environment variables, not the SKILL.md. The skill file may be version-controlled or shared. Username and password must be referenced by env var name only — never written inline in the instructions.
- Screenshot on unexpected failure. Browser automation fails silently if the page layout changes or an unexpected modal appears. Capturing a screenshot on error gives you the context needed to diagnose what went wrong, rather than a generic "navigation failed" message.
- Never auto-retry on OTP rejection. An incorrect OTP followed by automatic retries can trigger an account lockout on the portal. Always halt and ask the human to request a fresh code.
- Null over inference. If the portal UI doesn't display a premium or coverage limit for a policy, the field is
null. The skill never estimates from visible context — regulated data must be accurate or absent, not guessed.
Common Mistakes to Avoid
Most skill failures trace back to one of five problems. Recognising them early saves hours of debugging.
❌ Vague description
If your description doesn't match how users phrase their requests, the skill never triggers. Test by invoking it naturally and checking which skill fires. Refine until the match is reliable.
❌ No output schema
Without a defined schema, output varies run to run. Column names drift. Fields get added or dropped. Data becomes difficult to aggregate or trust. Always define the schema before you write the workflow steps.
❌ No dedupe strategy
If a skill runs multiple times against the same data source and appends results each time, you accumulate duplicates. Define a dedupe key and check it before every write operation.
❌ No retry or fallback
Network errors, rate limits, and timeouts are not edge cases—they're certainties over time. A skill without retry and fallback logic will break silently and produce partial results with no indication of what was missed.
❌ One skill doing too much
A skill that researches data, transforms it, formats a report, emails stakeholders, and updates three different systems is impossible to test, debug, or maintain. Split responsibilities across focused skills and chain them if needed.
Workspace vs Shared Skills: Decision Guide
Not every skill should live in the same location. Use this guide to decide:
| Situation | Use Workspace Skill | Use Shared Skill |
|---|---|---|
| Tied to one project or client | ✓ Yes | — |
| Custom output schema per project | ✓ Yes | — |
| Requires specific credentials or data sources | ✓ Yes | — |
| Generic workflow reused across many projects | — | ✓ Yes |
| Stable instructions, rarely changes | — | ✓ Yes |
| Minimal environment-specific dependency | — | ✓ Yes |
| Still being iterated / not yet stable | ✓ Yes (start here) | — |
Practical Starter Templates
Use these as starting points. Copy, rename, and fill in the specifics for your task.
Template 1: Minimal Starter
--- name: your-skill-name description: > Triggered when user asks to [natural language description of the task, matching how users will phrase it]. user-invocable: true --- ## When to Use [Describe exact trigger conditions. List any cases where a different skill should be preferred instead.] ## Required Inputs - [Input 1]: [description, where to get it] - [Input 2]: [description, default value if applicable] ## Workflow 1. [First step] 2. [Second step] a. [Sub-step if needed] b. [Sub-step if needed] 3. [Continue...] ## Output Format [Table or JSON schema. Define every field.] | Field | Type | Required | Notes | |---------|--------|----------|------------| | field1 | string | yes | | | field2 | int | no | default: 0 | ## Error Handling - [Specific failure scenario]: [action to take] - [Transient error]: retry [N] times with [Xs] backoff - Stop condition: [when to halt and report vs. skip and continue]
Template 2: Data Pipeline Skill
--- name: data-pipeline-skill-name description: > Fetch, validate, and append [data type] records to [destination] for [scope/filter]. user-invocable: true --- ## When to Use When user asks to update, refresh, or populate [destination] with [data type] data from [source]. ## Required Inputs - destination_id: [sheet ID / table name / file path] - filter_param: [e.g., date range, region, category] ## Workflow ### Phase 1: Fetch 1. Query [source] with filter_param 2. For each result, extract required fields 3. On fetch failure: retry up to 3× with 2s backoff ### Phase 2: Validate 4. Confirm all required fields are present and correctly typed 5. Reject rows that fail validation — log to [error_log], do not append ### Phase 3: Deduplicate 6. Generate dedupe_key: [formula, e.g., slug(field1 + field2)] 7. Check destination for existing dedupe_key 8. Skip row if duplicate found ### Phase 4: Append 9. Construct output row per schema below 10. Validate JSON structure before API call 11. Append to destination 12. Confirm row count matches expected ## Output Schema | Field | Type | Required | Notes | |-------------|--------|----------|------------------------| | dedupe_key | string | yes | slug(field1 + field2) | | field1 | string | yes | | | field2 | string | yes | | | source_url | string | yes | | | confidence | string | yes | High / Medium / Low | | date_added | date | yes | ISO 8601 | ## Error Handling - Missing required field: discard row, log, continue - Duplicate dedupe_key: skip silently, continue - JSON invalid: log and skip, do not append partial row - API write failure: halt, report full error to user
Template 3: Safe External Action Checklist
Use this checklist for any skill that writes to external systems (APIs, email, databases, webhooks).
## Safe External Action Checklist
Before any write/send/post operation, the skill MUST:
☐ Validate all output data against schema
☐ Check for existing record (dedupe) before create
☐ Present a summary to the user and request confirmation:
"About to append N rows to [destination]. Proceed? (yes/no)"
☐ Only proceed on explicit "yes" — halt on anything else
☐ Log timestamp, action taken, and row count on success
☐ On failure:
- Do not retry write operations without user confirmation
- Report full error context: endpoint, payload, response
- Never silently discard write failures
## Credential Rules
- Never log credential values
- Never include credentials in output data
- Use env references, not hardcoded values
- Confirm minimum required scope for the operation Start With One High-Friction Task
You don't need to migrate your entire workflow to skills on day one. The best approach is to pick the one task you run most often, where results are most inconsistent, and where cleanup takes the most time.
Convert that single task to a skill with strict output and guardrails. Run it. Measure the difference:
- Fewer manual corrections — because the schema enforces structure from the start
- Fewer retries — because error handling and backoff are built in, not improvised
- Faster delivery consistency — because the workflow is defined once and followed every time
Once you see the difference on one skill, the approach scales naturally. Build the next one. Promote the stable ones to shared. Over time, your OpenClaw instance becomes a toolkit of reliable, auditable automations—not a collection of prompts you have to re-explain every session.
Need Help Building Your First Skill?
If you're working on a research, data pipeline, or automation task and want to structure it as a reliable OpenClaw skill, get in touch. We can help you define the workflow, output schema, and quality gates—so your automation works the first time and every time after that.
Let's Talk →About TechSona: We build reliable automations, modern web applications, and AI-assisted workflows for businesses that care about quality and consistency. From OpenClaw skill design to full-stack development, we help teams get repeatable results from their tools.