Build Skill-Augmented AI Agents with SkillNet for Search, Evaluation, Graph Analysis, and Task Planning
AI
This tutorial demonstrates building skill-augmented AI agents using SkillNet, covering skill discovery, installation, evaluation, graph analysis, and task planning.
Intelligence Insights
The Big Picture
The article provides a step-by-step tutorial on using SkillNet to create AI agents enhanced with reusable skills. It begins by setting up a SkillNet client with SDK and REST fallback, then compares keyword and semantic search for finding skills. Users install curated skills from GitHub, inspect their metadata, and evaluate them across five quality dimensions (safety, completeness, executability, maintainability, cost awareness) using a quality gate. Relationships between skills are analyzed and visualized as a graph using NetworkX. Finally, a skill-augmented agent planner decomposes a complex goal into subtasks, discovers relevant skills, filters them, and assembles an execution pipeline, showcasing how to build modular AI systems.
Why It Matters
SkillNet introduces a modular, reusable skill ecosystem that lets developers build AI agents by composing pre-vetted, quality-gated components rather than coding from scratch. This shifts agent development from monolithic models to flexible, skill-augmented pipelines, enabling faster prototyping and more reliable task execution. As AI agents become more common, such skill marketplaces could standardize how capabilities are shared and evaluated, much like app stores did for mobile software.
Deepen your understanding
Use our AI to break down complex signals.
Select an AI action to generate more depth.
In this tutorial, we implement aSkillNet use case as a practical framework for discovering, installing, inspecting, evaluating, and organizing reusable AI skills. We start by setting up a robust SkillNet client with SDK and REST fallback support, then compare keyword search with semantic search to understand how skills can be found for different task requirements. From there, we install curated skills from GitHub, inspect their metadata, apply a quality gate across key evaluation dimensions, and visualize relationships between skills as a graph. Finally, we build a skill-augmented agent planner that breaks a complex goal into subtasks, discovers relevant skills, filters them, and assembles an execution pipeline.
We install the required dependencies and prepare the basic environment for the SkillNet tutorial. We configure API keys, model settings, GitHub options, and working directories to ensure the rest of the workflow runs smoothly. We also define a reusable banner function to keep the tutorial output organized and readable.
banner("1) Initialize SkillNet client (SDK with REST fallback)")
USE_SDK = False
client = None
try:
from skillnet_ai import SkillNetClient
client = SkillNetClient(
api_key=API_KEY or None,
base_url=BASE_URL,
github_token=GITHUB_TOKEN or None,
)
USE_SDK = True
print("SDK loaded: skillnet_ai.SkillNetClient")
except Exception as e:
print(f"SDK unavailable ({e!r}); using REST fallback for search/download.")
def _norm(item):
if isinstance(item, dict):
g = item.get
else:
g = lambda k, d=None: getattr(item, k, d)
return {
"skill_name": g("skill_name") or g("name") or "?",
"skill_description": g("skill_description") or g("description") or "",
"author": g("author") or "",
"stars": g("stars") or 0,
"skill_url": g("skill_url") or g("url") or "",
"category": g("category") or "",
}
def search(q, mode="keyword", limit=5, min_stars=0, sort_by="stars", threshold=0.8):
if USE_SDK:
try:
kw = dict(q=q, limit=limit, mode=mode)
if mode == "keyword":
kw.update(min_stars=min_stars, sort_by=sort_by)
else:
kw.update(threshold=threshold)
res = client.search(**kw)
return [_norm(x) for x in (res or [])]
except Exception as e:
print(f" [SDK search failed -> REST] {e!r}")
params = {"q": q, "mode": mode, "limit": limit}
if mode == "keyword":
params.update(min_stars=min_stars, sort_by=sort_by)
else:
params.update(threshold=threshold)
try:
r = requests.get(f"{REST_BASE}/search", params=params, timeout=30)
r.raise_for_status()
return [_norm(x) for x in r.json().get("data", [])]
except Exception as e:
print(f" [REST search failed] {e!r}")
return []
def show_results(results, title=""):
if title:
print(f"\n-- {title} --")
if not results:
print(" (no results / endpoint unreachable)")
return
for i, s in enumerate(results, 1):
desc = textwrap.shorten(s["skill_description"], 70, placeholder="...")
print(f" {i}. {s['skill_name']:<34} ⭐{s['stars']:<5} [{s['category']}]")
if desc:
print(f" {desc}")
banner("2) Search: keyword vs. semantic (vector)")
kw_hits = search("pdf", mode="keyword", limit=5, sort_by="stars")
show_results(kw_hits, "keyword: 'pdf' (sorted by stars)")
vec_hits = search("analyze financial reports from documents",
mode="vector", limit=5, threshold=0.80)
show_results(vec_hits, "vector: 'analyze financial reports from documents'")
We initialize the SkillNet client and provide a REST fallback, so the tutorial remains usable even if the SDK does not work. We define helper functions to normalize search results and perform both keyword and semantic searches. We then compare a keyword search for PDF-related skills with a vector search for analyzing financial reports from documents.
banner("3) Install skills (download from GitHub into ./skillnet_demo/my_skills)")
CURATED = [
"https://github.com/anthropics/skills/tree/main/skills/skill-creator",
"https://github.com/anthropics/skills/tree/main/skills/algorithmic-art",
]
for s in (kw_hits + vec_hits):
if s["skill_url"] and s["skill_url"] not in CURATED:
CURATED.append(s["skill_url"])
CURATED = CURATED[:4]
def download(url, target_dir):
if USE_SDK:
try:
kw = {}
if GITHUB_MIRROR:
kw["mirror"] = GITHUB_MIRROR
return client.download(url=url, target_dir=str(target_dir), **kw)
except TypeError:
return client.download(url=url, target_dir=str(target_dir))
except Exception as e:
print(f" download failed for {url}: {e!r}")
return None
print(" (SDK not present — skipping live download for this URL)")
return None
installed = []
for url in CURATED:
print(f" downloading: {url}")
path = download(url, SKILLS_DIR)
if path:
installed.append(path)
print(f" -> {path}")
print(f"\nInstalled {len(installed)} skill(s).")
We create a curated list of useful SkillNet-compatible skills and expand it using the search results collected earlier. We download selected skills from GitHub into a local skills directory when the SDK is available. We keep the installation process small and quick, so the tutorial remains practical for Google Colab.
banner("4) Inspect installed skills (SKILL.md frontmatter)")
def parse_skill_md(skill_path):
p = pathlib.Path(skill_path)
md = None
if p.is_dir():
for cand in p.rglob("SKILL.md"):
md = cand; break
elif p.name.upper() == "SKILL.MD":
md = p
if not md or not md.exists():
return {"path": str(skill_path), "name": p.name, "meta": {}, "found": False}
text = md.read_text(encoding="utf-8", errors="ignore")
meta = {}
m = re.match(r"^---\s*\n(.*?)\n---", text, re.DOTALL)
if m:
for line in m.group(1).splitlines():
if ":" in line:
k, v = line.split(":", 1)
meta[k.strip()] = v.strip().strip('"').strip("'")
return {"path": str(md), "name": meta.get("name", p.name),
"meta": meta, "found": True}
inspected = [parse_skill_md(pp) for pp in installed] if installed else []
for info in inspected:
print(f" • {info['name']} ({'SKILL.md found' if info['found'] else 'no SKILL.md'})")
desc = info["meta"].get("description", "")
if desc:
print(f" {textwrap.shorten(desc, 90, placeholder='...')}")
if not inspected:
print(" (nothing installed locally — likely no SDK/network; sections 2 & 7 still run)")
We inspect the installed skills by searching for their SKILL.md files and reading their metadata. We parse the front matter to extract useful information, such as the skill name and description. We then print a clean summary of each installed skill to understand what has been added locally.
banner("5) Evaluate skills on 5 quality dimensions (quality gate)")
DIMS = ["safety", "completeness", "executability", "maintainability", "cost_awareness"]
LEVEL_SCORE = {"Excellent": 4, "Good": 3, "Fair": 2, "Poor": 1, "Bad": 0}
def evaluate(target):
if USE_SDK and API_KEY:
try:
return client.evaluate(target=target)
except Exception as e:
print(f" evaluate failed for {target}: {e!r}")
return None
def mock_eval(name):
import hashlib
h = int(hashlib.md5(name.encode()).hexdigest(), 16)
levels = ["Excellent", "Good", "Fair", "Poor"]
return {d: {"level": levels[(h >> (i * 3)) % 4], "reason": "offline mock score"}
for i, d in enumerate(DIMS)}
def gate_score(report):
tot = sum(LEVEL_SCORE.get(report.get(d, {}).get("level", "Fair"), 2) for d in DIMS)
return tot / (len(DIMS) * 4)
GATE_THRESHOLD = 0.55
targets = [s["skill_url"] for s in (kw_hits + vec_hits) if s["skill_url"]][:3] \
or [i["name"] for i in inspected] or ["pdf-extractor", "chart-reader", "web-scraper"]
passed, scored = [], []
for t in targets:
rep = evaluate(t)
via = "LLM"
if rep is None:
rep, via = mock_eval(str(t)), "mock"
score = gate_score(rep)
scored.append((t, score, via))
flags = " ".join(f"{d[:4]}={rep.get(d,{}).get('level','?')[:4]}" for d in DIMS)
status = "PASS ✅" if score >= GATE_THRESHOLD else "FAIL ❌"
print(f" [{via:4}] {status} score={score:.2f} {textwrap.shorten(str(t),46,placeholder='...')}")
print(f" {flags}")
if score >= GATE_THRESHOLD:
passed.append(t)
print(f"\n{len(passed)}/{len(targets)} skills passed the quality gate (threshold={GATE_THRESHOLD}).")
banner("6) Analyze relationships and draw the Skill Graph")
def analyze(skills_dir):
if USE_SDK and API_KEY:
try:
return client.analyze(skills_dir=str(skills_dir))
except Exception as e:
print(f" analyze failed: {e!r}")
return None
rels = analyze(SKILLS_DIR)
if not rels:
names = [i["name"] for i in inspected] or ["PDF_Parser", "Text_Summarizer",
"Chart_Reader", "Web_Scraper"]
while len(names) < 4:
names.append(f"Skill_{len(names)}")
rels = [
{"source": names[0], "type": "compose_with", "target": names[1]},
{"source": names[2], "type": "similar_to", "target": names[0]},
{"source": names[3], "type": "depend_on", "target": names[1]},
{"source": names[1], "type": "belong_to", "target": names[2]},
]
print(" (using offline mock relationships — set API_KEY for real analysis)")
for r in rels:
print(f" {r['source']} --[{r['type']}]--> {r['target']}")
try:
import networkx as nx
import matplotlib.pyplot as plt
G = nx.DiGraph()
COLORS = {"similar_to": "#4C9BE8", "belong_to": "#E8A14C",
"compose_with": "#6BBF59", "depend_on": "#D45D79"}
for r in rels:
G.add_edge(r["source"], r["target"], type=r["type"])
pos = nx.spring_layout(G, seed=42, k=1.2)
plt.figure(figsize=(9, 6))
nx.draw_networkx_nodes(G, pos, node_size=2200, node_color="#EDEDED", edgecolors="#444")
nx.draw_networkx_labels(G, pos, font_size=9)
for et, col in COLORS.items():
edges = [(u, v) for u, v, d in G.edges(data=True) if d["type"] == et]
if edges:
nx.draw_networkx_edges(G, pos, edgelist=edges, edge_color=col,
width=2, arrows=True, arrowsize=18,
connectionstyle="arc3,rad=0.08")
plt.legend(handles=[plt.Line2D([0], [0], color=c, lw=2, label=t)
for t, c in COLORS.items()], loc="best", fontsize=8)
plt.title("SkillNet — Skill Relationship Graph")
plt.axis("off"); plt.tight_layout()
plt.savefig(WORKDIR / "skill_graph.png", dpi=130)
plt.show()
print(f" graph saved -> {WORKDIR/'skill_graph.png'}")
except Exception as e:
print(f" graph drawing skipped: {e!r}")
We evaluate skills across five quality dimensions: safety, completeness, executability, maintainability, and cost awareness. We apply a quality gate to determine which skills meet a minimum score threshold, using mock scores when an API key is unavailable. We also analyze relationships between skills and visualize them as a Skill Graph using NetworkX and Matplotlib.
banner("7) Skill-augmented agent planner")
GOAL = "Analyze scRNA-seq data to find and validate cancer drug targets, then write a report"
def llm_decompose(goal):
if API_KEY:
try:
payload = {
"model": MODEL,
"messages": [
{"role": "system", "content":
"Decompose the user's goal into 3-6 short, ordered subtasks. "
"Reply ONLY as a JSON array of strings, no prose, no markdown."},
{"role": "user", "content": goal},
],
"temperature": 0.2,
}
r = requests.post(f"{BASE_URL}/chat/completions",
headers={"Authorization": f"Bearer {API_KEY}"},
json=payload, timeout=60)
r.raise_for_status()
txt = r.json()["choices"][0]["message"]["content"]
txt = re.sub(r"^```(?:json)?|```$", "", txt.strip()).strip()
subs = json.loads(txt)
if isinstance(subs, list) and subs:
return [str(x) for x in subs]
except Exception as e:
print(f" LLM decompose failed -> heuristic ({e!r})")
return ["acquire single-cell RNA-seq dataset",
"preprocess and cluster cells",
"identify candidate cancer target genes",
"validate targets against pathway database",
"generate a discovery report"]
def keywords_for(subtask):
stop = {"the", "and", "a", "to", "of", "from", "into", "for", "with", "then", "an"}
toks = [w for w in re.findall(r"[a-zA-Z\-]+", subtask.lower()) if w not in stop]
return " ".join(toks[:4])
subtasks = llm_decompose(GOAL)
print(f"GOAL: {GOAL}\n\nPLAN ({len(subtasks)} steps):")
plan = []
for i, st in enumerate(subtasks, 1):
q = keywords_for(st)
hits = search(q, mode="vector", limit=2, threshold=0.6) or \
search(q, mode="keyword", limit=2)
best = hits[0] if hits else None
chosen = best["skill_name"] if best else "(no skill found — fallback to base model)"
plan.append({"step": i, "subtask": st, "query": q, "skill": chosen})
print(f"\n Step {i}: {st}")
print(f" search('{q}') -> {chosen}" + (f" ⭐{best['stars']}" if best else ""))
print("\nExecution order (assembled pipeline):")
print(" " + " -> ".join(p["skill"].split()[0] if p["skill"][0] != "(" else "base-model"
for p in plan))
banner("Tutorial complete")
print(textwrap.dedent(f"""
Recap:
• Search (keyword + vector) ............ ran via {'SDK' if USE_SDK else 'REST'}
• Install (GitHub -> local) ............ {len(installed)} skill(s)
• Inspect SKILL.md metadata ............ {len(inspected)} parsed
• Evaluate + quality gate .............. {len(passed)}/{len(targets)} passed {'(LLM)' if API_KEY else '(offline mock)'}
• Relationship graph ................... {len(rels)} edges -> skill_graph.png
• Agent planner ........................ {len(plan)} steps mapped to skills
Docs: https://github.com/zjunlp/SkillNet
"""))
We build a skill-augmented agent planner around a complex scientific discovery goal. We decompose the goal into ordered subtasks, identify relevant skills for each step, and map those skills to an execution pipeline. We finish by printing a recap of the full workflow, including search, installation, inspection, evaluation, graph analysis, and planning.
In conclusion, we created a complete SkillNet workflow that moves beyond simple skill search and demonstrates how skills can support structured agentic systems. We saw how SkillNet helps us discover useful capabilities, evaluate their quality, understand their relationships, and connect them to real task planning. It also remains practical because it runs even without an API key by falling back to offline mock evaluations, while still allowing deeper LLM-powered analysis when credentials are available. Also, we used SkillNet as a foundation for building modular, skill-driven AI agents that can plan, select tools, and organize execution more intelligently.