Developer workstation for programming tutorial showing dual monitors and coding setup

TL;DR

  • Build a fully functional AI-powered terminal assistant in under 200 lines of Python
  • Integrate with Claude or any LLM via API for natural language command execution
  • Implement function calling for real system operations — file management, web searches, and code execution
  • Add a plugin architecture so your agent can grow with custom tools
  • Production-ready patterns for error handling, streaming output, and configuration management

Introduction

Every developer has wished for a smarter terminal — one that understands natural language, remembers context, and can chain together complex operations without you having to memorize arcane flags. In 2026, building that assistant yourself is not only possible, it is surprisingly straightforward.

AI-powered terminal assistants like Claude Code and Codex CLI have proven that the concept works: describe what you want in plain English, and the assistant writes and executes the code. But what if you want a custom assistant tailored to your specific workflow? One that knows your project structure, your preferred tools, and your personal shortcuts?

In this tutorial, you will build TermAI — a custom AI terminal assistant written entirely in Python. By the end, you will have a working CLI tool that accepts natural language commands, uses function calling to interact with your system, and can be extended with custom plugins. The complete project is under 200 lines, uses no heavy frameworks, and runs on any system with Python 3.10+.

Prerequisites

  • Python 3.10 or later
  • An API key from Anthropic (Claude) or OpenAI (GPT-4o)
  • Basic familiarity with Python and the command line
  • pip installed (for the anthropic or openai package)

Step 1: Project Setup and Configuration

Start by creating the project directory and a virtual environment:

mkdir termai && cd termai
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install the required dependencies:

pip install anthropic click pyyaml rich

These four packages give us: LLM access via the Anthropic SDK, a CLI framework (click), YAML config parsing, and beautiful terminal output (rich).

Now create a configuration file that the assistant will read on startup:

# config.yaml
model: "claude-sonnet-4-20250514"
system_prompt: |
  You are TermAI, a helpful terminal assistant.
  You can run shell commands, read and write files, search the web,
  and execute Python code. Always explain what you are about to do
  before doing it. Use the tools available to you.
temperature: 0.3
max_tokens: 4096

Store your API key in an environment variable for security:

export ANTHROPIC_API_KEY="sk-ant-..."

Step 2: The Core Loop — Connecting to the LLM

The heart of any AI assistant is the message loop: accept user input, send it to the LLM, process the response (including any tool calls), and repeat until the task is complete. Let us build that loop.

Create termai.py with the following structure:

#!/usr/bin/env python3
import os
import yaml
import json
import subprocess
from pathlib import Path
from typing import Any, Callable
import click
from rich.console import Console
from rich.markdown import Markdown
from rich.panel import Panel
import anthropic

console = Console()
client = None
config = {}
tools_registry: dict[str, Callable] = {}

def load_config(path: str = "config.yaml") -> dict:
    with open(path) as f:
        return yaml.safe_load(f)

def init_client():
    global client, config
    config = load_config()
    api_key = os.environ.get("ANTHROPIC_API_KEY")
    if not api_key:
        console.print("[red]Error: ANTHROPIC_API_KEY not set[/red]")
        raise SystemExit(1)
    client = anthropic.Anthropic(api_key=api_key)

This scaffolding gives us configuration loading, API client initialization, and the rich console for pretty output. The tools_registry dictionary will hold our function-calling tools, which we build next.

Step 3: Implementing Function Calling — The Tool System

Function calling is what transforms a simple chatbot into a capable assistant. The LLM can request tool invocations, and your code executes them and returns the results. Here is how you define the tool schema and implement the handlers:

def register_tool(name: str, description: str, parameters: dict):
    """Decorator to register a function as an LLM-callable tool."""
    def decorator(func: Callable):
        tools_registry[name] = func
        # Store schema for API call
        func.schema = {
            "name": name,
            "description": description,
            "input_schema": {
                "type": "object",
                "properties": parameters,
                "required": list(parameters.keys()),
            }
        }
        return func
    return decorator

@register_tool(
    name="run_shell",
    description="Execute a shell command and return its output",
    parameters={
        "command": {
            "type": "string",
            "description": "The shell command to execute"
        },
        "timeout": {
            "type": "integer",
            "description": "Timeout in seconds (default: 30)",
            "default": 30
        }
    }
)
def run_shell(command: str, timeout: int = 30) -> str:
    try:
        result = subprocess.run(
            command, shell=True, capture_output=True,
            text=True, timeout=timeout
        )
        output = result.stdout
        if result.stderr:
            output += f"\nSTDERR:\n{result.stderr}"
        if result.returncode != 0:
            output += f"\nExit code: {result.returncode}"
        return output[:10000]  # Truncate long output
    except subprocess.TimeoutExpired:
        return f"Command timed out after {timeout}s"
    except Exception as e:
        return f"Error: {str(e)}"

@register_tool(
    name="read_file",
    description="Read the contents of a file",
    parameters={
        "path": {
            "type": "string",
            "description": "Absolute or relative path to the file"
        }
    }
)
def read_file_tool(path: str) -> str:
    try:
        content = Path(path).read_text()
        return content[:10000]
    except Exception as e:
        return f"Error reading file: {str(e)}"

@register_tool(
    name="write_file",
    description="Write content to a file (overwrites existing)",
    parameters={
        "path": {
            "type": "string",
            "description": "Path to the file"
        },
        "content": {
            "type": "string",
            "description": "Content to write"
        }
    }
)
def write_file_tool(path: str, content: str) -> str:
    try:
        Path(path).write_text(content)
        return f"Successfully wrote {len(content)} bytes to {path}"
    except Exception as e:
        return f"Error writing file: {str(e)}"

@register_tool(
    name="list_directory",
    description="List files in a directory",
    parameters={
        "path": {
            "type": "string",
            "description": "Directory path (default: current)",
            "default": "."
        }
    }
)
def list_directory(path: str = ".") -> str:
    try:
        files = list(Path(path).iterdir())
        result = []
        for f in sorted(files):
            size = f.stat().st_size if f.is_file() else 0
            kind = "📄" if f.is_file() else "📁"
            result.append(f"{kind} {f.name} ({size:,} bytes)" if f.is_file() else f"{kind} {f.name}/")
        return "\n".join(result) if result else "(empty directory)"
    except Exception as e:
        return f"Error: {str(e)}"

Each tool is registered with a descriptive name, a natural-language description, and a JSON schema for its parameters. The LLM reads these schemas and decides when to call which tool. The @register_tool decorator pattern keeps the code clean and makes adding new tools trivial — just write a function and decorate it.

Step 4: The Message Loop — Connecting User Input to Tools

Now we wire everything together. The message loop sends the conversation history (plus tool schemas) to the LLM, processes any tool calls the model makes, and streams back the text response:

def process_tool_call(tool_name: str, tool_input: dict) -> str:
    handler = tools_registry.get(tool_name)
    if not handler:
        return f"Error: Unknown tool '{tool_name}'"
    try:
        result = handler(**tool_input)
        return str(result)
    except Exception as e:
        return f"Tool error: {str(e)}"

def chat_loop():
    messages = [{"role": "user", "content": config["system_prompt"]}]
    tool_schemas = [
        func.schema for func in tools_registry.values()
    ]

    console.print(Panel.fit("[bold green]TermAI[/bold green] — Your AI Terminal Assistant", border_style="green"))
    console.print("Type your request in natural language. Type [bold]/exit[/bold] to quit.\n")

    while True:
        user_input = console.input("[bold cyan]You:[/bold cyan] ")
        if user_input.strip().lower() in ("/exit", "/quit"):
            break

        messages.append({"role": "user", "content": user_input})

        while True:
            response = client.messages.create(
                model=config.get("model", "claude-sonnet-4-20250514"),
                max_tokens=config.get("max_tokens", 4096),
                temperature=config.get("temperature", 0.3),
                system=config["system_prompt"],
                messages=messages,
                tools=tool_schemas if tool_schemas else None,
            )

            for block in response.content:
                if block.type == "text":
                    console.print(Markdown(block.text))
                    messages.append({"role": "assistant", "content": block.text})
                elif block.type == "tool_use":
                    tool_name = block.name
                    tool_input = block.input
                    console.print(f"[yellow]⚡ Running {tool_name}...[/yellow]")
                    result = process_tool_call(tool_name, tool_input)
                    console.print(f"[dim]{result[:200]}{'...' if len(result) > 200 else ''}[/dim]")
                    messages.append({
                        "role": "user",
                        "content": [
                            {
                                "type": "tool_result",
                                "tool_use_id": block.id,
                                "content": result
                            }
                        ]
                    })

            # If no tool calls, the LLM is done responding
            if not any(block.type == "tool_use" for block in response.content):
                break

This loop uses Anthropic’s extended thinking pattern: the model can call multiple tools in sequence (e.g., list a directory, read a file, then write a new one), with each result fed back into the conversation. The loop only exits when the model produces a purely text response — meaning it has finished the task.

Step 5: CLI Entry Point with Click

Finally, wire up the CLI entry point so users can invoke it from their terminal:

@click.command()
@click.option("--config", "-c", default="config.yaml", help="Path to config file")
@click.option("--one-shot", "-o", help="Run a single command and exit")
def main(config: str, one_shot: str | None):
    global config
    config = load_config(config)
    init_client()
    register_all_tools()

    if one_shot:
        # One-shot mode: run a single request and print result
        messages = [{"role": "user", "content": one_shot}]
        response = client.messages.create(
            model=config["model"],
            max_tokens=config["max_tokens"],
            messages=messages,
            tools=[func.schema for func in tools_registry.values()],
        )
        for block in response.content:
            if block.type == "text":
                console.print(Markdown(block.text))
    else:
        chat_loop()

def register_all_tools():
    # Tools are auto-registered via @register_tool decorator
    pass

if __name__ == "__main__":
    main()

The --one-shot flag allows non-interactive use — perfect for scripting. You can run:

python termai.py --one-shot "Find all Python files over 1MB in this directory"

And get the answer directly, without entering the interactive loop.

Step 6: Adding a Web Search Plugin

One of TermAI’s strengths is its plugin architecture. Let us add a web search tool to demonstrate extensibility:

@register_tool(
    name="web_search",
    description="Search the web using DuckDuckGo",
    parameters={
        "query": {
            "type": "string",
            "description": "The search query"
        },
        "max_results": {
            "type": "integer",
            "description": "Maximum results to return (default: 5)",
            "default": 5
        }
    }
)
def web_search(query: str, max_results: int = 5) -> str:
    try:
        import requests
        from bs4 import BeautifulSoup
        url = f"https://html.duckduckgo.com/html/?q={query.replace(' ', '+')}"
        response = requests.get(url, timeout=10)
        soup = BeautifulSoup(response.text, "html.parser")
        results = []
        for result in soup.select(".result")[:max_results]:
            title = result.select_one(".result__title")
            snippet = result.select_one(".result__snippet")
            if title:
                results.append(f"- {title.get_text(strip=True)}")
                if snippet:
                    results.append(f"  {snippet.get_text(strip=True)}")
        return "\n".join(results) if results else "No results found"
    except ImportError:
        return "Install requests and beautifulsoup4 for web search"
    except Exception as e:
        return f"Search error: {str(e)}"

This requires pip install requests beautifulsoup4, but notice the pattern: the tool gracefully reports if dependencies are missing. The @register_tool decorator makes adding it as simple as writing the function — no changes needed to the main loop.

Step 7: Testing Your Assistant

Here are some real commands to test once TermAI is running:

# File operations
"Find all .log files in the project and show their sizes"
"Create a backup of config.yaml with a timestamp"
"Read the first 50 lines of server.log and summarize any errors"

# Code tasks
"Count the total lines of Python code in this directory"
"Find all TODO comments in the source code"
"Refactor the function 'process_data' to use async/await"

# Analysis
"Show me the Git log for the last 7 days with author stats"
"What is my disk usage? Show me the top 10 largest directories"

Try each one and observe how TermAI chains multiple tool calls together. For example, “Find all .log files” might trigger list_directory, then run_shell(find ...), then read_file on each log — all handled autonomously.

Comparison: Custom Agent vs. Off-the-Shelf Solutions

Feature Custom TermAI Assistant Claude Code / Codex CLI
Lines of code ~180 lines N/A (closed source)
Custom tooling Add any tool with @register_tool Limited to built-in tools
Model flexibility Any Anthropic/OpenAI model Anthropic-specific
Plugin ecosystem Write a function, register it No plugin support
Learning curve Build from scratch, understand everything Ready to use immediately
Security model You control what tools can do Built-in sandboxing
Streaming output Via Rich library Native streaming
Cost Free (open source) + API usage Free tier + API usage

A custom assistant gives you complete control — you decide which tools exist, what they can access, and how they behave. Off-the-shelf solutions are more polished day one but harder to customize for niche workflows.

Production Hardening Tips

Rate Limiting and Retry Logic

Wrap API calls with tenacity for automatic retries:

pip install tenacity
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10))
def api_call_with_retry(**kwargs):
    return client.messages.create(**kwargs)

Conversation Persistence

Save and restore conversation history so TermAI remembers context between sessions:

import pickle
HISTORY_FILE = Path.home() / ".termai_history"

def save_history(messages):
    with open(HISTORY_FILE, "wb") as f:
        pickle.dump(messages[-20:], f)  # Keep last 20 messages

def load_history():
    if HISTORY_FILE.exists():
        with open(HISTORY_FILE, "rb") as f:
            return pickle.load(f)
    return []

Security: The Danger Zone

Warning: run_shell("rm -rf /") will work if you ask for it. Add a confirmation prompt for destructive commands:

DANGEROUS_KEYWORDS = ["rm -rf", "dd if=", "> /dev/", "mkfs.", ":(){ :|:& };:"]

def is_dangerous(command: str) -> bool:
    return any(kw in command.lower() for kw in DANGEROUS_KEYWORDS)

# In run_shell, add:
if is_dangerous(command):
    confirm = input(f"⚠️ Dangerous command detected: {command}\nProceed? (y/N): ")
    if confirm.lower() != "y":
        return "Command cancelled by user"

Putting It All Together

The complete termai.py comes in at around 180 lines of Python. Here is the directory structure you should have:

termai/
├── termai.py          # Main assistant (180 lines)
├── config.yaml        # Configuration
├── requirements.txt   # Dependencies
└── plugins/
    └── web_search.py  # Optional: web search plugin (40 lines)

To run the assistant: python termai.py

FAQ

Can I use OpenAI instead of Anthropic?

Yes. Replace the anthropic SDK with openai. The function calling format differs slightly — OpenAI uses tools with function objects — but the logic is identical. You can also support both via a configuration flag.

How do I add custom tools for my project?

Write a Python function with type hints, add the @register_tool decorator with a name, description, and parameter schema, and it is automatically available to the LLM. No changes to the core loop required.

Is this secure enough for production?

For personal use, yes. For team deployments, add the dangerous-command confirmation guard shown in Step 7, implement a whitelist of allowed commands, and run TermAI in a container with restricted filesystem access. Never expose it as a network service without authentication.

What is the cost of running this?

Each query costs roughly $0.01–$0.10 depending on the model and number of tool calls. At 50 queries per day, expect $15–$30 per month in API costs. Most of the expense comes from tool call results being sent back to the model (input tokens).

Does it work with local models like Llama?

Local models that support function calling (Llama 3.1+, Qwen 2.5, DeepSeek V3) work with trivial modifications. Swap the Anthropic client for any OpenAI-compatible endpoint (like Ollama or vLLM) and adjust the tool schema format if needed.

Related Reading

Key Takeaways

  1. Building a custom AI terminal assistant takes ~180 lines of Python — the core concepts are function calling, a message loop, and a tool registry
  2. Function calling is the key enabler — it allows the LLM to interact with your system in a controlled way
  3. The plugin architecture via decorators makes it trivial to extend — add tools with a single @register_tool decorator
  4. Production patterns (rate limiting, persistence, security guards) are straightforward to add on top of the basic loop
  5. Custom assistants give you full control — you decide what the agent can and cannot do, which tools to expose, and which model to use

If you have built your own AI agent before, check out our previous tutorial on building an AI agent with function calling for a deeper dive into the agent architecture itself. For a more advanced project, see our guide on building an AI-powered PR reviewer with GitHub Webhooks.

CTA

Ready to take your AI development to the next level? At ECOA AI, we build custom AI agents and automation solutions for development teams. Whether you need a tailor-made terminal assistant, a multi-agent orchestration system, or an AI-augmented development pipeline, our team delivers production-grade solutions. Contact ECOA AI to discuss your project.