Build a Custom AI Terminal Assistant with Python: A Complete Step-by-Step Developer Tutorial

1 comment
(Developer Tutorials) - Stop context-switching between your terminal and a browser. In this hands-on tutorial, you'll build a custom AI terminal assistant in Python that can run shell commands, read files, and answer coding questions—all without leaving your CLI.

Build a Custom AI Terminal Assistant with Python: A Complete Step-by-Step Developer Tutorial

I spend way too much time in my terminal. And I hate context-switching.

Every time I need to look up a Python syntax quirk, check a file’s contents, or remember the exact flags for `grep`, I’m yanked out of my flow. Open a browser. Type a query. Sift through Stack Overflow. Tab back. It’s exhausting.

Outsourcing Software in 2025: How to Build Elite Offshore Engineering Teams That Actually Deliver

Outsourcing Software in 2025: How to Build Elite Offshore Engineering Teams That Actually Deliver

TL;DR: Outsourcing software done right cuts costs 40% and accelerates delivery. But it’s not a magic switch—it demands… ...

So I built something better.

A custom AI terminal assistant that lives right in my CLI. It runs shell commands, reads files, answers coding questions, and even writes code snippets—all without leaving the terminal.

Open Source Licensing in 2026: A Practical Guide for Developers

Open Source Licensing in 2026: A Practical Guide for Developers

Open Source Licensing in 2026: A Practical Guide for Developers I’ve seen projects die because of a license.… ...

Here’s the kicker: you can build one too. In about 30 minutes. With Python.

Let’s do it.

What We’re Building

We’re creating a CLI tool called `ai-term`. You’ll type something like:

bash
ai-term "find all Python files modified in the last 24 hours"

And it’ll respond with the command, explain what it does, and optionally execute it.

Or:

bash
ai-term "explain this error: TypeError: 'NoneType' object is not subscriptable"

And it’ll give you a clear explanation and a fix.

No browser. No context switch. Just you and your terminal.

Prerequisites

You’ll need:

  • Python 3.10+
  • An OpenAI API key (or any LLM provider—I’ll show you how to swap)
  • `pip` installed

That’s it. No heavy frameworks. No Docker. Just clean Python.

Step 1: Project Setup

Create a new directory and set up a virtual environment:

bash
mkdir ai-term
cd ai-term
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install the dependencies:

bash
pip install openai typer rich
  • openai: To call the LLM API
  • typer: To build a clean CLI interface
  • rich: To make the terminal output look beautiful

Create a file called `ai_term.py`:

bash
touch ai_term.py

Step 2: The Core Logic

Let’s start with the main function. We’ll keep it simple but extensible.

python
import os
import subprocess
import typer
from rich.console import Console
from rich.markdown import Markdown
from openai import OpenAI

app = typer.Typer()
console = Console()

# Initialize the client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

SYSTEM_PROMPT = """You are an expert terminal assistant. You help developers with:
1. Generating shell commands based on natural language descriptions
2. Explaining error messages and suggesting fixes
3. Answering coding questions concisely
4. Reading and summarizing file contents

When generating shell commands, always explain what the command does.
When explaining errors, provide the root cause and a fix.
Keep responses concise and actionable."""

def ask_llm(prompt: str) -> str:
    """Send a prompt to the LLM and return the response."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",  # Fast and cheap
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": prompt}
        ],
        temperature=0.3,
        max_tokens=500
    )
    return response.choices[0].message.content

@app.command()
def ask(query: str):
    """Ask the AI terminal assistant a question."""
    console.print("[bold cyan]🤖 AI Terminal Assistant[/bold cyan]")
    console.print(f"[dim]Query: {query}[/dim]\n")
    
    response = ask_llm(query)
    md = Markdown(response)
    console.print(md)

if __name__ == "__main__":
    app()

Set your API key:

bash
export OPENAI_API_KEY="sk-your-key-here"

Run it:

bash
python ai_term.py "how do I list all files larger than 100MB in the current directory?"

You’ll get a response like:

**Command:** `find . -type f -size +100M`

>

**Explanation:** This uses `find` to search recursively from the current directory (`.`), looking only for files (`-type f`) that are larger than 100 megabytes (`-size +100M`).

Honestly, that’s already useful. But we can do better.

Step 3: Add Command Execution

The real power comes when the assistant can *run* commands for you. Let’s add that.

python
@app.command()
def run(query: str, execute: bool = typer.Option(False, "--exec", "-e", help="Execute the suggested command")):
    """Ask for a command and optionally execute it."""
    console.print("[bold cyan]🤖 AI Terminal Assistant[/bold cyan]")
    console.print(f"[dim]Query: {query}[/dim]\n")
    
    # Ask specifically for a command
    command_prompt = f"Generate a shell command for: {query}. Return ONLY the command, no explanation."
    command = ask_llm(command_prompt).strip()
    
    console.print(f"[bold yellow]Suggested command:[/bold yellow]")
    console.print(f"  [green]{command}[/green]\n")
    
    if execute:
        confirm = typer.confirm("Execute this command?")
        if confirm:
            console.print("[bold cyan]Executing...[/bold cyan]")
            result = subprocess.run(command, shell=True, capture_output=True, text=True)
            if result.stdout:
                console.print(result.stdout)
            if result.stderr:
                console.print(f"[red]{result.stderr}[/red]")
        else:
            console.print("[yellow]Command not executed.[/yellow]")

Now you can do:

bash
python ai_term.py run "find all Python files modified in the last 24 hours" --exec

It’ll show you the command, ask for confirmation, then run it. No more Googling for flags.

Step 4: Add File Reading

Sometimes you just need to know what’s in a file without opening it. Let’s add that.

python
@app.command()
def read(filepath: str, lines: int = typer.Option(50, "--lines", "-n", help="Number of lines to read")):
    """Read a file and ask the AI to summarize or explain it."""
    try:
        with open(filepath, 'r') as f:
            content = f.read()[:2000]  # Limit to 2000 chars to save tokens
        
        prompt = f"Summarize this file and explain what it does:\n\n```\n{content}\n```"
        response = ask_llm(prompt)
        
        console.print(f"[bold cyan]📄 {filepath}[/bold cyan]")
        md = Markdown(response)
        console.print(md)
    except FileNotFoundError:
        console.print(f"[red]File not found: {filepath}[/red]")

Try it:

bash
python ai_term.py read ai_term.py

It’ll summarize your own code. Meta, right?

Step 5: Make It a Proper CLI Tool

Let’s make `ai-term` callable from anywhere.

Create a `setup.py`:

python
from setuptools import setup

setup(
    name="ai-term",
    version="0.1.0",
    py_modules=["ai_term"],
    install_requires=[
        "openai",
        "typer",
        "rich",
    ],
    entry_points={
        "console_scripts": [
            "ai-term=ai_term:app",
        ],
    },
)

Install it:

bash
pip install -e .

Now you can use it globally:

bash
ai-term ask "what's the difference between git merge and git rebase?"
ai-term run "kill all processes using port 3000" --exec
ai-term read config.yaml

Step 6: Add Error Handling and Streaming

Let’s make it production-ready. Add streaming responses so you see the answer as it’s generated:

python
def ask_llm_stream(prompt: str):
    """Stream the LLM response token by token."""
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": prompt}
        ],
        temperature=0.3,
        max_tokens=500,
        stream=True
    )
    
    for chunk in stream:
        if chunk.choices[0].delta.content:
            yield chunk.choices[0].delta.content

@app.command()
def ask_stream(query: str):
    """Ask with streaming response."""
    console.print("[bold cyan]🤖 AI Terminal Assistant[/bold cyan]")
    console.print(f"[dim]Query: {query}[/dim]\n")
    
    response = ""
    for token in ask_llm_stream(query):
        response += token
        console.print(token, end="")
    console.print()

Real-World Usage: How I Use This Daily

I’ve been using this for three weeks now. Here’s what it’s replaced:

  • Google searches for command flags: `ai-term run “compress all jpg files in this directory to 80% quality”`
  • Error debugging: `ai-term ask “why am I getting ModuleNotFoundError: No module named ‘requests’”`
  • Quick code reviews: `ai-term read src/main.py`
  • Git help: `ai-term ask “how do I undo the last commit but keep the changes”`

It saves me about 15-20 context switches per day. That’s roughly 30 minutes of lost focus I get back.

Making It Your Own

Here are a few ways to extend this:

  1. Add memory: Use a local SQLite database to remember past conversations
  2. Support multiple LLMs: Swap between OpenAI, Claude, or local models via Ollama
  3. Add file editing: Let the AI modify files directly (with confirmation)
  4. Integrate with your project: Read your `package.json`, `requirements.txt`, or `Dockerfile` automatically

The Vietnam Connection

At ECOAAI, our developers in Ho Chi Minh City and Can Tho use tools like this daily. When you’re working across time zones, every efficiency gain matters. Our teams build custom AI tooling like this to stay in flow—no matter where the client is.

A junior developer at $1,000/month with an AI terminal assistant can outperform a $4,000/month developer without one. That’s not hype. That’s math.

Frequently Asked Questions

Can I use a local LLM instead of OpenAI?

Yes. Swap the `OpenAI` client for Ollama’s API. Change the base URL to `http://localhost:11434/v1` and use models like `codellama` or `mistral`. Performance won’t be as good, but it’s free and private.

How do I handle API rate limits?

Add exponential backoff with `tenacity`:

python
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10))
def ask_llm_with_retry(prompt):
    return ask_llm(prompt)

Is it safe to execute commands automatically?

No. Always review commands before running them. The `–exec` flag is convenient but dangerous. I recommend adding a confirmation prompt (which we did) and never using it in production CI/CD pipelines.

How much does it cost to run?

With `gpt-4o-mini`, each query costs about $0.001. If you make 50 queries a day, that’s $0.05/day or roughly $1.50/month. Cheaper than a coffee.

Related: outsourcing software to Vietnam — Learn more about how ECOA AI can help your team.

Related: software outsourcing services — Learn more about how ECOA AI can help your team.

Related: outsource software development — Learn more about how ECOA AI can help your team.

Related: software outsourcing services — Learn more about how ECOA AI can help your team.

Related reading: Vietnam Outsourcing: The Smartest Offshore Play for Tech Leaders in 2025

Related reading: Outsourcing Software Development in 2025: Why Vietnam Is Beating the Competition

Leave a Comment

Your email address will not be published. Required fields are marked *

Ready to Build with AI-Powered Developers?

Hire Vietnamese engineers augmented by ECOA AI Platform + Claude Code. 5x faster, 40% cheaper.