Inspiration
Turtle was born from a common frustration we both experienced as developers: constantly forgetting complex terminal commands and struggling with command-line syntax. We found ourselves repeatedly googling "how to find files recursively" or "move all files with specific extension" for tasks we'd done dozens of times before. The terminal is powerful, but its learning curve creates unnecessary friction in daily workflows. We saw an opportunity to leverage AI to bridge this gap. Rather than memorizing obscure flags and syntax, why not just describe what you want in natural language? With the recent advances in local LLMs and tool-calling capabilities, we realized we could build a terminal assistant that actually understands intent and executes commands safely. The goal was simple: make the command line accessible and intuitive without sacrificing the power that makes it essential for developers.
What it does
Turtle is a local terminal assistant CLI that brings AI-powered help directly into your workflow. It operates in two intelligent modes: Explain Mode: When you ask conceptual questions like "how do I find files recursively?" Turtle provides concise, terminal-friendly explanations with practical example commands like no tool execution, just helpful guidance. Action Mode: When you give Turtle a task like "create a folder named 3010 notes and move all files whose names contain 3010 into it," Turtle translates your natural language request into the appropriate bash command, shows you exactly what it plans to execute, and only runs it after you confirm with y/n. This safety-first approach means you stay in control while getting the convenience of AI assistance. Turtle handles common shell workflows including file and folder operations (create, move, rename, delete), searching within directories, listing contents, and inspecting files all through conversational commands. It runs entirely locally using Ollama with the Qwen 2.5 3B model, so your data and commands never leave your machine.
How we built it
We built Turtle over a hackathon weekend as a team of two, working collaboratively on all aspects of the project. This was our first time working with Ollama, agent frameworks, and LLM tool calling, so the learning curve was steep but rewarding. Tech Stack:
Python for the CLI and agent logic Ollama for local LLM runtime with the Qwen 2.5 3B model Rich library for terminal UI with styled panels and colored output Subprocess for safe command execution with output capture
Architecture:
The core of Turtle is a tool-calling agent that maintains conversation context and intelligently decides between explain and action modes based on user input. We defined a strict run_terminal tool schema that the LLM uses to propose commands in a structured format. The agent loop continuously processes user input, queries the model with appropriate system prompts, and handles tool calls by executing bash commands only after user confirmation. We implemented keyword-based routing (detecting phrases like "how do I" or "explain") to switch between explain and action prompts, ensuring the LLM behaves appropriately for each use case. The confirmation system adds a critical safety layer in which every proposed command is displayed before execution, and users must explicitly approve with y/n.
Challenges we ran into
Getting the LLM to consistently use tool calling correctly was our biggest technical hurdle. Since this was our first time working with tool-calling agents, we had to learn through trial and error how to structure prompts that reliably trigger the run_terminal tool. We iterated extensively on the system prompts, adding explicit instructions and JSON examples to guide the model toward consistent structured outputs. We also discovered that a strict tool schema with clear parameter descriptions significantly improved the model's reliability. Balancing between explain vs action modes proved trickier than expected. We needed Turtle to know when to explain a concept versus when to execute a command, and simple keyword detection wasn't always sufficient. We refined our explain markers ("how can I", "how do I", "explain") and crafted distinct system prompts for each mode. The explain mode had to be concise enough for terminal reading while still being helpful, and the action mode needed to be assertive about using tool calls without over-explaining. We also faced challenges with prompt engineering for tool use and teached ourselves how to structure prompts that would consistently produce the right behavior from a small local model. This involved understanding how to balance instruction clarity with model capability constraints.
Accomplishments that we're proud of
We're incredibly proud that we built a working AI agent in a short hackathon timeframe despite being new to Ollama, tool calling, and agent frameworks. Going from zero experience to a functional product in 36 hours required rapid learning and iteration, and seeing Turtle actually work felt amazing. Making terminal interactions more accessible was our core mission, and we achieved it. Watching Turtle successfully translate natural language like "create a folder named 3010 notes and move all files whose names contain 3010 into it" into the correct bash command and execute it perfectly validated our vision. We created a genuinely useful tool for daily workflows have already started using Turtle ourselves for file organization and common tasks. Most importantly, we successfully implemented safe command execution with a confirmation system that keeps users in control. Building an AI tool that executes system commands could be dangerous if done carelessly, so we're proud that Turtle shows you exactly what it will do and waits for explicit approval. This makes it practical for real-world use without the anxiety of unintended actions.
What we learned
We learned how to work with local LLMs using Ollama, including model selection, prompt engineering, and tool-calling APIs Agent framework design patterns, particularly how to maintain conversation context, route between different modes, and structure system prompts for reliable behavior Prompt engineering for tool use, discovering that explicit examples and strict schemas dramatically improve LLM consistency with structured outputs The importance of safety layers in AI tools is that confirmation prompts aren't just nice to have, they're essential when AI interacts with system commands Practical tradeoffs in model selection is why we chose Qwen 2.5 3B for speed and local inference, learning to work within the constraints of smaller models
We also learned valuable lessons about rapid prototyping under time constraints, the iterative process of refining prompts through testing, and how to build developer tools that balance power with usability.
What's next for Turtle - Command Line Interface Agent
We have ambitious plans to evolve Turtle into an even more powerful developer tool:
- Multi-directory navigation: Enable Turtle to navigate through different directories seamlessly, maintaining context about your location in the file system and executing commands across different paths without manual cd commands.
- Model flexibility and optimization: Integrate support for multiple LLM backends to let users choose models optimized for speed, accuracy, or specific tasks. This could include switching between Qwen variants, Llama models, or other Ollama-supported options based on command complexity.
- Complex command chaining: Improve accuracy when combining multiple operations together, enabling Turtle to handle sophisticated multi-step workflows like "find all Python files, count the lines in each, and save the results to a CSV" with better reliability and error handling.
Built With
- ollama
- python
- rich
Log in or sign up for Devpost to join the conversation.