Inspiration

As developers and creators, we noticed a massive friction point in the era of Generative AI: the cognitive cost of context switching. Whether drafting a client email, refactoring a React component, or writing marketing copy, utilizing AI requires breaking your flow state. You have to highlight text, Ctrl+C, switch to a browser, type a prompt, wait, Ctrl+C the response, switch back, and Ctrl+V.

We quantified this workflow friction. If a power user consults AI 50 times a day, the time lost to mechanical switching operations can be modeled as:

$$T_{loss} = \sum_{i=1}^{N} \left( t_{app_switch} + t_{prompt_formulation} + t_{data_transfer} \right)$$

We realized that the Logitech MX Creative Console is the perfect hardware to reduce $$T_{loss}$$ to near zero. We want to build a bridge that brings Agentic AI directly to the user's cursor, turning a multi-step digital chore into a single, satisfying physical action.

What it will do

TextTuner will turn the MX Creative Console into a universal, OS-level physical controller for Agentic AI. It completely eliminates the need to type prompts or switch windows.

  1. The Selection: Highlight text or code in any application (VS Code, Chrome, Word, Slack).
  2. The Directives (LCD Keys): Press a physical LCD key on the MX Console to instantly apply an AI directive. Keys will be mapped to specific Gemini agents (e.g., Refactor Code, Summarize, Make Professional, Translate).
  3. The Inline Magic: The plugin will silently read the system clipboard, process it through Gemini, and auto-paste the result right back into the active window, replacing the original text seamlessly.
  4. The Tactile Iteration (The Dial): AI rarely gets it perfect on the first try. If you aren't satisfied, simply turn the MX Dial. Turning the dial left or right will instantly cycle through alternative AI generations (e.g., adjusting the length or tone dynamically), live-replacing the text on screen until it is perfect.

How we plan to build it

We are architecting TextTuner as a background Node.js process utilizing the Logitech Actions SDK.

  • Hardware Bridge: We will use the Logitech SDK to map hardware events (key down, dial turn) to local JavaScript functions, while pushing dynamic visual feedback (loading spinners, success checks) back to the LCD keys.
  • OS Interop: To make it universal, we will bypass app-specific APIs. We plan to utilize Node packages like clipboardy and robotjs/nut-js to manipulate the OS-level clipboard and simulate global keystrokes (Ctrl+C / Ctrl+V).
  • AI Engine: We are integrating the Gemini REST API to handle the heavy lifting. When a directive is triggered, we will pass the clipboard string into a pre-engineered system prompt. For the dial functionality, we will prompt Gemini to return a structured JSON array of 5 variations, allowing the dial to simply scrub through the local array index $$i \in [0, 4]$$.

Anticipated Technical Challenges

Integrating hardware with OS-level commands and asynchronous cloud APIs will introduce severe race conditions that we are already planning to mitigate.

Our biggest anticipated hurdle is the Clipboard Overwrite Issue. The simulated Ctrl+V keystroke could easily fire before the OS has fully registered the new Gemini payload into the system clipboard, resulting in the plugin pasting the user's original text back to them. To solve this, we will engineer a robust asynchronous polling mechanism to verify the clipboard state hash before executing the final paste simulation.

Additionally, handling the analog data from the MX Dial will require debouncing. The hardware sends continuous rotation events, which could cause our plugin to skip through array indices too quickly. We plan to implement a custom logarithmic threshold to make the dial rotation feel "heavy" and precise.

The Core Innovation

We are most excited about bringing the "Dial-to-Iterate" feature to life. Prompt engineering is inherently imprecise, but by mapping an array of AI responses to a physical, analog dial, we will make interacting with an LLM feel like tuning a radio. It transforms an unpredictable software interaction into a highly deterministic, satisfying physical experience.

We are also focused on the sheer universality of the tool. Because it relies on OS-level clipboards rather than locked-down app APIs, TextTuner will work just as flawlessly in a heavily secured enterprise IDE as it does in a basic Notepad file.

UX Philosophy

Building this will test our ability to design haptic and visual hardware feedback. When dealing with variable latency from an AI API, the user needs to know the system hasn't frozen. Pushing dynamic state changes (like a custom "thinking" UI) to the MX Console's LCD screens via the Actions SDK will be a core focus of our UX design.

What's next for TextTuner

This hackathon submission is just the foundation. Our roadmap leading up to the finals in Switzerland and beyond includes:

  1. The Agent Marketplace: Allowing users to write their own custom Gemini system prompts and map them to their own custom LCD icons.
  2. Context-Aware Mapping: Utilizing the Logi SDK's app-detection feature so the LCD keys dynamically change their AI agents based on the active app (e.g., showing "Debug Code" in VS Code, but switching to "Tone Adjust" when Microsoft Outlook is focused).
  3. Multimodal Dialing: Expanding to image generation, where the dial physically scrubs through AI-generated image variations inside tools like Figma or Canva.

Built With

  • clipboardy
  • gemini-api
  • logitech-actions-sdk
  • mx-creative-console
  • mx-master-4
  • node.js
  • nut-js
  • typescript
Share this project:

Updates