Project Title: Local LLM Inference Web Interface Overview: This project is a simple Streamlit-based web application that connects to a locally hosted LLM using Ollama. The interface enables a user to enter questions, configure the local backend settings, and receive model-generated responses in a conversational format. Features:

  1. Clean Streamlit frontend with a chat-style interface
  2. Configurable Ollama backend URL, endpoint path, and model name
  3. Adjustable max_tokens and temperature settings
  4. Conversation history display
  5. Reset conversation button
  6. Support for streaming Ollama JSON response chunks Implementation Details:
  7. Written in Python using streamlit and requests
  8. Backend communication is performed via HTTP POST requests to the local Ollama endpoint
  9. The app parses streaming JSON lines returned by Ollama and aggregates them into a complete assistant response
  10. Chat history is stored in st.session_state to preserve messages during page interaction Setup and Usage Instructions:
  11. Open your project directory: ‘’’powershell cd C:\Users\Admin\my_llm_app
  12. Create a Python virtual environment: powershell py -3.14 -m venv .venv 3. Activate the environment: powershell Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass ..venv\Scripts\Activate.ps1 4. Install dependencies: powershell python -m pip install --upgrade pip python -m pip install -r requirements.txt
  13. Run the application: powershell python -m streamlit run app.py
  14. Open the browser at the URL provided by Streamlit, usually: http://localhost:8501
  15. In the app sidebar, verify the settings: Ollama Backend URL: http://localhost:11434 Endpoint path: /api/generate Model name: llama3.2 (or your installed model name) Notes: a. The app assumes Ollama is running locally and accessible on port 11434 b. If the app does not respond, ensure Ollama is active and the endpoint is correct c. This project is designed for local inference, not cloud deployment

Built With

Share this project:

Updates