Home | use-local-llm

⚡

Direct Browser → Localhost

No server routes, no API layer. Your React app calls the local LLM directly via fetch + streaming.

🔌

Multi-Backend

Ollama, LM Studio, llama.cpp, and any OpenAI-compatible endpoint. Auto-detected by port.

💬

Full Chat State

Message history, system prompts, abort, clear — all managed inside the hook. No boilerplate.

🪶

2.8 KB Gzipped

Zero runtime dependencies. Only a peer dependency on React ≥17. Tree-shakeable ESM + CJS.

🔤

Token-by-Token Streaming

Real-time text rendering with onToken callbacks. Supports both SSE and NDJSON protocols.

🛡️

TypeScript-First

Written in strict TypeScript. Every option, return value, and callback is fully typed.

Quick Start

App.tsx
import { useOllama } from "use-local-llm";

function Chat() {
  const { messages, send, isStreaming } = useOllama("gemma3:1b");

  return (
    <div>
      {messages.map((m, i) => (
        <p key={i}><b>{m.role}:</b> {m.content}</p>
      ))}
      <button onClick={() => send("Hello!")} disabled={isStreaming}>
        {isStreaming ? "Generating..." : "Send"}
      </button>
    </div>
  );
}

Install with npm install use-local-llm · Start Ollama · Done.

Why not Vercel AI SDK?

Vercel AI SDK requires Next.js API routes — it can't call localhost:11434 from the browser. use-local-llm was built specifically for direct browser → localhost streaming with zero server configuration.

See Full Comparison