Direct Browser → Localhost
No server routes, no API layer. Your React app calls the local LLM directly via fetch + streaming.
Multi-Backend
Ollama, LM Studio, llama.cpp, and any OpenAI-compatible endpoint. Auto-detected by port.
Full Chat State
Message history, system prompts, abort, clear — all managed inside the hook. No boilerplate.
2.8 KB Gzipped
Zero runtime dependencies. Only a peer dependency on React ≥17. Tree-shakeable ESM + CJS.
Token-by-Token Streaming
Real-time text rendering with onToken callbacks. Supports both SSE and NDJSON protocols.
TypeScript-First
Written in strict TypeScript. Every option, return value, and callback is fully typed.
Quick Start
import { useOllama } from "use-local-llm";
function Chat() {
const { messages, send, isStreaming } = useOllama("gemma3:1b");
return (
<div>
{messages.map((m, i) => (
<p key={i}><b>{m.role}:</b> {m.content}</p>
))}
<button onClick={() => send("Hello!")} disabled={isStreaming}>
{isStreaming ? "Generating..." : "Send"}
</button>
</div>
);
}
Install with npm install use-local-llm · Start Ollama · Done.
Why not Vercel AI SDK?
Vercel AI SDK requires Next.js API routes — it can't call localhost:11434 from the browser. use-local-llm was built specifically for direct browser → localhost streaming with zero server configuration.