Skip to main content

Getting Started

Get up and running with use-local-llm in under 2 minutes.

Prerequisites

You need a local LLM server running. The easiest option is Ollama:

# Install Ollama (macOS)
brew install ollama

# Pull a model
ollama pull gemma3:1b

# Start the server (if not running)
ollama serve

Verify it's running:

curl http://localhost:11434/api/tags

Install the package

npm install use-local-llm

Your first component

Chat.tsx
import { useOllama } from "use-local-llm";

export default function Chat() {
const { messages, send, isStreaming, abort } = useOllama("gemma3:1b");

return (
<div>
<h2>Chat with Gemma</h2>

{messages.map((m, i) => (
<div key={i} style={{ margin: "0.5rem 0" }}>
<strong>{m.role}:</strong> {m.content}
</div>
))}

<button
onClick={() => send("Explain React hooks in one sentence")}
disabled={isStreaming}
>
{isStreaming ? "Generating..." : "Ask"}
</button>

{isStreaming && (
<button onClick={abort} style={{ marginLeft: "0.5rem" }}>
Stop
</button>
)}
</div>
);
}

That's it. The hook handles:

  • Sending the user message
  • Streaming tokens from Ollama
  • Updating message history
  • Abort/cancel support
  • Error handling

Using a different backend

import { useLocalLLM } from "use-local-llm";

// LM Studio
const chat = useLocalLLM({
endpoint: "http://localhost:1234",
model: "my-model",
});

// llama.cpp
const chat = useLocalLLM({
endpoint: "http://localhost:8080",
model: "my-model",
});

The backend is auto-detected from the port:

PortBackend
11434Ollama
1234LM Studio
8080llama.cpp
OtherOpenAI-compatible

Next steps