Monter son propre ChatGPT en local avec Ollama + une UI maison

Vue d’ensemble

L’idée est simple :

Architecture

Ollama → moteur LLM qui tourne en local
Mon serveur Node → petit proxy pour éviter les problèmes CORS et ajouter de la logique
Mon UI → une page HTML basique qui ressemble à un mini ChatGPT

1) Installer Ollama

Téléchargez Ollama depuis le site officiel (Mac, Linux, Windows).

https://ollama.com/download

Une fois installé, testez si ça tourne :

curl http://localhost:11434/api/tags

Si ça répond avec une liste (peut-être vide), c’est bon signe ✅

2) Télécharger un modèle

Par défaut, il n’y a rien. Vous pouvez choisir parmi les models dispo sur le site de Ollama :

https://ollama.com/search

J’ai choisi Llama 3

ollama pull llama3
ollama run llama3

Sinon pour tester vite fait sans griller toute la RAM :

ollama run smollm2:135m

3) Lancer l’API et régler CORS

L’API tourne généralement déjà sur http://127.0.0.1:11434. Si besoin, on peut la (re)lancer :

ollama serve

👉 Si vous servez votre UI ailleurs (genre localhost:3000), ajoutez cette variable avant :

export OLLAMA_ORIGINS=http://localhost:3000

Sinon… CORS error incoming 😅

4) Faire une UI maison

J’ai voulu garder ça simple : un petit serveur Node.js + un HTML minimaliste.

a) Créer le serveur (proxy vers Ollama)

mkdir chat-local && cd chat-local
npm init -y
npm i express node-fetch

Puis server.js :

import express from 'express';
import fetch from 'node-fetch';
import path from 'path';
import { fileURLToPath } from 'url';

const __dirname = path.dirname(fileURLToPath(import.meta.url));
const app = express();
app.use(express.json());
app.use(express.static(path.join(__dirname, 'public')));

app.post('/chat', async (req, res) => {
  const { model = 'llama3', messages = [] } = req.body || {};
  const r = await fetch('http://127.0.0.1:11434/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ model, messages, stream: false }),
  });
  const data = await r.json();
  res.json(data);
});

app.listen(3000, () => console.log("UI dispo sur http://localhost:3000"));

b) L’UI (HTML)

Créez public/index.html :

<!doctype html>
<html lang="fr">
<head>
  <meta charset="utf-8" />
  <title>Mon Chat Local</title>
  <style>
    body {
      margin: 0;
      font-family: system-ui, sans-serif;
      background: #f6f7f9;
      color: #222;
      display: flex;
      flex-direction: column;
      height: 100vh;
    }
    header {
      padding: 12px 20px;
      border-bottom: 1px solid #ddd;
      background: #fff;
      font-weight: 600;
    }
    .chat {
      flex: 1;
      overflow-y: auto;
      padding: 20px;
      display: flex;
      flex-direction: column;
      gap: 12px;
    }
    .msg {
      max-width: 75%;
      padding: 10px 14px;
      border-radius: 10px;
      line-height: 1.4;
      white-space: pre-wrap;
    }
    .me {
      align-self: flex-end;
      background: #007aff;
      color: #fff;
    }
    .bot {
      align-self: flex-start;
      background: #e5e7eb;
      color: #111;
    }
    .composer {
      display: flex;
      border-top: 1px solid #ddd;
      padding: 10px;
      background: #fff;
    }
    textarea {
      flex: 1;
      border: 1px solid #ccc;
      border-radius: 8px;
      padding: 10px;
      font: inherit;
      resize: none;
    }
    button {
      margin-left: 8px;
      padding: 0 16px;
      border: none;
      border-radius: 8px;
      background: #007aff;
      color: #fff;
      font-weight: 600;
      cursor: pointer;
    }
  </style>
</head>
<body>
  <header>💬 Mon ChatGPT Local</header>
  <div id="chat" class="chat"></div>
  <div class="composer">
    <textarea id="input" rows="2" placeholder="Écrivez un message…"></textarea>
    <button id="send">Envoyer</button>
  </div>
  <script>
    const chat = document.getElementById('chat');
    const input = document.getElementById('input');
    const btn = document.getElementById('send');
    const messages = [];

    function add(role, content) {
      const div = document.createElement('div');
      div.className = 'msg ' + (role === 'user' ? 'me' : 'bot');
      div.textContent = content;
      chat.appendChild(div);
      chat.scrollTop = chat.scrollHeight;
    }

    async function send() {
      const text = input.value.trim();
      if (!text) return;
      input.value = '';
      add('user', text);
      messages.push({ role: 'user', content: text });

      try {
        const r = await fetch('/chat', {
          method: 'POST',
          headers: { 'Content-Type': 'application/json' },
          body: JSON.stringify({ model: 'llama3', messages })
        });
        const data = await r.json();
        const reply = data?.message?.content || "⚠️ Pas de réponse";
        add('assistant', reply);
        messages.push({ role: 'assistant', content: reply });
      } catch (e) {
        add('assistant', '⚠️ Erreur: impossible de joindre Ollama.');
      }
    }

    btn.onclick = send;
    input.addEventListener('keydown', (e) => {
      if (e.key === 'Enter' && !e.shiftKey) {
        e.preventDefault();
        send();
      }
    });
  </script>
</body>
</html>

5) Tester avec cURL

curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
  "messages": [{ "role": "user", "content": "Explique-moi les closures en JS" }],
  "stream": false
}'

6) Captures

UI en action :

7) Bonus Docker

Si vous voulez isoler l’environnement :

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Et test :

curl http://localhost:11434/api/tags

⚠️ 8) Petits pièges

Port déjà occupé → changez avec OLLAMA_HOST=127.0.0.1:11435 ollama serve
CORS → pensez à OLLAMA_ORIGINS=http://localhost:3000
RAM saturée → commencez avec un modèle petit (genre smollm2)

Conclusion

Franchement c’est satisfaisant :

J’ai mon propre chatbot style ChatGPT, mais offline.
Ça marche sur mon laptop, pas besoin de cloud.
Je peux bidouiller la UI, tester d’autres modèles, ajouter mes features.
Et surtout, je peux imaginer plein d’évolutions : ajouter une mémoire de conversation, intégrer du contenu externe (PDF, docs), passer en mode agent avec plusieurs outils, ou encore faire de la recherche avancée sur mes données locales.

👉 Prochain step : gérer le streaming token par token, faire un Modelfile custom et pourquoi pas brancher un petit RAG maison. Et au-delà de la technique, l’objectif est clair : permettre aux gens et aux entreprises d’avoir leur propre IA responsable, qui protège vraiment leurs données personnelles.