π’ Just booted your PC? Start chatting in 10 seconds
The AI engine (Ollama) installs itself as a Windows service β so it's already running the second your PC turns on. You never "start the engine". You just open a way to talk to it. Fastest way, straight into PowerShell:
ollama run killa
Type away Β· /bye to leave. First message after a boot takes a few seconds (it's loading your brain into the 4090). Want the pretty chat box or your phone? β sections 2 & 3.
π΄ Too tired to read? Tonight's update in 20 seconds
What changed: when your mates open your shared link, after the password they now type their name β and every question + answer gets saved on your 4090 (private, never leaves). That's you banking real training examples while they chat.
What your mates do: nothing new β just type a name once. They're never asked the password or name again after that.
The ONE thing YOU do to switch it on: the server you've got running is the old one. Grab the upgraded one and restart it β that's it. Full steps in Section 4, or the quick grab:
Safe to run β it upgrades the code but keeps your password and your saved chat log. Then start it + share the link from Section 4.
π· Does any of this cost money? No β Β£0. The dashboard's on free Cloudflare Pages, the share-link is a free tunnel, and the AI runs on your own 4090. No Cloudflare Workers, no bills, no limits you'll hit. Only "cost" is the electric running your GPU.
1 First time on this PC once
Setup steps. If something's already done, skip it β running twice does no harm.
Install Ollama once
The engine that runs the AI. (You've likely done this β skip if ollama --version works.)
winget install Ollama.Ollama
Then close and reopen PowerShell.
Get the model once
Downloads the Llama brain-weights (~4.9GB).
ollama pull llama3.1:8b
Give it YOUR brain β the killa model once
Bakes in your identity + tone so it knows it's local and knows you. Makes a model called killa.
Changed your mind? Delete Killa 4090 Dashboard.url from shell:startup (paste that into the Run box / File Explorer bar).
2 Chat on this PC anytime
Quick terminal chat
Fastest. Talk to your brain-model right in PowerShell.
ollama run killa
Type away. /bye to exit.
Or the nice chat box (GUI)
Start the server (section 3 below), then open localhost:8080 β that page can actually talk to your AI. Pick killa in the model box.
β οΈ Don't chat on this web page β it's just your menu, it can't reach the AI (that's the 404 you saw). Chatting happens at localhost:8080 or in the terminal above.
cd "$env:USERPROFILE\Downloads\llama-chat-server"; powershell -ExecutionPolicy Bypass -File .\start-here.ps1
It prints an address like http://192.168.1.50:8080 β open THAT on any device on the same wifi. Leave the window open while you use it.
4 Send a private link (mum / your mates) off your wifi
Two windows β one runs the server (with a password), one opens the public link. This way you can SEE it working (the old one-click hid its errors).
π Now with name + logging. After they type the password they get asked their name β so you know who said what. And every message + reply is saved locally to chat-log.jsonl in the server folder (on this 4090, private, never leaves). That's your training-data firehose β the more your mates chat, the more real examples you bank. Peek at it any time with the command in β‘ Handy commands.
First β grab the upgraded server do once after tonight's update
The names + logging live in the new server code. Run this to upgrade your copy (keeps your password + saved log).
Open a fresh PowerShell window and run this. It prints a https://β¦trycloudflare.com link.
cloudflared tunnel --url http://localhost:8080
Send mum that link + the password. Tell her to pick killa in the model box.
Keep BOTH windows open, and Ollama running. The link is fresh every time you run it β always send the newest one (an old link gives a 404). Link + password = a key; only give them to people you trust. Want a permanent link + per-person login? That needs a domain β just ask.
5 Training Lab β make it truly yours advanced
This is real training β baking your voice into the model's weights (a LoRA), not just notes it re-reads. Think muscle memory instead of a briefing. Do the steps in order.
Straight talk before you burn a night on this: fine-tuning locks in tone + identity β it does not make it know more facts or stop it making things up (that's model size, different job). And it's only as good as the data. The kit ships 50 seed examples β enough to nudge the voice + prove the pipeline, not transform it. We grow it toward ~300 (shadow-log + more) before it really bites. We keep the old killa until the new one earns the name.
Step 1 β Get the training kit once
The dataset, the training script, and the guide. Downloads to your Downloads folder.
Open README.md in there and give it a read β it's the honest version of all this.
Step 2 β Set up the trainer oncefiddly
Fine-tuning wants Linux tooling. Cleanest route on Windows = WSL2 β a real Ubuntu running inside Windows with your 4090 passed through. Think a Linux box in a window, sharing your GPU.
wsl --install
Reboot after this. Then open Ubuntu from the Start menu and run the next one inside it:
nvidia-smi && pip install unsloth
nvidia-smi should list your 4090 (proves the GPU's visible). pip install unsloth pulls the whole trainer + downloads the base model (~5GB) on first run. This is the bit that can fight you β if it errors, copy me the exact red line and I'll unstick it. Not pretending it's one-click.
Step 3 β Run the fine-tune ~30 min
Inside Ubuntu, go to the kit folder and train. Your 4090 does the work.
cd /mnt/c/Users/$USER/Downloads/killa-train && python train.py
It trains, then spits out killa-tuned/ β a ready-to-load model. Watch the loss tick down. Leave it cooking.
Step 4 β Load it & judge it head-to-head
Bake the result into Ollama as killa2, then ask both the same thing and see who's better.
cd /mnt/c/Users/$USER/Downloads/killa-train; ollama create killa2 -f killa-tuned/Modelfile; ollama run killa2 "who are you and where do you run?"
Then compare: ollama run killa "who are you and where do you run?". If killa2 wins, tell me and we promote it to killa. If not, we grow the data and go again. Proof, not faith.