Open access for next 5 hours (8GiB model, running on RTX 3090) or until server c...

ggerganov · 2026-04-01T06:53:53 1775026433

Better keep the KV cache in full precision

freakynit · 2026-04-01T07:06:01 1775027161

Wow.. the GOAT himself.. thank you sooo much for creating llama.cpp ... will re-deploy with full kv cache once requests stop coming.

ramon156 · 2026-04-01T09:38:44 1775036324

I genuinely love talking to these models

https://ofo1j9j6qh20a8-80.proxy.runpod.net/#/chat/5554e479-0...

I'm contemplating whether I should drive or walk to the car wash (I just thought of that one HN post) and this is what it said after a few back-and-forths:

- Drive to the car (5 minutes), then park and wash.

- If you have a car wash nearby, you can walk there (2 minutes) and do the washing before driving to your car.

- If you're in a car wash location, drive to it and wash there.

Technically the last point was fine, but I like the creativity.

logicallee · 2026-04-01T05:58:07 1775023087

That was really impressive. https://pastebin.com/PmJmTLJN pretty much instantly. (Very weak models can't do this.)

freakynit · 2026-04-01T10:48:20 1775040500

Update: this has been evicted by runpod as it was on spot.

Imustaskforhelp · 2026-04-01T08:39:59 1775032799

Kind sir, May I say to you thanks for doing so! I really appreciate it :D

TRCat · 2026-04-01T06:48:20 1775026100

Thank you! I am impressed by the speed of it.