I'm contemplating whether I should drive or walk to the car wash (I just thought of that one HN post) and this is what it said after a few back-and-forths:
- Drive to the car (5 minutes), then park and wash.
- If you have a car wash nearby, you can walk there (2 minutes) and do the washing before driving to your car.
- If you're in a car wash location, drive to it and wash there.
Technically the last point was fine, but I like the creativity.
https://ofo1j9j6qh20a8-80.proxy.runpod.net
The server can serve 5 parallel request, with each request capped at around `13K` tokens...A bit of of benchmarks I did:
1. Input: 700 tokens, ttfs: ~0 second, outputs: 1822 tokens ~190t/s
1. Input: 6400+ tokens, ttfs: ~2 second, outputs: 2012 tokens at ~135t/s
Vram usage was consistently at ~4GiB.