apple silicon and small models

I recently got a new (monster-spec) MacBook Pro and, well, in a fit of questionable decision making later, got a (base-spec) Mac mini also. I’ve had a great time playing with this hardware.

I got a ridiculously good deal on the Mac mini, so I picked it up without a full plan on what I would be using it for. It’s now my “project desk computer” where you can look at iFixit guides while you work on whatever you’re going to work on. It also uses no electricity to speak of, so it’s an always-on environment for serving whatever I feel like serving. It’s worked out well; my MBP follows me around and is online and offline as needed, the m4cube is just always there, running, providing whatever all the time.

I like playing with AI stuff, so I have AI stuff running on it. It’s a very limited machine when it comes to AI, at least intuitively. AI wants tons of VRAM and tons of processing power and this base Mac mini has none of that, really. It’s shared 16GB of RAM has to be the ram that runs the OS and acts as VRAM, so loading a huge model isn’t going to happen - and it’s M4 processor, while absolutely ridiculous for many reasons, is also used in an iPad. It’s impressive in it’s own ways, but it’s not a monster.

duckiesays.com is now served from the mini. It traditionally used a very small, very limited model, but since it’s not running on a Celeron NAS anymore, it can use a ~3x bigger model and still be much more responsive than it used to be… because the M4 processor is ridiculous in it’s own ways. I can also run various ~7b models alongside the duckie model with no real stress.

Why does this matter? Well… not long ago - earlier this year, even - I was running my always-on local AI stuff on an Nvidia 3090 ti stuffed in a linux box. That card consumes 100w at idle, and is rated up to 475w under load. Is it faster than the Mac mini? oh definitely, by far. You can do much more with a big Nvidia card. But… if all I really need is a couple small models…

According to asitop, my Mac mini’s peak power usage for the CPU, GPU, and ANE (Apple Neural Engine) combined has been 5.17w. That’s ~95x less than the max power consumption of just the 3090 ti card, forget the rest of the computer there to support it.

Big models are fun, but small models are the future. We’ll be able to put these everywhere doing all sorts of little things. I can foresee a ton of little robots specialized in certain things, maybe a translator, maybe a utility / maintenance thing… we could call them droids.

apple silicon and small models

Graph View