DEV Community

Christopher Maher profile picture

Christopher Maher

Husband, dad, and software engineering leader. Passionate about automation, AI, emerging tech, and ham radio (N7CPM).

Joined Joined on  github website
Making a fleet of self-hosted LLM agents trustworthy

Making a fleet of self-hosted LLM agents trustworthy

1
Comments
6 min read
TurboQuant on a MacBook Pro, part 2: perplexity, KL divergence, and asymmetric K/V on M5 Max

TurboQuant on a MacBook Pro, part 2: perplexity, KL divergence, and asymmetric K/V on M5 Max

Comments
8 min read
TurboQuant on a MacBook Pro: two findings the upstream discussion missed

TurboQuant on a MacBook Pro: two findings the upstream discussion missed

Comments
7 min read
62.2% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.

62.2% on Aider Polyglot from a MacBook Pro. Then the other model we tried scored 4%. Here's what actually happened, with a working cost loop attached.

Comments
16 min read
We ran Qwen3.6-27B on $800 of consumer GPUs, day one: llama.cpp vs vLLM

We ran Qwen3.6-27B on $800 of consumer GPUs, day one: llama.cpp vs vLLM

2
Comments
15 min read
LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

LLMKube Now Deploys Any Inference Engine, Not Just llama.cpp

Comments
3 min read
I tested speculative decoding on my home GPU cluster. Here's why it didn't help.

I tested speculative decoding on my home GPU cluster. Here's why it didn't help.

Comments
5 min read
Google Released Gemma 4 Yesterday. I Had It Fixing Real Bugs by Lunch.

Google Released Gemma 4 Yesterday. I Had It Fixing Real Bugs by Lunch.

1
Comments
5 min read
I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.

I Tested TurboQuant KV Cache Compression on Consumer GPUs. Here's What Actually Happened.

2
Comments
6 min read
The $0 Problem: Why Every Tool Says Your On-Prem Inference is Free

The $0 Problem: Why Every Tool Says Your On-Prem Inference is Free

Comments
4 min read
llama.cpp on Kubernetes: The Guide I Wish Existed

llama.cpp on Kubernetes: The Guide I Wish Existed

3
Comments
9 min read
loading...