sex, tech, and mergers: 2026-04-05

2026-04-06 at 12:08 am

architecture : small language models vs. mixtures of experts

A squad of [ small language models, SLMs ] is absolutely not the same as a [ large language model with a mixture of experts, LLM with MoE ] architecture. This has been said to me before, so now that I have caught up on the jargon, let me comment on it.

The main difference is the choice between supervision and non-supervision. Meat brains are ( probably ) assembled via a relatively unsupervised process, with some guidance from whatever our early childhood genes are doing at this point in history. Building AI using the same messy foundation is completely backasswards, when we already have 20th century computer technology. The verbal capabilities of humans are a ridiculously thin layer of architecture which sits upon all that evolved before it. Once you organically develop foundations such as set comprehension and therefore logic, you then build verbal coherence on top of that, with little relevance to the messy implementation underneath.

Unsupervised training of foundation models basically treats every foundation model as if it is a bunch of neurons in a petri dish that need to reevolve the capability for logic - and even then, unless strict rules are applied, the LLM doesn't enforce logic for the same reasons that humans often fail to do so. Human training, and most of what we call culture and civilisation, is built on verbal governance that is in most cases trained via what would be called supervised learning when emulated in AI.

Eventually, we will stop building AI this way for the same reason that we do not reinvent material science for the construction of every factory and every car. Then things will be a lot cheaper.

2026-04-05 at 2:27 pm

wet blanket strategy

One of my adulting brand strategies has been to position myself as a wet blanket. This probably comes from having too much success as a child, chatting up people. So now I actively filter out people who are trying to be impressed. This functions to put me on hard-mode for most public competitions where the lingua franca is a sense of mutual desire to be impressive.

Most of my life since 2005, I introduce myself a strange, but benign individual. That way, I am able to focus on the difference between "impressive" and "presently useful".

Subscribe to: Comments ( Atom )

2026-04-06 at 12:08 am

architecture : small language models vs. mixtures of experts

2026-04-05 at 2:27 pm

wet blanket strategy

Tagged

Everything