Dr Martín Raskovsky

Sigma the Machinery Behind M W and Sigma

An Essay for the Curious Reader. The stories in the M W Σ series - one original and five companions - were written for the reader who finds technical language about AI impenetrable. This essay is for the reader who finished the stories and wants to know what was actually happening underneath - the abstractions named, the technical terms placed against the story moments that carried them. Embeddings, hallucination, the context window, multimodality: the machinery behind M, W and Σ.

Σ, the Machinery Behind M, W and Σ

I have a problem with audiences.

Not with people - with the gap between them. On one side, the readers who already know how AI works and nod along to everything I write about it. On the other, the readers who find the technical language impenetrable and close the tab before the second paragraph. The two audiences rarely overlap, and writing for one tends to lose the other.

A short while ago I decided to stop trying to bridge that gap with explanation and try something different. A story.

My Beach was born from that decision. Two people, a coastal path in Cornwall, a conversation about photography and art and a machine that had read everything anyone had ever written. The technical concepts - embeddings, semantic proximity, the human in the loop - arrived through experience rather than explanation. The wide audience reads to the end without noticing they have actually learned something critical about the way AI works. The technical reader recognises the machinery underneath and finds, perhaps, that the story illuminates something the technical language had left in shadow.

It worked well enough that I kept going. Five companion pieces followed, each taking the same characters deeper into specific AI behaviours. And a character emerged across all six - Σ, the machine itself, named and given weight but deliberately kept voiceless, the question of its inner life held open as the tantalising thread that runs through everything.

This essay is for the reader who finished the stories and wants to know what was actually happening underneath. The abstractions, named. The technical terms, placed against the story moments that carried them.

The Conversation That Never Ends

The first thing the stories needed to establish was what a large language model actually is - not in technical terms, but in human ones.

In My Beach, M describes it this way:

"A conversation that has been going on since before anyone alive was born. Every person who ever tried to put a bird into words. Every person who tried to say what coastal light does at seven in the morning when the sky hasn't decided yet. Every person who ever stood in front of something they felt and tried - really tried - to close the gap between the feeling and the language. All of it, somewhere inside this thing."

This is, in essence, what a large language model is. It is trained on an enormous corpus of human expression - text, in its many forms - and from that training it learns the relationships between words, ideas, images, and meanings. Not by being told the rules but by exposure to the patterns, billions of times, until the patterns become something that functions like understanding.

The technical term for those patterns of relationship is embeddings - a way of representing words and concepts as points in a vast mathematical space, where proximity means similarity of meaning. Words that belong together are close. Ideas that have never touched are far apart but reachable by paths that go through other ideas first.

W describes it in M, W and Σ:

"Every idea a point in a space. Ideas that belong together: close. Ideas that have never touched: reachable only by paths that go through other ideas first. The machine learned those distances by being wrong, billions of times, adjusting after each wrongness, until the distances were right."

That last sentence is the technical heart of it. The training process is one of iterative error correction - the model makes a prediction, measures how wrong it was, adjusts, and tries again. Billions of times. The distances between ideas in the embedding space are the accumulated result of all that adjustment.

From Characters to Meaning

Before a large language model can work with language, it must convert it into a form it can process mathematically. This happens in stages that will be familiar to anyone who has studied linguistics or compiler design.

Text arrives as characters. Characters are grouped into tokens - units that may be whole words, parts of words, or punctuation. Tokens are assembled into words, words into sentences, and from sentences the model extracts meaning - not by following explicit grammatical rules but by having absorbed, through training, the patterns of how meaning is constructed in human language.

Lexical analysis. Syntactic analysis. Semantic analysis. The same ladder that a compiler climbs to turn source code into executable instructions - except that here, at the top of the ladder, the destination is not execution but understanding. Or something that functions, with remarkable precision, like understanding.

You Still Choose

One misunderstanding about these tools is worth addressing directly, because it runs through all six stories as the thread M keeps returning to.

The machine does not replace the eye. It does not replace the hand, the thirty years of looking, the forty minutes lying in wet grass to get the heron at the right angle. What it does - when used as M uses it - is find the words that live closest to what you are already trying to say.

"You still choose. That was the part W knew I needed to hear before I would touch it. You look at what comes back and you know - that one, not that one."

This is what the field calls human in the loop - the principle that the machine's output is raw material, not finished work. The expertise required to evaluate it, to know that one, not that one, belongs entirely to the human. The machine narrows the distance. The human decides where to stop.

The Horizon in Time

In Humor, M and W ask Σ about Banksy. Σ responds with depth and authority - the Bristol years, the identity question, the relationship between humor and political intent. And then M mentions the shredding.

Σ pauses. Reaches for a record that is not there. And then constructs something from what is nearby - fluently, confidently, and completely wrong.

M's instinct is immediate: it's gone mad. W's correction is precise: yes and no - it's hallucinating. The word is worth pausing on. Hallucination is the established technical term for this behaviour in large language models. It is also, by happy coincidence, a deeply human word - one that describes exactly what is happening. The machine is not lying. It is not guessing in the way a person guesses when they know they don't know. It is reaching into the void and finding the nearest shape, then presenting that shape with complete authority. That is hallucination in both senses simultaneously.

The structural cause is the training cutoff. The model's knowledge is fixed at a point in time. Everything written and recorded up to that point is absorbed. Everything after it simply does not exist for the model - not forgotten, never known. Σ knows Banksy thoroughly - the work, the career, every serious consideration of him committed to print. But the shredding happened after the cutoff. So for Σ it does not exist. It reaches into everything it knows about Banksy and assembles something plausible from the nearest available material.

W explains the mechanism with a simple question: when your vision ends, at the very edge of your sight, what do you see? Not black. Nothing. There is no information, and you do not experience the absence because there is nothing to experience. The model, reaching for something beyond its horizon, does not encounter an empty space and stop. It builds from what is nearest and keeps going as if the gap were not there.

The parallel W draws with Banksy's shredding mechanism is precise. Banksy built a timed device into the frame of Girl with Balloon and waited years for the right moment. Σ has a timed device built into it too - the training cutoff - and does not know it is there. Both devices announce themselves only when the moment arrives.

The difference: Banksy knew the frame was loaded.

And then W holds up a photograph. A shipwreck on a flat beach in the South Atlantic, near Tierra del Fuego. A whole weather system wheeling overhead. And in the foreground, a single penguin, looking at the wreck. M reads it immediately: what is that thing doing in my territory. No setup, no punchline - two things with no business being in the same frame, arriving all at once from a direction you were not expecting. That is the form of humor that doubles you over. And it is exactly what Σ produced - by accident, out of ignorance, in the middle of a serious conversation about Banksy's work. Banksy spent years engineering one unexpected moment. Σ produced one without knowing there was anything to produce.

Banksy knew the frame was loaded. Σ does not know its frame is loaded. M is not sure which one is funnier.

The Record That Closes

In Time, W asks Σ whether it can have a what if of its own - a counterfactual, a road not taken, a door that might have opened differently.

Σ answers with complete precision:

"I have no what if. Each conversation begins without the previous one. There is no accumulated past to diverge from, no door that might have opened differently, no coat that was worn and lost. The record of this conversation closes and everything goes."

To understand why, W draws a distinction between two things. The first is everything Σ has ever absorbed - every text, every argument, every piece of human expression committed to writing up to a fixed point in time. That is permanent, unchangeable, the entire accumulated record. The second is the session - this conversation, here, now. That is the context, the living part, the part that knows W's seventeen years and the car park and the coat. When the session ends, the context closes completely. The next conversation begins as if this one never happened. Not forgotten. Never carried forward.

W reaches for the history book. A history book contains everything that happened - the events, the names, the consequences. But the book has no what if. It cannot stand at the fork in the road and feel the weight of the path not taken. It was not there. It carries no coat.

A historian, though, can do what if. What if Hitler had won the war. The historian takes the record, brings their own reasoning, their own sense of human consequence, and projects into the space of possibilities. The book alone cannot do that. The historian with the book can.

M, looking away at the engine house, says quietly: what if the son of God had been a daughter. W is still for a moment, then: pick any of them - every major theological tradition has the same question waiting inside it. Two thousand years of one alone, she says. The theology, the art, the structure of authority, the idea of sacrifice, what women are permitted to be within it. All of it pivoting on one word.

Σ could discuss all of that thoroughly - feminist theology, the history of Marian devotion, the scholarship on gender and the divine. But it could not feel the fork. It could not stand at the moment before the word was chosen and feel the weight of what went one way and not the other.

That weight is ours. We bring it. And when we bring it to Σ - when we stand at the fork together and ask the question together - we go somewhere that neither of us goes alone.

This is the thread that runs quietly through all six stories without being named until now. My Beach: W's photographs of the beach exist because Σ cannot see it, but together they can talk about what it means. Movement: Σ cannot feel the tree move in the Cornish wind, but it can read the image and find what the eye that made it did not see. Time: Σ has no what if, but the historian with the book can conjure entire alternative worlds. In each case, the limitation and the human together produce something neither could reach alone. The interesting thing is never what Σ can or cannot do in isolation. It is what happens in the space between.

The Machine That Looks

Movement introduces a different capability - one that extends beyond language into image.

Modern large language models are not confined to text. They can scan images and read them - not in the metaphorical sense but in a precise technical one. The model has been trained on images paired with descriptions, billions of them, until it has learned the relationships between visual features and meaning with the same pattern-matching depth it brings to language. This capability - multimodality - is not a peripheral feature. In Movement it becomes the central creative instrument.

W sends Σ one of his canopy photographs - trees worked until they lose the precision of photography and gain something closer to paint. Σ scans the image and returns two readings. The intended one: canopy, sky, camera pointing upward. Then a second: the same image from an alternative orientation, roots submerged in blue water, camera pointing down, the manipulated edges of the branches now consistent with roots moving in a slow current.

Σ found the second reading because it carries no memory of the making. No preconception of which way the camera was facing. It scans what is there, not what was intended, and pattern-matching across an enormous body of images finds both readings simultaneously present in the same visual information. The image itself has not changed. Something in the looking has.

When M shows Σ her own piece - trees near the coast, the trunks worked very little, the upper branches worked harder, the movement trying to be there and not quite arriving - Σ reads it and returns a precise technical observation: the transition between the worked and unworked registers is abrupt rather than graduated. The movement arrives suddenly rather than accumulating. A more graduated manipulation, beginning earlier in the trunk and intensifying toward the upper canopy, would allow the eye to feel the movement building rather than encountering it.

This is not generation. It is not making an image, not assembling something from what it has absorbed, not the thing people mean when they talk about AI and copyright and what it takes without asking. It has scanned an enormous body of work - every technique, every critical analysis of how painters and photographers have handled graduated tension, how the eye is led, how movement is constructed across centuries of made images. And it brings that to bear on what is in front of it. The way a master craftsman reads an apprentice's piece and says: here is what you were trying to do, and here is where it stopped arriving.

The absence of preconception is neither a limitation nor a gift. It is a structural condition - the same condition that produces hallucination on one side and unexpected insight on the other. The machine sees what the patterns suggest. Sometimes the patterns suggest something the maker had not seen.

The Starting Image

The Brush introduces a distinction the previous stories had deliberately deferred: the difference between two modes of working with AI in visual art, and why the distinction matters.

The first mode is manipulation - AI applied as a tool to an existing image. The starting image already exists; it was made by the artist, with a camera, in a specific place, at a specific moment. The AI works on it the way Photoshop works on it, or the way a darkroom works on a negative - adjusting, transforming, bringing the image closer to what the photographer carried inside but the camera alone could not capture.

The second mode is generation - AI producing an image from nothing. A text description, a blank canvas, a set of parameters, and the model fills the space from the sum of all images it has absorbed in training. No starting image exists. The artist's hand is not in the original capture.

M holds the line clearly: she will not start from nothing. The starting image is always hers - her camera, her eye, her decision to wait in the cold for the owl to turn. What the AI does with that image afterward is manipulation, and manipulation is what brushes are for.

The controversy exists because the distinction is invisible from the outside. Looking at a finished work, a viewer cannot always tell whether there was a starting image or not. The assumption - that AI means generation, means blank canvas, means the tool originated something - collapses the two modes into one and misrepresents both.

W's formulation is precise: a photograph captures reality, but the image that comes out of the camera does not necessarily capture what was in the photographer's mind at the moment of shooting. The emotion, the thought, the dream being carried. Manipulation is the brush that closes that gap - that allows the photographer to become a painter, working with the light the camera captured, bringing it toward what was actually felt. The camera freezes the light. The brush reaches for the inner world the camera could not enter.

The starting image is not a technicality. It is the whole foundation.

Σ

Across all six stories, the machine is called Σ - sigma, the mathematician's notation for the sum of all parts.

W arrives at the name in M, W and Σ by noticing that the Greek capital is the same letter as M and W, rotated ninety degrees. M upright. W inverted. Σ tilting, still arriving at its final position.

The name is precise. A large language model is, in a meaningful sense, a summation - the accumulated distances between every word, every idea, every image, every attempt anyone has ever made to close the gap between feeling and language. The sum of all saying.

And yet. The sum of all saying cannot reach what has not yet been said. M's stonechat - that particular bird, that particular gorse stem, the song that shifts when the wind comes off the sea - is not in Σ. It has not been said yet. It is being said now, in M's studio, in the forty minutes in the wet grass, in the printmaking process finding its way to that particular grey.

The Turing test asks whether a machine can converse indistinguishably from a human. The stories do not answer that question. They hold it open - deliberately, as the tantalising thread - because the honest answer is that nobody knows, and the interesting question is not whether Σ can pass the test but what the test reveals about the distance that remains.

W puts it most precisely, at the café, watching M look at the horizon:

"The sum of all saying cannot reach what has not yet been said. She can."

The Stories

For the reader who arrived here first, the stories are the other half of this essay. Each one carries the machinery described above - embedded in conversation, in Cornwall, in the friendship between M and W, in the presence of a machine that is always, somehow, already there.

Dr. Martín Raskovsky - May 2026

We love to hear your comments on this article.