Gemini and Context Splicing
Imagine that you’re out for a walk. You’re looking at the landscape and taking photos with your camera or phone. At the same time your GPS watch, or phone app is recording your location every second. What remains of the walk is a gps track that has a resolution of one set of gps co-ordinates per second, and photos when you took photos. Most of the walk is “lost” because to record video of the entire walk would be too consuming.
To use another image. For many years podcasters would record one and a half hours of show, but they would edit it, keeping just the best thirty minutes. In the case of the blurb to describe a podcast you might have three to five sentences, chapter markers, and possibly a list of hyperlinks.
Trying to Remember a Conversation
AI, too, has a limited attention span. In theory Gemini can accept up to a million tokens but in practice it is quite a bit lower. A token is a snippet or phrase of context. The chattier you are, the more snippets of conversation AI models like Gemini need to understand.
If you’re verbose, and Gemini is, then that’s a lot of conversation to remember. We forget, and so does the AI model. It starts to mis-remember things.
Context Switching
LLMs have a huge set of data from which to work, so I expect them to see a screengrab from Immich or Photoprism, and to remember which is which. If I tell it “I’m using the laptop for this, and the Pi for that, I expect it to remember and to automatically switch between the two without me telling it. The issue is that it doesn’t. You need to say “I’m using the Pi for this” and “I’m using the laptop for that”. It doesn’t remember the context, especially with longer tasks.
The Gemini Trap
If you’re sorting through tens of thousands of files, and you’re getting the output from exiftool or other, you’re easily coming out with documents with 7 million tokens, but the context window is a million tokens. Gemini doesn’t say “Wait, I can’t handle such a large data set”. It hallucinates instead.
Helping Gemini
When I noticed this I started a new chat. I also asked Gemini, How can I provide you with enough data for you to understand it, without being overwhelmed so it said give me the head and the tail (paraphrased) and I’ll draw conclusions from that. From that moment on I tried to give Gemini enough for it to understand the output, without giving it excess information.
Grep and Other Commands
If you have a large data set, instead of feeding it to Gemini as is, you can ask it to help you summarise the data via GREP and other commands. Instead of feeding it a list of jdupes duplicates tell it We have 1200 lines from Source A and 1400 from Source B, and wait for Gemini to provide you with its “thoughts”.
On the Others
Google AI Studio, Le Chat, MyAI and Euria will all say “I’m full” and encourage you either to come back later, or to open a new window. Gemini is like the person in the restaurant in a Monty Python sketch that has another bite and explodes.
And Finally
One of the reasons I like Gemini over other models is that it’s patient. It will accept a lot of information. The drawback is that it will start hallucinating before saying “i’m full, I need a break”. If you’re not cautious it can provide you with the wrong command. That can have serious consequences if you don’t dry run and check every command before taking away the safety wheels.
In summary, watch for signs of context splicing, and when you spot them get a summary, and move to a new chat and continue from there. I don’t know how other AI solutions behave.