Ignore the specific example of counting characters, I was just quickly coming up with a situation where the instruction is at the end of the input. Here is a better example:
Input the full text of a novel, then ask for a minor detail (eg color of a car that is briefly mentioned in the middle of the book). Again a human can do this by flipping back to the relevant section but LLMs have no mechanism for this when using a sliding window attention scheme.
If the full input can fit in the context window then any LLM today would be able to extract the color of the car.
Input the full text of a novel, then ask for a minor detail (eg color of a car that is briefly mentioned in the middle of the book). Again a human can do this by flipping back to the relevant section but LLMs have no mechanism for this when using a sliding window attention scheme.
If the full input can fit in the context window then any LLM today would be able to extract the color of the car.