Simon Willison’s Weblog

Subscribe

Phi-4 Bug Fixes by Unsloth (via) This explains why I was seeing weird <|im_end|> suffexes during my experiments with Phi-4 the other day: it turns out the Phi-4 tokenizer definition as released by Microsoft had a bug in it, and there was a small bug in the chat template as well.

Daniel and Michael Han figured this out and have now published GGUF files with their fixes on Hugging Face.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe