Byte Latent Transformer: Patches Scale Better Than...
# ai-reading-club
p
Byte Latent Transformer: Patches Scale Better Than Tokens: https://arxiv.org/abs/2412.09871
a
I'm excited about this one, maybe the first tokenizer-free Transformer architecture?