• 3 Posts
  • 11 Comments
Joined 1 year ago
cake
Cake day: June 20th, 2023

help-circle



  • Surprised nobody mentioned this: Most of these models use tokenization; they group words into groups of symbols like “ea” and “the” and “anti” - they don’t pick which key to press for the text, they pick which bunch of keys to press. These are called tokens. I believe there are tokens it just can’t output, or tokens that are extremely unlikely. I could imagine that “etc.” and “…” are tokens with relatively high probabilities, but perhaps “etc…” doesn’t break into a nice set of them? (or the tokens it can be broken into all have extremely low weights for the model).