• 0 Posts
  • 10 Comments
Joined 1 year ago
cake
Cake day: June 13th, 2023

help-circle








  • Just as a fun example of a really basic language model, here’s my phones predictive model answering your question. I put the starting tokens in brackets for illustration only, everything following is generated by choosing one of the three suggestions it gives me. I mostly chose the first but occasionally the second or third option because it has a tendency to get stuck in loops.

    [We know LLMs are not intelligent because] they are not too expensive for them to be able to make it work for you and the other things that are you going to do.

    Yeah it’s nonsense, but the main significant difference between this and an LLM is the size of the network and the quantity of data used to train it.


  • I’m possibly just vomiting something you already know here, but an important distinction is that the problem isn’t that ChatGPT is full of “incorrect data”, it’s that it is has no concept of correct or incorrect, and it doesn’t store any data in the sense we think of it.

    It is a (large) language model (LLM) which does one thing, albeit incredibly well: output a token (a word or part of a word) based on the statistical probability of that token following the previous tokens, based on a statistical model generated from all the data used to train it.

    It doesn’t know what a book is, nor does it have any memory of any titles of any books. It only has connections between token, scored by their statistical probability to follow each other.

    It’s like a really advanced version of predictive texting, or the predictive algorithm that Google uses when you start typing a search.

    If you ask it a question, it only starts to string together tokens which form an answer because the network has been trained on vast quantities of text which have a question-answer format. It doesn’t know it’s answering you, or even what a question is; it just outputs the most statistically probable token, appends it to your input, and then runs that loop.

    Sometimes it outputs something accurate - perhaps because it encountered a particular book title enough times in the training data, that it is statistically probable that it will output it again; or perhaps because the title itself is statistically probable (e.g. the title “Voyage to the Stars Beyond” will be much more statistically likely than “Significantly Nine Crescent Unduly”, even if neither title actually existed in the training data.

    Lots of the newer AI services put different LLMs together, along with other tools to control output and format input in a way which makes the response more predictable, or even which run a network request to look up additional data (more tokens) but the most significant part of the underlying tech is still fundamentally unable to conceptualise the notion of accuracy, let alone ensure they uphold it.

    Maybe there will be another breakthrough in another area of AI research of which LLMs will form an important part, but the hype train has been running hard to categorise LLMs as AI, which is disingenuous. Theyre incredibly impressive non-intelligent automatic text generators.