The Dangers of Knowing Everything

By now I’m sure you’ve heard that the Internet is obsessed with ChatGPT. I’ve been watching from the sidelines as people find more and more uses for our current favorite large language model (LLM) toy. Why a toy and not a full-blown solution to all our ills? Because ChatGPT has one glaring flaw that I can see right now that belies its immaturity. ChatGPT knows everything. Or at least it thinks it does.

Unknown Unknowns

If I asked you the answer to a basic trivia question you could probably recall it quickly. Like “who was the first president of the United States?” These are answers we have memorized over the years to things we are expected to know. History, math, and even written communication has questions and answers like this. Even in an age of access to search engines we’re still expected to know basic things and have near-instant recall.

What if I asked you a trivia question you didn’t know the answer to? Like “what is the name of the metal cap at the end of a pencil?” You’d likely go look it up on a search engine or on some form of encyclopedia. You don’t know the answer so you’re going to find it out. That’s still a form of recall. Once you learn that it’s called a ferrule you’ll file it away in the same place as George Washington, 2+2, and the aglet as “things I just know”.

Now, what if I asked you a question that required you to think a little more than just recalling info? Such as “Who would have been the first president if George Washington refused the office?” Now we’re getting into more murky territory. Instead of being able to instantly recall information you’re going to have analyze what you know about the situation. For most people that aren’t history buffs they might recall who Washington’s vice president was and answer with that. History buffs might take more specialized knowledge about matters would apply additional facts and infer a different answer, such as Jefferson or even Samuel Adams. They’re adding more information to the puzzle to come up with a better answer.

Now, for completeness sake, what if I asked you “Who would have become the Grand Vizier of the Galactic Republic if Washington hadn’t been assassinated by the separatists?” You’d probably look at me like I was crazy and say you couldn’t answer a question like that because I made up most of that information or I’m trying to confuse you. You may not know exactly what I’m talking about but you know, based on your knowledge of elementary school history, that there is no Galactic Republic and George Washington was definitely not assassinated. Hold on to this because we’ll come back to it later.

Spinning AI Yarns

How does this all apply to a LLM? The first thing to realize is that LLMs are not replacements for search engines. I’ve heard of many people asking ChatGPT basic trivia and recall type questions. That’s not what LLMs are best at. We have a multitude of ways to learn trivia and none of them need the power of a cloud-scale computing cluster interpreting inputs. Even asking that trivia question to a smart assistant from Apple or Amazon is a better way to learn.

So what does an LLM excel at doing? Nvidia will tell you that it is “a deep learning algorithm that can recognize, summarize, translate, predict and generate text and other content based on knowledge gained from massive datasets”. In essence it can take a huge amount of input, recognize certain aspects of it, and produce content based on the requirements. That’s why ChatGPT can “write” things in the style of something else. It knows what that style is supposed to look and sound like and can produce an output based on that. It analyzes the database and comes up with the results using predictive analysis to create grammatically correct output. Think of it like Advanced Predictive Autocorrect.

If you think I’m oversimplifying what LLMs like ChatGPT can bring to the table then I challenge you to ask it a question that doesn’t have an answer. If you really want to see it work some magic ask it something oddly specific about something that doesn’t exist, especially if that process involves steps or can be broken down into parts. I’d bet you get an answer at least as many times as you get something back that is an error message.

To me, the problem with ChatGPT is that the model is designed to produce an answer unless it has specifically been programmed not to do so. There are a variety of answers that the developers have overridden in the algorithm, usually something racially or politically sensitive. Otherwise ChatGPT is happy to spit out lots of stuff that looks and sounds correct. Case in point? This gem of a post from Joy Larkin of ZeroTier:

https://mastodon.social/@joy/109859024438664366

Short version: ChatGPT gave a user instructions for a product that didn’t exist and the customer was very frustrated when they couldn’t find the software to download on the ZeroTier site. The LLM just made up a convincing answer to a question that involved creating something that doesn’t exist. Just to satisfy the prompt.

Does that sound like a creative writing exercise to you? “Imagine what a bird would look like with elephant feet.” Or “picture a world where people only communicated with dance.” You’ve probably gone through these exercises before in school. You stretch your imagination to take specific inputs and produce outputs based on your knowledge. It’s like the above mention of applied history. You take inputs and produce a logical outcome based on facts and reality.

ChatGPT is immature enough to not realize that some things shouldn’t be answered. If you use a search engine to find the steps to configure a feature on a product the search algorithm will return a page that has the steps listed. Are the correct? Maybe. Depends on how popular the result is. But the results will include a real product. If you search for nonexistent functionality or a software package that doesn’t exist your search won’t have many results.

ChatGPT doesn’t have a search algorithm to rely on. It’s based on language. It’s designed to approximate writing when given a prompt. That means, aside from things it’s been programmed not to answer, it’s going to give you an answer. Is it correct? You won’t know. You’d have to take the output and send it to a search engine to determine if that even exists.

The danger here is that LLMs aren’t smart enough to realize they are creating fabricated answers. If someone asked me how to do something that I didn’t know I would preface my answer with “I’m not quite sure but this is how I think you would do it…” I’ve created a frame of reference that I’m not familiar with the specific scenario and that I’m drawing from inferred knowledge to complete the task. Or I could just answer “I don’t know” and be done with it. ChatGPT doesn’t understand “I don’t know” and will respond with answers that look right according to the model but may not be correct.


Tom’s Take

What’s funny is that ChatGPT has managed to create an approximation of another human behavior. For anyone that has ever worked in sales you know one of the maxims is “never tell the customer ‘no'”. In a way, ChatGPT is like a salesperson. No matter what you ask it the answer is always yes, even if it has to make something up to answer the question. Sci-fi fans know that in fiction we’ve built guardrails for robots to save our society from being harmed by functions. AI, no matter how advanced, needs protections from approximating bad behaviors. It’s time for ChatGPT and future LLMs to learn that they don’t know everything.

ChatGPT and Creating For Yourself

I’m sure you’ve been inundated by posts about ChatGPT over the past couple of weeks. If you managed to avoid it the short version is that there is a new model from OpenAI that can write articles, create poetry, and basically answer your homework. Lots of people are testing it out for things as mundane as writing Amazon reviews or creating configurations for routers.

It’s not a universal hit though. Stack Overflow banned ChatGPT code answers because they’re almost always wrong. My own limited tests show that it can create a lot of words from a prompt that seem to sound correct but feel hollow. Many others have accused the algorithm of scraping content from others on the Internet and sampling it into answers to make it sound accurate but not the best answer to the question.

Are we ready for AI to do our writing for us? Is the era of the novelist or technical writer finished? Should we just hang up our keyboards and call it a day?

Byte-Sized Content

When I was deciding what I wanted to do with my life after college I took the GMAT to see if I could get into grad school for an MBA. I scored well on the exam but not quite to the magical level to get a scholarship. However, one area that I did do surprising well in for myself was the essay writing section. I bought a prep book that had advice for the major sections but spent a lot of time with the writing portion because it was relatively new at the time and many people were having issues with how to write an essay. The real secret is that the essay was graded by a computer, so you just had to follow a formula to succeed:

  1. Write an opening paragraph covering what you’re going to say with three points of discussion.
  2. Write a paragraph about point 1 and provide details to support it.
  3. Report for points 2 & 3
  4. Write a summary paragraph restating what you said in the opening.

That’s it. That’s the formula to win the GMAT writing portion. The computer isn’t looking for insightful poetry or groundbreaking sci-fi world building. It’s been trained to look for structure. Main idea statements, supporting evidence, and conclusions all tick boxes that provide points to pass the section.

If all that sounds terribly boring and formulaic you’re absolutely right. Passing a test of competence isn’t the same and pushing the boundaries of the craft. A poet like e e cummings would have failed because his work has no structure and contains capitalization errors compared to the standards of grammar. Yet no one would deny that he is a master of his craft. Likewise, always following the standards is only important when you want to create things that already exist.

Free Thinking

Tech writing is structured but often involves new ideas that aren’t commonplace. How can you train an algorithm to write about Zero Trust Network Architecture or VR surgery if no examples of that exist yet? Can you successfully tell ChatGPT to write about space exploration through augmented reality if no one has built it yet? Even if you asked would you know what sounded correct from the reply.

Part of the issue comes from content consumption. We read things and assume they are correct. Words were written so they must have been researched and confirmed before being committed to the screen. Therefore we tend to read content in a passive form. We’re not reacting to what we’re seeing but instead internalizing it for future use. That’s fine if we’re reading for fun or not thinking critically about a subject. But for technical skills it is imperative that we’re constantly challenging what’s written to ensure that it’s accurate and useful.

If we only consumed content passively we’d never explore new ideas or create new ways to achieve outcomes. Likewise, if the only content we have is created by algorithm based on existing training and thought patterns we will never evolve past the point we are today. We can’t hope that a machine will have the insight to look beyond the limitations imposed upon it by the bounds of the program. I talked about this over six years ago where I said that machine learning would always give you great answers but true AI would be able to find them where they don’t exist.

That’s my real issue with ChatGPT. It’s great at producing content that is well within the standard deviation of what is expected. It can find answers. It can’t create them. If you ask it how to enter lunar orbit it can tell you. But if you ask it how to create a spacecraft to get to a moon in a different star system it’s going to be stumped. Because that hasn’t been created yet. It can only tell you what it’s seen. We won’t evolve as a species unless we remember that our machines are only as good as the programming we impart to them.


Tom’s Take

ChatGPT and programs like Stable Diffusion are fun. They show how far our technology has come. But they also illustrate the importance that we as creative beings can still have. Programs can only create within their bounds. Real intelligence can break out of the mold and go places that machines can’t dream of. We’ve spent billions of dollars and millions of hours trying to train software to think like a human and we’ve barely scratched the surface. What we need to realize is that while we can write software that can approximate how a human can think we can never replace the ability to create something from nothing.