*Are LLMs Creative? - Outline*

# Are LLMs Creative? (Outline) ### Who would get great joy reading this article? Someone who wants to understand if LLMs can be thought of as truly creative in the way that humans are. ### Thesis The state of the art models that we have at present are not creative, and, more importantly, they have made no progress in moving towards creativity. ### What must you show in order for this to be a strong argument? 1. That LLMs as architected today, have not produced creative explanations. 2. That these approaches are similar to trying to reach the moon with a really long ladder. They are fundamentally doomed. # Outline ### Introduction 1. Recap introduction of series (just a few paragraphs) 2. Highlight LLMs in particular 3. Ask the question: Are LLMs creative? Provide examples where people say they are: 1. Creativity is extending patterns from one domain into another. Supposedly that is automatable 2. [Tweet / X](https://twitter.com/sama/status/1682493142845763585) 4. Then provide the mic drop: *they are not*. 5. Now you need to take a step back and ensure you and the reader are standing in the same place. This requires some digging in on: what are LLMs and what is creativity. ### What are LLMs? 1. Explain how transformers work. This is needed to ensure we are all [standing in the same place](The%20Pyramid%20Principle.md). Be sure to highlight: 1. Predict next token architecture 2. Information sharing / message passing 3. Residual Stream 4. They are induction based. They are data hungry. They have almost exhausted all data. Possibly add that there are ideas about generating their own data and then learning from that. ⚠️ ([Tweet / X](https://twitter.com/finbarrtimbers/status/1634250306426122241), [A step towards self-improving LLMs - by Finbarr Timbers](https://finbarrtimbers.substack.com/p/a-step-towards-self-improving-llms), [Tweet / X](https://twitter.com/sanjitjuneja/status/1672587834513817601)) 5. Challenges ([[2307.10169] Challenges and Applications of Large Language Models](https://arxiv.org/abs/2307.10169)) 6. Wolfram conversation with Fridman: Chatgpt is learning something like the "logic of human language" (recover that conversation) 7. Adversarial attacks 1. Gary Marcus grape juice example. Pg 49. Or come up with your own creative one! 2. What are some good examples saying that chatgpt is a combinatorial parrot and cannot effectively extrapolate or have goals? 2. Once you have explained this, pose the question: does this type of architecture prohibit creativity? ### What is creativity? 1. **Creativity via Interpolation vs via Extrapolation** 2. **Examples of Interpolative Creativity** * Any sort of combinatorial creativity? * Strategic vs input complexity (NGI pg 28) 3. **Examples of Extrapolative Creativity** * Consider Einstein, Deutsch, Gallileo, change in world views (Thomas Kuhn paradigm shifts, The Structure of Scientific Revolutions) * Really hone in on General Relativity and Quantum Mechanics. Could an LLM as architected today discover either? Need to learn more about how they were created * This would absolutely yield a testable hypothesis! * Darwin (relate it to analogy and world model required, this was used by taking info from geology-see dennett i believe) * Constructor theory, dark matter * BOI -> page 11 4. **Some open questions (that will be addressed in further detail later in the series)** * **What about Analogy?** * [Melanie Mitchell: Concepts, Analogies, Common Sense & Future of AI | Lex Fridman Podcast #61 - YouTube](https://www.youtube.com/watch?v=ImKkaeUx1MU) * See Hofstader * It seems that a large part about creativity comes from analogies? What do analogies require (do they require a world model of sorts? * could an llm learn analogies? Can LLMs reason by analogy and apply cross domain? (Eg in physics using analogies of water flow in pipes to understand current in circuits) * **Does it require a world model?** * Is learning a world model required? Can a transformer (with predict next word architecture) learn a world model? See [Actually, Othello-GPT Has A Linear Emergent World Representation — LessWrong](https://www.lesswrong.com/s/nhGNHyJHbrofpPbRG/p/nmxzr2zsjNtjaHh7x) * could an LLM create/does it have good internal concepts to utilize? Does it need this? * World model could allow for mental simulation and creative thinking (NGI pg 30) * **Does it require intelligence?** * On Measures of Intelligence Francois Chollet * Intelligence is the efficiency with which you turn experience into generalizable programs * Intelligence is a process. The output of that process is skill, but intelligence is the process that generated that skill. * Intelligence emerges from the interaction between a brain, a body and an environment * [Intelligence and Generalization](Intelligence%20and%20Generalization.md) ### Creativity in Knowledge Creation 1. **What is knowledge?** 1. See Deutsch 2. **What is knowledge creation?** * [2. Knowledge creation - Brett Hall](LLMs%20are%20Not%20Creative.md#2.%20Knowledge%20creation%20-%20Brett%20Hall) * [Knowledge vs Intelligence](LLMs%20are%20Not%20Creative.md#3.%20ChatGPT%20isn't%20that%20great) * [Critical and Creative Thinking - BRETT HALL](https://www.bretthall.org/critical-and-creative-thinking.html) * [David Deutsch: Knowledge Creation and The Human Race - YouTube](https://youtu.be/YyxepLfH1ZU?t=769) * [David Deutsch: Knowledge Creation and The Human Race](https://nav.al/david-deutsch) * Induction * Chapter 1 BOI * Philosophy in the real world (great examples) * It requires problems!!! But problems requires agency. This is directly aligned with popper (see philosophy in the real world), listen here: [](https://overcast.fm/+j_cH1KflM/53:14) must be able to notice a conflict in your ideas, have a body. * LLMs are induction based (learned from examples). But this requires a *problem*. Observation needs a problem, or a chosen object (philosophy and the real world - pg 30) * "The knowledge in human brains and the knowledge in biological adaptations are both created by evolution in the broad sense: the variation of existing information, alternating with selection. In the case of human knowledge, the variation is by conjecture, and the selection is by criticism and experiment. In the biosphere, the variation consists of mutations (random changes) in genes, and natural selection favors the variants that most improve the ability of their organisms to reproduce, thus causing those variant genes to spread through the population." - Deutsch, David. The Beginning of Infinity (p. 78). 3. **How is creativity used?** 4. **Thought experiments!** 1. Imagine an LLM give *all knowledge present* at the time of 1910. Could it have created the theory of general relativity? 2. Paradigm shifts (note that these are truly crucial to the advancement of civilization) 3. Need a less epic example ⚠️ 5. **The knowledge is not coming from chatgpt. The creativity is not intrinsic.** 1. Calculator again [](https://overcast.fm/+j_cH1KflM/59:03), Davis deutsch calculator example from reason is fun , why doesn’t wolfram alpha get more credit compared to ChatGPT? This is a GREAT example. Capacity to calculate English, vs Capacity to calculate mathematics. Why does calculating English get so much credit? Is it just easier to grok? Do a deep dive between logic in mathematics and the “logic” of English 2. Think of a convnet - the *programmer* had the creative idea to utilize *convolution*; not the system itself. * Need to include the useful follow up from Deutsch. Knowledge was encoded into humans genes. However, *humans did not create that knowledge*. The process of evolution did. * Automation vs persperation quote ### Synthesis: What does this mean? 1. [LLMs are Not Creative](LLMs%20are%20Not%20Creative.md) 2. **Predict the next token architecture doesn't seem suited to handle creativity + criticism** 1. Prediction is not the purpose of science, knowledge is (BOI pg 14). Yet, it feels like these are simply predict the next token engines? 2. [Critical and Creative Thinking 3 - BRETT HALL](https://www.bretthall.org/critical-and-creative-thinking-3.html) 3. Consider that knowledge creation (outside of occurring via evolution) can occur in large steps. A set of ideas going from idea A to Z to not need to all be viable in between. "Ideas can die in place of the organism". The question is, how does this map to the predict the next character approach? What about when you introduce chain of thought and python glue code approaches? 3. **Fundamentally Limited to Interpolation** * Interpolation vs Extrapolation (convex hull) * Can a NN learn the function y = x^2 outside of the support? What about as you add single points? * It *can* extrapolate if you restrict the hypothesis space! But this is an example of encoding knowledge into a system: [Tweet / Twitter](https://twitter.com/phillip_isola/status/1680566634514116609) * Is creativity simply the act of combining already known things? Or is it true extrapolation? * See section 3 of [On Creativity of LLMs](https://arxiv.org/pdf/2304.00008.pdf). ### Counter Arguments / Steel Man the other side * But wait, can't they create explanations? * [What AI can do with a toolbox... Getting started with Code Interpreter](https://www.oneusefulthing.org/p/what-ai-can-do-with-a-toolbox-getting) * Summarizing vs understanding information (address this further later in series) * [https://twitter.com/intuitmachine/status/1677315919410757634?s=46](https://twitter.com/intuitmachine/status/1677315919410757634?s=46) * According to deutsch, understanding comes through explanatory theories (for pg 6) * Phase transition / Emergence argument 1. But have we already hit the data bottle neck? 2. Is othello building a world model? If so, could that mean that as we get more data world models could be learned that would extend us pass predict the next token architectures? ### What does this mean / What should you do? 1. **We should not expect transformational AI** (see [Why transformative artificial intelligence is really, really hard to achieve](https://thegradient.pub/why-transformative-artificial-intelligence-is-really-really-hard-to-achieve/)) * Should we expect exponential/transformational AI? That would require (likely) that we get past the bottlenecks. In order to get past the bottleneck, all areas must have AI drastically improve their speed. But that means that innovation and creativity must be automated. My argument is that we are not there yet and won't be any time soon. [What if we could automate invention? - by Matt Clancy](https://mattsclancy.substack.com/p/what-if-we-could-automate-invention) 2. **If AI isn't creative, it cannot (on it's own) be dangerous** * Steel man: Connor Leahy argument that AI is dangerous: [mlst connor leahy - YouTube](https://www.youtube.com/results?search_query=mlst+connor+leahy). Note: We should disentangle these two things - being creative vs being dangerous. * Counter argument: But also consider that if AI is *not creative*, by Brett Hall's argument [here](https://www.bretthall.org/superintelligence-4.html), then AI should not be dangerous. 3. **Babble and Prune** * LLMs can help us babble more effectively. But note that this babble will (frequently) represent the *average* of human opinion on a given topic. Does the average of human opinion (or some fine tuned thing based on preexisting knowledge) actually help us? Is that the type of thing that spawns creativity? Or keeps us stuck? Is it just helpful to automate the babble side of the creativity then criticism look? * As these tools get better this will be a more appropriate technique 4. **Embrace your uniqueness, practice creativity** * Use these tools to get faster so you have more time to be creative. Practice creative thought, it is a skill! * Peter Theil ideas from 0 to 1 5. **don’t copy templates from people. Learning predefined patterns is not quite as useful** * I mean it still is, if you can apply them effectively. However, the true super power is being creative. * ### Categorize Below... ## Todos * Can’t be programmed to disobey or else you are obeying the command to disobey * if you try and build a repertoire of fine tuned skills, that still leaves you with N skills, but that is still infinitely far away from infinity! (Deutsch) * do we have enough data on creating new knowledge? Could that be a limiting factor? * Creative error correction (deutsch states most ideas are never actually tried/tested, the scientist simply rules them out in their mind. Could and llm do this?) * Transformers have learned algorithms that allow it to do things that it wasn't "programmed" to do. What does this say about induction? Didn't it learn this via induction? I thought we couldn't learn via induction...? * [https://twitter.com/phillip_isola/status/1680371125199548417?s=46](https://twitter.com/phillip_isola/status/1680371125199548417?s=46) transformers extrapolate due to constraints they place on hypothesis space * An explanation is hard to vary. This means there are some constraints present * Need deutschs most up to date definition of explanation - [The Tim Ferriss Show Transcripts: David Deutsch and Naval Ravikant — The Fabric of Reality, The Importance of Disobedience, The Inevitability of Artificial General Intelligence, Finding Good Problems, Redefining Wealth, Foundations of True Knowledge, Harnessing Optimism, Quantum Computing, and More (#662) – The Blog of Author Tim Ferriss](https://tim.blog/2023/03/24/david-deutsch-naval-ravikant-transcript/amp/) * fabric of reality, oracle theory * humans create explanations not in their genes. can llms do that for data not in their training set? * I am not entirely sure, but part of LLMs "power" could be due to us not understanding how large 65 billion parameters are and how those in term create a massive LSH * Ways of thinking - they have evolved and improved over time. Think about Aristotle to Galileo to newton to Einstein. Could an llm trained on all data from one way of thinking at a point in time ever evolve to the next? * Idea - what discoveries that have been made since 2021 can chat gpt not even remotely grok? * Consider non Euclidean geometry. To get to that we had to question assumptions. But questioning assumptions themselves was a *paradigm shift*. Now an llm could be programmed to question assumptions, but could it learn new *paradigm shifts* on its own? No * Can ai even identify problems? Problems are crucial to poppers philosophy! But can ai determine a problem even is? * Probe gpt 4 on deep intuitions: * [Linear Algebra Fundamentals](https://chat.openai.com/share/f3f1f9db-b96f-4346-98a0-283b8e8898ca) * Compare w/ the key intuition of [The Spirit of Linear Algebra](The%20Spirit%20of%20Linear%20Algebra.md) and Jeremy Kuhn's book (chapter on linear algebra)