Hard to Vary vs Overfit

# Hard to Vary vs Overfit * Okay, an interesting thought that's coming to mind is um if you are writing where you are coming up and needing to come up with a hard to very solution, large language models do a bad job. What is interesting is I just think a lot of what we do on a day to day basis is not hard to very. I think I would grant that. I think that that is actually true. It's like if I look at everything that we do it's like a lot of this stuff you're like okay yep you could have swapped in any word there. You know one of the reasons I think LLMs are so useful, I think it's kind of like we have this weird conflation. I think in terms of programming they are only, they're useful just because we have so many examples and this text is so unbelievably constrained in these examples. So again it's like it's kind of the perfect use case for a large language model. And then in terms of text of which you may write with it's so unbelievably frequently a lot of what we write, myself included, is so frequently easy to vary. It's easy to manipulate, it's easy to move around, you can swap in different words. You know LLMs constantly just kind of spitball a ton of stuff at you. When you try to get them to write something that is hard to vary, I mean I don't even think I literally feel like they're as useful as a random number generator spitting out text on the screen. That is how useless I have found it. And you could say that that's user error but then again it's like so then where's the intelligence lie? The intelligence lies in me. So I think basically it comes down to they're really good at two things. If they have pre-cached patterns, I think they're good at retrieving those. So that's just kind of like a text-based look up, if you will. And you may then say that so that's thing one. If you can just kind of retrieve a preset pattern and then I think honestly with this notion of temperature and kind of combining different things, a lot of stuff is just very easily combined. So a lot of stuff just, you're not actually producing good explanations. If you do, it's a pre-cached one and if you want an LLM to come up with a good explanation it is incredibly rare that it will be able to do so if it doesn't have a pre-cached pattern for it. The reason being good explanations are very hard to vary. So they're frequently going to have all sorts of things that are wrong. And again, so the counter argument to that may be well wait but like don't we have a kind of pre-cached pattern, they're sorry, don't we have good explanations? Wouldn't you say a program is a good explanation of something that you want to model? I would say yes but again that's a pre-cached pattern. I think that's just an interesting thing to riff on. * Hard to vary vs overfit? Think about cover parameters (if only one parameter did well, is that a hard to vary explanation or an overfit one)? Description vs explanation --- Date: 20250514 Links to: Tags: References: * []()