Comment on
Can gzip be a language model?
This is precisely why LLMs and AI are 99% a scam.
AI is the minimum value of a well structured dataset obtained by a destructive, hallucinatory compressor that is MAXIMALLY inefficient with inefficiency increasing nonlinearly as a model gets bigger... while it does demonstrate power, this observation that a well structured, QC'd, properly curated dataset reveals a latent intelligence in good data clearly points to functional programming, relational programming and the profession of the librarian and archivist as the directions where the genesis point of intelligence can be pursued, not these bullshitting AI's which demonstrate a degraded truth in a stupendously hamfisted, wasteful way that sends amateurs looking hopelessly looking in the wrong direction.
In otherwords, the algorithm and model are worthless, costly junk, it is the well structured, large high quality dataset and the humans that maintain and contextualize it that are precious.
Scientists could have told computer people this was true a long time ago if they had listened.