Sep 3, 2025

LLM是个压缩器，写在OpenAI直播展示活动之前

Due to holidays and preparations for new projects, there hasn't been an update for two weeks. Of course, it is also because there were no particularly significant events requiring commentary. Most thoughts regarding models have been covered recently. While we can look forward, saying certain things now might seem too speculative, so it is better to wait until they appear—after all, the wait won't be long.

Next week, major tech companies have a series of events: OpenAI's live demo on May 13, Microsoft's series of developer events starting May 13, and Google's I/O Day on May 15.

The public, clear, or highly credible information is as follows:

OpenAI will have major updates regarding ChatGPT and GPT-4. While rumors about a search engine and GPT-5 have been publicly denied by Sam Altman, rumors about voice interaction and Agents seem quite substantial.
Microsoft might (perhaps on May 16?) release MAI-1, its first large-scale model after absorbing Inflection AI. It is rumored to have 500B parameters. This largely confirms our previous judgment or speculation that the relationship between Microsoft and OpenAI is becoming increasingly subtle.
It is still uncertain what Google will present, but the move to fully integrate Gemini into Workspace and Google One is already significant. Thus, seeing fast implementation would be good enough.

Undoubtedly, after a brief information vacuum, the tech giants will kick off a new cycle of major releases and updates. Discussing this in the context of last week's events makes it even more interesting:

Apple introduced the M4 chip, ending the life cycle of the M3 after only six months. They released the most expensive iPad Pro (also the most powerful). This product is actually too perfect; I like it a lot, but I just can't find a use case for it. At the same price, one could get a high-memory version of a MacBook.
The DeepSeek V2 model was released and open-sourced, providing a strong boost to domestic Chinese models.
During ICLR, DeepMind presented a series of papers. I found several of them very interesting. Their market impact is certainly not as shocking as AlphaFold3, but that was inevitable, wasn't it? Shouldn't we look at how models can be "jailbroken" to leak private information, or how LLMs can be used as foundational tools to change existing research frameworks?
People are focusing on a potential Transformer alternative: KAN. In my opinion, neither Mamba nor KAN is revolutionary compared to the original release of the Transformer. If we must make a comparison, it is better to look at x-LSTM. As a predecessor that lost out to the Transformer, LSTM is now being modified to fit within the broader Transformer architecture, which is quite valuable. Much of the so-called "mathematics" in current large model development focuses on reducing training/inference costs and enhancing memory—meaningful for engineering implementation, but far from "revolutionary."

Continuing from the fourth point: essentially, Large Language Models are knowledge compressors. With this compressor, we can have Q&A, search, function calls, write programs to build our own tools, and become agents...

Actually, since the moment GPT-4 was released, it proved that humanity has reached this level. The release of more models now only proves that: 1. GPT was not an accident; 2. The method of knowledge compression is now mastered by more people.

Thus, only two basic questions remain on this path: 1. What do we use it for? 2. Where do the models go from here?

Regarding the first question—which has been the focus for months—what kind of products will land? Consumer (C-end) or Business (B-end)? Software or hardware? How do we achieve positive commercial feedback?

This is why we see Apple entering the fray, releasing the M4 early, and even rumors of building data centers using M2 Ultra as the base chip. Everyone knows I highly value Apple's All-in-One capabilities. I still believe this capability is the most important when it comes to AI implementation.

This is why we see domestic models, represented by Kimi, becoming more user-friendly. More friends around me are starting to use Kimi as an indispensable tool in their daily work. This trend will likely accelerate in the coming months.

This is why we see more people believing in Cloud and SaaS.

This is why we see more people believing that AI PCs and AI phones will drive more demand.

...

However, the second question is what the tech giants are truly focused on. It may even determine the future survival of OpenAI.

We imagine AI based on ourselves. We admit that even if someone spent their entire life learning, the knowledge they could acquire is several orders of magnitude less than a large model. This is the advantage of digital computation.

Yet, while we take pride in our biological computing advantages, we also hope AI can possess these abilities: deep thinking, learning through interaction, memory...

On the basis of possessing the ability to compress the world's knowledge, the next generation of models needs to deliver more.

Echoing the title, there are about thirty hours left until OpenAI's live demo. We already know there won't be a so-called GPT-5, but we can still look forward to any information regarding unreleased model capabilities and the direction of the next step.

We don't necessarily need a successor to the Transformer yet; it is merely the foundation for the next step. Whichever giant takes that next step might signal that soon after, another giant will truly fall.

Or, there is another possibility: the likelihood of increased regulation is steadily rising...