


Andrew Ng's ChatGPT class went viral: AI gave up writing words backwards, but understood the whole world
Jun 03, 2023 pm 09:27 PMI didn’t expect that ChatGPT would still make stupid mistakes to this day?
Master Andrew Ng pointed it out at the latest class:
ChatGPT will not reverse words!
For example, let it reverse the word lollipop, and the output is pilollol, which is completely confusing.
Oh, this is indeed a bit surprising.
So much so that after netizens who listened to the class posted on Reddit, they immediately attracted a large number of onlookers, and the post quickly reached 6k views.
And this is not an accidental bug. Netizens found that ChatGPT is indeed unable to complete this task, and the results of our personal testing are also the same.
- 1 token≈4 English characters≈three-quarters of a word;
- 100 tokens≈75 words;
- 1-2 sentences ≈30 tokens;
- A paragraph ≈ 100 tokens, 1500 words ≈ 2048 tokens;
The higher the token-to-char (token to word) ratio, the higher the processing cost. Therefore, processing Chinese tokenize is more expensive than English.
It can be understood that token is a way for large models to understand the real world of humans. It's very simple and greatly reduces memory and time complexity.
But there is a problem with tokenizing words, which makes it difficult for the model to learn meaningful input representations. The most intuitive representation is that it cannot understand the meaning of the words.
At that time, Transformers had done corresponding optimization. For example, a complex and uncommon word was divided into a meaningful token and an independent token.
Just like "annoyingly" is divided into two parts: "annoying" and "ly", the former retains its own meaning, while the latter is more common.
This has also resulted in the amazing effects of ChatGPT and other large model products today, which can understand human language very well.
As for the inability to handle such a small task as word reversal, there is naturally a solution.
The simplest and most direct way is to separate the words yourself~
Or you can let ChatGPT do it step by step , first tokenize each letter.
Or maybe let it write a program that reverses letters, and then the result of the program is correct. (dog head)
However, GPT-4 can also be used, and there is no such problem in actual testing.
△Actual measurement GPT-4
In short, token is the cornerstone of AI’s understanding of natural language.
As a bridge for AI to understand human natural language, the importance of tokens has become increasingly obvious.
It has become a key determinant of the performance of AI models and the billing standard for large models.
There is even token literature
As mentioned above, token can facilitate the model to capture more fine-grained semantic information, such as word meaning, word order, grammatical structure, etc. In sequence modeling tasks (such as language modeling, machine translation, text generation, etc.), position and order are very important for model building.
Only when the model accurately understands the position and context of each token in the sequence can it predict the content better and correctly and give reasonable output.
Therefore, the quality and quantity of tokens have a direct impact on the model effect.
Starting this year, when more and more large models are released, the number of tokens will be emphasized. For example, the details of the exposure of Google PaLM 2 mentioned that it used 3.6 trillion tokens for training.
And many big names in the industry have also said that tokens are really crucial!
Andrej Karpathy, an AI scientist who switched from Tesla to OpenAI this year, said in his speech:
More tokens can enable models Think better.
And he emphasized that the performance of the model is not determined solely by the parameter size.
For example, the parameter size of LLaMA is much smaller than that of GPT-3 (65B vs 175B), but because it uses more tokens for training (1.4T vs 300B), LLaMA is more powerful.
With its direct impact on model performance, token is still the billing standard for AI models.
Take OpenAI’s pricing standard as an example. They charge in units of 1K tokens. Different models and different types of tokens have different prices.
In short, once you step into the field of AI large models, you will find that token is an unavoidable knowledge point.
Well, even token literature has been derived...
But it is worth mentioning that, what role does token play in the Chinese world? What it should be translated into has not been fully decided yet.
The literal translation of "token" is always a bit weird.
GPT-4 thinks it is better to call it "word element" or "tag", what do you think?
Reference link:
[1]https://www.reddit.com/r/ChatGPT/comments/13xxehx/chatgpt_is_unable_to_reverse_words/
[2]https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
[3]https://openai.com /pricing
The above is the detailed content of Andrew Ng's ChatGPT class went viral: AI gave up writing words backwards, but understood the whole world. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Ethereum is a decentralized application platform based on smart contracts, and its native token ETH can be obtained in a variety of ways. 1. Register an account through centralized platforms such as Binance and Ouyiok, complete KYC certification and purchase ETH with stablecoins; 2. Connect to digital storage through decentralized platforms, and directly exchange ETH with stablecoins or other tokens; 3. Participate in network pledge, and you can choose independent pledge (requires 32 ETH), liquid pledge services or one-click pledge on the centralized platform to obtain rewards; 4. Earn ETH by providing services to Web3 projects, completing tasks or obtaining airdrops. It is recommended that beginners start from mainstream centralized platforms, gradually transition to decentralized methods, and always attach importance to asset security and independent research, to

The most suitable tools for querying stablecoin markets in 2025 are: 1. Binance, with authoritative data and rich trading pairs, and integrated TradingView charts suitable for technical analysis; 2. Ouyi, with clear interface and strong functional integration, and supports one-stop operation of Web3 accounts and DeFi; 3. CoinMarketCap, with many currencies, and the stablecoin sector can view market value rankings and deans; 4. CoinGecko, with comprehensive data dimensions, provides trust scores and community activity indicators, and has a neutral position; 5. Huobi (HTX), with stable market conditions and friendly operations, suitable for mainstream asset inquiries; 6. Gate.io, with the fastest collection of new coins and niche currencies, and is the first choice for projects to explore potential; 7. Tra

The real use of battle royale in the dual currency system has not yet happened. Conclusion In August 2023, the MakerDAO ecological lending protocol Spark gave an annualized return of $DAI8%. Then Sun Chi entered in batches, investing a total of 230,000 $stETH, accounting for more than 15% of Spark's deposits, forcing MakerDAO to make an emergency proposal to lower the interest rate to 5%. MakerDAO's original intention was to "subsidize" the usage rate of $DAI, almost becoming Justin Sun's Solo Yield. July 2025, Ethe

What is Treehouse(TREE)? How does Treehouse (TREE) work? Treehouse Products tETHDOR - Decentralized Quotation Rate GoNuts Points System Treehouse Highlights TREE Tokens and Token Economics Overview of the Third Quarter of 2025 Roadmap Development Team, Investors and Partners Treehouse Founding Team Investment Fund Partner Summary As DeFi continues to expand, the demand for fixed income products is growing, and its role is similar to the role of bonds in traditional financial markets. However, building on blockchain

Table of Contents Crypto Market Panoramic Nugget Popular Token VINEVine (114.79%, Circular Market Value of US$144 million) ZORAZora (16.46%, Circular Market Value of US$290 million) NAVXNAVIProtocol (10.36%, Circular Market Value of US$35.7624 million) Alpha interprets the NFT sales on Ethereum chain in the past seven days, and CryptoPunks ranked first in the decentralized prover network Succinct launched the Succinct Foundation, which may be the token TGE

A verbal battle about the value of "creator tokens" swept across the crypto social circle. Base and Solana's two major public chain helmsmans had a rare head-on confrontation, and a fierce debate around ZORA and Pump.fun instantly ignited the discussion craze on CryptoTwitter. Where did this gunpowder-filled confrontation come from? Let's find out. Controversy broke out: The fuse of Sterling Crispin's attack on Zora was DelComplex researcher Sterling Crispin publicly bombarded Zora on social platforms. Zora is a social protocol on the Base chain, focusing on tokenizing user homepage and content

Directory What is Zircuit How to operate Zircuit Main features of Zircuit Hybrid architecture AI security EVM compatibility security Native bridge Zircuit points Zircuit staking What is Zircuit Token (ZRC) Zircuit (ZRC) Coin Price Prediction How to buy ZRC Coin? Conclusion In recent years, the niche market of the Layer2 blockchain platform that provides services to the Ethereum (ETH) Layer1 network has flourished, mainly due to network congestion, high handling fees and poor scalability. Many of these platforms use up-volume technology, multiple transaction batches processed off-chain

The failure to register a Binance account is mainly caused by regional IP blockade, network abnormalities, KYC authentication failure, account duplication, device compatibility issues and system maintenance. 1. Use unrestricted regional nodes to ensure network stability; 2. Submit clear and complete certificate information and match nationality; 3. Register with unbound email address; 4. Clean the browser cache or replace the device; 5. Avoid maintenance periods and pay attention to the official announcement; 6. After registration, you can immediately enable 2FA, address whitelist and anti-phishing code, which can complete registration within 10 minutes and improve security by more than 90%, and finally build a compliance and security closed loop.
