The quest to type Chinese on a QWERTY keyboard created autocomplete

This is an excerpt from The Chinese Computer: A Global History of the Information Age by Thomas S. Mullaney, published on May 28 by The MIT Press. It has been lightly edited. ymiw2 klt4 pwyy1 wdy6 o1 dfb2 wdv2 fypw3 uet5 dm2 dlu1 … A young Chinese man sat down at his QWERTY keyboard and…

The quest to type Chinese on a QWERTY keyboard created autocomplete

This is an excerpt from The Chinese Computer: A Global History of the Information Age by Thomas S. Mullaney, published on May 28 by The MIT Press. It has been lightly edited.

ymiw2

klt4

pwyy1

wdy6

o1

dfb2

wdv2

fypw3

uet5

dm2

dlu1 …

A young Chinese man sat down at his QWERTY keyboard and rattled off an enigmatic string of letters and numbers.

Was it code? Child’s play? Confusion? It was Chinese.

The beginning of Chinese, at least. These forty-four keystrokes marked the first steps in a process known as “input” or shuru: the act of getting Chinese characters to appear on a computer monitor or other digital device using a QWERTY keyboard or trackpad.

Stills taken from a 2013 Chinese input competition screencast.
Stills taken from a 2013 Chinese input competition screencast.
COURTESY OF MIT PRESS

Across all computational and digital media, Chinese text entry relies on software programs known as “Input Method Editors”—better known as “IMEs” or simply “input methods” (shurufa). IMEs are a form of “middleware,” so-named because they operate in between the hardware of the user’s device and the software of its program or application. Whether a person is composing a Chinese document in Microsoft Word, searching the web, sending text messages, or otherwise, an IME is always at work, intercepting all of the user’s keystrokes and trying to figure out which Chinese characters the user wants to produce. Input, simply put, is the way ymiw2klt4pwyy … becomes a string of Chinese characters.

IMEs are restless creatures. From the moment a key is depressed, or a stroke swiped, they set off on a dynamic, iterative process, snatching up user-inputted data and searching computer memory for potential Chinese character matches. The most popular IMEs these days are based on Chinese phonetics—that is, they use the letters of the Latin alphabet to describe the sound of Chinese characters, with mainland Chinese operators using the country’s official Romanization system, Hanyu pinyin. 

A series of screenshots of the Chinese Input Method Editor pop-up menu showing the process of typing (抄袭 / “plagiarism”).
Example of Chinese Input Method Editor pop-up menu (抄袭 / “plagiarism”)
COURTESY OF MIT PRESS

This young man’s name was Huang Zhenyu (also known by his nom de guerre, Yu Shi). He was one of around sixty contestants that day, each wearing a bright red shoulder sash—like a tickertape parade of old, or a beauty pageant. “Love Chinese Characters” (Ai Hanzi) was emblazoned in vivid, golden yellow on a poster at the front of the hall. The contestants’ task was to transcribe a speech by outgoing Chinese president Hu Jintao, as quickly and as accurately as they could. “Hold High the Great Banner of Socialism with Chinese Characteristics,” it began, or in the original:  高举中国特色社会主义伟大旗帜为夺取全面建设小康社会新胜利而奋斗. Huang’s QWERTY keyboard did not permit him to enter these characters directly, however, and so he entered the quasi-gibberish string of letters and numbers instead: ymiw2klt4pwyy1wdy6…

With these four-dozen keystrokes, Huang was well on his way, not only to winning the 2013 National Chinese Characters Typing Competition, but also to clock one of the fastest typing speeds ever recorded, anywhere in the world.

ymiw2klt4pwyy1wdy6 … is not the same as 高举中国特色社会主义 …  the keys that Huang actually depressed on his QWERTY keyboard—his “primary transcript,” as we could call it—were completely different than the symbols that ultimately appeared on his computer screen, namely the “secondary transcript” of Hu Jintao’s speech. This is true for every one of the world’s billion-plus Sinophone computer users. In Chinese computing, what you type is never what you get.

For readers accustomed to English-language word processing and computing, this should come as a surprise. For example, were you to compare the paragraph you’re reading right now against a key log showing exactly which buttons I depressed to produce it, the exercise would be unenlightening (to put it mildly). “F-o-r-_-r-e-a-d-e-r-s-_-a-c-c-u-s-t-o-m-e-d-_t-o-_-E-n-g-l-i-s-h … ” it would read (forgiving any typos or edits). In English-language typewriting and computer input, a typist’s primary and secondary transcripts are, in principle, identical. The symbols on the keys and the symbols on the screen are the same.

Not so for Chinese computing. When inputting Chinese, the symbols a person sees on their QWERTY keyboard are always different from the symbols that ultimately appear on the monitor or on paper. Every single computer and new media user in the Sinophone world—no matter if they are blazing-fast or molasses-slow—uses their device in exactly the same way as Huang Zhenyu, constantly engaged in this iterative process of criteria-candidacy-confirmation, using one IME or another. Not some Chinese-speaking users, mind you, but all. This is the first and most basic feature of Chinese computing: Chinese human-computer interaction (HCI) requires users to operate entirely in code all the time.

If Huang Zhenyu’s mastery of a complex alphanumeric code weren’t impressive enough, consider the staggering speed of his performance. He transcribed the first 31 Chinese characters of Hu Jintao’s speech in roughly 5 seconds, for an extrapolated speed of 372 Chinese characters per minute. By the close of the grueling 20-minute contest, one extending over thousands of characters, he crossed the finish line with an almost unbelievable speed of 221.9 characters per minute.

That’s 3.7 Chinese characters every second.

In the context of English, Huang’s opening 5 seconds would have been the equivalent of around 375 English words-per-minute, with his overall competition speed easily surpassing 200 WPM—a blistering pace unmatched by anyone in the Anglophone world (using QWERTY, at least). In 1985, Barbara Blackburn achieved a Guinness Book of World Records–verified performance of 170 English words-per-minute (on a typewriter, no less). Speed demon Sean Wrona later bested Blackburn’s score with a performance of 174 WPM (on a computer keyboard, it should be noted). As impressive as these milestones are, the fact remains: had Huang’s performance taken place in the Anglophone world, it would be his name enshrined in the Guinness Book of World Records as the new benchmark to beat.

Huang’s speed carried special historical significance as well.

For a person living between the years 1850 and 1950—the period examined in the book The Chinese Typewriter—the idea of producing Chinese by mechanical means at a rate of over two hundred characters per minute would have been virtually unimaginable. Throughout the history of Chinese telegraphy, dating back to the 1870s, operators maxed out at perhaps a few dozen characters per minute. In the heyday of mechanical Chinese typewriting, from the 1920s to the 1970s, the fastest speeds on record were just shy of eighty characters per minute (with the majority of typists operating at far slower rates). When it came to modern information technologies, that is to say, Chinese was consistently one of the slowest writing systems in the world.

What changed? How did a script so long disparaged as cumbersome and helplessly complex suddenly rival—exceed, even—computational typing speeds clocked in other parts of the world? Even if we accept that Chinese computer users are somehow able to engage in “real time” coding, shouldn’t Chinese IMEs result in a lower overall “ceiling” for Chinese text processing as compared to English? Chinese computer users have to jump through so many more hoops, after all, over the course of a cumbersome, multistep process: the IME has to intercept a user’s keystrokes, search in memory for a match, present potential candidates, and wait for the user’s confirmation. Meanwhile, English-language computer users need only depress whichever key they wish to see printed on screen. What could be simpler than the “immediacy” of “Q equals Q,” “W equals W,” and so on?

Tom Mullaney
COURTESY OF TOM MULLANEY

To unravel this seeming paradox, we will examine the first Chinese computer ever designed: the Sinotype, also known as the Ideographic Composing Machine. Debuted in 1959 by MIT professor Samuel Hawks Caldwell and the Graphic Arts Research Foundation, this machine featured a QWERTY keyboard, which the operator used to input—not the phonetic values of Chinese characters—but the brushstrokes out of which Chinese characters are composed. The objective of Sinotype was not to “build up” Chinese characters on the page, though, the way a user builds up English words through the successive addition of letters. Instead, each stroke “spelling” served as an electronic address that Sinotype’s logical circuit used to retrieve a Chinese character from memory. In other words, the first Chinese computer in history was premised on the same kind of “additional steps” as seen in Huang Zhenyu’s prizewinning 2013 performance.

During Caldwell’s research, he discovered unexpected benefits of all these additional steps—benefits entirely unheard of in the context of Anglophone human-machine interaction at that time. The Sinotype, he found, needed far fewer keystrokes to find a Chinese character in memory than to compose one through conventional means of inscription. By way of analogy, to “spell” a nine-letter word like “crocodile” (c-r-o-c-o-d-i-l-e) took far more time than to retrieve that same word from memory (“c-r-o-c-o-d” would be enough for a computer to make an unambiguous match, after all, given the absence of other words with similar or identical spellings). Caldwell called his discovery “minimum spelling,” making it a core part of the first Chinese computer ever built. 

Today, we know this technique by a different name: “autocompletion,” a strategy of human-computer interaction in which additional layers of mediation result in faster textual input than the “unmediated” act of typing. Decades before its rediscovery in the Anglophone world, then, autocompletion was first invented in the arena of Chinese computing.