|March 24th, 2008||#1|
Join Date: Nov 2003
Blog Entries: 34
The Legal Situation in China
[Interesting discussion of the techniques China uses to censor the Internet.]
In America, the Internet was originally designed to be free of choke points, so that each packet of information could be routed quickly around any temporary obstruction. In China, the Internet came with choke points built in. Even now, virtually all Internet contact between China and the rest of the world is routed through a very small number of fiber-optic cables that enter the country at one of three points: the Beijing-Qingdao-Tianjin area in the north, where cables come in from Japan; Shanghai on the central coast, where they also come from Japan; and Guangzhou in the south, where they come from Hong Kong. (A few places in China have Internet service via satellite, but that is both expensive and slow. Other lines run across Central Asia to Russia but carry little traffic.) In late 2006, Internet users in China were reminded just how important these choke points are when a seabed earthquake near Taiwan cut some major cables serving the country. It took months before international transmissions to and from most of China regained even their pre-quake speed, such as it was.
Thus Chinese authorities can easily do something that would be harder in most developed countries: physically monitor all traffic into or out of the country. They do so by installing at each of these few “international gateways” a device called a “tapper” or “network sniffer,” which can mirror every packet of data going in or out. This involves mirroring in both a figurative and a literal sense. “Mirroring” is the term for normal copying or backup operations, and in this case real though extremely small mirrors are employed. Information travels along fiber-optic cables as little pulses of light, and as these travel through the Chinese gateway routers, numerous tiny mirrors bounce reflections of them to a separate set of “Golden Shield” computers.Here the term’s creepiness is appropriate. As the other routers and servers (short for file servers, which are essentially very large-capacity computers) that make up the Internet do their best to get the packet where it’s supposed to go, China’s own surveillance computers are looking over the same information to see whether it should be stopped.
The mirroring routers were first designed and supplied to the Chinese authorities by the U.S. tech firm Cisco, which is why Cisco took such heat from human-rights organizations. Cisco has always denied that it tailored its equipment to the authorities’ surveillance needs, and said it merely sold them what it would sell anyone else. The issue is now moot, since similar routers are made by companies around the world, notably including China’s own electronics giant, Huawei. The ongoing refinements are mainly in surveillance software, which the Chinese are developing themselves. Many of the surveillance engineers are thought to come from the military’s own technology institutions. Their work is good and getting better, I was told by Chinese and foreign engineers who do “oppo research” on the evolving GFW so as to design better ways to get around it.
Andrew Lih, a former journalism professor and software engineer now based in Beijing (and author of the forthcoming book The Wikipedia Story), laid out for me the ways in which the GFW can keep a Chinese Internet user from finding desired material on a foreign site. In the few seconds after a user enters a request at the browser, and before something new shows up on the screen, at least four things can go wrong—or be made to go wrong.
The first and bluntest is the “DNS block.” The DNS, or Domain Name System, is in effect the telephone directory of Internet sites. Each time you enter a Web address, or URL—www.yahoo.com, let’s say—the DNS looks up the IP address where the site can be found. IP addresses are numbers separated by dots—for example, TheAtlantic.com’s is 18.104.22.168. If the DNS is instructed to give back no address, or a bad address, the user can’t reach the site in question—as a phone user could not make a call if given a bad number. Typing in the URL for the BBC’s main news site often gets the no-address treatment: if you try news.bbc.co.uk, you may get a “Site not found” message on the screen. For two months in 2002, Google’s Chinese site, Google.cn, got a different kind of bad-address treatment, which shunted users to its main competitor, the dominant Chinese search engine, Baidu. Chinese academics complained that this was hampering their work. The government, which does not have to stand for reelection but still tries not to antagonize important groups needlessly, let Google.cn back online. During politically sensitive times, like last fall’s 17th Communist Party Congress, many foreign sites have been temporarily shut down this way.
Next is the perilous “connect” phase. If the DNS has looked up and provided the right IP address, your computer sends a signal requesting a connection with that remote site. While your signal is going out, and as the other system is sending a reply, the surveillance computers within China are looking over your request, which has been mirrored to them. They quickly check a list of forbidden IP sites. If you’re trying to reach one on that blacklist, the Chinese international-gateway servers will interrupt the transmission by sending an Internet “Reset” command both to your computer and to the one you’re trying to reach. Reset is a perfectly routine Internet function, which is used to repair connections that have become unsynchronized. But in this case it’s equivalent to forcing the phones on each end of a conversation to hang up. Instead of the site you want, you usually see an onscreen message beginning “The connection has been reset”; sometimes instead you get “Site not found.” Annoyingly, blogs hosted by the popular system Blogspot are on this IP blacklist. For a typical Google-type search, many of the links shown on the results page are from Wikipedia or one of these main blog sites. You will see these links when you search from inside China, but if you click on them, you won’t get what you want.
The third barrier comes with what Lih calls “URL keyword block.” The numerical Internet address you are trying to reach might not be on the blacklist. But if the words in its URL include forbidden terms, the connection will also be reset. (The Uniform Resource Locator is a site’s address in plain English—say, www.microsoft.com—rather than its all-numeric IP address.) The site FalunGong .com appears to have no active content, but even if it did, Internet users in China would not be able to see it. The forbidden list contains words in English, Chinese, and other languages, and is frequently revised—“like, with the name of the latest town with a coal mine disaster,” as Lih put it. Here the GFW’s programming technique is not a reset command but a “black-hole loop,” in which a request for a page is trapped in a sequence of delaying commands. These are the programming equivalent of the old saw about how to keep an idiot busy: you take a piece of paper and write “Please turn over” on each side. When the Firefox browser detects that it is in this kind of loop, it gives an error message saying: “The server is redirecting the request for this address in a way that will never complete.”
The final step involves the newest and most sophisticated part of the GFW: scanning the actual contents of each page—which stories The New York Times is featuring, what a China-related blog carries in its latest update—to judge its page-by-page acceptability. This again is done with mirrors. When you reach a favorite blog or news site and ask to see particular items, the requested pages come to you—and to the surveillance system at the same time. The GFW scanner checks the content of each item against its list of forbidden terms. If it finds something it doesn’t like, it breaks the connection to the offending site and won’t let you download anything further from it. The GFW then imposes a temporary blackout on further “IP1 to IP2” attempts—that is, efforts to establish communications between the user and the offending site. Usually the first time-out is for two minutes. If the user tries to reach the site during that time, a five-minute time-out might begin. On a third try, the time-out might be 30 minutes or an hour—and so on through an escalating sequence of punishments.
Last edited by Alex Linder; March 24th, 2008 at 06:29 PM.