Текст книги "Geek Sublime: The Beauty of Code, the Code of Beauty"
Автор книги: Vikram Chandra
Жанры:
Современная проза
,сообщить о нарушении
Текущая страница: 2 (всего у книги 15 страниц)
I was saved by a fellow graduate student who had noticed my burgeoning geekiness. By now I was walking around campus with eight-hundred-page computer manuals tucked under my arm and holding forth about the video-game virtues of Leisure Suit Larry in the Land of the Lounge Lizardsin the grad-student lounge. My friend asked me to help set up her new computer, and I arrived at her house with a painstakingly curated collection of bootlegged programs and freeware utilities and an extra-large bottle of Diet Coke. She just wanted to be able to write short stories and print them out, but one of the preeminent signs of computer mania is a fanatical exactitude, a desire to have the system work just so. I tricked out her machine, emptied my bottle of soda, and then gave her my standard lecture about on-site and off-site backups and the importance of regular hard-disk checks and defragging. She looked a bit overwhelmed, but a couple of weeks later she called to ask if I would help a friend of hers, the owner of a local bookstore, with his new computers at the shop. “They’ll pay you,” she said.
Pay me? For letting me play with their new machines, no doubt still boxed and unsullied and ripe for my superior setup skills? This seemed incredible, but I gathered myself and said, “Sure, sounds good.” This was the beginning of a busy and profitable career as an independent computer consultant, which in short order led to paid programming gigs. As many consultants and programmers do, I learned on the job – if I didn’t know how to do something, Usenet and the technical sections of bookstores pointed me in the general direction of a solution. I was fairly scrupulous about not billing clients for the hours spent educating myself, more from a desire not to overprice my services than moral rectitude. I did provide value – word of mouth gave me a growing list of clients, and I was able to raise my hourly rate steadily.
I set up computers for elegant ladies in River Oaks and gave them word-processing lessons; I went out to factories and offices in the hinterlands of Houston to observe assembly lines and then modeled workflows and production processes. The programming I did was journeyman work; I mostly wrote CRUD applications, menu-driven screens that let the users Create, Retrieve, Update, and Delete records that tracked whatever product or service they provided: precision-engineered drill parts for high-heat applications, workers for the oil industry, reservations at restaurants. Simple stuff, but useful, and I always felt like I was learning, and making good money, sometimes even great money. I could afford biannual trips to India. Programming in America paid for my research and writing. I managed to get through graduate school without taking any loans, finished my novel, found an agent.
After the novel was published, I accepted a university teaching job in creative writing, and finally gave up the professional freelance computer work. It had served me well. Now it was time to write.
I found, soon enough, that although I may have stopped chasing the fat consulting paychecks, the impulse to program had not left me. The work of making software gave me a little jolt of joy each time a piece of code worked; when something wasn’t working, when the problem resisted and made me rotate the contours of the conundrum in my mind, the world fell away, my body vanished, time receded. And three or five hours later, when the pieces of the problem came together just so and clicked into a solution, I surfed a swelling wave of endorphins. On the programming section of Reddit, a popular social news site, a beginner posted a picture of his first working program with the caption, “For most of you, this is surely child [ sic] play, but holy shit, this must be what it feels like to do heroin for the first time.” 2Even after you are long past your first “Hello, world!” there is an infinity of things to learn, you are still a child, and – if you aren’t burned out by software delivery deadlines and management-mandated all-nighters – coding is still play. You can slam this pleasure spike into your veins again and again, and you want more, and more, and more. It’s mostly a benign addiction, except for the increased risks of weight gain, carpal tunnel syndrome, bad posture, and reckless spending on programming tools you don’t really need but absolutely must have.
So I indulged myself and puttered around and made little utilities, and grading systems, and suchlike. I was writing fiction steadily, but I found that the stark determinisms of code were a welcome relief from the ambiguities of literary narrative. By the end of a morning of writing, I was eager for the pleasures of programming. Maybe because I no longer had to deliver finished applications and had time to reflect, I realized that I had no idea what my code actually did. That is, I worked within a certain language and formal system of rules, I knew how the syntax of this language could be arranged to affect changes in a series of metaphors – the “file system,” the “desktop,” “Windows”—but the best understanding I had of what existed under these conceptualizations was a steampunk-ish series of gearwheels taken from illustrations of Charles Babbage’s Difference Engine. So now I made an attempt to get closer to the metal, to follow the effects of my code down into the machine.
3 THE LANGUAGE OF LOGIC
The seven lines of the “Hello, world!” code at the beginning of this book – written in Microsoft’s C# language – do nothing until they are swallowed and munched by a specialized program called a compiler, which translates them into thirty-odd lines of “Common Intermediate Language” (CIL) that look like this:
This, as the name of the language indicates, is a mediating dialect between human and machine. You could write a “Hello, world!” program in another Microsoft language like Visual Basic and get almost exactly the same listing, which is how the program is stored on disk, ready to run. When you do run it, the CIL is converted yet again, this time into machine code:
Now we’re really close to computing something, but not quite yet. Machine code is actually a really low-level programming language which encodes one or more instructions as numbers. The numbers are displayed above in a hexadecimal format, which is easier for humans to read than the binary numbers (“101010110001011 …”) sent to the computer’s central processing unit (CPU). This CPU is able to accept these numbers, each of which represents an instruction native to that particular type of CPU; the CPU reacts to each number by tripping its logic gates, which is to say that a lot of physical changes cascade in a purely mechanical fashion through the chips and platters in that box on your desk, and “Hello, world!” appears on your screen.
But, but – what are “logic gates”? Before I began my investigation of the mechanics of computing, this phrase evoked some fuzzy images of ones and zeros and intricate circuits, but I had no idea how all of this worked together to produce “Hello, world!” on my screen. This is true of the vast majority of people in the world. Each year, I ask a classroom of undergraduate students at Berkeley if they can describe how a logic gate works, and usually out of about a hundred-odd juniors and seniors, I get one or two who are able to answer in the affirmative, and typically these are computer science or engineering majors. There are IT professionals who don’t know how computers really work; I certainly was one of them, and here is “Rob P.” on the “programmers” section of stackexchange.com, a popular question-and-answer site:
This is almost embarrassing [to] ask … I have a degree in Computer Science (and a second one in progress). I’ve worked as a full-time.NET Developer for nearly five years. I generally seem competent at what I do.
But I Don’t Know How Computers Work![Emphasis in the original.]
I know there are components … the power supply, the motherboard, ram, CPU, etc … and I get the “general idea” of what they do. But I really don’t understand how you go from a line of code like Console.Readline() in.NET (or Java or C++) and have it actually dostuff. 1
How logic gates “do stuff” is dazzlingly simple. But before we get to their elegant workings, a little primer on terminology: you will remember that the plus sign in mathematical notation (as in “2 + 3”) can be referred to as the “addition operator.” The minus sign is similarly the “subtraction operator,” the forward slash is the “division operator,” and so on. Mostly, we non-mathematicians treat the operators as convenient, almost-invisible markers that tell us which particular kindergarten-vintage practice we should apply to the all-important digits on either side of themselves. But there is another way to think about operators: as functions that consume the digits and output a result. Perhaps you could visualize the addition operator as a little machine like this, which accepts inputs on the left and produces the output on the right:
So if you give this “Add” machine the inputs “3” and “2,” it will produce the result “5.”
A “Subtract” operator might be imagined like this:
So, giving this “Subtract” operator a first input of “4.2” and a second input of “2.2” will cause it to output “2.0.”
The mathematical addition and subtraction operators above act on numbers, and only on numbers. In his 1847 monograph, The Mathematical Analysis of Logic, George Boole proposed a similar algebra for logic. In 1854, he corrected and extended these ideas about the application of symbolic algebra to logic with a seminal book, An Investigation of the Laws of Thought, on Which Are Founded the Mathematical Theories of Logic and Probabilities.In “Boolean algebra,” the only legal inputs and outputs are the logical values “true” and “false”—nothing else, just “true” and “false.” The operators which act on these logical inputs are logical functions such as AND (conjunction), OR (disjunction), and NOT (negation). So the logical AND operator might look like this:
The AND or conjunction operator, according to Boole, outputs “true” only when both inputs are “true.” That is, it works like this:
Input 1
Input 2
Output
false
false
false
false
true
false
true
false
false
true
true
true
If you gave the Boolean operator AND a first input of “false” and a second input of “true,” it would output “false.”
The output of “(Teddy can fly) AND (Teddy is a dog)” would therefore be “false.” But the output of “(Teddy is a dog) AND (Teddy has a keen sense of smell)” would be “true.”
Other operators work similarly. The “truth table” for the Boolean OR operator would look like this:
Input 1
Input 2
Output
false
false
false
false
true
true
true
false
true
true
true
true
So, the output of “(Teddy can fly) OR (Teddy is a dog)” would be “true.” That is, a first input of “false” and a second input of “true” would produce the output “true.”
If one were to adopt the convention that “false” was represented by the digit “0” and “true” by “1,” the functioning of the OR operator could be represented as follows:
Input 1
Input 2
Output
0
0
0
0
1
1
1
0
1
1
1
1
And so we could draw our OR operator example like this:
The XOR operator – sometimes referred to as the “exclusive-OR” operator – is a variation of the OR operator. It outputs “true” if either, but not both, of the inputs are “true.”
Input 1
Input 2
Output
0
0
0
0
1
1
1
0
1
1
1
0
You can think of XOR as an “either-or” operator – it returns “true” if one of the inputs is true, but returns “false” if both of the inputs are “true” or if both of the inputs are “false.” For example, a future robot-run restaurant might use an XOR operation to test whether your sandwich order was valid: “(With soup) XOR (With salad)”—you could have soup or salad, but not both or nothing. The last line of the truth table for the XOR operator could be drawn like this:
Boolean algebra allows the translation of logical statements into mathematical expressions. By substituting ones and zeros into our soup-XOR-salad statement above, we can get back another number we can use in further operations.
Now here’s the magical part: you can build simple physical objects – logic gates – that mechanically reproduce the workings of Boolean operators. Here is an AND logic gate built out of LEGO brides, cogs, and wheels by Martin Howard, a physicist from the UK:
This is a push-pull logic gate. The two levers on the left are for input: a pushed-in lever represents a value of “true” or “1,” while a lever in the “out” position represents a “false” or “0.” The mechanism of the logic gate – its gears and rods – has been set up so that when you push in the input levers on the left, the output lever on the right automaticallytakes a position (in or out) that follows the workings of the Boolean logical operator AND. In figure 3.14, both input levers have been pushed in (set to “true” and “true”), and so the output lever has slid into a position representing “true” or “1.” Or, in Boolean terms, “true AND true = true.” Any possible positioning of the input levers will produce the correct positioning of the output lever. The logic gate always follows the truth table for the AND operator.
And here is a push-pull XOR logic gate that mimics the workings of the Boolean XOR operator:
If you’re still having trouble visualizing how these LEGO logic gates work, you can watch videos at http://www.randomwraith.com/logic.html. A physical logic gate is any device that – through artful construction – can correctly replicate the workings of one of the Boolean logical operators.
You may still be wondering how all of this leads us toward computation, toward calculation. Well, as it happens, you can also represent numbers in a binary fashion, with ones and zeros, or with absence and presence, off states and on states. In binary notation, the decimal number “3” is “11.” How does this work? You’ll recall from elementary school that in the decimal system, the position of a digit – from right to left – changes what that digit means. If you write the decimal number “393,” you are putting the digits into columns like this:
Hundreds
(10
2
)
Tens
(10
1
)
Ones
(10
0
)
3
9
3
So what you’re representing when you write “393” is something like “three hundreds, plus nine tens, plus three ones” or “(3 × 10 2) + (9 × 10 1) + (3 × 10 0).” A more precise way to think about the columns in the decimal system is to say each column, from right to left, represents an increase by a factor of ten. The small superscript number – the exponent – tells you how many times to use the number in a multiplication by itself. So, the “Hundreds” column represents quantities of “10 2” or “10 × 10.” Any number to the power of 1 gives you the number itself, so “10 1” is “10”; and any number to the power of zero is just “1,” so “10 0” is “1.”
In base-2 or binary notation, our column headings would look like this:
And you would write “393” in binary like this:
When you write the binary number “110001001,” you are putting a “1” in every column that you want to include in your reckoning of the quantity you are trying to represent. In a base-2 system, you have only two symbols you can use, “0” and “1” (as opposed to the ten symbols you use in a base-10 system, “0” through “9”). So, with “110001001,” you are representing something like “256, plus 128, plus 8, plus 1” or “(1 × 2 8) + (1 × 2 7) + (1 × 2 3) + (1 × 2 0)”—which equals decimal “393.”
Decimal “9” is the same as binary “1001,” and decimal “5” is binary “101”—all very baffling to the decimal-using brain, but completely consistent and workable. So if you wanted to add “9” to “5” in binary, it would look like this:
And binary “1110” is of course equivalent to “8 + 4 + 2 + 0” or decimal “14.” From the above, you can deduce the rules of binary addition:
0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 0, and carry 1
The last rule may seem a bit mystifying until you recall how addition works in decimal arithmetic. In decimal, you use the digits “0” through “9” to represent numbers; when the sum of the digits in a column exceeds “9” you write the least significant figure (“4” in the number “14”) in that column and carry the more significant figure (“1” in the number “14”) to the next column on the left. In binary notation, you can only use the digits “1” and “0”—so when you add “1” to “1,” you write “0” in that column and carry “1.”
Now this may begin to remind you of Boolean logic – you’re taking inputs of zeros and ones and sending out zeros and ones. In fact, except for the “carry 1” part of the last rule, this looks very much like the truth table for the XOR logical operator:
Input 1
Input 2
Output
0
0
0
0
1
1
1
0
1
1
1
0
It turns out that if you put together certain logical operators in clever ways, you can completely replicate addition in binary, including the “carry 1” part. Here is a schematic for a “half adder”—built by combining an XOR operator and an AND operator – which takes in two single binary digits and outputs a sum and an optional digit to carry.
So the half adder would function as follows:
And since we can build logic gates – physical objects that replicate logical operations – we should be able to build a physical half adder. And, indeed, here is a LEGO half adder built by Martin Howard.
At last we have computation, which is – according to the Oxford English Dictionary( OED) – the “action or process of computing, reckoning, or counting; arithmetical or mathematical calculation.” A “computer” was originally, as the OEDalso tells us, a “person who makes calculations or computations; a calculator, a reckoner; spec. a person employed to make calculations in an observatory, in surveying, etc.” Charles Babbage set out to create a machine that would replace the vast throngs of human computers who worked out logarithmic and trigonometric tables; what we’ve sketched out above are the beginnings of a mechanism which can do exactly that and more. You can use the output of the half adder as input for other mechanisms, and also continue to add logic gates to it to perform more complex operations. You can hook up two half adders together and add an OR logic gate to make a “full adder,” which will accept two binary digits and also a carry digit as input, and will output a sum and a carry digit. You can then put together cascades of full adders to add binary numbers eight columns wide, or sixteen, or thirty-two. This adding machine “knows” nothing; it is just a clever arrangement of physical objects that can go from one state to another, and by doing so cause changes in other physical objects. The revolutionary difference between it and the first device that Charles Babbage built, the Difference Engine, is that it represents data and logic in zeros and ones, in discrete digits – it is “digital,” as opposed to the earlier “analog” devices, all the way back through slide rules and astrolabes and the Antikythera mechanism. Babbage’s planned second device, the Analytical Engine, would have been a digital, programmable computer, but the technology and engineering of his time was not able to implement what he had imagined.
Once you have objects that can materialize both Boolean algebra and binary numbers, you can connect these components in ways that allow the computation of mathematical functions. Line up sufficiently large numbers of simple on/off mechanisms, and you have a machine that can add, subtract, multiply, and through these mathematical operations format your epic novel in less time than you will take to finish reading this sentence. Computers can only compute, calculate; the poems you write, the pictures of your family, the music you listen to – all these are converted into binary numbers, sequences of ones and zeros, and are thus stored and changed and re-created. Your computer allows you to read, see, and hear by representing binary numbers as letters, images, and sounds. Computers may seem mysteriously active, weirdly alive, but they are mechanical devices like harvesting combines or sewing machines.
You can build logic gates out of any material that can accept inputs and switch between distinct states of output (current or no current, 1 or 0); there is nothing special about the chips inside your laptop that makes them essential to computing. Electrical circuits laid out in silicon just happen to be small, cheap, relatively reliable, and easy to produce in mass quantities. The Digi-Comp II, which was sold as a toy in the sixties, used an inclined wooden plane, plastic cams, and marbles to perform binary mathematical operations. The vast worlds inside online games provide virtual objects that can be made to interact predictably, and some people have used these objects to make computing machines insidethe games – Jong89, the creator of the “Dwarven Computer” in the game Dwarf Fortress, used “672 [virtual water] pumps, 2000 [faux wooden] logs, 8500 mechanisms and thousands of other assort[ed] bits and knobs like doors and rock blocks” to put together his device, which is a fully functional computer that can perform any calculation that a “real” computer can. 2Logic gates have been built out of pneumatic, hydraulic, and optical devices, out of DNA, and flat sticks connected by rivets. Recently, some researchers from Kobe University in Japan announced, “We demonstrate that swarms of soldier crabs can implement logical gates when placed in a geometrically constrained environment.” 3
Many years after I stopped working professionally as a programmer, I finally understood this, truly grokked this fact – that you can build a logic gate out of water and pipes and valves, no electricity needed, and from the interaction of these physical objects produce computation. The shock of the revelation turned me into a geek party bore. I arranged toothpicks on dinner tables to lay out logic-gate schematics, and harassed my friends with disquisitions about the life and work of George Boole. And as I tried to explain the mechanisms of digital computation, I realized that it is a process that is fundamentally foreign to our common-sense, everyday understanding. In his masterly book on the subject, Code: The Hidden Language of Computer Hardware and Software, Charles Petzold uses telegraphic relay circuits – built out of batteries and wires – to walk the lay reader through the functioning of computing machines. And he points out:
Samuel Morse had demonstrated his telegraph in 1844—ten years before the publication of Boole’s The Laws of Thought …
But nobody in the nineteenth century made the connection between the ANDs and ORs of Boolean algebra and the wiring of simple switches in series and in parallel. No mathematician, no electrician, no telegraph operator, nobody. Not even that icon of the computer revolution Charles Babbage (1792–1871), who had corresponded with Boole and knew his work, and who struggled for much of his life designing first a Difference Engine and then an Analytical Engine that a century later would be regarded as the precursors to modern computers …
Nobody in that century ever realized that Boolean expressions could be directly realized in electrical circuits. This equivalence wasn’t discovered until the 1930s, most notably by Claude Elwood Shannon … whose famous 1938 M.I.T. master’s thesis was entitled “A Symbolic Analysis of Relay and Switching Circuits.” 4
After Shannon, early pioneers of modern computing had no choice but to comprehend that you could build Boolean logic and binary numbers into electrical circuits and work directly with this equivalence to produce computation. That is, early computers requiredthat you wire the logic of your program into the machine. If you needed to solve a different problem, you had to build a whole new computer. General programmable computers, capable of receiving instructions to process varying kinds of logic, were first conceived of by Charles Babbage in 1837, and Lady Ada Byron wrote the first-ever computer program – which computed Bernoulli numbers – for this imaginary machine, but the technology of the era was incapable of building a working model. 5The first electronic programmable computers appeared in the nineteen forties. They required instructions in binary – to talk to a computer, you had to actually understand Boolean logic and binary numbers and the innards of the machine you were driving into action. Since then, decades of effort have constructed layer upon layer of translation between human and machine. The paradox is, quite simply, that modern high-level programming languages hide the internal structures of computers from programmers. This is how Rob P. can acquire an advanced degree in computer science and still be capable of that plaintive, boldfaced cry, “But I Don’t Know How Computers Work!” 6
Computers have not really changed radically in terms of their underlying architecture over the last half-century; what we think of as advancement or progress is really a slowly growing ease of human use, an amenability to human cognition and manipulation that is completely dependent on vast increases in processing power and storage capabilities. As you can tell from our journey down the stack of languages mentioned earlier, the purpose of each layer is to shield the user from the perplexing complexities of the layer just below, and to allow instructions to be phrased in a syntax that is just a bit closer to everyday, spoken language. All this translation from one dialect to a lower one exacts a fearsome cost in processing cycles, which users are never aware of because the chips which do all the work gain astonishing amounts of computing ability every year; in the famous formulation by Intel co-founder George E. Moore, the number of transistors that can be fitted on to an integrated circuit should double approximately every two years. Moore’s Law has held true since 1965. What this means in practical terms is that computers get exponentially more powerful and smaller every decade.
According to computer scientist Jack Ganssle, your iPad 2 has “about the compute capability of the Cray 2, 1985’s leading supercomputer. The Cray cost $35 million more than the iPad. Apple’s product runs 10 hours on a charge; the Cray needed 150 KW and liquid Flourinert cooling.” 7He goes on to describe ENIAC – the Electronic Numerical Integrator and Computer – which was the world’s first general-purpose, fully electronic computer capable of being programmed for diverse tasks. It was put into operation in 1945. 8“If we built [an iPhone] using the ENIAC’s active element technology,” Ganssle writes:
the phone would be about the size of 170 Vertical Assembly Buildings (the largest single-story building in the world) … Weight? 2,500 Nimitz-class aircraft carriers. And what a power hog! Figure over a terawatt, requiring all of the output of 500 of Olkiluoto power plants (the largest nuclear plant in the world). An ENIAC-technology iPhone would run a cool $50 trillion, roughly the GDP of the entire world. 9
So that smartphone you carry in your pocket is actually a fully programmable supercomputer; you could break the Enigma code with it, or design nuclear bombs. You can use it to tap out shopping lists because millions of logic gates are churning away to draw that pretty keyboard and all those shadowed checkboxes. And I can write working programs because modern high-level languages like C# protect me from the overwhelming intricacy of the machine as it actually is. When I write code in C#, I work within a regime that has been designed to be “best for human understanding,” far removed from the alien digital idiom of the machine. Until the early fifties, programmers worked in machine code or one of its close variants. As we’ve just seen, instructions passed to the computer’s CPU have to be encoded as binary numbers (“1010101 10001011 …”), which are extremely hard for humans to read and write, or even distinguish from one another. Representing these numbers in a hexadecimal format (“55 8B …”) makes the code more legible, but only slightly so. So assembly language was created; in assembly, each low-level machine-code instruction is represented by a mnemonic. So our earlier hexadecimal representation of “Hello, world!” becomes:
One line of code in assembly language usually translates into one machine-code instruction. Writing code in assembly is more efficient than writing machine code, but is still difficult and error-prone.
In 1954, John Backus and a legendary team of IBM programmers began work on a pioneering high-level programming language, FORTRAN (from FORmula TRANslation), intended for use in scientific and numerical applications. FORTRAN offered not only a more English-like vocabulary and syntax, but also economy – a single line of FORTRAN would be translated into many machine-code instructions. “Hello, world!” in FORTRAN is:
All modern high-level languages provide the same ease of use. I work inside an orderly, simplified hallucination, a mayathat is illusion and not-illusion – the code I write sets off other subterranean incantations which are completely illegible to me, but I can cause objects to move in the real world, and send messages to the other side of the planet.