Hamming: Learning to Learn

Hamming, Richard W. The Art of Doing Science and Engineering: Learning to Learn. Stripe Press, 2020.

1: Orientation
2: Foundations of the digital (discrete) revolution
3: History of computers -- hardware
4: History of computers -- software
5: History of computer applications
6: Limits of computer applications -- AI-I
7: Limits of computer applications -- AI-II

1: Orientation

Here I make a digression to illustrate what is often called "back-of-the-envelope calculations." I have frequently observed great scientists and engineers do this much more often than the "run-of-the-mill" people, hence it requires illustration.

I read somewhere there are 76 different methods of predicting the future -- but the very number suggests there is no reliable method which is widely accepted. The most trivial method is to predict tomorrow will be exactly the same as today -- which at times is a good bet. The next level of sophistication is to use the current rates of change and to suppose that they will stay the same -- linear prediction in the variable used. Which variable you use can, of course, strongly affect the prediction made! Both methods are not much good for long-term predictions, however.

History is often used as a long-term guide; some people believe history repeats itself and others believe exactly the opposite! It is obvious:

The past was once the future and the future will become the past.

In any case, I will often use history as a background for the extrapolations I make. I believe the best predictions are based on understanding the fundamental forces involved, and this is what I depend on mainly. Often it is not physical limitations which control but rather it is human-made laws, habits, and organizational rules, regulations, personal egos, and inertia which dominate the evolution to the future.

There is a saying, "Short-term predictions are always optimistic and long-term predictions are always pessimistic." The reason, so itis claimed, the second part is true is that for most people the geometric growth due to the compounding of knowledge is hard to grasp. For example, for money, a mere 6% annual growth doubles the money in about 12 years! In 48 years the growth is a factor of 16. An example of the truth of this claim that most long-term predictions are low is the growth of the computer field in speed, in density of components, in drop in price, etc., as well as the spread of computers into the many corners of life. But the field of artificial intelligence (AI) provides a very good counterexample. Almost all the leaders in the field made long-term predictions which have almost never come true, and are not likely to do so within your lifetime, though many will in the fullness of time.

It is probable the future will be more limited by the slow evolution of the human animal and the corresponding human laws, social institutions, and organizations than it will be by the rapid evolution of technology.

In spite of the difficulty of predicting the future and that

unforeseen technological inventions can completely upset the most careful predictions,

you must try to foresee the future you will face. To illustrate the importance of this point of trying to foresee the future I often use a standard story.

It is well known the drunken sailor who staggers to the left or right with n independent random steps will, on the average, end up about √n steps from the origin. But if there is a pretty girl in one direction, then his steps will tend to go in that direction and he will go a distance proportional to n. In a lifetime of many, many independent choices, small and large, a career with a vision will get you a distance proportional to n, while no vision will get you only the distance √n. In a sense, the main difference between those who go far and those who do not is some people have a vision and the others do not and therefore can only react to the current events as they happen.

You will probably object that if you try to get a vision now it is likely to be wrong -- and my reply is that from observation I have seen the accuracy of the vision matters less than you might suppose, getting anywhere is better than drifting, there are potentially many paths to greatness for you, and just which path you go on, so long as it takes you to greatness, is none of my business. You must, as in the case of forging your personal style, find your vision of your future career, and then follow it as best you can.

No vision, not much of a future.

To what extent history does or does not repeat itself is a moot question. But it is one of the few guides you have... The other main tool I have used is an active imagination in trying to see what will happen... In forming your plan for your future you need to distinguish three different questions:

What is possible?

What is likely to happen?

What is desirable to have happen?

In a sense the first is science -- what is possible. The second is engineering -- what are the human factors which choose the one future that does happen from the ensemble of all possible futures. The third is ethics, morals, or whatever other word you wish to apply to value judgments. It is important to examine all three questions, and insofar as the second differs from the third, you will probably have an idea of how to alter things to make the more desirable future occur, rather than let the inevitable happen and suffer the consequences. Again, you can see why having a vision is what tends to separate the leaders from the followers.

2: Foundations of the digital (discrete) revolution

We should note here transmission through space (typically signaling) is the same as transmission through time (storage).

From Material Goods to Information Services

Society is steadily moving from a material goods society to an information service society. At the time of the American Revolution, say 1780 or so, over 90% of the people were essentially farmers -- now farmers are a very small percentage of workers. Similarly, before WWII most workers were in factories -- now less than half are there. In 1993, there were more people in government (excluding the military) than there were in manufacturing! What will the situation be in 2020? As a guess I would say less than 25% of the people in the civilian workforce will be handling things; the rest will be handling information in some form or other. In making a movie or a TV program you are making not so much a thing, though of course it does have a material form, as you are organizing information. Information is, of course, stored in a material form, say a book (the essence of a book is information), but information is not a material good to be consumed like food, a house, clothes, an automobile, or an airplane ride for transportation.

contra Smil

Equivalent Design

When we first passed from hand accounting to machine accounting we found it necessary, for economical reasons if no other, to somewhat alter the accounting system. Similarly, when we passed from strict hand fabrication to machine fabrication we passed from mainly screws and bolts to rivets and welding.

It has rarely proved practical to produce exactly the same product by machines as we produced by hand.

Indeed, one of the major items in the conversion from hand to machine production is the imaginative redesign of an equivalent product. Thus in thinking of mechanizing a large organization, it won't work if you try to keep things in detail exactly the same, rather there must be a larger give and take if there is to be a significant success. You must get the essentials of the job in mind and then design the mechanization to do that job rather than trying to mechanize the current version -- if you want a significant success in the long run.

Micromanagement

The effects on society are large. The most obvious illustration is that computers have given top management the power to micromanage their organization, and top management has shown little or no ability to resist using this power.

Among other evils of micromanagement is lower management does not get the chance to make responsible decisions and learn from their mistakes, but rather, because the older people finally retire, then lower management finds itself as top management -- without having had many real experiences in management!

Furthermore, central planning has been repeatedly shown to give poor results (consider the Russian experiment, for example, or our own bureaucracy). The persons on the spot usually have better knowledge than can those at the top and hence can often (not always) make better decisions if things are not micromanaged. The people at the bottom do not have the larger, global view, but at the top they do not have the local view of all the details, many of which can often be very important, so either extreme gets poor results.

Next, an idea which arises in the field, based on the direct experience of the people doing the job, cannot get going in a centrally controlled system since the managers did not think of it themselves. The not invented here (NIH) syndrome is one of the major curses of our society, and computers, with their ability to encourage micromanagement, are a significant factor.

There is slowly, but apparently definitely, coming a counter trend to micromanagement. Loose connections between small, somewhat independent organizations are gradually arising... I believe you can expect to see much more of this loose association between small organizations as a defense against micromanagement from the top, which occurs so often in big organizations. There has always been some independence of subdivisions in organizations, but the power to micromanage from the top has apparently destroyed the conventional liens and autonomy of decision making -- and I doubt the ability of most top managements to resist for long the power to micromanage. I also doubt many large companies will be able to give up micromanagement; most will probably be replaced in the long run by smaller organizations without the cost (overhead) and errors of top management. Thus computers are affecting the very structure of how society does its business, and for the moment apparently for the worse in this area.

Human & Machine Roles

I believe computers will be almost everywhere, since I once saw a sign which read, "The battlefield is no place for the human being." Similarly for situations requiring constant decision making. The many advantages of machines over humans were listed near the end of the last chapter, and it is hard to get around these advantages, though they are certainly not everything. Clearly the role of humans will be quite different from what it has traditionally been, but many of you will insist on old theories you were taught long ago as if they would be automatically true in the long future. It will be the same in business much of what is now taught is based on the past, and has ignored the computer revolution and our responses to some of the evils the revolution has brought; the gains are generally clear to management, the evils are less so.

3: History of computers -- hardware

The first commercial production of electronic computers was under Mauchly and Eckert again, and since the company they formed was merged with another, their machines were finally called UNIVACs. Especially noted was the one for the Census Bureau. IBM came in a bit late with 18 (20 if you count secret cryptographic users) IBM 701s. I well recall a group of us, after a session on the IBM 701 at a meeting where they talked about the proposed 18 machines, all believed this would saturate the market for many years! Our error was simply we thought only of the kinds of things we were currently doing, and did not think in the directions of entirely new applications of machines. The best experts at the time were flatly wrong! And not by a small amount either! Nor for the last time!

Machine as Thing

We see the machine does not know where it has been, nor where it is going to go; it has at best only a myopic view of simply repeating the same cycle endlessly. Below this level the individual gates and two-way storage devices do not know any meaning -- they simply react to what they are supposed to do. They too have no global knowledge of what is going on, nor any meaning to attach to any bit, whether storage or gating.

I am reviewing this so you will be clear the machine processes bits of information according other bits, and as far ashore machine is concerned there is no meaning to anything which happens -- it is we who attach meaning to the bits. The machine is a "machine" in the classical sense; it does what it does and nothing else (unless it malfunctions). There are, of course, real-time interrupts, and other ways new bits get into the machine, but to the machine they are only bits.

4: History of computers -- software

Mechanical Programs

The punch card machines were controlled by plug board wiring to tell the machine where to find the information, what to do with it, and where to put the answers on the cards (or on the printed sheet of a tabulator), but somme of the control might also come from the cards themselves, typically X and Y punches (other digits, could, at times, control what happened). A plug board was specially wired for each job to be done, and in an accounting office the wired boards were usually saved and used again each week, or month, as they were needed in the cycle of accounting.

Layers of Abstraction

Finally, a more complete, and more useful, Symbolic Assembly Program (SAP) was devised -- after more years than you are apt to believe, during which time most programmers continued their heroic absolute binary programming. At the time SAP first appeared I would guess about 1% of the older programmers were interested in it -- using SAP was "sissy stuff," and a real programmer would not stoop to wasting machine capacity to do the assembly. Yes! Programmers wanted no part of it, though when pressed they had to admit their old methods used more machine time in locating and fixing up errors than the SAP program ever used. One of the main complaints was when using a symbolic system you didn't know where anything was in storage -- though in the early days we supplied a mapping of symbolic to actual storage, and believe it or not they later lovingly pored over such sheets rather than realize they did not need to know that information if they stuck to operating within the system -- no! When correcting errors they preferred to do it in absolute binary addresses.

FORTRAN
FORmula TRANslation

Again, monitors, often called "the system" these days, like all the earlier steps I have mentioned, should be obvious to anyone who is involved in using the machines from day to day; but most users seem too busy to think or observe how bad things are and how much the computer could do to make things significantly easier and cheaper. Tao see the obvious it often takes an outsider, or else someone like me who is thoughtful and wonders what he is doing and why it is all necessary. Even when told, the old timers will persist in the ways they learned, probably out of pride for their past and an unwillingness to admit there are better ways than those they were using for so long.

One way of describing what happened in the history of software is that we were slowly going from absolute to virtual machines. First, we got rid of the actual code instructions, then the actual addresses, then in FORTRAN the necessity of learning a lot of the insides of these complicated machines and how they worked. We were buffering the user from the machine itself. Fairly early at Bell Telephone Laboratories we built some devices to make the tape units virtual, machine independent.

Logic vs Humanity (Psychology)

Algol, around 1958-1960, was backed by many worldwide computer organizations, including the ACM. It was an attempt by the theoreticians to greatly improve FORTRAN. But being logicians, they produced a logical, not a humane, psychological language, and of course, as you know, it failed in the long run. It was, among other things, stated in a Boolean logical form which is not comprehensible to mere mortals (and often not even to the logicians themselves!). Many other logically designed languages have come and gone, while FORTRAN (somewhat modified to be sure) remains a widely used language, indicating clearly the power of psychologically designed languages over logically designed languages.

This was the beginning of a great hope for special languages, POLs they were called, meaning problem-oriented languages. There is some merit in this idea, but the great enthusiasm faded because too many problems involved more than one special field, and the languages were usually incompatible. Furthermore, in the long run, they were too costly in the learning phase for humans to master all of the various ones they might need.

In about 1962 the >LISP language began. Various rumors floated around as to how it actually came about. The probable truth is something like this: John McCarthy suggested the elements of the language for theoretical purposes, the suggestion was taken up and significantly elaborated upon by others, and when some student observed he could write a compiler for it in LISP, using the simple trick of self-compiling, all were astounded, including, apparently, McCarthy himself. But he urged the student to try, and magically, almost overnight, they moved from theory to a real operating LISP compiler!

Understanding

I had reviewed for a journal the EDSAC book on programming, and there in Appendix D was a peculiar program written to get a large program into a small storage. It was an interpreter. But if it was in Appendix D, did they see the importance? I doubt it! Furthermore, in the second edition it was still in Appendix D, apparently unrecognized by them for what it was.

This raises, as I wished to, the ugly point of when something is understood. Yes, they wrote one, and used it, but did they understand the generality of interpreters and compilers? I believe not. Similarly, when around that time a number of us realized computers were actually symbol manipulators and not just number crunchers, we went around giving talks, and I saw people nod their heads sagely when I said it, but I also realized most of them did not understand. Of course you can say Turing's original paper (1937) clearly showed computers were symbol manipulating machines, but one carefully rereading the von Neumann reports you would not guess the authors did, though there is one combinatorial program and a sorting routine.

History tends to be charitable in this matter. It gives credit for understanding what something means when we first do it. But there is a wise saying, "Almost everyone who opens up a new field does not really understand it the way the followers do."

Specialization

What is wanted in the long run, of course, is that the man with the problem does the actual writing of the code with no human interface, as we all too often have these days, between the person who knows the problem and the person who knows the programming language. This date is unfortunately too far off to do much good immediately, but I would think by the year 2020 it would be fairly universal practice for the expert in the field of application to do the actual program preparation rather than have experts in computers (and ignorant of the field of application) do the program preparation.

Language Fundamentals

The fundamentals of language are not understood to this day. ... But until we genuinely understand such things -- assuming, as seems reasonable, the current natural languages through long evolution are reasonably suited to the job they do for humans -- we will not know how to design artificial languages for human-machine communication. Hence I expect a lot of trouble until we do understand human communication via natural languages. Of course, the problem of human-machine communication is significantly different from human-human communication, but in which ways and how much seems to be not known nor even sought for.

Until we better understand languages of communication involving humans as they are (or can be easily trained), it is unlikely many of our software problems will vanish.

Some times ago there was the prominent "fifth generation" of computers the Japanese planned to use, along with AI, to get a better interface between the machine and the human problem solders. Great clips were made for both the machines and the languages. The result, so far, is the machines came out as advertised, and they are back to the drawing boards on the use of AI to aid in programming. It came out as I predicted at that time (for Los Alamos), since I did not see the Japanese were trying to understand the basics of language in the above engineering sense. There are many things we can do to reduce "the software problem," as it is called, but it will take some basic understanding of language as it is used to communicate understanding between humans, and between humans and machines, before we will have a really decent solution to this costly problem. It simply will not go away.

Software Engineering

You read constantly about "engineering the production of software," both for the efficiency of production and for the reliability of the product. But you do not expect novelists to "engineer the production of novels." The question arises: "Is programming closer to novel writing than it is to classical engineering?" I suggest yes! Given the problem of getting a man into outer space, both the Russians and the Americans did it pretty much the same way, all things considered, and allowing for some espionage. They were both limited by the same firm laws of physics. But give two novelists the problem of writing on "the greatness and misery of man," and you will probably get two very different novels (without saying just how to measure this). Give the same complex problem to two modern programmers and. You will, I claim, get two rather different programs. Hence my belief that current programming practice is closer to novel writing than it is to engineering. The novelists are bound only by their imaginations, which is somewhat as the programmers are when they are writing software. Both activities have a large creative component, and while you would like to make programming resemble engineering, it will take a lot of time to get there -- and maybe you really, in the long run, do not want to do it! Maybe it just sounds good.

5: History of computer applications

Trial

This is typical of many situations. It is first necessary to prove beyond any doubt the new thing, device, method, or whatever it is, can cope with heroic tasks before it can get into the system to do the more routine, and, int he long run, more useful tasks. Any innovation is always against such a barrier, so do not get discouraged when you find your new idea is stoutly, and perhaps foolishly, resisted. By realizing the magnitude of the actual task you can then decide if it is worth your efforts to continue, or if you should go do something else you can accomplish and not fritter away your efforts needlessly against the forces of inertia and stupidity.

Growth

What will come along to sustain this straight line logarithmic growth curve and prevent the inevitable flattening out of the S-curve of applications? The next big area is, I believe, pattern recognition. I doubt our ability to cope with the most general problem of pattern recognition, because for one thing it implies too much, but in areas like speech recognition, radar pattern recognition, picture analysis and redrawing, workload scheduling in factories and offices, analysis of data for statisticians, creation of virtual images, and such, we can consume a very large amount of computer power. Virtual reality computing will become a large consumer of computing power, and its obvious economic value assures us this will happen, both in the practical needs and in amusement areas. Beyond these is, I believe, artificial intelligence, which will finally get to the point where the delivery of what they have to offer will justify the price in computing effort, and will hence be another source of problem solving.

pattern recognition (ie machine learning)
virtual reality
artificial intelligence

Collaboration

The reason it did not work as planned is simple. If the current status of the design is on the tape (currently discs), and if you use the data during a study of, say, wing area, shape, and profile, then when you make a change in your parameters and you find an improvement, it might have been due to a change someone else inserted into the common design and not tot he change you made -- which might have actually made things worse! Hence what happened in practice was each group, when making an optimization study, made a copy of the current tape, and used it without any updates from any other area. Only when they finally decided on their new design did they insert the changes -- and of course they had to verify their new design meshed with the new designs of the others. You simply cannot use a constantly changing database for an optimization study.

5: Limits of computer applications -- AI-I

computers manipulate symbols, not information; we are simply unable to say, let alone write a program for, what we mean by the word "information"

Can Machines Think?

I believe very likely in the future we will have vehicles exploring the surface of Mars. The distance between Earth and Mars at times may be so large the signaling time round-trip could be 20 or more minutes. In the exploration process the vehicle must, therefore, have a fair degree of local control. When, having passed between two rocks, turned a bit, and then found the ground under the front wheels was falling away, you will want prompt, "sensible" action on the part of the vehicle. Simple, obvious things like backing up will be inadequate to save it from destruction, and there is not time to get advice from Earth; hence some degree of "intelligence" should be programmed into the machine.

I earlier mentioned on the need to get at least some understanding of what we mean by "a machine" and by "thinking." We were discussing these things at Bell Telephone Laboratories in the late 1940s and someone said a machine could not have organic parts, upon which I said the definition excluded any wooden parts! The first definition was retracted, but to be nasty I suggested in time we might learn how to remove a large part of a frog's nervous system and keep it alive. If we found how to use it for a storage mechanism, would be it be a machine or not? If we used it as content-addressable storage, how would you feel about it being a "machine"?
In the same discussion, on the thinking side, a Jesuit-trained engineer gave the definition, "Thinking is what humans can do and machines cannot do." Well, that solves the problem once and for all, apparently. But do you like the definition? Is it really fair? As we pointed out to him then, if we start with some obvious difference at present, then with improved machines and better programming we may be able to reduce the difference, and it is not clear in the long run there would be any difference left.
Clearly we need to define "thinking." Most people want the definition of thinking to be such that they can think, but stones, trees, and such things cannot think. But people vary to the extent that they will or will not include the higher levels of animals. People often make the mistake of saying, "Thinking is what Newton and Einstein did." But by that definition most of us cannot think -- and usually we do not like that conclusion! Turing, in coping with the question, in a sense evaded it, and made the claim that if at the end of one teletype line there was a human and at the end of another teletype line there was a suitably programmed machine, and if the average human could not tell the difference, then that was proof of "thinking" on the part of the machine (program).

Are We Machines?>/h3>
Physics regards you as a collection of molecules in a radiant energy field, and there is, in strict physics, nothing else. Democritus (b. Around 460 BC) said in Ancient Greek times, "All is atoms and void." This is the stance of the hard AI people; there is no essential difference between machines and humans, hence by suitably programming machines, the machines can do anything humans can do. Their failures to produce thinking in significant detail is, they believe, merely the failure of programmers to understand what they are doing, and not an essential limitation.

Is it not fair to say, "The program learned from experience"? Your immediate objection is that there was a program telling the machine how to learn. But when you take a course in Euclidean geometry, is not the teacher putting a similar learning program into you? Poorly, to be sure, but is that not, in a real sense, what a course in geometry is all about? You enter the course and cannot do problems; the teacher puts into you a program and at the end of the course you can solve such problems. Think it over carefully. If you deny the machine learns from experience because you claim the program was told (by the human programmer) how to improve its performance, then is not the situation much the same with you, except you are born with a somewhat larger initial program compared to the machine when it leaves the manufacturer's hands? Are you sure you are not merely "programmed" in life by what chance events happen to you?

"I know kung fu"

7: Limits of computer applications -- AI-II

Quality of Quantity?

Let us start again and return to the elements of machines and humans. Both are built out of atoms and molecules. Both have organized basic parts; the machine has, among other things, two-state devices both for storage and for gates, while humans are built of cells. Both have larger structures, arithmetic units, storage, control, and I/O for machines, and humans have bones, muscles, organs, blood vessels, a nervous system, etc.
But let us note some things carefully. From large organizations new effects can arise. For example, we believe there is no friction between molecules, but most large structures show this effect -- it is an effect which arises from the organization of smaller parts which do not show the effect.
We should also note that often when we engineer some device to do the same as nature does, we do it differently. For example, we have airplanes which generally use fixed wings (or rotors), while birds mainly flap their wings. But we also do a different thing -- we fly much higher and certainly much faster than birds can. Nature never invented the wheel, though we use wheels in many, many ways. Our nervous system is comparatively slow and signals with a velocity of around a few hundred meters per second, while computers signal at around 186,000 miles per second.
A third thing to note, before continuing with what AI has accomplished, is that the human brain has many, many components in the form of interconnected nerves. We want to have the definition of "thinking" be something the human brain can do. With past failures to program a machine to think, the excuse is often given that the machine was not big enough, fast enough, etc. Some people conclude from this that if we build a big enough machine, then automatically it will be able to think! Remember, it seems to be more a problem of writing the program than it is building a machine, unless you believe, as with friction, that enough small parts will produce a new effect -- thinking from non-thinking parts. Perhaps that is all thinking really is! Perhaps it is not a separate thing, it is just an artifact of largeness. One cannot flatly deny this, as we have to admit we do not know what thinking really is.