Choose Descriptive Names, But Try For Contexts That Make Descriptive Names Unimportant.

The names you choose for your constants, variables, types, classes, subprograms, etc. have a subtle affect on the thinking of any person whose understanding of your program is not complete. Because bugs can be caused by thoughts we are not quite conscious of, your choices of names are important to such a person. If your program is successful, then sooner or later somebody else will work on it. In fact, that person may be an older, but now clueless, version of yourself.

Choose descriptive names.

I do this by trying to choose a name that gives a useful hint.

The first rule for achieving a name that gives a useful hint is to choose a name that does not mislead.

An example of misleading names would be classes (or modules) named INPUT and OUTPUT but written so that normal output is done from within the INPUT class in a way that bypasses the OUTPUT class.

(Few people would choose names like this. Nevertheless similar examples are easily found in production code. I think they are caused by changes made after the choice of name. Don't let that happen. Changing a name to fit new circumstances doesn't take much time -- especially when compared to the consequences of bad names.)

The second rule for achieving a name that gives a useful hint is to take the time to decide what hint would be useful.

(When it is difficult to find a useful hint, how much effort should you make? When should you instead take the easy way out and choose a name such as XYZ or ABC which gives no hint but, at the same time, doesn't mislead either?

I can tell you that, whenever I have a clear idea about what I am doing, choosing a name is seldom difficult. Therefore, when finding a name is hard, it is a signal that I'm not clear enough about what I'm doing or that I am trying something too complicated.

Put another way: when I can't think of a good name, it is a signal that the code I am about to write may have to be redone when I know what I am doing. Choosing a name is often a part of a process that clarifies my thinking and avoids rewrites.

This explains why the first person to suffer from a bad name will probably be yourself. Look at the your current project. Is it debugged? Are the names good? I have made bugs go away by rewriting in a way that permits good names. You can too. Not only will your code be more correct, but everybody who works on it after will appreciate what you have done.)

The third rule for achieving a name that gives a useful hint is not to try for a name that explains everything. You won't succeed and the effort will make you dislike descriptive names.

(Here's an example from a program I am currently working on. This program translates a markup language of my creation to Latex or HTML. The program has a subprogram that reads input from markup token to markup token. As a markup token is being searched for, text that is passed over is placed in a string named Prev. The token itself is placed in a string named Toke. Now, Prev may be empty if two tokens appear with no intervening normal text. However, Toke may only be empty when the input file has been consumed.

The names Prev and Toke give hints about purposes but they do not explain the whole story. For that you need the information in the preceding paragraph.)

Try For Contexts That Make Descriptive Names Unimportant

This part of the tip applies particularly to variable names. To set the stage, look at this C code:

 

int Get_sum( int A [], int Last_index ) { 

  int I,S;

  for( I=0; I<=Last_index; I+=1 ) { S += A[I]; }

  return S; 

}

I doubt that many of us would say it is important to have descriptive names for the variables S, A, and I . Some would say that descriptive names should be given to all variables all the time. Their reason would probably be that it is easier and more reliable to give all variables descriptive names than to decide which ones don't need it.

That argument has a lot of merit but it summarily throws away thousands of years of experience writing mathematics. When we ask ourselves why mathematicians name variables one way and software engineers another, we find reasons for software engineers to behave like mathematicians -- some of the time. Possibly more important, we find reasons why software engineers should try to create situations in which it makes sense to name variables like mathematicians.

What I am about to assert goes against the conventional wisdom of the software engineering community. That wisdom is that descriptive variable names are always good but lazyness can be tolerated in some few situations. There is no better example of this conventional wisdom at work than in the book Code Complete. Two studies about the length of variable names are cited in the chapter titled "The Power of Data Names". One study suggests that data names should be an average of twelve (plus or minus a few) characters long. This study is given a whole sub-sub-section. The other study suggests that rarely used global names should be long and local or loop variables should be short. This study is given a sentence. But, even that short shrift is too much for a book that follows conventional wisdom on this subject -- the sentence about the second study is followed with this editorial comment: "Short names are subject to many problems, however, and some careful programmers avoid them altogether as a matter of defensive-programming policy."

That's the way conventional wisdom works. Evidence that supports it is seen to be definitive and evidence that goes against it is seen to be flawed.)

In spite of the software engineering community's experience, there is no question that short variable names are valuable in mathematics. As just one recent example, consider an article in the American Mathematical Monthly by the developer of the Latex document preparation system. The article argues that mathematicians should look to software engineering techniques for a better way to organize their proofs. You might expect, that one of those software engineering techniques would be the use of descriptive variable names. It isn't. The example proof follows mathematical practice and uses short variable names.

What's going on? Obviously, there is some significant difference between writing mathematics and writing computer software. It's not hard to see relevant factors:

To the extent that software engineers can divide their work up into tidy contexts, the way mathematicians do, their need for descriptive variable names decreases.

To the extent that software engineers must deal with complicated relationships and operations in one context, their need for shorter variable names increases.

A simple example of how context can affect the need for a more descriptive variable name is found in the following two Pascal procedures.

 

procedure updateAddress1( var C : CUSTOMERxRECORD);  

begin 

   C.Address := ... 

end updateAddress; 



procedure updateAddress2( var CustomerRecord : CUSTOMERxRECORDxTYPE ); 

begin 

   CustomerRecord.Address := ...   

end updateAddress;  

In updateAddress1, the descriptive type name makes a descriptive object name redundant -- the parameter name can be one letter long without sacrificing readability.

In modern software engineering there are more constructs to help a programmer create tidy contexts: packages, modules, and classes come immediately to mind.

The example involving Prev and Toke above illustrates the value of a package. In the translation program, there is a package of some subprograms which control these variables. Outside the package, the variables behave as described. Inside, the situation is more complex because their values must be sometimes pushed on a stack so that an include file can be read.

Here's a short formulation of the point I am trying to make:

Zimmer's Hypothesis

Computer code that can be understood with short variable names is better than computer code that requires long variable names.

(Do not take this hypothesis to mean that you should try to organize your code so that all variable names can be short. Take it to mean that you should seek out contexts that let some of your variable names be short.

Note designing contexts that permit short variable names is still valuable if you work for somebody who insists on descriptive variable names. The real value comes from the well organized contexts -- not the variable names.

My hypothesis does not mean that code with shorter variable names is better than code with longer variable names. This will bring you into conflict with others who believe that code with longer variable names is better. Both ideas are stupid.)

The purposes of global variables that are seldom used are easy to forget. Give those variables long descriptive names. All my hypothesis says is that a program which needs fewer of those variables will be easier to work with than one which needs lots of them. Nothing very controversial in that.

Remarks:

Here's a tip that discusses naming conventions.

Here's a tip that discusses a style of programming which permits programmers to set up tidy contexts the way mathematicians do.

Copyright and Permissions

Copyright, 1995 by J Adrian Zimmer

This tip is distributed to individuals free of charge from the Software Build and Fix web site. All other distribution (including but not limited to internal distribution within an organization and mirroring of any kind) is forbidden without written consent of the copyright holder.

Context  Some Tips for Programmers    Author J Adrian Zimmer  
Dated: October 1, 1995 ; Revised: Oct 07 1998