C Checker Reference Manual

January 1998

3.1 - Introduction

3.2 - Type conversions

3.2.1 - Integer to integer conversions
3.2.2 - Pointer to integer and integer to pointer conversions
3.2.3 - Pointer to pointer conversions
3.2.4 - Example: 64-bit portability issues

3.3 - Function type checking

3.3.1 - Type checking non-prototyped functions
3.3.2 - Checking printf strings
3.3.3 - Function return checking

3.4 - Overriding type checking

3.4.1 - Implicit Function Declarations
3.4.2 - Function Parameters
3.4.3 - Incompatible promoted function arguments
3.4.4 - Incompatible type qualifiers

3 Type Checking

3.1 Introduction

Type checking is relevant to two main areas of C. It ensures that all declarations referring to the same object are consistent (clearly a pre-requisite for a well-defined program). It is also the key to determining when an undefined or unexpected value has been produced due to the type conversions which arise from certain operations in C. Conversions may be explicit (conversion is specified by a cast) or implicit. Generally explicit conversions may be regarded more leniently since the programmer was obviously aware of the conversion, whereas the implications of an implicit conversion may not have been considered.

3.2 Type conversions

The only types which may be interconverted legally are integral types, floating point types and pointer types. Even if these rules are observed, the results of some conversions can be surprising and may vary on different machines. The checker can detect three categories of conversion: integer to integer conversions, pointer to integer and integer to pointer conversions, and pointer to pointer conversions.

In the default mode, the checker allows all integer to integer conversions, explicit integer to pointer and pointer to integer conversions and the explicit pointer to pointer conversions defined by the ISO C standard (all conversions between pointers to function types and other pointers are undefined according to the ISO C standard).

Checks to detect these conversions are controlled by the pragma:

	#pragma TenDRA conversion analysis status

Unless explicitly stated to the contrary, throughout the rest of the document where status appears in a pragma statement it represents one of on (enable the check and produce errors), warning (enable the check but produce only warnings), or off (disable the check). Here status may be on to give an error if a conversion is detected,

warning

to produce a warning if a conversion is detected, or off to switch the checks off. The checks may also be controlled using the command line option

-X:

test=state where test is one of convert_all, convert_int, convert_int_explicit, convert_int_implicit, convert_int_ptr and convert_ptr and state is check,warn or dont.

Due to the serious nature of implicit pointer to integer, implicit pointer to pointer conversions and undefined explicit pointer to pointer conversions, such conversions are flagged as errors by default. These conversion checks are not controlled by the global conversion analysis pragma above, but must be controlled by the relevant individual pragmas given in sections 3.2.2 and 3.2.3.

3.2.1 Integer to integer conversions

All integer to integer conversions are allowed in C, however some can result in a loss of accuracy and so may be usefully detected. For example, conversions from int to long never result in a loss of accuracy, but conversions from long to int may. The detection of these shortening conversions is controlled by:

	#pragma TenDRA conversion analysis ( int-int ) status

Checks on explicit conversions and implicit conversions may be controlled independently using:

	#pragma TenDRA conversion analysis ( int-int explicit ) status

and

	#pragma TenDRA conversion analysis ( int-int implicit ) status

Objects of enumerated type are specified by the ISO C standard to be compatible with an implementation-defined integer type. However assigning a value of a different integral type other then an appropriate enumeration constant to an object of enumeration type is not really in keeping with the spirit of enumerations. The check to detect the implicit integer to enum type conversions which arise from such assignments is controlled using:

	#pragma TenDRA conversion analysis ( int-enum implicit ) status

Note that only implicit conversions are flagged; if the conversion is made explicit, by using a cast, no errors are raised.

As usual status must be replaced by on, warning or off in all the pragmas listed above.

The interaction of the integer conversion checks with the integer promotion and arithmetic rules is an extremely complex issue which is further discussed in Chapter 4.

3.2.2 Pointer to integer and integer to pointer conversions

Integer to pointer and pointer to integer conversions are generally unportable and should always be specified by means of an explicit cast. The exception is that the integer zero and null pointers are deemed to be inter-convertible. As in the integer to integer conversion case, explicit and implicit pointer to integer and integer to pointer conversions may be controlled separately using:

	#pragma TenDRA conversion analysis ( int-pointer explicit ) status

and

	#pragma TenDRA conversion analysis ( int-pointer implicit ) status

or both checks may be controlled together by:

	#pragma TenDRA conversion analysis ( int-pointer ) status

where status may be on, warning or off as before and pointer-int may be substituted for int-pointer.

3.2.3 Pointer to pointer conversions.

According to the ISO C standard, section 6.3.4, the only legal pointer to pointer conversions are explicit conversions between:

a pointer to an object or incomplete type and a pointer to a different object or incomplete type. The resulting pointer may not be valid if it is improperly aligned for the type pointed to;
a pointer to a function of one type and a pointer to a function of another type. If a converted pointer, used to call a function, has a type that is incompatible with the type of the called function, the behaviour is undefined.

Except for conversions to and from the generic pointer which are discussed below, all other conversions, including implicit pointer to pointer conversions, are extremely unportable.

All pointer to pointer conversion may be flagged as errors using:

	#pragma TenDRA conversion analysis ( pointer-pointer ) status

Explicit and implicit pointer to pointer conversions may be controlled separately using:

	#pragma TenDRA conversion analysis ( pointer-pointer explicit ) status

and

	#pragma TenDRA conversion analysis ( pointer-pointer implicit ) status

where, as before, status may be on,

warning

or off.

Conversion between a pointer to a function type and a pointer to a non-function type is undefined by the ISO C standard and should generally be avoided. The checker can however be configured to treat function pointers as object pointers for conversion using:

	#pragma TenDRA function pointer as pointer permit

Unless explicitly stated to the contrary, throughout the rest of the document where permit appears in a pragma statement it represents one of allow (allow the construct and do not produce errors), warning (allow the construct but produce warnings when it is detected), or disallow (produce errors if the construct is detected) Here there are three options for permit:

allow

(do not produce errors or warnings for function pointer <-> pointer conversions); warning (produce a warning when function pointer <-> pointer conversions are detected); or

disallow

(produce an error for function pointer <-> pointer conversions).

The generic pointer, void *, is a special case. All conversions of pointers to object or incomplete types to or from a generic pointer are allowed. Some older dialects of C used char * as a generic pointer. This dialect feature may be allowed, allowed with a warning, or disallowed using the pragma:

	#pragma TenDRA compatible type : char * == void * permit

where permit is allow, warning or disallow as before.

3.2.4 Example: 64-bit portability issues

64-bit machines form the "next frontier" of program portability. Most of the problems involved in 64-bit portability are type conversion problems. The assumptions that were safe on a 32-bit machine are not necessarily true on a 64-bit machine - int may not be the same size as long, pointers may not be the same size as int, and so on. This example illustrates the way in which the checker's conversion analysis tests can detect potential 64-bit portability problems.

Consider the following code:

	#include <stdio.h>
	void print ( string, offset, scale )
	char *string;
	unsigned int offset;
	int scale;
	{
		string += ( scale * offset );
		( void ) puts ( string );
		return;
	}
	
	int main ()
	{
		char *s = "hello there";
		print ( s + 4, 2U, -2 );
		return ( 0 );
	}

This appears to be fairly simple - the offset of 2U scaled by -2 cancels out the offset in s + 4, so the program just prints "hello there". Indeed, this is what happens on most machines. When ported to a particular 64-bit machine, however, it core dumps. The fairly subtle reason is that the composite offset, scale * offset, is actually calculated as an unsigned int by the ISO C arithmetic conversion rules. So the answer is not -4. Strictly speaking it is undefined, but on virtually all machines it will be UINT_MAX - 3. The fact that adding this offset to string is equivalent to adding -4 is only true on machines on which pointers have the same size as unsigned int. If a pointer contains 64 bits and an unsigned int contains 32 bits, the result is 232 bytes out.

So the error occurs because of the failure to spot that the offset being added to string is unsigned. All mixed integer type arithmetic involves some argument conversion. In the case above, scale is converted to an unsigned int and that is multiplied by offset to give an unsigned int result. If the implicit int->int conversion checks (3.2.1 ) are enabled, this conversion is detected and the problem may be avoided.

3.3 Function type checking

The importance of function type checking in C lies in the conversions which can result from type mismatches between the arguments in a function call and the parameter types assumed by its definition or between the specified type of the function return and the values returned within the function definition. Until the introduction of function prototypes into ISO standard C, there was little scope for detecting the correct typing of functions. Traditional C allows for absolutely no type checking of function arguments, so that totally bizarre functions, such as:

	int f ( n ) int n ; {
		return ( f ( "hello", "there" ) ) ;
	}

are allowed, although their effect is undefined. However, the move to fully prototyped programs has been relatively slow. This is partially due to an understandable reluctance to change existing, working programs, but the desire to maintain compatibility with existing C compilers, some of which still do not support prototypes, is also a powerful factor. Prototypes are allowed in the checker's default mode but tchk can be configured to allow, allow with a warning or disallow prototypes, using:

	#pragma TenDRA prototype permit

where permit is allow, disallow or warning.

Even if prototypes are not supported the checker has a facility, described below, for detecting incorrectly typed functions.

3.3.1 Type checking non-prototyped functions

The checker offers a method for applying prototype-like checks to traditionally defined functions, by introducing the concept of " weak" prototypes. A weak prototype contains function parameter type information, but has none of the automatic argument conversions associated with a normal prototype. Instead weak prototypes imply the usual argument promotion passing rules for non-prototyped functions. The type information required for a weak prototype can be obtained in three ways:

A weak prototype may be declared using the syntax:

	int f WEAK ( char, char * ) ;

where WEAK represents any keyword which has been introduced using:

	#pragma TenDRA keyword WEAK for weak

An alternative definition of the keyword must be provided for other compilers. For example, the following definition would make system compilers interpret weak prototypes as normal (strong) prototypes:

	#ifdef __TenDRA__
	#pragma TenDRA keyword WEAK for weak
	#else
	#define WEAK
	#endif

The difference between conventional prototypes and weak prototypes can be illustrated by considering the normal prototype for f:

	int f (char,char *);

When the prototype is present, the first argument to f would be passed as a char. Using the weak prototype, however, results in the first argument being passed as the integral promotion of char, that is to say, as an int.

There is one limitation on the declaration of weak prototypes - declarations of the form:

	int f WEAK() ;

are not allowed. If a function has no arguments, this should be stated explicitly as:

	int f WEAK( void ) ;

whereas if the argument list is not specified, weak prototypes should be avoided and a traditional declaration used instead:

	extern int f ();

The checker may be configured to allow, allow with a warning or disallow weak prototype declarations using:

	#pragma TenDRA prototype ( weak ) permit

where permit is replaced by allow,

warning

or disallow as appropriate. Weak prototypes are not permitted in the default mode.

Information can be deduced from a function definition. For example, the function definition:
```
	int f(c,s) char c; char *s;{...}
```
is said to have weak prototype:
```
	int f WEAK (char,char *);
```
The checker automatically constructs a weak prototype for each traditional function definition it encounters and if the weak prototype analysis mode is enabled (see below) all subsequent calls of the function are checked against this weak prototype.
For example, in the bizarre function in 3.3, the weak prototype:
```
	int f WEAK ( int );
```
is constructed for f. The subsequent call to f:
```
	f ( "hello", "there" );
```
is then rejected by comparison with this weak prototype - not only is f called with the wrong number of arguments, but the first argument has a type incompatible with (the integral promotion of) int.
Information may be deduced from the calls of a function. For example, in:
```
	extern void f ();
	void g ()
	{
		f ( 3 );
		f ( "hello" );
	}
```
we can infer from the first call of f that f takes one integral argument. We cannot deduce the type of this argument, only that it is an integral type whose promotion is int (since this is how the argument is passed). We can therefore infer a partial weak prototype for f:
```
	void f WEAK ( t );
```
for some integral type t which promotes to int. Similarly, from the second call of f we can infer the weak prototype:
```
	void f WEAK ( char * );
```
(the argument passing rules are much simpler in this case). Clearly the two inferred prototypes are incompatible, so an error is raised.
Note that prototype inferred from function calls alone cannot ensure that the uses of the function within a source file are correct, merely that they are consistent. The presence of an explicit function declaration or definition is required for a definitive "right" prototype.
Null pointers cause particular problems with weak prototypes inferred from function calls. For example, in:
```
	#include <stdio.h>
	extern void f ();
	void g () {
		f ( "hello" );
		f( NULL );
	}
```
the argument in the first call of f is char* whereas in the second it is int (because NULL is defined to be 0). Whereas NULL can be converted to char*, it is not necessarily passed to procedures in the same way (for example, it may be that pointers have 64 bits and ints have 32 bits). It is almost always necessary to cast NULL to the appropriate pointer type in weak procedure calls.

Functions for which explicitly declared weak prototypes are provided are always type-checked by the checker. Weak prototypes deduced from function declarations or calls are used for type checking if the weak prototype analysis mode is enabled using:

	#pragma TenDRA weak prototype analysis status

where status is one of on, warning and off as usual. Weak prototype analysis is not performed in the default mode.

There is also an equivalent command line option of the form -X:weak_proto=state, where state can be check, warn or dont.

This section ends with two examples which demonstrate some of the less obvious consequences of weak prototype analysis.

Example 1: An obscure type mismatch

As stated above, the promotion and conversion rules for weak prototypes are precisely those for traditionally declared and defined functions. Consider the program:

	void f ( n )long n;{
		printf ( "%ld\n", n );
	}
	void g (){
		f ( 3 );
	}

The literal constant 3 is an int and hence is passed as such to f. f is however expecting a long, which can lead to problems on some machines. Introducing a strong prototype declaration of f for those compilers which understand them:

	#ifdef __STDC__ 
		void f ( long );
	#endif

will produce correct code - the arguments to a function declared with a prototype are converted to the appropriate types, so that the literal is actually passed as 3L. This solves the problem for compilers which understand prototypes, but does not actually detect the underlying error. Weak prototypes, because they use the traditional argument passing rules, do detect the error. The constructed weak prototype:

	void f WEAK ( long );

conveys the type information that f is expecting a long, but accepts the function arguments as passed rather than converting them. Hence, the error of passing an int argument to a function expecting a long is detected.

Many programs, seeking to have prototype checks while preserving compilability with non-prototype compilers, adopt a compromise approach of traditional definitions plus prototype declarations for those compilers which understand them, as in the example above. While this ensures correct argument passing in the prototype case, as the example shows it may obscure errors in the non-prototype case.

Example 2: Weak prototype checks in defined programs

In most cases a program which fails to compile with the weak prototype analysis enabled is undefined. ISO standard C does however contain an anomalous rule on equivalence of representation. For example, in:

	extern void f ();
	void g () {
		f ( 3 );
		f ( 4U );
	}

the TenDRA checker detects an error - in one instance f is being passed an int, whereas in the other it is being passed an unsigned int. However, the ISO C standard states that, for values which fit into both types, the representation of a number as an int is equal to that as an unsigned int, and that values with the same representation are interchangeable in procedure arguments. Thus the program is defined. The justification for raising an error or warning for this program is that the prototype analysis is based on types, not some weaker notion of "equivalence of representation". The program may be defined, but it is not type correct.

Another case in which a program is defined, but not correct, is where an unnecessary extra argument is passed to a function. For example, in:

	void f ( a ) int a; {
		printf ( "%d\n", a );
	}
	void g () {
		f ( 3, 4 );
	}

the call of f is defined, but is almost certainly a mistake.

3.3.2 Checking printf strings

Normally functions which take a variable number of arguments offer only limited scope for type checking. For example, given the prototype:

	int execl ( const char *, const char *, ... );

the first two arguments may be checked, but we have no hold on any subsequent arguments (in fact in this example they should all be const char *, but C does not allow this information to be expressed). Two classes of functions of this form, namely the printf and scanf families, are so common that they warrant special treatment. If one of these functions is called with a constant format string, then it is possible to use this string to deduce the types of the extra arguments that it is expect ing. For example, in:

	printf ( "%ld", 4 );

the format string indicates that printf is expecting a single additional argument of type long. We can therefore deduce a quasi-prototype which this particular call to printf should conform to, namely:

	int printf ( const char *,long );

In fact this is a mixture of a strong prototype and a weak prototype. The first argument comes from the actual prototype of printf, and hence is strong. All subsequent arguments correspond to the ellipsis part of the printf prototype, and are passed by the normal promotion rules. Hence the long component of the inferred prototype is weak (see 3.3.1). This means that the error in the call to printf - the integer literal is passed as an int when a long is expected - is detected.

In order for this check to take place, the function declaration needs to tell the checker that the function is like printf. This is done by introducing a special type, PSTRING say, to stand for a printf string, using:

	#pragma TenDRA type PSTRING for ... printf

For most purposes this is equivalent to:

	typedef const char *PSTRING;

except that when a function declaration:

	int f ( PSTRING, ... );

is encountered the checker knows to deduce the types of the arguments corresponding to the ... from the PSTRING argument (the precise rules it applies are those set out in the XPG4 definition of fprintf). If this mechanism is used to apply printf style checks to user defined functions, an alternative definition of PSTRING for conventional compilers must be provided. For example:

	#ifdef __TenDRA__
	#pragma TenDRA type PSTRING for ... printf
	#else
	typedef const char *PSTRING;
	#endif

The TenDRA descriptions of the standard APIs use this mechanism to describe those functions, namely printf, fprintf and sprintf, and scanf, fscanf and sscanf which are of these forms. This means that the checks are switched on for these functions by default. However, these descriptions are under the control of a macro, __NO_PRINTF_CHECKS, which, if defined before stdio.h is included, effectively switches the checks off. This macro is defined in the start-up files for certain checking modes, so that the checks are disabled in these modes (see chapter 2). The checks can be enabled in these cases by #undef'ing the macro before including stdio.h. There are equivalent command-line options to tchk of the form -X:printf=state, where state can be check or dont, which respectively undefine and define this macro.

3.3.3 Function return checking

Function returns normally present no difficulties. The return value is converted, as if by assignment, to the function return type, so that the problem is essentially one of type conversion (see 3.2). There is however one anomalous case. A plain return statement, without a return value, is allowed in functions returning a non-void type, the value returned being undefined. For example, in:

	int f ( int c )
	{
		if ( c ) return ( 1 );
		return;
	}

the value returned when c is zero is undefined. The test for detecting such void returns is controlled by:

	#pragma TenDRA incompatible void return permit

where permit may be allow, warning or disallow as usual.

There are also equivalent command line options to tchk of the form -X:void_ret=state, where state can be check, warn or dont. Incompatible void returns are allowed in the default mode and of course, plain return statements in functions returning void are always legal.

This check also detects functions which do not contain a return statement, but fall out of the bottom of the function as in:

	int f ( int c )
	{
		if ( c ) return ( 1 );
	}

Occasionally it may be the case that such a function is legal, because the end of the function is not reached. Unreachable code is discussed in section 5.2.

3.4 Overriding type checking

There are several commonly used features of C, some of which are even allowed by the ISO C standard, which can circumvent or hinder the type-checking of a program. The checker may be configured either to enforce the absence of these features or to support them with or without a warning, as described below.

3.4.1 Implicit Function Declarations

The ISO C standard states that any undeclared function is implicitly assumed to return int. For example, in ISO C:

	int f ( int c ) {
		return ( g( c )+1 );
	}

the undeclared function g is inferred to have a declaration:

	extern int g ();

This can potentially lead to program errors. The definition of f would be valid if g actually returned double, but incorrect code would be produced. Again, an explicit declaration might give us more information about the function argument types, allowing more checks to be applied.

Therefore the best chance of detecting bugs in a program and ensuring its portability comes from having each function declared before it is used. This means detecting implicit declarations and replacing them by explicit declarations. By default implicit function declarations are allowed, however the pragma:

	#pragma TenDRA implicit function declaration status

may be used to determine how tchk handles implicit function declarations. Status is replaced by on to allow implicit declarations, warning to allow implicit declarations but to produce a warning when they occur, or off to prevent implicit declarations and raise an error where they would normally be used.

(There are also equivalent command-line options to tcc of the form -X:implicit_func=state, where state can be check, warn or dont.)

This test assumes an added significance in API checking. If a programmer wishes to check that a certain program uses nothing outside the POSIX API, then implicitly declared functions are a potential danger area. A function from outside POSIX could be used without being detected because it has been implicitly declared. Therefore, the detection of implicitly declared functions is vital to rigorous API checking.

3.4.2 Function Parameters

Many systems pass function arguments of differing types in the same way and programs are sometimes written to take advantage of this feature. The checker has a number of options to resolve type mismatches which may arise in this way and would otherwise be flagged as errors:

Type-type compatibility
When comparing function prototypes for compatibility, the function parameter types must be compared. If the parameter types would otherwise be incompatible, they are treated as compatible if they have previously been introduced with a type-type param ter compatibility pragma i.e.
```
	#pragma TenDRA argument type-name as type-name
```
where type-name is the name of any type. This pragma is transitive and the second type in the pragma is taken to be the final type of the parameter.
Type-ellipsis compatibility
Two function prototypes with different numbers of arguments are compatible if:
- both prototypes have an ellipsis;
- each parameter type common to both prototypes is compatible;
- each extra parameter type in the prototype with more parameters, is either specified in a type-ellipsis compatibility pragma or is type-type compatible (see above) to a type that is specified in a type-ellipsis compatibility.
Type-ellipsis compatibility is introduced using the pragma:
```
	#pragma TenDRA argument type-name as ...
```
where again type-name is the name of any type.
Ellipsis compatibility
If, when comparing two function prototypes for compatibility, one has an ellipsis and the other does not, but otherwise the two types would be compatible, then if an `extra' ellipsis is allowed, the types are treated as compatible. The pragma controlling ellipsis compatibility is:
```
	#pragma TenDRA extra ... permit
```
where permit may be allow, disallow or warning as usual.

3.4.3 Incompatible promoted function arguments

Mixing the use of prototypes with old-fashioned function definitions can result in incorrect code. For example, in the program below the function argument promotion rules are applied to the definition of f, making it incompatible with the earlier prototype (a is converted to the integer promotion of char, i.e. int).

	int f(char);
	int f(a)char a;{
		...
	}

An incompatible type error is raised in the default checking mode. The check for incompatible types which arise from mixtures of prototyped and non-prototyped function declarations and definitions is controlled using:

#pragma TenDRA incompatible promoted function argument permit

Permit may be replaced by allow, warning or disallow as normal. The parameter type in the resulting function type is the promoted parameter type.

3.4.4 Incompatible type qualifiers

The declarations

	const int a;
	int a;

are not compatible according to the ISO C standard because the qualifier, const, is present in one declaration but not in the other. Similar rules hold for volatile qualified types. By default, tchk produces an error when declarations of the same object contain different type qualifiers. The check is controlled using:

	#pragma TenDRA incompatible type qualifier permit

where the options for permit are allow,

disallow

or warning.