std
namespaceint
printf
and scanf
argument checkingThis section describes how the C++ producer can be configured to apply extra static checks or to support various dialects of C++. In all cases the default behaviour is precisely that specified in the ISO C++ standard with no extra checks.
Certain very basic configuration information is specified using a
portability table, however the primary method
of configuration is by means of #pragma
directives.
These directives may be placed within the program itself, however
it is generally more convenient to group them into a
start-up file in order to create a
user-defined compilation profile. The
#pragma
directives recognised by the C++ producer have
one of the equivalent forms:
#pragma TenDRA .... #pragma TenDRA++ ....Some of these are common to the C and C++ producers (although often with differing default behaviour). The C producer will ignore any
TenDRA++
directives, so these may be used in compilation
profiles which are to be used by both producers. In the descriptions
below, the presence of a ++
is used to indicate a directive
which is C++ specific; the other directives are common to both producers.
Within the description of the #pragma
syntax, on
stands for on
, off
or warning
,
allow stands for allow
, disallow
or
warning
, string-literal is any string literal,
integer-literal is any integer literal, identifier is
any simple, unqualified identifier name, and type-id is any
type identifier. Other syntactic items are described in the text.
A
complete grammar for the #pragma
directives accepted by the C++ producer is given as an annex.
Certain very basic configuration information is read from a file called
a portability table, which may be specified to the producer using
a
-n
option. This information
includes the minimum sizes of the basic integral types, the
sign of plain char
, and whether signed
types can be assumed to be symmetric (for example, [-127,127]) or
maximum (for example, [-128,127]).
The default portability table values, which are built into the producer, can be expressed in the form:
char_bits 8 short_bits 16 int_bits 16 long_bits 32 signed_range symmetric char_type either ptr_int none ptr_fn no non_prototype_checks yes multibyte 1This illustrates the syntax for the portability table; note that all ten entries are required, even though the last four are ignored.
The simplest level of configuration is to reset the severity level of a particular error message using:
#pragma TenDRA++ error string-literal on #pragma TenDRA++ error string-literal allowThe given string-literal should name an error from the error catalogue. A severity of
on
or disallow
indicates that the associated diagnostic
message should be an error, which causes the compilation to fail.
A severity of
warning
indicates that the associated diagnostic message
should be a warning, which is printed but allows the compilation to
continue. A severity of off
or allow
indicates that the associated error should be ignored. Reducing the
severity of any error from its default value, other than via one of
the dialect directives described in this section, results in undefined
behaviour.
The next level of configuration is to reset the severity level of a particular compiler option using:
#pragma TenDRA++ option string-literal on #pragma TenDRA++ option string-literal allowThe given string-literal should name an option from the option catalogue. The simplest form of compiler option just sets the severity level of one or more error messages. Some of these options may require additional processing to be applied.
It is possible to link a particular error message to a particular compiler option using:
#pragma TenDRA++ error string-literal as option string-literal
Note that the directive:
#pragma TenDRA++ use error string-literalcan be used to raise a given error at any point in a translation unit in a similar fashion to the
#error
directive. The values
of any parameters for this error are unspecified.
The directives just described give the primitive operations on error messages and compiler options. Many of the remaining directives in this section are merely higher level ways of expressing these primitives.
Most compiler options are scoped. A checking scope may be defined by enclosing a list of declarations within:
#pragma TenDRA begin .... #pragma TenDRA endIf the final
end
directive is omitted then the scope
ends at the end of the translation unit. Checking scopes may be nested
in the obvious way. A checking scope inherits its initial set of
checks from its enclosing scope (this includes the implicit main checking
scope consisting of the entire input file). Any checks switched on
or off within a scope apply only to the remainder of that scope and
any scope it contains. A particular check can only be set once in
a given scope. The set of applied checks reverts to its previous state
at the end of the scope.
A checking scope can be named using the directives:
#pragma TenDRA begin name environment identifier .... #pragma TenDRA endChecking scope names occupy a namespace distinct from any other namespace within the translation unit. A named scope defines a set of modifications to the current checking scope. These modifications may be reapplied within a different scope using:
#pragma TenDRA use environment identifierThe default behaviour is not to allow checks set in the named checking scope to be reset in the current scope. This can however be modified using:
#pragma TenDRA use environment identifier reset allow
Another use of a named checking scope is to associate a checking scope with a named include file directory. This is done using:
#pragma TenDRA directory identifier use environment identifierwhere the directory name is one introduced via a
-N
command-line option.
The effect of this directive, if a #include
directive
is found to resolve to a file from the given directory, is as if the
file was enclosed in directives of the form:
#pragma TenDRA begin #pragma TenDRA use environment identifier reset allow .... #pragma TenDRA end
The checks applied to the expansion of a macro definition are those from the scope in which the macro was defined, not that in which it was expanded. The macro arguments are checked in the scope in which they are specified, that is to say, the scope in which the macro is expanded. This enables macro definitions to remain localised with respect to checking scopes.
This table gives the default implementation limits imposed by the
C++ producer for the various implementation quantities listed in Annex
B of the ISO C++ standard, together with the minimum limits allowed
in ISO C and C++. A default limit of none means that the quantity
is limited only by the size of the host machine (either ULONG_MAX
or until it runs out of memory). A limit of target means that
while no limits is imposed by the C++ front-end, particular target
machines may impose such limits.
Quantity identifier | Min C limit | Min C++ limit | Default limit |
---|---|---|---|
statement_depth | 15 | 256 | none |
hash_if_depth | 8 | 256 | none |
declarator_max | 12 | 256 | none |
paren_depth | 32 | 256 | none |
name_limit | 31 | 1024 | none |
extern_name_limit | 6 | 1024 | target |
external_ids | 511 | 65536 | target |
block_ids | 127 | 1024 | none |
macro_ids | 1024 | 65536 | none |
func_pars | 31 | 256 | none |
func_args | 31 | 256 | none |
macro_pars | 31 | 256 | none |
macro_args | 31 | 256 | none |
line_length | 509 | 65536 | none |
string_length | 509 | 65536 | none |
sizeof_object | 32767 | 262144 | target |
include_depth | 8 | 256 | 256 |
switch_cases | 257 | 16384 | none |
data_members | 127 | 16384 | none |
enum_consts | 127 | 4096 | none |
nested_class | 15 | 256 | none |
atexit_funcs | 32 | 32 | target |
base_classes | N/A | 16384 | none |
direct_bases | N/A | 1024 | none |
class_members | N/A | 4096 | none |
virtual_funcs | N/A | 16384 | none |
virtual_bases | N/A | 1024 | none |
static_members | N/A | 1024 | none |
friends | N/A | 4096 | none |
access_declarations | N/A | 4096 | none |
ctor_initializers | N/A | 6144 | none |
scope_qualifiers | N/A | 256 | none |
external_specs | N/A | 1024 | none |
template_pars | N/A | 1024 | none |
instance_depth | N/A | 17 | 17 |
exception_handlers | N/A | 256 | none |
exception_specs | N/A | 256 | none |
It is possible to impose lower limits on most of the quantities listed above by means of the directive:
#pragma TenDRA++ option value string-literal integer-literalwhere string-literal gives one of the quantity identifiers listed above and integer-literal gives the limit to be imposed. An error is reported if the quantity exceeds this limit (note however that checks have not yet been implemented for all of the quantities listed). Note that the
name_limit
and
include_depth
implementation limits
can be set using dedicated directives.
The maximum number of errors allowed before the producer bails out can be set using the directive:
#pragma TenDRA++ set error limit integer-literalThe default value is 32.
During lexical analysis, a source file which is not empty should end in a newline character. It is possible to relax this constraint using the directive:
#pragma TenDRA no nline after file end allow
In several places in this section it is described how to introduce keywords for TenDRA language extensions. By default, no such extra keywords are defined. There are also low-level directives for defining and undefining keywords. The directive:
#pragma TenDRA++ keyword identifier for keyword identifiercan be used to introduce a keyword (the first identifier) standing for the standard C++ keyword given by the second identifier. The directive:
#pragma TenDRA++ keyword identifier for operator operatorcan similarly be used to introduce a keyword giving an alternative representation for the given operator or punctuator, as, for example, in:
#pragma TenDRA++ keyword and for operator &&Finally the directive:
#pragma TenDRA++ undef keyword identifiercan be used to undefine a keyword.
C-style comments do not nest. The directive:
#pragma TenDRA nested comment analysis onenables a check for the characters
/*
within C-style
comments.
During lexical analysis, each character in the source file has an associated look-up value which is used to determine whether the character can be used in an identifier name, is a white space character etc. These values are stored in a simple look-up table. It is possible to set the look-up value using:
#pragma TenDRA++ character character-literal as character-literal allowwhich sets the look-up for the first character to be the default look-up for the second character. The form:
#pragma TenDRA++ character character-literal disallowsets the look-up of the character to be that of an invalid character. The forms:
#pragma TenDRA++ character string-literal as character-literal allow #pragma TenDRA++ character string-literal disallowcan be used to modify the look-up values for the set of characters given by the string literal. For example:
#pragma TenDRA character '$' as 'a' allow #pragma TenDRA character '\r' as ' ' allowallows
$
to be used in identifier names (like a
)
and carriage return to be a white space character. The former is
a common dialect feature and can also be controlled by the directive:
#pragma TenDRA dollar as ident allow
The maximum number of characters allowed in an identifier name can be set using the directives:
#pragma TenDRA set name limit integer-literal #pragma TenDRA++ set name limit integer-literal warningThis length is given by the
name_limit
implementation
quantity
mentioned above. Identifiers which exceed this
length raise an error or a warning, but are not truncated.
The rules for finding the type of an integer literal can be described using directives of the form:
#pragma TenDRA integer literal literal-specwhere:
literal-spec : literal-base literal-suffixopt literal-type-list literal-base : octal decimal hexadecimal literal-suffix : unsigned long unsigned long long long unsigned long long literal-type-list : * literal-type-spec integer-literal literal-type-spec | literal-type-list ? literal-type-spec | literal-type-list literal-type-spec : : type-id * allowopt : identifier * * allowopt :Each directive gives a literal base and suffix, describing the form of an integer literal, and a list of possible types for literals of this form. This list gives a mapping from the value of the literal to the type to be used to represent the literal. There are three cases for the literal type; it may be a given integral type, it may be calculated using a given literal type token, or it may cause an error to be raised. There are also three cases for describing a literal range; it may be given by values less than or equal to a given integer literal, it may be given by values which are guaranteed to fit into a given integral type, or it may be match any value. For example:
#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int #pragma TenDRA integer literal decimal 32767 : int | ** : l_idescribes how to find the type of a decimal literal with no suffix. Values less that or equal to 32767 have type
int
; larger
values have target dependent type calculated using the token
~lit_int
. Introducing a warning
into the
directive will cause a warning to be printed if the token is used
to calculate the value.
Note that this scheme extends that implemented by the C producer,
because of the need for more accurate information in the C++ producer.
For example, the specification above does not fully express the ISO
rule that the type of a decimal integer is the first of the types
int
, long
and unsigned long
which it fits into (it only expresses the first step). However with
the C++ extensions it is possible to write:
#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int #pragma TenDRA integer literal decimal ? : int | ? : long |\ ? : unsigned long | ** : l_i
By default, a simple character literal has type int
in
C and type char
in C++. The type of such literals can
be controlled using the directive:
#pragma TenDRA++ set character literal : type-idThe type of a wide character literal is given by the implementation defined type
wchar_t
. By default, the definition of
this type is taken from the target machine's <stddef.h>
C header (note that in ISO C++, wchar_t
is actually a
keyword, but its underlying representation must be the same as in
C). This definition can be overridden in the producer by means of
the directive:
#pragma TenDRA set wchar_t : type-idfor an integral type type-id. Similarly, the definitions of the other implementation dependent integral types which arise naturally within the language - the type of the difference of two pointers,
ptrdiff_t
, and the type of the sizeof
operator, size_t
- given in the <stddef.h>
header can be overridden using the directives:
#pragma TenDRA set ptrdiff_t : type-id #pragma TenDRA set size_t : type-idThese directives are useful when targeting a specific machine on which the definitions of these types are known; while they may not affect the code generated they can cut down on spurious conversion warnings. Note that although these types are built into the producer they are not visible to the user unless an appropriate header is included (with the exception of the keyword
wchar_t
in ISO C++), however
the directives:
#pragma TenDRA++ type identifier for type-namecan be used to make these types visible. They are equivalent to a
typedef
declaration of identifier as the given
built-in type, ptrdiff_t
, size_t
or
wchar_t
.
Whether plain char
is signed or unsigned is implementation
dependent. By default the implementation is determined by the definition
of the ~char
token, however
this can be overridden in the producer either by means of the
portability table or by the directive:
#pragma TenDRA character character-signwhere character-sign can be
signed
,
unsigned
or either
(the default). Again
this directive is useful primarily when targeting a specific machine
on which the signedness of char
is known.
By default, character string literals have type char [n]
in C and older dialects of C++, but type const char [n]
in ISO C++. Similarly wide string literals have type wchar_t
[n]
or const wchar_t [n]
. Whether string literals are
const
or not can be controlled using the two directives:
#pragma TenDRA++ set string literal : const #pragma TenDRA++ set string literal : no constIn the case where literals are
const
, the array-to-pointer
conversion is allowed to cast away the const
to allow
for a degree of backwards compatibility. The status of this deprecated
conversion can be controlled using the directive:
#pragma TenDRA writeable string literal allow(yes, I know that that should be
writable
). Note that
this directive has a slightly different meaning in the C producer.
Adjacent string literals tokens of similar types (either both character string literals or both wide string literals) are concatenated at an early stage in parser, however it is unspecified what happens if a character string literal token is adjacent to a wide string literal token. By default this gives an error, but the directive:
#pragma TenDRA unify incompatible string literal allowcan be used to enable the strings to be concatenated to give a wide string literal.
If a '
or "
character does not have
a matching closing quote on the same line then it is undefined whether
an implementation should report an unterminated string or treat the
quote as a single unknown character. By default, the C++ producer
treats this as an unterminated string, but this behaviour can be controlled
using the directive:
#pragma TenDRA unmatched quote allow
By default, if the character following the \
in an escape
sequence is not one of those listed in the ISO C or C++ standards
then an error is given. This behaviour, which is left unspecified
by the standards, can be controlled by the directive:
#pragma TenDRA unknown escape allowThe result is that the
\
in unknown escape sequences
is ignored, so that \z
is interpreted as z
,
for example. Individual escape sequences can be enabled or disabled
using the directives:
#pragma TenDRA++ escape character-literal as character-literal allow #pragma TenDRA++ escape character-literal disallowso that, for example:
#pragma TenDRA++ escape 'e' as '\033' allow #pragma TenDRA++ escape 'a' disallowsets
\e
to be the ASCII escape character and disables
the alert character \a
.
By default, if the value of a character, given for example by a
\x
escape sequence, does not fit into its type then an
error is given. This implementation dependent behaviour can however
be controlled by the directive:
#pragma TenDRA character escape overflow allowthe value being converted to its type in the normal way.
Non-standard preprocessing directives can be controlled using the directives:
#pragma TenDRA directive ppdir allow #pragma TenDRA directive ppdir (ignore) allowwhere ppdir can be
assert
, file
,
ident
, import
(C++ only),
include_next
(C++ only), unassert
,
warning
(C++ only) or weak
. The second form
causes the directive to be processed but ignored (note that there is no
(ignore) disallow
form). The treatment of other unknown
preprocessing directives can be controlled using:
#pragma TenDRA unknown directive allowCases where the token following the
#
in a preprocessing
directive is not an identifier can be controlled using:
#pragma TenDRA no directive/nline after ident allowWhen permitted, unknown preprocessing directives are ignored.
By default, unknown #pragma
directives are ignored without
comment, however this behaviour can be modified using the directive:
#pragma TenDRA unknown pragma allowNote that any unknown
#pragma TenDRA
directives always
give an error.
Older preprocessors allowed text after #else
and
#endif
directives. The following directive can be used
to enable such behaviour:
#pragma TenDRA text after directive allowSuch text after a directive is ignored.
Some older preprocessors have problems with white space in preprocessing
directives - whether at the start of the line, before the initial
#
, or between the #
and the directive identifier.
Such white space can be detected using the directives:
#pragma TenDRA indented # directive allow #pragma TenDRA indented directive after # allowrespectively.
One of the effects of trying to compile code in a target independent
manner is that it is not always possible to completely evaluate the
condition in a #if
directive. Thus the conditional inclusion
needs to be preserved until the installer phase. This can only be
done if the target dependent #if
is more structured than
is normally required for preprocessing directives. There are two cases;
in the first, where the #if
appears in a statement, it
is treated as if it were a if
statement with braces including
its branches; that is:
#if cond true_statements #else false_statements #endifmaps to:
if ( cond ) { true_statements } else { false_statements }In the second case, where the
#if
appears in a list of
declarations, normally gives an error. The can however be overridden
by the directive:
#pragma TenDRA++ conditional declaration allowwhich causes both branches of the
#if
to be analysed.
There is a maximum depth of nested #include
directives allowed by the C++ producer. This depth is given by the
include_depth
implementation quantity
mentioned above. Its value is fairly small
in order to detect recursive inclusions. The maximum depth can be
set using:
#pragma TenDRA includes depth integer-literal
A further check, for full pathnames in #include
directives
(which may not be portable), can be enabled using the directive:
#pragma TenDRA++ complete file includes allow
By default, multiple consistent definitions of a macro are allowed. This behaviour can be controlled using the directive:
#pragma TenDRA extra macro definition allowThe ISO C/C++ rules for determining whether two macro definitions are consistent are fairly restrictive. A more relaxed rule allowing for consistent renaming of macro parameters can be enabled using:
#pragma TenDRA weak macro equality allow
In the definition of macros with parameters, a #
in the
replacement list must be followed by a parameter name, indicating
the stringising operation. This behaviour can be controlled by the
directive:
#pragma TenDRA no ident after # allowwhich allows a
#
which is not followed by a parameter
name to be treated as a normal preprocessing token.
In a list of macro arguments, the effect of a sequence of preprocessing tokens which otherwise resembles a preprocessing directive is undefined. The C++ producer treats such directives as normal sequences of preprocessing tokens, but can be made to report such behaviour using:
#pragma TenDRA directive as macro argument allow
ISO C requires that a translation unit should contain at least one declaration. C++ and older dialects of C allow translation units which contain no declarations. This behaviour can be controlled using the directive:
#pragma TenDRA no external declaration allow
std
namespace
Several classes declared in the std
namespace arise naturally
as part of the C++ language specification. These are as follows:
std::type_info // type of typeid construct std::bad_cast // thrown by dynamic_cast construct std::bad_typeid // thrown by typeid construct std::bad_alloc // thrown by new construct std::bad_exception // used in exception specificationsThe definitions of these classes are found, when needed, by looking up the appropriate class name in the
std
namespace.
Depending on the context, an error may be reported if the class is
not found. It is possible to modify the namespace which is searched
for these classes using the directive:
#pragma TenDRA++ set std namespace : scope-namewhere scope-name can be an identifier giving a namespace name or
::
, indicating the global namespace.
If an object is declared with both external and internal linkage in the same translation unit then, by default, an error is given. This behaviour can be changed using the directive:
#pragma TenDRA incompatible linkage allowWhen incompatible linkages are allowed, whether the resultant identifier has external or internal linkage can be set using one of the directives:
#pragma TenDRA linkage resolution : off #pragma TenDRA linkage resolution : (external) on #pragma TenDRA linkage resolution : (internal) on
It is possible to declare objects with external linkage in a block. C leaves it undefined whether declarations of the same object in different blocks, such as:
void f () { extern int a ; .... } void g () { extern double a ; .... }are checked for compatibility. However in C++ the one definition rule implies that such declarations are indeed checked for compatibility. The status of this check can be set using the directive:
#pragma TenDRA unify external linkage onNote that it is not possible in ISO C or C++ to declare objects or functions with internal linkage in a block. While
static
object definitions in a block have a specific meaning, there is no
real reason why static
functions should not be declared
in a block. This behaviour can be enabled using the directive:
#pragma TenDRA block function static allow
Inline functions have external linkage by default in ISO C++, but internal linkage in older dialects. The default linkage can be set using the directive:
#pragma TenDRA++ inline linkage linkage-specwhere linkage-spec can be
external
or
internal
. Similarly const
objects have
internal linkage by default in C++, but external linkage in C. The
default linkage can be set using the directive:
#pragma TenDRA++ const linkage linkage-spec
Older dialects of C treated all identifiers with external linkage
as if they had been declared volatile
(i.e. by being
conservative in optimising such values). This behaviour can be enabled
using the directive:
#pragma TenDRA external volatile_t
It is possible to set the default language linkage using the directive:
#pragma TenDRA++ external linkage string-literalThis is equivalent to enclosing the rest of the current checking scope in:
extern string-literal { .... }It is unspecified what happens if such a directive is used within an explicit linkage specification and does not nest correctly. This directive is particularly useful when used in a named environment associated with an include directory. For example, it can be used to express the fact that all the objects declared in headers included from that directory have C linkage.
A change in ISO C++ relative to older dialects is that the language linkage of a function now forms part of the function type. For example:
extern "C" int f ( int ) ; int ( *pf ) ( int ) = f ; // errorThe directive:
#pragma TenDRA++ external function linkage oncan be used to control whether function types with differing language linkages, but which are otherwise compatible, are considered compatible or not.
By default, objects and functions with internal linkage are mapped to tags without external names in the output TDF capsule. Thus such names are not available to the installer and it needs to make up internal names to represent such objects in its output. This is not desirable in such operations as profiling, where a meaningful internal name is needed to make sense of the output. The directive:
#pragma TenDRA preserve identifier-listcan be used to preserve the names of the given list of identifiers with internal linkage. This is done using the
static_name_def
TDF construct. The form:
#pragma TenDRA preserve *will preserve the names of all identifiers with internal linkage in this way.
ISO C++ requires every declaration or member declaration to introduce one or more names into the program. The directive:
#pragma TenDRA unknown struct/union allowcan be used to relax one particular instance of this rule, by allowing anonymous class definitions (recall that anonymous unions are objects, not types, in C++ and so are not covered by this rule). The C++ grammar also allows a solitary semicolon as a declaration or member declaration; however such a declaration does not introduce a name and so contravenes the rule above. The rule can be relaxed in this case using the directive:
#pragma TenDRA extra ; allowNote that the C++ grammar explicitly allows for an extra semicolon following an inline member function definition, but that semicolons following other function definitions are actually empty declarations of the form above. A solitary semicolon in a statement is interpreted as an empty expression statement rather than an empty declaration statement.
int
The C "implicit int
" rule, whereby a type of
int
is inferred in a list of type or declaration specifiers which does
not contain a type name, has been removed in ISO C++, although it
was supported in older dialects of C++. This check is controlled
by the directive:
#pragma TenDRA++ implicit int type allowPartial relaxations of this rules are allowed. The directive:
#pragma TenDRA++ implicit int type for const/volatile allowwill allow for implicit
int
when the list of type specifiers
contains a cv-qualifier. Similarly the directive:
#pragma TenDRA implicit int type for function return allowwill allow for implicit
int
in the return type of a function
definition (this excludes constructors, destructors and conversion
functions, where special rules apply). A function definition is the
only kind of declaration in ISO C where a declaration specifier is
not required. Older dialects of C allowed declaration specifiers to
be omitted in other cases. Support for this behaviour can be enabled
using:
#pragma TenDRA implicit int type for external declaration allowThe four cases can be demonstrated in the following example:
extern a ; // implicit int const b = 1 ; // implicit const int f () // implicit function return { return 2 ; } c = 3 ; // error: not allowed in C++
The long long
integral types are not part of ISO C or
C++ by default, however support for them can be enabled using the
directive:
#pragma TenDRA longlong type allowThis support includes allowing
long long
in type specifiers
and allowing LL
and ll
as integer literal
suffixes.
There is a further directive given by the two cases:
#pragma TenDRA set longlong type : long long #pragma TenDRA set longlong type : longwhich can be used to control the implementation of the
long
long
types. Either they can be mapped to the
default representation, which is guaranteed
to contain at least 64 bits, or they can be mapped to the corresponding
long
types.
Because these long long
types are not an intrinsic part
of C++ the implementation does not integrate them into the language
as fully as is possible. This is to prevent the presence or otherwise
of
long long
types affecting the semantics of code which
does not use them. For example, it would be possible to extend the
rules for the types of integer literals, integer promotion types and
arithmetic types to say that if the given value does not fit into
the standard integral types then the extended types are tried. This
has not been done, although these rules could be implemented by changing
the definitions of the standard tokens
used to determine these types. By default, only the rules for arithmetic
types involving a long long
operand and for LL
integer literals mention long long
types.
The C++ rules on bitfield types differ slightly from the C rules. Firstly any integral or enumeration type is allowed in a bitfield, and secondly the bitfield width may exceed the underlying type size (the extra bits being treated as padding). These properties can be controlled using the directives:
#pragma TenDRA extra bitfield int type allow #pragma TenDRA bitfield overflow allowrespectively.
In elaborated type specifiers, the class key (class
,
struct
, union
or enum
) should
agree with any previous declaration of the type (except that class
and struct
are interchangeable). This requirement can
be relaxed using the directive:
#pragma TenDRA ignore struct/union/enum tag on
In ISO C and C++ it is not possible to give a forward declaration of an enumeration type. This constraint can be relaxed using the directive:
#pragma TenDRA forward enum declaration allowUntil the end of its definition, an enumeration type is treated as an incomplete type (as with class types). In enumeration definitions, and a couple of other contexts where comma-separated lists are required, the directive:
#pragma TenDRA extra , allowcan be used to allow a trailing comma at the end of the list.
The directive:
#pragma TenDRA complete struct/union analysis oncan be used to enable a check that every class or union has been completed within each translation unit in which it is declared.
C, but not C++, allows calls to undeclared functions, the function being declared implicitly. It is possible to enable support for implicit function declarations using the directive:
#pragma TenDRA implicit function declaration onSuch implicitly declared functions have C linkage and type
int ( ... )
.
The C producer supports a concept, weak prototypes, whereby type checking can be applied to the arguments of a non-prototype function. This checking can be enabled using the directive:
#pragma TenDRA weak prototype analysis onThe concept of weak prototypes is not applicable to C++, where all functions are prototyped. The C++ producer does allow the syntax for explicit weak prototype declarations, but treats them as if they were normal prototypes. These declarations are denoted by means of a keyword,
WEAK
say, introduced by the directive:
#pragma TenDRA keyword identifier for weakpreceding the
(
of the function declarator. The directives:
#pragma TenDRA prototype allow #pragma TenDRA prototype (weak) allowwhich can be used in the C producer to warn of prototype or weak prototype declarations, are similarly ignored by the C++ producer.
The C producer also allows the directives:
#pragma TenDRA argument type-id as type-id #pragma TenDRA argument type-id as ... #pragma TenDRA extra ... allow #pragma TenDRA incompatible promoted function argument allowwhich control the compatibility of function types. These directives are ignored by the C++ producer (some of them would make sense in the context of C++ but would over-complicate function overloading).
printf
and scanf
argument checking
The C producer includes a number of checks that the arguments in a
call to a function in the printf
or scanf
families match the given format string. The check is implemented
by using the directives:
#pragma TenDRA type identifier for ... printf #pragma TenDRA type identifier for ... scanfto introduce a type representing a
printf
or scanf
format string. For most purposes this type is treated as const
char *
, but when it appears in a function declaration it alerts
the producer that any extra arguments passed to that function should
match the format string passed as the corresponding argument. The
TenDRA API headers conditionally declare printf
,
scanf
and similar functions in something like the form:
#ifdef __NO_PRINTF_CHECKS typedef const char *__printf_string ; #else #pragma TenDRA type __printf_string for ... printf #endif int printf ( __printf_string, ... ) ; int fprintf ( FILE *, __printf_string, ... ) ; int sprintf ( char *, __printf_string, ... ) ;These declarations can be skipped, effectively disabling this check, by defining the
__NO_PRINTF_CHECKS
macro.
These printf
and scanf
format string checks
have not yet been implemented in the C++ producer due to presence
of an alternative, type checked, I/O package - namely
<iostream>
. The format string types are simply
treated as const char *
.
C does not allow multiple definitions of a typedef
name,
whereas C++ allows multiple consistent definitions. This behaviour
can be controlled using the directive:
#pragma TenDRA extra type definition allow
The directive:
#pragma TenDRA incompatible type qualifier allowallows objects to be redeclared with different cv-qualifiers (normally such redeclarations would be incompatible). The composite type is qualified using the join of the cv-qualifiers in the various redeclarations.
The directive:
#pragma TenDRA compatible type : type-id == type-id : allowasserts that the given two types are compatible. Currently the only implemented version is
char * == void *
which enables
char *
to be used as a generic pointer as it was in older
dialects of C.
Some dialects of C allow incomplete arrays as member types. These are generally used as a place-holder at the end of a structure to allow for the allocation of an arbitrarily sized array. Support for this feature can be enabled using the directive:
#pragma TenDRA incomplete type as object type allow
There are a number of directives which allow various classes of type conversion to be checked. The directives:
#pragma TenDRA conversion analysis (int-int explicit) on #pragma TenDRA conversion analysis (int-int implicit) onwill check for unsafe explicit or implicit conversions between arithmetic types. Similarly conversions between pointers and arithmetic types can be checked using:
#pragma TenDRA conversion analysis (int-pointer explicit) on #pragma TenDRA conversion analysis (int-pointer implicit) onor equivalently:
#pragma TenDRA conversion analysis (pointer-int explicit) on #pragma TenDRA conversion analysis (pointer-int implicit) onConversions between pointer types can be checked using:
#pragma TenDRA conversion analysis (pointer-pointer explicit) on #pragma TenDRA conversion analysis (pointer-pointer implicit) on
There are some further variants which can be used to enable useful sets of conversion checks. For example:
#pragma TenDRA conversion analysis (int-int) onenables both implicit and explicit arithmetic conversion checks. The directives:
#pragma TenDRA conversion analysis (int-pointer) on #pragma TenDRA conversion analysis (pointer-int) on #pragma TenDRA conversion analysis (pointer-pointer) onare equivalent to their corresponding explicit forms (because the implicit forms are illegal by default). The directive:
#pragma TenDRA conversion analysis onis equivalent to the four directives just given. It enables checks on implicit and explicit arithmetic conversions, explicit arithmetic to pointer conversions and explicit pointer conversions.
The default settings for these checks are determined by the implicit and explicit conversions allowed in C++. Note that there are differences between the conversions allowed in C and C++. For example, an arithmetic type can be converted implicitly to an enumeration type in C, but not in C++. The directive:
#pragma TenDRA conversion analysis (int-enum implicit) oncan be used to control the status of this conversion. The level of severity for an error message arising from such a conversion is the maximum of the severity set by this directive and that set by the
int-int implicit
directive above.
The implicit pointer conversions described above do not include conversions
to and from the generic pointer void *
, which have their
own controlling directives. A pointer of type void *
can be converted implicitly to another pointer type in C but not in
C++; this is controlled by the directive:
#pragma TenDRA++ conversion analysis (void*-pointer implicit) onThe reverse conversion, from a pointer type to
void *
is allowed in both C and C++, and has a controlling directive:
#pragma TenDRA++ conversion analysis (pointer-void* implicit) on
In ISO C and C++, a function pointer can only be cast to other function
pointers, not to object pointers or void *
. Many dialects
however allow function pointers to be cast to and from other pointers.
This behaviour can be controlled using the directive:
#pragma TenDRA function pointer as pointer allowwhich causes function pointers to be treated in the same way as all other pointers.
The integer conversion checks described above only apply to unsafe conversions. A simple-minded check for shortening conversions is not adequate, as is shown by the following example:
char a = 1, b = 2 ; char c = a + b ;the sum
a + b
is evaluated as an int
which
is then shortened to a char
. Any check which does not
distinguish this sort of "safe" shortening conversion from
unsafe shortening conversions such as:
int a = 1, b = 2 ; char c = a + b ;is not likely to be very useful. The producer therefore associates two types with each integral expression; the first is the normal, representation type and the second is the underlying, semantic type. Thus in the first example, the representation type of
a + b
is int
, but semantically it is still a char
.
The conversion analysis is based on the semantic types.
The C producer supports a directive:
#pragma TenDRA keyword identifier for type representationwhereby a keyword can be introduced which can be used to explicitly declare a type with given representation and semantic components. Unfortunately this makes the C++ grammar ambiguous, so it has not yet been implemented in the C++ producer.
It is possible to allow individual conversions by means of conversion tokens. A procedure token which takes one rvalue expression program parameter and returns an rvalue expression, such as:
#pragma token PROC ( EXP : t : ) EXP : s : conv #can be regarded as mapping expressions of type
t
to expressions
of type s
. The directive:
#pragma TenDRA conversion identifier-list allowcan be used to nominate such a token as a conversion token. That is to say, if the conversion, whether explicit or implicit, from
t
to s
cannot be done by other means, it is done by applying
the token conv
, so:
t a ; s b = a ; // maps to conv ( a )Note that, unlike conversion functions, conversion tokens can be applied to any types.
ISO C++ introduces the constructs static_cast
,
const_cast
and reinterpret_cast
, which can
be used in various contexts where an old style explicit cast would
previously have been used. By default, an explicit cast can perform
any combination of the conversions performed by these three constructs.
To aid migration to the new style casts the directives:
#pragma TenDRA++ explicit cast as cast-state allow #pragma TenDRA++ explicit cast allowwhere cast-state is defined as follows:
cast-state : static_cast const_cast reinterpret_cast static_cast | cast-state const_cast | cast-state reinterpret_cast | cast-statecan be used to restrict the conversions which can be performed using explicit casts. The first form sets the interpretation of explicit cast to be combinations of the given constructs; the second resets the interpretation to the default. For example:
#pragma TenDRA++ explicit cast as static_cast | const_cast allowmeans that conversions requiring
reinterpret_cast
(the
most unportable conversions) will not be allowed to be performed using
explicit casts, but will have to be given as a reinterpret_cast
construct. Changing allow
to warning
will
also cause a warning to be issued for every explicit cast expression.
The directive:
#pragma TenDRA ident ... allowmay be used to enable or disable the use of
...
as a
primary expression in a function defined with ellipsis. The type
of such an expression is implementation defined. This expression
is used in the definition of the va_start
macro in the <stdarg.h>
header. This header
automatically enables this switch.
Older dialects of C++ did not report ambiguous overloaded function resolutions, but instead resolved the call to the first of the most viable candidates to be declared. This behaviour can be controlled using the directive:
#pragma TenDRA++ ambiguous overload resolution allowThere are occasions when the resolution of an overloaded function call is not clear. The directive:
#pragma TenDRA++ overload resolution allowcan be used to report the resolution of any such call (whether explicit or implicit) where there is more than one viable candidate.
An interesting consequence of compiling C++ in a target independent manner is that certain overload resolutions can only be determined at install-time. For example, in:
int f ( int ) ; int f ( unsigned int ) ; int f ( long ) ; int f ( unsigned long ) ; int a = f ( sizeof ( int ) ) ; // which f?the type of the
sizeof
operator, size_t
,
is target dependent, but its promotion must be one of the types
int
, unsigned int
, long
or
unsigned long
. Thus the call to f
always
has a unique resolution, but what it is is target dependent. The
equivalent directives:
#pragma TenDRA++ conditional overload resolution allow #pragma TenDRA++ conditional overload resolution (complete) allowcan be used to warn about such target dependent overload resolutions. By default, such resolutions are only allowed if there is a unique resolution for each possible implementation of the argument types (note that, for simplicity, the possibility of
long long
implementation types is ignored). The directive:
#pragma TenDRA++ conditional overload resolution (incomplete) allowcan be used to allow target dependent overload resolutions which only have resolutions for some of the possible implementation types (if one of the
f
declarations above was removed, for example).
If the implementation does not match one of these types then an install-time
error is given.
There are restrictions on the set of candidate functions involved
in a target dependent overload resolution. Most importantly, it should
be possible to bring their return types to a common type, as if by
a series of ?:
operations. This common type is the type
of the target dependent call. By this means, target dependent types
are prevented from propagating further out into the program. Note
that since sets of overloaded functions usually have the same semantics,
this does not usually present a problem.
The directive:
#pragma TenDRA operator precedence analysis oncan be used to enable a check for expressions where the operator precedence is not necessarily what might be expected. The intended precedence can be clarified by means of explicit parentheses. The precedence levels checked are as follows:
&&
versus ||
.
<<
and >>
versus binary
+
and -
.
&
versus binary +
, -
,
==
, !=
, >
, >=
,
<
and <=
.
^
versus binary &
, +
,
-
, ==
, !=
, >
,
>=
, <
and <=
.
|
versus binary ^
, &
,
+
, -
, ==
, !=
,
>
, >=
, <
and <=
.
a < b < c
which do not have their normal mathematical meaning. For example,
in:
d = a << b + c ; // precedence is a << ( b + c )the precedence is counter-intuitive, although strangely enough, it isn't in:
cout << b + c ; // precedence is cout << ( b + c )
Other dubious arithmetic operations can be checked for using the directive:
#pragma TenDRA integer operator analysis onThis includes checks for operations, such as division by a negative value, which are implementation dependent, and those such as testing whether an unsigned value is less than zero, which serve no purpose. Similarly the directive:
#pragma TenDRA++ pointer operator analysis onchecks for dubious pointer operations. This includes very simple bounds checking for arrays and checking that only the simple literal
0
is used in null pointer constants:
char *p = 1 - 1 ; // valid, but weird
The directive:
#pragma TenDRA integer overflow analysis onis used to control the treatment of overflows in the evaluation of integer constant expressions. This includes the detection of division by zero.
C, but not C++, only allows constant expressions in static initialisers. The directive:
#pragma TenDRA variable initialization allowcan be enable support for C++-style dynamic initialisers. Conversely, it can be used in C++ to detect such dynamic initialisers.
In older dialects of C it was not possible to initialise an automatic variable of structure or union type. This can be checked for using the directive:
#pragma TenDRA initialization of struct/union (auto) allow
The directive:
#pragma TenDRA++ complete initialization analysis oncan be used to check aggregate initialisers. The initialiser should be fully bracketed (i.e. with no elision of braces), and should have an entry for each member of the structure or array.
C++ defines the results of several operations to be lvalues, whereas they are rvalues in C. The directive:
#pragma TenDRA conditional lvalue allowis used to apply the C++ rules for lvalues in conditional (
?:
)
expressions.
Older dialects of C++ allowed this
to be treated as an
lvalue. It is possible to enable support for this dialect feature
using the directive:
#pragma TenDRA++ this lvalue allowhowever it is recommended that programs using this feature should be modified.
The directive:
#pragma TenDRA discard analysis oncan be used to enable a check for values which are calculated but not used. There are three checks controlled by this directive, each of which can be controlled independently. The directive:
#pragma TenDRA discard analysis (function return) onchecks for functions which return a value which is not used. The check needs to be enabled for both the declaration and the call of the function in order for a discarded function return to be reported. Discarded returns for overloaded operator functions are never reported. The directive:
#pragma TenDRA discard analysis (value) onchecks for other expressions which are not used. Finally, the directive:
#pragma TenDRA discard analysis (static) onchecks for variables with internal linkage which are defined but not used.
An unused function return or other expression can be asserted to be
deliberately discarded by explicitly casting it to void
or, equivalently, preceding it by a keyword introduced using the directive:
#pragma TenDRA keyword identifier for discard valueA static variable can be asserted to be deliberately unused by including it in list of identifiers in a directive of the form:
#pragma TenDRA suspend static identifier-list
The directive:
#pragma TenDRA const conditional allowcan be used to enable a check for constant expressions used in conditional contexts. A literal constant is allowed in the condition of a
while
, for
or do
statement to allow for
such common constructs as:
while ( true ) { // while statement body }and target dependent constant expressions are allowed in the condition of an
if
statement, but otherwise constant conditions
are reported according to the status of this check.
The common error of writing =
rather than ==
in conditions can be detected using the directive:
#pragma TenDRA assignment as bool allowwhich can be used to disallow such assignment expressions in contexts where a boolean is expected. The error message can be suppressed by enclosing the assignment within parentheses.
Another common error associated with iteration statements, particularly with certain heretical brace styles, is the accidental insertion of an extra semicolon as in:
for ( init ; cond ; step ) ; { // for statement body }The directive:
#pragma TenDRA extra ; after conditional allowcan be used to enable a check for such suspicious empty iteration statement bodies (it actually checks for
;{
).
A switch
statement is said to be exhaustive if its control
statement is guaranteed to take one of the values of its
case
labels, or if it has a default
label.
The TenDRA C and C++ producers allow a switch
statement
to be asserted to be exhaustive using the syntax:
switch ( cond ) EXHAUSTIVE { // switch statement body }where
EXHAUSTIVE
is either the directive:
#pragma TenDRA exhaustiveor a keyword introduced using:
#pragma TenDRA keyword identifier for exhaustiveKnowing whether a
switch
statement is exhaustive or not
means that checks relying on flow analysis (including variable usage
checks) can be applied more precisely.
In certain circumstances it is possible to deduce whether a
switch
statement is exhaustive or not. For example,
the directive:
#pragma TenDRA enum switch analysis onenables a check on
switch
statements on values of enumeration
type. Such statements should be exhaustive, either explicitly by
using the EXHAUSTIVE
keyword or declaring a
default
label, or implicitly by having a case
label for each enumerator. Conversely, the value of each case
label should equal the value of an enumerator. For the purposes of
this check, boolean values are treated as if they were declared using
an enumeration type of the form:
enum bool { false = 0, true = 1 } ;
A common source of errors in switch
statements is the
fall-through from one case
or default
statement to the next. A check for this can be enabled using:
#pragma TenDRA fall into case allow
case
or default
labels where fall-through
from the previous statement is intentional can be marked by preceding
them by a keyword, FALL_THRU
say, introduced using the
directive:
#pragma TenDRA keyword identifier for fall into case
In ISO C++ the scope of a variable declared in a for-init-statement
is the body of the for
statement; in older dialects it
extended to the end of the enclosing block. So:
for ( int i = 0 ; i < 10 ; i++ ) { // for statement body } return i ; // OK in older dialects, error in ISO C++This behaviour is controlled by the directive:
#pragma TenDRA++ for initialization block ona state of
on
corresponding to the ISO rules and
off
to the older rules. Perhaps most useful is the
warning
state which implements the old rules but gives
a warning if a variable declared in a for-init-statement is used outside
the corresponding for
statement body. A program which
does not give such warnings should compile correctly under either
set of rules.
In C, but not in C++, it is possible to have a return
statement without an expression in a function which does not return
void
. It is possible to enable this behaviour using
the directive:
#pragma TenDRA incompatible void return allowNote that this check includes the implicit
return
caused
by falling off the end of a function. The effect of such a
return
statement is undefined. The C++ rule that falling
off the end of main
is equivalent to returning a value
of 0 overrides this check.
The directive:
#pragma TenDRA unreachable code allowenables a flow analysis check to detect unreachable code. It is possible to assert that a statement is reached or not reached by preceding it by a keyword introduced by one of the directives:
#pragma TenDRA keyword identifier for set reachable #pragma TenDRA keyword identifier for set unreachable
The fact that certain functions, such as exit
, do not
return a value can be exploited in the flow analysis routines. The
equivalent directives:
#pragma TenDRA bottom identifier #pragma TenDRA++ type identifier for bottomcan be used to introduce a
typedef
declaration for the
type, bottom, returned by such functions. The TenDRA API headers
declare
exit
and similar functions in this way, for example:
#pragma TenDRA bottom __bottom __bottom exit ( int ) ; __bottom abort ( void ) ;The bottom type is compatible with
void
in function declarations
to allow such functions to be redeclared in their conventional form.
The directive:
#pragma TenDRA variable analysis onenables checks on the uses of automatic variables and function parameters. These checks detect:
a
, b
,
c
and d
respectively in:
void f () { int a ; // a never used int b ; int c = b ; // b not initialised c = 0 ; // c assigned to twice int d = 0 ; d = ++d ; // d assigned to twice }The second, and more particularly the third, of these checks requires some fairly sophisticated flow analysis, so any hints which can be picked up from exhaustive
switch
statements etc. is likely to increase the accuracy of the errors
detected.
In a non-static member function the various non-static data members are analysed as if they were automatic variables. It is checked that each member is initialised in a constructor. A common source of initialisation problems in a constructor is that the base classes and members are initialised in the canonical order of virtual bases, non-virtual direct bases and members in the order of their declaration, rather than in the order in which their initialisers appear in the constructor definition. Therefore a check that the initialisers appear in the canonical order is also applied.
It is possible to change the state of a variable during the variable analysis using the directives:
#pragma TenDRA set expression #pragma TenDRA discard expressionThe first asserts that the variable given by the expression has been assigned to; the second asserts that the variable is not used. An alternative way of expressing this is by means of keywords:
SET ( expression ) DISCARD ( expression )introduced using the directives.
#pragma TenDRA keyword identifier for set #pragma TenDRA keyword identifier for discard variablerespectively. These expressions can appear in expression statements and as the first argument of a comma expression.
The variable flow analysis checks have not yet been completely implemented. They may not detect errors in certain circumstances and for extremely convoluted code may occasionally give incorrect errors.
The directive:
#pragma TenDRA variable hiding analysis oncan be used to enable a check for hiding of other variables and, in member functions, data members, by local variable declarations.
The ISO C++ rules do not require exception specifications to be checked statically. This is to facilitate the integration of large systems where a single change in an exception specification could have ramifications throughout the system. However it is often useful to apply such checks, which can be enabled using the directive:
#pragma TenDRA++ throw analysis onThis detects any potentially uncaught exceptions and other exception problems. In the error messages arising from this check, an uncaught exception of type
...
means that an uncaught exception
of an unknown type (arising, for example, from a function without
an exception specification) may be thrown. For example:
void f ( int ) throw ( int ) ; void g ( int ) throw ( long ) ; void h ( int ) ; void e () throw ( int ) { f ( 1 ) ; // OK g ( 2 ) ; // uncaught 'long' exception h ( 3 ) ; // uncaught '...' exception }
The C++ producer makes the distinction between exported templates,
which may be used in one module and defined in another, and non-exported
templates, which must be defined in every module in which they are
used. As in the ISO C++ standard, the export
keyword
is used to distinguish between the two cases. In the past, different
compilers have had different template compilation models; either all
templates were exported or no templates were exported. The latter
is easily emulated - if the export
keyword is not used
then no templates will be exported. To emulate the former behaviour
the directive:
#pragma TenDRA++ implicit export template oncan be used to treat all templates as if they had been declared using the
export
keyword.
The automatic instantiation of exported templates has not yet been implemented correctly. It is intended that such instantiations will be generated during intermodule analysis (where they conceptually belong). At present it is necessary to work round this using explicit instantiations.
Several checks of varying utility have been implemented in the C++ producer but do not as yet have individual directives controlling their use. These can be enabled en masse using the directive:
#pragma TenDRA++ catch all allowIt is intended that this directive will be phased out as these checks are assigned controlling directives. It is possible to achieve finer control over these checks by enabling their individual error messages as described above.
Part of the TenDRA Web.
Crown
Copyright © 1998.