It is expected that the general user will have little direct contact with the token syntax, instead using the asbstract standard headers provided or using the tspec tool [Ref. 5] to generate their own token interface header files automatically. However, it may occasionally be necessary to use the raw power of the token syntax directly.
As an example of the power of the token syntax consider the program below:
#pragma token TYPE FILE# #pragma token EXP rvalue:FILE *:stderr# int fprintf(FILE *, const char *, ...); void f(void) { fprintf(stderr,"hello world\n"); }The first line of the program introduces a token, FILE, for a type. By using its identification, FILE, this token can be used wherever a type could have been used throughout the rest of the program. The compiler can then compile this program to TDF (the abstract TenDRA machine) even though it contains an undefined type. This is fundamental to the construction of portable software, where the developer cannot assume the definitions of various types as they may be different on different machines.
The second line of the example, which introduces a token for an expression, is somewhat more complicated. In order to make use of an expression, it is necessary to know its type and whether or not it is an lvalue (i.e. whether or not it can be assigned to). As can be seen from the example however, it is not necessary to know the exact type of the expression because a token can be used to represent its type.
The TenDRA compiler makes no assumptions about the possible definitions of tokens and will raise an error if a program requires information about an undefined token. In this way many errors resulting from inadvertent use of a definition present on the developer's system can be detected. For example, developers often assume that the type FILE will be implemented by a structure type when in fact the ISO C standard permits the implementation of FILE by any type. In the program above, any attempt to access members of stderr would cause the compiler to raise an error.
In the compilation phase the source text written in the C language
is mapped to an object code format. This object code is generally
not complete in itself and must be linked with other program segments
such as definitions from the system libraries.
When tokens are involved there is an extra stage in the construction
process where undefined tokens in one program segment are linked with
their definitions in another program segment. To summarise, program
construction using TDF and the TenDRA tools has four basic operations:
F.2 Program construction using TDF
Traditional program construction using the C language has two phases:
compilation and linking.
F.3 The token syntax
The token syntax is an extension to the ISO C standard language to
allow the use of tokens to represent program constructs. Tokens can
be used either in place of, or as well as, the definitions required
by a program. In the latter case, the tokens exist merely to enforce
correct definitions and usage of the objects they reference. However
it should be noted that the presence of a token introduction can alter
the semantics of a program (examples are given in F.5
Expression tokens). The semantics have been altered to force programs
to respect token interfaces where they would otherwise fail to do
so.
The token syntax takes the following basic form:
#pragma token token-introduction token-identificationIt is introduced as a pragma to allow other compilers to ignore it, though if tokens are being used to replace the definitions needed by a program, ignoring these pragmas will generally cause the compilation to fail.
The token-introduction defines the kind of token being introduced along with any additional information associated with that kind of token. Currently there are five kinds of token that can be introduced, corresponding approximately to expressions, statements, type-names, member designators and function-like macros.
The token-identification provides the means of referring to the token, both internally within the program and externally for TDF linking purposes.
F.4 Token identification
The syntax for the token-identification is as follows:
token identification:
name-spaceopt identifier # external-identifier
opt
name-space:
TAG
There is a default name space associated with each kind of token and
internal identifiers for tokens generally reside in these default
name spaces. The ISO C standard describes the five name spaces as
being:
The exception is compound type-token identifiers (see F.7.4
Compound type tokens) which by default reside in the ordinary
name space but can be forced to reside in the tag name space by setting
the optional name-space to be TAG.
The first identifier of the token-identification provides the internal identification of the token. This is the name used to identify the token within the program. It must be followed by a #.
All further preprocessing tokens until the end of the line are treated as part of the external-identifier with non-empty white space sequences being replaced by a single space. The external-identifier specifies the external identification of the token which is used for TDF linking. External token identifications reside in their own name space which is distinct from the external name space for functions and objects. This means that it is possible to have both a function and a token with the same external identification. If the external-identifier is omitted it is assumed that the internal and external identifications are the same.
Expression tokens can either be defined using #define statements or
by using externals. They can also be resolved as a result of applying
the type-resolution or assignment-resolution operators (see F.7.5
Type token compatibility, definitions etc.). Expression token
definitions are subject to the following constraints:
The program below provides two examples of the violation of the second
constraint.
If the exp-token-name refers to an expression that designates
a value, then the defining expression is converted, as if by assignment,
to the type of the expression token using the assignment-resolution
operator (see F.7.5 Type token compatibility, definitions
etc.). With all other designations the defining expression is
left unchanged. In both cases the resulting expression is used as
the definition of the expression token. This can subtly alter the
semantics of a program. Consider the program:
Although they look similar, expression token definitions using #defines
are not quite the same as macro definitions. A macro can be defined
by any preprocessing tokens which are then computed in phase 3 of
translation as defined in the ISO C standard, whereas tokens are defined
by assignment-expressions which are computed in phase 7. One of the
consequences of this is illustrated by the program below:
Expression tokens can also be defined by declaring the exp-token-name
that references the token to be an object with external linkage e.g.
The use of statement tokens is analogous to the use of expression
tokens (see F.5 Expression tokens). A new symbol,
stat-token-name, has been introduced into the syntax analysis
at phase 7 of translation as defined in the ISO C standard. This token
is passed through to the syntax analyser whenever the preprocessor
encounters an identifier referring to a statement token. A
As with expression tokens, statement tokens are defined using #define
statements. An example of this is shown below:
F.5 Expression tokens
There are various properties associated with expression tokens which
are used to determine the operations that may be performed upon them.
The syntax for introducing expression tokens is:
exp-token:
EXP exp-storage : type-name :
NAT
exp-storage:
rvalue
lvalue
const
Expression tokens can be introduced using either the EXP or NAT token
introductions. Expression tokens introduced using NAT are constant
value designations of type int i.e. they reference constant integer
expressions. All other expression tokens are assumed to be non-constant
and are introduced using EXP.
All internal expression token identifiers must reside in the macro
name space and this is consequently the default name space for such
identifiers. Hence the optional name-space, TAG, should not be present
in an EXP token introduction. Although the use of an expression token
after it has been introduced is very similar to that of an ordinary
identifier, as it resides in the macro name space, it has the additional
properties listed below:exp-storage
is either lvalue or rvalue. If it
is lvalue, then the token is an object designation without type qualification.
If it is rvalue then the token is either a value or a function designation
depending on whether or not its type is a function type.
In order to make use of tokenised expressions, a new symbol, exp-token-name
, has been introduced at translation phase seven of the syntax
analysis as defined in the ISO C standard. When an expression token
identifier is encountered by the preprocessor, an exp-token-name
symbol is passed through to the syntax analyser. An exp-token-name
provides information about an expression token in the same
way that a typedef-name
provides information about a
type introduced using a typedef. This symbol can only occur as part
of a primary-expression (ISO C standard section 6.3.1) and the expression
resulting from the use of exp-token-name
will have the
type, designation and constancy specified in the token introduction.
As an example, consider the pragma:
#pragma token EXP rvalue : int : x#
This introduces a token for an expression which is a value designation
of type int with internal and external name x.
exp-token-name
refers to a constant expression
(i.e. it was introduced using the NAT token introduction), then the
defining expression must also be a constant expression as expressed
in the ISO C standard, section 6.4;exp-token-name
refers to an lvalue expression,
then the defining expression must also designate an object and the
type of the expression token must be resolvable to the type of the
defining expression. All the type qualifiers of the defining expression
must appear in the object designation of the token introduction;exp-token-name
refers to an expression that
has function designation, then the type of the expression token must
be resolvable to the type of the defining expression.
#pragma token EXP lvalue : int : i#
extern short k;
#define i 6
#define i k
The expression token i is an object designation of type int. The first
violation occurs because the expression, 6, does not designate an
object. The second violation is because the type of the token expression,
i, is int which cannot be resolved to the type short.
#pragma token EXP rvalue:long:li#
#define li 6
int f() {
return sizeof(li);
}
The definition of the token li causes the expression, 6, to be converted
to long (this is essential to separate the use of li from its definition).
The function, f, then returns sizeof(long). If the token introduction
was absent however f would return sizeof(int).
#pragma token EXP rvalue:int :X#
#define X M+3
#define M sizeof(int)
int f(int x) { return (x+X); }
If the token introduction of X is absent, the program above will compile
as, at the time the definition of X is interpreted (when evaluating
x+X), both M and X are in scope. When the token introduction is present
the compilation will fail as the definition of X, being part of translation
phase 7, is interpreted when it is encountered and at this stage M
is not defined. This can be rectified by reversing the order of the
definitions of X and M or by bracketing the definition of X. i.e.
#define X (M+3)
Conversely consider:
#pragma token EXP rvalue:int:X#
#define M sizeof(int)
#define X M+3
#undef M
int M(int x) { return (x+X); }
The definition of X is computed on line 3 when M is in scope, not
on line 6 where it is used. Token definitions can be used in this
way to relieve some of the pressures on name spaces by undefining
macros that are only used in token definitions. This facility should
be used with care as it may not be a straightforward matter to convert
the program back to a conventional C program.
#pragma token EXP lvalue:int:x#
extern int x;
The semantics of this program are effectively the same as the semantics
of:
#pragma token EXP lvalue:int:x#
extern int _x;
#define x _x
F.6 Statement tokens
The syntax for introducing a statement token is simply:
#pragma token STATEMENT init_globs#
int g(int);
int f(int x) { init_globs return g(x);}
Internal statement token identifiers reside in the macro name space.
The optional name space, TAG, should not appear in statement token
introductions.stat-token-name
can only occur as part of the statement syntax (ISO C standard,
section 6.6).
#pragma token STATEMENT i_globs#
#define i_globs {int i=x;x=3;}
The constraints on the definition of statement tokens are:
The semantics of the defining statement are precisely the same as
the semantics of a compound statement forming the definition of a
function with no parameters and void result. The definition of statement
tokens carries the same implications for phases of translation as
the definition of expression tokens (see F.5 Expression
tokens).
Once general type tokens have been introduced, they can be used to
construct derived declarator types in the same way as conventional
type declarators. For example:
This definition of lvalue conversion for general token types is used
to allow objects of general tokenised types to be assigned to and
passed as arguments to functions. The extensions to the semantics
of function argument passing and assignment are as follows: if the
type token is defined to be an array then the components of the array
are assigned and passed as arguments to the function call; in all
other cases the assignment and function call are the same as if the
defining type had been used directly.
The default name space for the internal identifiers for general type
tokens is the ordinary name space and all such identifiers must reside
in this name space. The local identifier behaves exactly as if it
had been introduced with a typedef statement and is thus treated as
a typedef-name by the syntax analyser.
Values which have integral tokenised types can be converted to any
scalar type (see F.7 Type tokens). Similarly values
with any scalar type can be converted to a value with a tokenised
integral type. The semantics of these conversions are exactly the
same as if the type defining the token were used directly. Consider:
The usual arithmetic conversions described in the ISO C standard (section
6.3.1.5) are defined on integral type tokens and are applied where
required by the ISO C standard.
The integral promotions are defined according to the rules introduced
in Chapter 4. These promotions are first applied to the integral type
token and then the usual arithmetic conversions are applied to the
resulting type.
As with general type tokens, integral type tokens can only reside
in their default name space, the ordinary name space (the optional
name-space, TAG, cannot be specified in the token introduction). They
also behave as though they had been introduced using a typedef statement.
Compound type tokens are introduced using either the STRUCT or UNION
token introductions. A compound type token can be defined by any compound
type, regardless of the introduction used. It is expected, however,
that programmers will use STRUCT for compound types with non-overlapping
member selectors and UNION for compound types with overlapping member
selectors. The compound type token introduction does not specify the
member selectors of the compound type - these are added later (see
F.8 Selector tokens).
Values and objects with tokenised compound types can be used anywhere
that a structure and union type can be used.
Internal identifiers of compound type tokens can reside in either
the ordinary name space or the tag name space. The default is the
ordinary name space; identifiers placed in the ordinary name space
behave as if the type had been declared using a typedef statement.
If the identifier, id say, is placed in the tag name space, it is
as if the type had been declared as struct id or union id. Examples
of the introduction and use of compound type tokens are shown below:
Type tokens can only be defined by using one of the operations known
as
The ISO C standard prohibits the repeated use of typedef statements
to define a type. However, in order to allow type resolution, the
compiler allows types to be consistently redefined using multiple
typedef statements if:
The syntax for introducing member selector tokens as follows:
Internal identifiers of member selector tokens can only reside in
the member name space of the compound type to which they belong. Clearly
this is also the default name space for such identifiers.
When structure or union types are declared, according to the ISO C
standard there is an implied ordering on the member selectors. In
particular this means that:
The decision to allow unordered member selectors has been taken deliberately
in order to separate the decision of which members belong to a structure
from that of where such member components lie within the structure.
This makes it possible to represent extensions to APIs which require
extra member selectors to be added to existing compound types.
As an example of the use of token member selectors, consider the structure
lconv specified in the ISO C Standard library (section 7.4.3.1). The
standard does not specify all the members of struct lconv or the order
in which they appear. This type cannot be represented naturally by
existing C types, but can be described by the token syntax.
There are two methods for defining selector tokens, one explicit and
one implicit. As selector token identifiers do not reside in the macro
name space they cannot be defined using #define statements.
Suppose A is an undefined compound token type and mem is an undefined
selector token for A. If A is later defined to be the compound type
B and B has a member selector with identifier mem then A.mem is defined
to be B.mem providing the type of A.mem can be resolved to the type
of B.mem. This is known as implicit selector token definition.
In the program shown below the redefinition of the compound type s_t
causes the token for the selector mem_x to be implicitly
defined to be the second member of struct s_tag. The consequential
type resolution leads to the token type t_t being defined
to be int.
The identifier provides the identification of the member
selector within that compound type.
The member-designator provides the definition of the selector
token. It must identify a selector of a compound type.
If the member-designator is an identifier, then the identifier
must be a member of the compound type specified by the type-name.
If the member-designator is an identifier, id say,
followed by a further member-designator, M say, then:
In the example shown below, the selector token mem is defined to be
the second member of struct s which in turn is the second
member of struct s_t.
Procedure tokens are based on this concept of parameterisation. Procedure
tokens reference program constructs that are parameterised by other
program constructs.
There are three methods of introducing procedure tokens. These are
described in the sections below.
The
The bound token dependencies are introduced in exactly the same way
as the tokens described in the previous sections with the identifier
corresponding to the internal identification of the token. No external
identification is allowed. The scope of these local identifiers terminates
at the end of the procedure token introduction, and whilst in scope,
they hide all other identifiers in the same name space. Such tokens
are referred to as "bound" because they are local to the
procedure token.
Once a bound token dependency has been introduced, it can be used
throughout the rest of the procedure token introduction in the construction
of other components.
The
Each program parameter is introduced with a keyword expressing the
kind of program construct that it represents. The keywords are as
follows:
Consider the two procedure token introductions below, corresponding
to the macro SWAP described earlier.
The syntax for introducing function procedure tokens is:
The syntax for defining procedure tokens is given below and is based
upon the standard parameterised macro definition. However, as in the
definitions of expressions and statements, the #defines of procedure
token identifiers are evaluated in phase 7 of translation as described
in the ISO C standard.
None of the bound token dependencies can be defined during the evaluation
of the definition of the procedure token since they are effectively
provided by the arguments of the procedure token each time it is called.
To illustrate this, consider the example below based on the dderef
token used earlier.
Again, the presence of a procedure token introduction can alter the
semantics of a program. Consider the program below.
Function procedure tokens are introduced with tentative implicit definitions,
defining them to be direct calls of the functions they reference and
effectively removing the in-lining capability. If a genuine definition
is found later in the compilation, it overrides the tentative definition.
An example of a tentative definition is shown below:
In order to produce executable code, definitions of the interface
tokens must be provided on all target machines. This is done by compiling
the interfaces with the system headers and libraries.
When developing applications, programmers must ensure that they do
not accidentally define a token expressing an API. Implementers of
APIs, however, do not want to inadvertently fail to define a token
expressing that API. Token definition states have been introduced
to enable programmers to instruct the compiler to check that tokens
are defined when and only when they wish them to be. This is fundamental
to the separation of programs into portable and unportable parts.
When tokens are first introduced, they are in the free state. This
means that the token can be defined or left undefined and if the token
is defined during compilation, its definition will be output as TDF.
Once a token has been given a valid definition, its definition state
moves to defined. Tokens may only be defined once. Any attempt to
define a token in the defined state is flagged as an error.
There are three more token definition states which may be set by the
programmer. These are as follows:
The token-op specifies the definition state to be associated
with the tokens in the
The
The default compilation state is the standard state. In this state
the
Files included using:
The implementation compilation state is associated with files included
using:
Including a file using:
Part of the TenDRA Web.F.7 Type tokens
Type tokens are used to introduce references to types. The ISO C standard,
section 6.1.2.5, identifies the following classification of types:
These types fall into the following broader type classifications:
The classification of a type determines which operations are permitted
on objects of that type. For example, the ! operator can only be applied
to objects of scalar type. In order to reflect this, there are several
type token introductions which can be used to classify the type to
be referenced, so that the compiler can perform semantic checking
on the application of operators. The possible type token introductions
are:
type-token:
TYPE
VARIETY
ARITHMETIC
STRUCT
UNION
F.7.1 General type tokens
The most general type token introduction is TYPE. This introduces
a type of unknown classification which can be defined to be any C
type. Only a few generic operations can be applied to such type tokens,
since the semantics must be defined for all possible substituted types.
Assignment and function argument passing are effectively generic operations,
apart from the treatment of array types. For example, according to
the ISO C standard, even assignment is not permitted if the left operand
has array type and we might therefore expect assignment of general
token types to be illegal. Tokens introduced using the TYPE token
introduction can thus be regarded as representing non-array types
with extensions to represent array types provided by applying non-array
semantics as described below.
#pragma token TYPE t_t#
#pragma token TYPE t_p#
#pragma token NAT n#
typedef t_t *ptr_type; /* introduces pointer type */
typedef t_t fn_type(t_p);/*introduces function type */
typedef t_t arr_type[n];/*introduces array type */
The only standard conversion that can be performed on an object of
general token type is the lvalue conversion (ISO C standard section
6.2). Lvalue conversion of an object with general token type is defined
to return the item stored in the object. The semantics of lvalue conversion
are thus fundamentally altered by the presence of a token introduction.
If type t_t
is defined to be an array type the lvalue
conversion of an object of type t_t will deliver a pointer
to the first array element. If, however, t_t is defined to
be a general token type, which is later defined to be an array type,
lvalue conversion on an object of type t_t will deliver the
components of the array.F.7.2 Integral type tokens
The token introduction VARIETY is used to introduce a token representing
an integral type. A token introduced in this way can only be defined
as an integral type and can be used wherever an integral type is valid.
#pragma token VARIETY i_t#
short f(void) {
i_t x_i = 5;
return x_i;
}
short g(void) {
long x_i = 5;
return x_i;
}
Within the function f there are two conversions: the value, 5, of
type int, is converted to i_t, the tokenised integral type,
and a value of tokenised integral type i_t is converted to
a value of type short. If the type i_t were defined to be
long then the function f would be exactly equivalent to the function
g.F.7.3 Arithmetic type tokens
The token introduction ARITHMETIC introduces an arithmetic type token.
In theory, such tokens can be defined by any arithmetic type, but
the current implementation of the compiler only permits them to be
defined by integral types. These type tokens are thus exactly equivalent
to the integral type tokens introduced using the token introduction
VARIETY.F.7.4 Compound type tokens
For the purposes of this document, a compound type is a type describing
objects which have components that are accessible via member selectors.
All structure and union types are thus compound types, but, unlike
structure and union types in C, compound types do not necessarily
have an ordering on their member selectors. In particular, this means
that some objects of compound type cannot be initialised with an initialiser-list
(see ISO C standard section 6.5.7).
#pragma token STRUCT n_t#
#pragma token STRUCT TAG s_t#
#pragma token UNION TAG u_t#
void f() {
n_t x1;
struct n_t x2; /* Illegal,n_t not in the tag name space */
s_t x3; /* Illegal,s_t not in the ordinary name space*/
struct s_t x4;
union u_t x5;
}
F.7.5 Type token compatibility, definitions etc.
A type represented by an undefined type token is incompatible (ISO
C standard section 6.1.3.6) with all other types except for itself.
A type represented by a defined type token is compatible with everything
that is compatible with its definition.type-resolution
and assignment-resolution
.
Note that, as type token identifiers do not reside in the macro name
space, they cannot be defined using #define statements.Type-resolution
operates on two types and is essentially
identical to the operation of type compatibility (ISO C standard section
6.1.3.6) with one major exception. In the case where an undefined
type token is found to be incompatible with the type with which it
is being compared, the type token is defined to be the type with which
it is being compared, thereby making them compatible.
As an example, consider the program below:
#pragma token TYPE t_t#
typedef t_t *ptr_t_t;
typedef int **ptr_t_t;
The second definition of ptr_t_t causes a resolution of the
types t_t *
and int **
.
The rules of type compatibility state that two pointers are compatible
if their dependent types are compatible, thus type resolution results
in the definition of t_t as int *.Type-resolution
can also result in the definition of
other tokens. The program below results in the expression token N
being defined as (4*sizeof(int)).
#pragma token EXP rvalue:int:N#
typedef int arr[N];
typedef int arr[4*sizeof(int)];
The type-resolution
operator is not symmetric; a resolution
of two types, A and B say, is an attempt to resolve type A to type
B. Thus only the undefined tokens of A can be defined as a result
of applying the type-resolution
operator. In the examples
above, if the typedefs were reversed, no type-resolution
would take place and the types would be incompatible.Assignment-resolution
is similar to type-resolution
but it occurs when converting an object of one type to another type
for the purposes of assignment. Suppose the conversion is not possible
and the type to which the object is being converted is an undefined
token type. If the token can be defined in such a way that the conversion
is possible, then that token will be suitably defined. If there is
more than one possible definition, the definition causing no conversion
will be chosen.F.8 Selector tokens
The use of selector tokens is the primary method of adding member
selectors to compound type tokens. (The only other method is to define
the compound type token to be a particular structure or union type.)
The introduction of new selector tokens can occur at any point in
a program and they can thus be used to add new member selectors to
existing compound types.
selector-token:
MEMBER selector-type-name : type-name :
selector-type-name:
type-name
type-name % constant-expression
The selector-type-name specifies the type of the object selected
by the selector token. If the selector-type-name is a plain
type-name, the member selector token has that type. If the
selector-type-name
consists of a type-name and
a constant-expression separated by a % sign, the member selector
token refers to a bitfield of type type-name and width constant-expression
. The second type-name gives the compound type to which
the member selector belongs. For example:
#pragma token STRUCT TAG s_t#
#pragma token MEMBER char*: struct s_t:s_t_mem#
introduces a compound token type, s_t, which has a member
selector, s_t_mem, which selects an object of type char*.
The member selectors introduced as selector tokens are not related
to any other member selectors until they are defined. There is thus
no ordering on the undefined tokenised member selectors of a compound
type. If a compound type has only undefined token selectors, it cannot
be initialised with an initialiser-list. There will be an ordering
on the defined members of a compound type and in this case, the compound
type can be initialised automatically.
#pragma token TYPE t_t#
#pragma token STRUCT s_t#
#pragma token MEMBER t_t : s_t : mem_x#
#pragma token MEMBER t_t : s_t : mem_y#
struct s_tag { int a, mem_x, b; }
typedef struct s_tag s_t;
Explicit selector token definition takes place using the pragma:
#pragma DEFINE MEMBER type-name identifier : member-designator
member-designator:
identifier
identifier . member-designator
The type-name specifies the compound type to which the selector
belongs.
As with implicit selector token definitions, the type of the selector
token must be resolved to the type of the selector identified by the
member-designator.member-designator
M must identify a member selector
of the compound type C.
#pragma token STRUCT s_t#
#pragma token MEMBER int : s_t : mem#
typedef struct {int x; struct {char y; int z;} s; } s_t;
#pragma DEFINE MEMBER s_t : mem s.z
F.9 Procedure tokens
Consider the macro SWAP defined below:
#define SWAP(T,A,B) { \
T x; \
x=B; \
B=A; \
A=x; \
}
SWAP can be thought of as a statement that is parameterised by a type
and two expressions.F.9.1 General procedure tokens
The syntax for introducing a general procedure token is:
general procedure:
PROC { bound-toksopt | prog-pars
opt } token-introduction
simple procedure:
PROC ( bound-toksopt ) token-introduction
bound-toks:
bound-token
bound-token, bound-toks
bound-token:
token-introduction name-spaceopt identifier
prog-pars:
program-parameter
program-parameter, prog-pars
program parameter:
EXP identifier
STATEMENT identifier
TYPE type-name-identifier
MEMBER type-name-identifier : identifier
The final token-introduction
specifies the kind of program
construct being parameterised. In the current implementation of the
compiler, only expressions and statements may be parameterised. The
internal procedure token identifier is placed in the default name
space of the program construct which it parameterises. For example,
the internal identifier of a procedure token parameterising an expression
would be placed in the macro name space.bound-toks
are the bound token dependencies which
describe the program constructs upon which the procedure token depends.
These should not be confused with the parameters of the token. The
procedure token introduced in:
#pragma token PROC {TYPE t,EXP rvalue:t**:e|EXP e} EXP:rvalue:t:dderef#
is intended to represent a double dereference and depends upon the
type of the expression to be dereferenced and upon the expression
itself but takes only one argument, namely the expression, from which
both dependencies can be deduced.prog-pars
are the program parameters. They describe
the parameters with which the procedure token is called. The bound
token dependencies are deduced from these program parameters.
Currently PROC tokens cannot be passed as program parameters.
char f(char **c_ptr_ptr){
return dderef(c_ptr_ptr);
}
causes the expression, e, to be defined to be c_ptr_ptr thus resolving
the type t** to be char **. The type t is hence defined to be char,
also providing the type of the expression obtained by the application
of the procedure token dderef; type-name
. The parameter
type is resolved to the argument type in order to define any related
dependencies;type-name
specifies the composite type to which the member selector belongs
and the identifier is the identification of the member selector. When
the procedure token is applied, the corresponding argument must be
a member-designator of the compound type.F.9.2 Simple procedure tokens
In cases where there is a direct, one-to-one correspondence between
the bound token dependencies and the program parameters a simpler
form of procedure token introduction is available.
/* General procedure introduction */
#pragma token PROC{TYPE t,EXP lvalue:t:e1,EXP lvalue:t:e2 | \
TYPE t,EXP e1,EXP e2 } STATEMENT SWAP#
/* Simple procedure introduction */
#pragma token PROC(TYPE t,EXP lvalue:t:,EXP lvalue:t: ) STATEMENT SWAP#
The simple-token syntax is similar to the bound-token
syntax, but it also introduces a program parameter for each bound
token. The bound token introduced by the simple-token syntax
is defined as though it had been introduced with the bound-token
syntax. If the final identifier is omitted, then no name space can
be specified, the bound token is not identified and in effect there
is a local hidden identifier.F.9.3 Function procedure tokens
One of the commonest uses of simple procedure tokens is to represent
function in-lining. In this case, the procedure token represents the
in-lining of the function, with the function parameters being the
program arguments of the procedure token call, and the program construct
resulting from the call of the procedure token being the corresponding
in-lining of the function. This is a direct parallel to the use of
macros to represent functions.
function-procedure:
FUNC type-name :
The type-name must be a prototyped function type. The pragma results
in the declaration of a function of that type with external linkage
and the introduction of a procedure token suitable for an in-lining
of the function. (If an ellipsis is present in the prototyped function
type, it is used in the function declaration but not in the procedure
token introduction.) Every parameter type and result type is mapped
onto the token introduction:
EXP rvalue:
The example below:
#pragma token FUNC int(int): putchar#
declares a function, putchar, which returns an int and takes an int
as its argument, and introduces a procedure token suitable for in-lining
putchar. Note that:
#undef putchar
will remove the procedure token but not the underlying function.F.9.4 Defining procedure tokens
All procedure tokens are defined by the same mechanism. Since simple
and function procedure tokens can be transformed into general procedure
tokens, the definition will be explained in terms of general procedure
tokens.
#define identifier ( id-listopt ) assignment-expression
#define identifier ( id-listopt ) statement
id-list:
identifier
identifer, id-list
The id-list must correspond directly to the program parameters
of the procedure token introduction. There must be precisely one identifier
for each program parameter. These identifiers are used to identify
the program parameters of the procedure token being defined and have
a scope that terminates at the end of the procedure token definition.
They are placed in the default name spaces for the kinds of program
constructs which they identify.
#pragma token PROC{TYPE t, EXP rvalue:t**:e|EXP e}EXP rvalue:t:dderef#
#define dderef (A) (**(A))
The identifiers t and e are not in scope during the definition, being
merely local identifiers for use in the procedure token introduction.
The only identifier in scope is A. A identifies an expression token
which is an rvalue whose type is a pointer to a pointer to a type
token. The expression token and the type token are provided by the
arguments at the time of calling the procedure token.
#pragma token PROC {TYPE t, EXP lvalue:t:,EXP lvalue:t:}STATEMENT SWAP#
#define SWAP(T,A,B)\
{T x; x=B; B=A; A=x;}
void f(int x, int y) {
SWAP(int, x, y)
}
The definition and call of the procedure token are extremely straightforward.
However, if the procedure token introduction is absent, the swap does
not take place because x refers to the variable in the inner scope.
#pragma token FUNC int(int, long) : func#
#define func(A, B) (func) (A, B)
F.10 Tokens and APIs
In Chapter 1 we mentioned that one of the main problems in writing
portable software is the lack of separation between specification
and implementation of APIs. The TenDRA technology uses the token syntax
described in the previous sections to provide an abstract description
of an API specification. Collections of tokens representing APIs are
called "interfaces". Tchk can compile programs with these
interfaces in order to check applications against API specifications
independently of any particular implementation that may be present
on the developer's machine.
These token definition states are set using the pragmas:
#pragma token-op token-id-listopt
token-op:
define
no_def
ignore
interface
token-id-list:
TAGopt identifier dot-listopt
token-id-listopt
dot-list:
. member-designator
The token-id-list is the list of tokens to which the definition
state applies. The tokens in the token-id-list
are identified
by an identifier, optionally preceded by TAG. If TAG is present, the
identifier refers to the tag name space, otherwise the macro and ordinary
name spaces are searched for the identifier. If there is no dot-list
present, the identifier must refer to a token. If the dot-list
is present, the identifier must refer to a compound type and
the member-designator must identify a member selector token of that
compound type.token-id-list
. There are three
literal operators and one context dependent operator, as follows:
As an example of an extension API, consider the POSIX stdio.h. This
is an extension of the ANSI stdio.h and uses the same tokens to represent
the common part of the interface. When compiling applications, nothing
can be assumed about the implementation of the ANSI tokens accessed
via the POSIX API so they should be in the indefinable state. When
the POSIX tokens are being implemented, however, the ANSI implementations
can be assumed. The ANSI tokens are then in the ignored state. (Since
the definitions of these tokens will have been output already during
the definition of the ANSI interface, they should not be output again.)no_def
causes the token state to move to indefinable.define
causes the token state to move to committed;ignore
causes the token state to move to ignored;interface
is the context dependent operator and is
used when describing extensions to existing APIs.interface
operator has a variable interpretation
to allow the correct definition state to be associated with these
`base-API tokens'. The compiler associates a compilation state with
each file it processes. These compilation states determine the interpretation
of the interface operator within that file.interface
operator is interpreted as the no_def
operator. This is the standard state for compiling applications in
the presence of APIs;
#include header
have the same compilation state as the file from which they were included.
#pragma implement interface header
In this context the interface
operator is interpreted
as the define
operator.
#pragma extend interface header
causes the compilation state to be extension unless the file from
which it was included was in the standard state, in which case the
compilation state is the standard state. In the extension state the
interface
operator is interpreted as the ignore
operator.
Crown
Copyright © 1998.