All executable code in TDF will arise from an EXP PROC made by either make_proc or make_general_proc. They differ in their treatment of how space for the actual parameters of a call is managed; in particular, is it the caller or the callee which deallocates the parameter space?
With make_proc, this management is conceptually done by the caller at an apply_proc; i.e. the normal C situation. This suffers from the limitation that tail-calls of procedures are then only possible in restricted circumstances (e.g. the space for the parameters of the tail-call must be capable of being included in caller's parameters) and could only be implemented as an optimisation within a translator. A producer could not predict these circumstances in a machine independent manner, whether or not it knew that a tail-call was valid.
An alternative would be to make the management of parameter space the responsibility of the called procedure. Rather than do this, make_general_proc (and apply_general_proc) splits the parameters into two sets, one whose allocation is the responsibility of the caller and the other whose allocation is dealt with by the callee. This allows an explicit tail_call to be made to a procedure with new callee parameters; the caller parameters for the tail_call will be the same as (or some initial subset of) the caller parameters of the procedure containing the tail_call .
A further refinement of make_general_proc is to allow access to the caller parameter space in a postlude at the call of the procedure using an apply_general_proc. This allows simple implementations of Ada out_parameters, or more generally, multiple results of procedures.
Each straightforward formal parameter is introduced by an auxiliary
SORT TAGSHACC using make_tagshacc:
Within body, the formal will be accessed using tg_intro;
it is always considered to be a pointer to the space of SHAPE sha
allocated by apply_proc, hence the pointer SHAPE.
For example, if we had a simple procedure with one integer parameter,
var_intro would be empty and params_intro might be:
5.1. make_proc and apply_proc
The make_proc constructor has signature:
result_shape: SHAPE
params_intro: LIST(TAGSHACC)
var_intro: OPTION(TAGACC)
body: EXP BOTTOM
-> EXP PROC
The params_intro and var_intro parameters introduce
the formal parameters of the procedure which may be used in body.
The procedure result will have SHAPE result_shape and will
be usually given by some return construction within body. The
basic model is that space will be provided to copy actual parameters
(into space supplied by some apply_proc) by value into these formals
and the body will treat this space effectively as local variables.
sha: SHAPE
opt_access: OPTION(LIST(ACCESS))
tg_intro: TAG POINTER(alignment(sha))
-> TAGSHACC
params_intro = make_tagshacc( integer(v), empty, make_tag(13))
Then, TAG 13 from the enclosing UNIT's name-space is identified with
the formal parameter with SHAPE POINTER(INTEGER(v)). Any use of obtain_tag(make_tag(13))
in body will deliver a pointer to the integer parameter. I
shall return to the meaning of opt_access and the ramifications
of the scope and extent of TAGs involved in conjunction with local
declarations in section 5.3.1 on page 30.
Procedures, whether defined by make_proc or make_general_proc, will usually terminate and deliver its result with a return:
arg1: EXP x -> EXP BOTTOMHere x must be identical to the result_shape of the call of the procedure There may be several returns in body; and the SHAPE x in each will be the same. Some languages allow different types to be returned depending on the particular call. The producer must resolve this issue. For example, C allows one to deliver void if the resulting value is not used. In TDF a dummy value must be provided at the return; for example make_value(result_shape)
Note that the body has SHAPE bottom since all possible terminations to a procedure have SHAPE BOTTOM..
Procedures defined by make_proc are called using apply_proc:
result_shape: SHAPE arg1: EXP PROC arg2: LIST(EXP) varparam: OPTION(EXP) --> EXP result_shapeHere arg1 is the procedure to be called and arg2 gives the actual parameters. There must be at least as many actual parameters as given (with the same SHAPE) in the params_intro of the corresponding make_proc for arg1 *. The values of arg2 will be copied into space managed by caller.
The SHAPE of the result of the call is given by result_shape which must be identical to the result_shape of the make_proc.
This is a slightly different method of giving a variable number of
parameters to a procedure, rather than simply giving more actuals
than formals. The principle difference is in the alignment of the
components of varparam; these will be laid out according to
the default padding defined by the component shapes. In most ABIs,
this padding is usually different to the way parameters are laid out;
for example, character parameters are generally padded out to a full
word. Thus a sequence of parameters of given shape has a different
layout in store to the same sequence of shapes in a structure. If
one wished to pass an arbitrary structure to a procedure, one would
use the varparam option rather passing the fields individually
as extra actual parameters.
The result_shape and body have the same general properties
as in make_proc. In addition prcprops gives other information
both about body and the way that that the procedure is called.
PROCPROPS are a set drawn from check_stack, inline, no_long_jump_dest,
untidy, var_callees and var_callers. The set is composed using add_procprops.
The PROCPROPS no_long_jump_dest is a property of body only;
it indicates that none of the labels within body will be the
target of a long_jump construct. The other properties should also
be given consistently at all calls of the procedure; theu are discussed
in section 5.2.2 on page 29.
A procedure, p, constructed by make_general_proc is called
using apply_general_proc:
The actual callee_params may be constructed in three different
ways. The usual method is to use make_callee_list, giving a list of
actual EXP parameters, corresponding to the caller_intro
list in the obvious way.The constructor, same_callees allows one to
use the callees of the current procedure as the callees of the call;
this, of course, assumes that the formals of the current procedure
are compatible with the formals required for the call The final method
allows one to construct a dynamically sized set of CALLEES; make_dynamic_callees
takes a pointer and a size (expressed as an OFFSET) to make the CALLEES;
this will be used in conjunction with a var_callees PROCPROPS (see
section 5.2.2 on page 29).
Some procedures can be expressed using either make_proc or make_general_proc.
For example:
make_proc(S, L, empty, B) = make_general_proc(S, var_callers, L, empty,
B)
One condition for such a tail call to be applicable is knowing that
g does not require any pointers to locals of f; this
is often implicit in the language involved. Equally important is that
the action on the return from f is indistiguishable from the
return from g. For example, if it were the callers responsibility
to pop the the space for the parameters on return from a call, then
the tail call of g would only work if g had the same
parameter space as f.
This is the justification for splitting the parameter set of a general
proc; it is (at least conceptually) the caller's responsibility for
popping the caller-parameters only - the callee-parameters are dealt
with by the procedure itself. Hence we can define tail_call which
uses the same caller-parameters, but a different set of callee-parameters:
tail_call(P, p, C) = return(apply_general_proc(S, P, p, L, C, make_top()))
However an implementation is expected to conserve stack by using the
same space for the call of p as the current procedure.
The presence of untidy means that body may be terminated by
an untidy_return. This returns the result of the procedure as in return,
but the lifetime of the local space of the procedure is extended (in
practice this is performed by not returning the stack to its original
value at the call). A procedure containing an untidy_return is a generalisation
of a local_alloc(see
section 5.3.4 on page 32). For example the procedure
could do some complicated local allocation (a triangular array, say)
and untidily return a pointer to it so that the space is still valid
in the calling procedure. The space will remain valid for the lifetime
of the calling procedure unless some local_free is called within it,
just as if the space had been generated by a local_alloc in the calling
procedure.
The presence of inline is just a hint to the translator that the procedure
body is a good candidate for inlining at the call.
The presence of check_stack means that the static stack requirements
of the procedure will be checked on entry to see that they do not
exceed the limits imposed by set_stack_limit; if they are exceeded
a TDF exception with ERROR_CODE stack_overflow (see section
6.3 on page 35) will be raised.
The other kind of local definition is variable:
The basic model used for the locals and parameters of a procedure
is a frame within a stack of nested procedure calls. One could implement
a procedure by allocating space according to SHAPEs of all of the
parameter and local TAGs so that the corresponding values are at fixed
offsets either from the start of the frame or some pointer within
it.
Indeed, if the ACCESS opt_access parameter in a TAG definition
is produced by visible, then a translator is almost bound to do just
that for that TAG. This is because it allows for the possibility of
the value to be accessed in some way other than by using obtain_tag
which is the standard way of recovering the value bound to the TAG.
The principal way that this could happen within TDF is by the combined
use of env_offset to give the offset and current_env to give a pointer
to the current frame (see section 5.3.3 on page 31).
The out_par ACCESS is only applicable to caller parameters of procedures;
it indicates that the value of the TAG concerned will accessed by
the postlude part of an apply_general_proc. Hence, the value of the
parameter must be accessible after the call; usually this will be
on the stack in the callers frame.
The long_jump_access flag is used to indicate that the tag must be
available after a long_jump. In practice, if either visible or long_jump_access
is set, most translators would allocate the space for the declaration
on the main-store stack rather than in an available register. If it
is not set, then a translator is free to use its own criteria for
whether space which can fit into a register is allocated on the stack
or in a register, provided there is no observable difference (other
than time or program size) between the two possibilities.
Some of these criteria are rather obvious; for example, if a pointer
to local variable is passed outside the procedure in an opaque manner,
then it is highly unlikely that one can allocate the variable in a
register. Some might be less obvious. If the only uses of a TAG t
was in obtain_tag(t)s which are operands of contents or the left-hand
operands of assigns , most ABIs would allow the tag to be placed in
a register. We do not necessarily have to generate a pointer value
if it can be subsumed by the operations available.
A POINTER tag with ACCESS no_other_read or no_other_write is asserting
that there are no "aliassed" accesses to the contents of
the pointer. For example, when applied to a parameter of a procedure,
it is saying that the original pointer of the tag is distinct from
any other tags used (reading/writing) in the lifetime of the tag.
These other tags could either be further parameters of the procedure
or globals. Clearly, this is useful for describing the limitations
imposed by Fortran parameters, for example.
Offsets from the current_env of a procedure to a tag declared in the
procedure are constructed by env_offset:
The offset arithmetic operations allow one to access the values of
tags non-locally using values derived from current_env and env_offset.
They are effectively defined by the following identities:
The importance of this is that env_offset(t) is a constant OFFSET
and can be used anywhere within the enclosing UNIT, in other procedures
or as part of constant TAGDEF; remember that the TDFINT underlying
t is unique within the UNIT. The result of a current_env could be
passed to another procedure (as a parameter, say) and this new procedure
could then access a local of the original by using its env_offset.
This would be the method one would use to access non-local, non-global
identifiers in a language which allowed one to define procedures within
procedures such as Pascal or Algol. Of course, given the stack-based
model, the value given by current_env becomes meaningless once the
procedure in which it is invoked is exited.
The alignment of pointer result is alloca_alignment which must include
all SHAPE alignments.
There are two constructors for releasing space generated by local_alloc.
To release all such space generated in the current procedure one does
local_free_all(); this reduces the size of the current frame to its
static size.
The other constructor is local_free whch is effectively a "pop"
to local_alloc's "push":
The use of a procedure with an untidy_return is just a generalisation
of the idea of local_alloc and the space made available by its use
can be freed in the same way as normal local allocations. Of course,
given that it could be the result of the procedure it can be structured
in an arbitrarily complicated way.
Part of the TenDRA Web.5.1.1. vartag, varparam
Use of the var_intro OPTION in make_proc and the corresponding
varparam in apply_proc allows one to have a parameter of any
SHAPE, possibly differing from call to call where the actual SHAPE
can be deduced in some way by the body of the make_proc . One
supplies an extra actual parameter, varparam, which usually
would be a structure grouping some set of values. The body of the
procedure can then access these values using the pointer given by
the TAG var_intro, using add_to_ptr with some computed offsets
to pick out the individual fields. 5.2. make_general_proc and apply_general_proc
A make_general_proc has signature:
result_shape: SHAPE
prcprops: OPTION(PROCPROPS)
caller_intro: LIST(TAGSHACC)
callee_intro: LIST(TAGSHACC)
body: EXP BOTTOM
-> EXP PROC
Here the formal parameters are split into two sets, caller_intro
and callee_intro, each given by a list of TAGSHACCs just as
in make_proc. The distinction between the two sets is that the make_general_proc
is responsible for de_allocating any space required for the callee
parameter set; this really only becomes obvious at uses of tail_call
within body.
result_shape: SHAPE
prcprops: OPTION(PROCPROPS)
p: EXP PROC
caller_params: LIST(OTAGEXP)
callee_params: CALLEES
postlude: EXP TOP
-> EXP result_shape
The actual caller parameters are given by caller_params as
a list of OTAGEXPs constructed using make_otagexp:
tgopt: OPTION(TAG x)
e: EXP x
-> OTAGEXP
Here, e is the value of the parameter and tgopt, if
present, is a TAG which will bound to the final value of the parameter
(after body is evaluated) in the postlude expression
of the apply_general_proc
*. Clearly, this allows
one to use a caller parameter as an extra result of the procedure;
for example, as in Ada out-parameters.5.2.1. tail_call
Often the result of a procedure, f, is simply given by the
call of another (or the same) procedure, g. In appropriate
circumstances, the same stack space can be used for the call of g
as the call of f. This can be particularly important where
heavily recursive routines are involved; some languages even use tail
recursion as the preferred method of looping.
prcprops: OPTION(PROCPROPS)
p: EXP PROC
callee_params: CALLEES
-> EXP BOTTOM
The procedure p will be called with the same caller parameters as
the current procedure and the new callee_params and return
to the call site of the current procedure. Semantically, if S is the
return SHAPE of the current procedure, and L is its caller-parameters:5.2.2. PROCPROPS
The presence of var_callees (or var_callers) means that the procedure
can be called with more actual callee (or caller) parameters than
are indicated in callee_intro (or caller_intro
). These extra parameters would be accessed within body using
offset calculations with respect to the named parameters. The offsets
should be calculated using parameter_alignment to give the packing
of the parameter packs.5.3. Defining and using locals
5.3.1. identify, variable
Local definitions within the body of a procedure are given
by two EXP constructors which permit one to give names to values over
a scope given by the definition. Note that this is somewhat different
to declarations in standard languages where the declaration is usually
embedded in a larger construct which defines the scope of the name;
here the scope is explicit in the definition. The reason for this
will become more obvious in the discussion of TDF transformations.
The simpler constructor is identify:
opt_access: OPTION(ACCESS)
name_intro: TAG x
definition: EXP x
body: EXP y
-> EXP y
The definition is evaluated and its result is identified with
the TAG given by name_intro within its scope body. Hence
the use of any obtain_tag(name_intro) within
body is equivalent to using this result. Anywhere else, obtain_tag(name_intro
) is meaningless, including in other procedures.
opt_access: OPTION(ACCESS)
name_intro: TAG x
init: EXP x
body: EXP y
-> EXP y
Here the init EXP is evaluated and its result serves as an
initialisation of space of SHAPE x local to the procedure.
The TAG name_intro is then identified with a pointer to that SPACE
within body. A use of obtain_tag(name_intro) within body
is equivalent to using this pointer and is meaningless outside body
or in other procedures. Many variable declarations in programs are
uninitialised; in this case, the init argument could be provided
by make_value which will produce some value with SHAPE given by its
parameter.5.3.2. ACCESS
The ACCESS SORT given in tag declarations is a way of describing a
list of properties to be associated with the tag. They are basically
divided into two classes, one which describes global properties of
the tag with respect to the model for locals and the other which gives
"hints" on how the value will be used. Any of these can
be combined using add_access.5.3.2.1 . Locals model
At the moment there are just three possibilities in the first class
of ACCESS constructors. They are standard_access (the default) , visible,
out_par and long_jump_access.
5.3.2.2 . Access "hints"
A variable tag with ACCESS constant is a write-once value; once it
is initialised the variable will always contain the initialisation.
In other words the tag is a pointer to a constant value; translators
can use this information to apply various optimisations.5.3.3. current_env, env_offset
The constructor current_env gives a pointer to the current procedure
frame of SHAPE POINTER(fa) where fa
is depends on how the procedure was defined and will be some set of
the special frame ALIGNMENTs. This set will always include locals_alignment
- the alignment of any locals defined within the procedure. If the
procedure has any caller- parameters, the set will also include callers_alignment(b)
where b indicates whether there can be a variable number of them;
similarly for callee-parameters.
fa: ALIGNMENT
y: ALIGNMENT
t: TAG x
-> EXP OFFSET(fa,y)
The frame ALIGNMENT fa will be the appropriate one for the
TAG t; i.e. if t is a local then the fa will
be locals_alignment; if t is a caller parameter, fa
will be callers_alignment(b); if t is a callee_parameter, fa
will be callees_alignment(b). The alignment y will be the alignment
of the initialisation of t.
If TAG t is derived from a variable definition
add_to_ptr(current_env(), env_offset(locals_alignment, A, t)) = obtain_tag(t)
if TAG t is derived from an identify definition:
contents(S, add_to_ptr(current_env(), env_offset(locals_alignment, A, t))) = obtain_tag(t)
if TAG t is derived from a caller parameter:
add_to_ptr(current_env(), env_offset(callers_alignment(b), A, t)) = obtain_tag(t)
if TAG t is derived from a callee parameter:
add_to_ptr(current_env(), env_offset(callees_alignment(b), A, t)) = obtain_tag(t)
These identities are valid throughout the extent of t, including in
inner procedure calls. In other words, one can dynamically create
a pointer to the value by composing current_env and env_offset. 5.3.4. local_alloc, local_free_all, last_local
The size of stack frame produced by variable and identify definitions
is a translate-time constant since the frame is composed of values
whose SHAPEs are known. TDF also allows one to produce dynamically
sized local objects which are conceptually part of the frame. These
are produced by local_alloc:
arg1: EXP OFFSET(x, y)
-> EXP POINTER(alloca_alignment)
The operand arg1 gives the size of the new object required
and the result is a pointer to the space for this object "on
top of the stack" as part of the frame. The quotation marks indicate
that a translator writer might prefer to maintain a dynamic stack
as well as static one. There are some disadvantages in putting everything
into one stack which may well out-weigh the trouble of maintaining
another stack which is relatively infrequently used. If a frame has
a known size, then all addressing of locals can be done using a stack-front
register; if it is dynamically sized, then another frame-pointer register
must be used - some ABIs make this easy but not all. The majority
of procedures contain no local_allocs, so their addressing of locals
can always be done relative to a stack-front; only the others have
to use another register for a frame pointer.
a: EXP OFFSET(x, y)
p: EXP POINTER(alloca_alignment)
-> EXP TOP
Here p must evaluate to a pointer generated either by local_alloc
or last_local . The effect is to free all of the space locally allocated
after p. The usual implementation (with a downward growing stack)
of this is that p becomes the "top of stack" pointer5.4. Heap storage
At the moment, there are no explicit constructors of creating dynamic
off-stack storage in TDF. Any off-stack storage requirements must
be met by the API in which the system is embedded, using the standard
procedural interface. For example, the ANSI C API allows the creation
of heap space using standard library procedures like malloc.
Crown
Copyright © 1998.