A communication system which allows messages to be sent from any to
any using a central message-forwarder and polling by the recipient (as
is apparently the case in nVidia GPUs) could not possibly be designed
to be slower, as it allows any message to interfere with any other and
any link between processors to be unuseable until any other link is
cleared, and prevents delivery of any message until arbitrary
communication-external conditions free the recipient to poll for a
message again.
- Example 4.1, p78
\(\ \ \ \ c_{lr}\ :\ (id,\ !\ +)\ :\ \#(nil,\ last\ 0)\ ([1\ 3\ 6\ 10],\ [5,11,18,26])\)
\(= c_{lr}\ :\ (id,\ !\ +)\ ([1,3,6,10],\ [(5,10)\ (11,10)\ (18,10)\ (26,10)])\)
(If I had to read this out in words I'd say that apparently:
- #() is a "communication function" meaning that it primarily
has to identify indexes of the things that need to be brought in
for the calculations, but then in reality it needs to know what
things to look for what it indexes into and what to do other than
extract indexed elements, namely construct new structures and
tuples and vectors etc. for downstream use ;
- "nil" seems to mean just copy the first item from the
following tuple (in this case a tuple of two 4-vectors) and
use that as the first item of the output of #(nil,...), which
is hardly a no-op (needed 28 words to describe it!) but I
guess you could see it that way.
- "# ! last 0" identifies for each element in each subvector
that it wants the last element in the other subvector. means
send the last element in the left vector to every element in
the right vector, AND the last element in the right vector to
every elementin the left vector. So "# (nil,last 0)" means
don't bring anything to the elements in the left subvector (by
nil being first) whereas do bring the last element of the left
subvector to each element of the right subvector. (see p58, Figure 3.2)
The meaning of last 0 is semantically composeable from the
meanings of last and of 0; "last" makes you think of counting
from the last index of the other vector, and to count down 0
more indexes. Then applying last 0 communication to a subvector
copies the last element of the other vector into structures for each
element in this vector.
In either case #(nil, last 0) takes 10 from the first (0'th)
subvector and appends a copy of it to each element in the 2nd
subvector. Very convenient to separate communication from
calculation, the communication puts the right stuff in your
workspace, and the calculation adds things up.
- \(h_{scan} + = (id, ! +) : \#(nil, last 0)\)
"... when applied to two vectors will leave the left vector untouched but will
add the last entry on the left to each entry on the right. For example,
\((h_{scan} +) ([1 3], [3 7])\)
= (id, +) : \#(nil, last 0) ([1 3], [3 7])\)
= (id, +) ([1 3], [(3, 3) (7, 3)])\)
= (id [1 3], ! + [(3, 3) (7, 3)]\)
= ([1 3], [6 10])\)
- Section 4.2.4 p 82
\(g_{broad} = (id, !other) : (nil, \#corr) \)
This pre-adjust "function can be used to compute broad
with any balanced division on vectors including \(d_{lr}\) or
\(d_{eo}\)".
- Algorithm 4.7
\(... (id,!loc) : \#(nil, (last 0)), ...\)
"The (last 0) communication used above implies broadcast.
It can however be replaced by correspondent communication ... by
introducing a new variable and making the broadcasted value
available at all entries (as in 4.8)
- Algorithm 4.8
\(... loc : \# ! corr ... \)
- Algorithm 4.12 2nd order parallel-DC sort, p 89ff:
\(sort = PDC(d_{lr},c_{lr},id, loc:\#!mirr, atom?, id)
loc = !nest : (!min : !max)
nest = PDC(d_{lr},c_{lr},(!min : !max) : \# ! corr, id, atom?, id) \)
- 5.2.2 p 94:
! + : #(last 0)
"When applied to two vectors will fetch the last entry of the
left subvector and add (+) it to the first entry of the right
subvector... thus will only effect the value of the first entry
of the right subvector.
- 5.3.1 p97:
\(\mu (Q,R)\ =\ loc\ :\ \#(rowbr\ k)\ :\ \#(colbr\ 0)(Q,\ R)\)
(colbr for column-wise broadcast). The explanation:
"When .. #(colbr 0) is applied to (Q,R), a matrix, say Q' is returned where
Q'(i,j) = ( Q(i,j), R(0,j) )
thus broadcasting the values of row zero of R into a tuple in each cell of the Q' matrix
so that later operations can use it.