CHAPTER 3

Compound Data Types and Operators

3.1 Sets and Set Denotations

Sets in SETL are finite collections of arbitrary SETL values. These values, called the members (or elements) of the set, are themselves data items of any SETL type. To write a set denotation, we simply list the members of the set, with commas between successive members, within the set brackets "{" and "}". Three examples are


{1, 2, 3, 4}
{"Tom", "Dick", "Harry"}
{TRUE, FALSE}

The first of these is the set of all integers between 1 and 4; the second is a set of three strings, namely, "Tom", "Dick", and "Harry"; the third is the set consisting of the two possible Boolean values TRUE and FALSE.

The null or empty set, i.e., the (unique) set having no members at all, is a legal SETL value. It is written as follows:

{ }

Any legal SETL value (with the sole exception of the undefined value OM) can be a member of a set. Examples illustrating this are


{1, true, "Tom"}
{1, true, "Tom", {3}}

The first of these two examples is a perfectly legitimate set whose three members are the integer 1, the Boolean value true, and the string "Tom". The second has four elements, the integer 1, Boolean value true, the string "Tom", and the set {3}, i.e., the singleton set whose sole member is the integer 3. This shows that sets need not be homogeneous, i.e., are not restricted to having members all of the same kind, and that sets can be members of other sets. Note also that the integer 3 is not a member of the set {1, true, "Tom", {3}}, but that the set {3}, which is quite a different thing, is. A more complex example illustrating this same fact is

			{1, {2}, {{3}}, { }, {5,6}}.  					   (1)

This is a set of five members, namely the integer 1; the set {2}, whose sole member is the integer 2; the set {{3}}, whose sole member is the set {3}; the null set { }; and the set {5, 6} consisting of the integers 5 and 6. Note that in this example the integer 3 is neither a member nor a member of a member of set (1); rather, it is a member of a member of a member of that set.

As ordinarily in mathematics, set values never contain duplicate members, and the members of a set have no implied order. Thus the set {1,1} and {1}, both of which are legal, designate exactly the same set, namely the set whose sole element is the integer 1. Similarly, {1, 2} and {2, 1} designate the same set, namely the set whose members are the integers 1 and 2. For a more complex example, note that

{1, 2, {3, 4}}

and

{{4, 3}, 2, 1}

designate the same set, namely the set whose three elements are the integers 1 and 2 and the set {3,4} (not to be confused with {1,2,3,4}, which is a set of four elements, namely the integers 1 through 4).

Since the elements of a set are not considered to have any particular order within the set, it is incorrect to speak of the first, second, or last element of a set. That is, it is incorrect to speak of the string "Tom" as the first element of the set

{"Tom", "Dick", "Harry"}

or to speak of the string "Harry" as its last element, since this same set can as well be written as

{"Harry", "Tom", "Dick"}

and

{"Dick", "Tom", "Harry"}

In working with sets, one must always remember that their elements have no particular order, and that they are all distinct. This can actually simplify matters because in problems involving a collection of data whose order is irrelevant, the programmer has fewer details to worry about.

3.1.1 Some useful sets of integers

Sets whose elements are successive integers, such as

{1, 2, 3, 4, 5, 6, 7}, {-3, -2, -1, 0, 1, 2, 3, }

arise often enough that a special notation is provided for them. To describe the set of all integer lying in the range M to N inclusive are integers, we write

{M..N}

The two dots (not three, and not commas!) stand for all the integers M + 1, M + 2, and so on up to N - 1. Sets of integers of the form

{1, 3, 5, 7, 9} or {10, 5, 0 , -5, -10, -15}

that is to say, sets that represent an arithmetic progression, are also useful enough to be given their own notation in SETL: We represent such sets by giving the first, second, and last element of the progression, as follows:

{1, 3 .. 9}, {10, 5, .. -15}

Note again the use of two dots to indicate the middle part of the sequence. These notions will be used frequently in what follows.

When sets are printed, their elements can appear in any arbitrary order. For example,

print({1..10});

might be expected to produce {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. However, if you try it out, you will see the following appear:

{4, 5, 6, 7, 1, 2, 8, 3, 9, 10}

(or some other permutation of the integers from 1 to 10). This emphasizes again the fact that the elements of a set have no particular order; the set {1..10} contains the integers in the range 1 ..10, but the internal representation of this set in the machine is independent of the way in which the set is written in the text of the program. In fact, printing the same set at various times may display its elements in different orders each time.
 

3.2 Tuples

In contrast to sets, tuples (sometimes also called vectors) in SETL are finite ordered sequences of arbitrary elements. To write a tuple denotation, we list its successive components, in order, within the tuple brackets "[" and "]." Components in such a list are separated by commas. Three examples are


[1, 2, 3, 4]
["Tom", "Dick", "Harry"]
[true, false]

The components of a tuple, as distinct from the elements of a set, do have a definite order within the tuple. Thus a tuple is a quite different kind of object from a set, even though the components of the tuple may all be elements of the set, and vice versa. For example, note that

[1,2,3,4] and {1,2,3,4}

are regarded in SETL as entirely different objects, and, indeed, as objects of entirely distinct types; the first is a tuple; the second is a set. Note also that [1, 2, 3, 4] and [2,1, 3, 4] are different objects, since the components of a tuple are considered to have a specific order and two tuples are only equal if they have the same components in the same order; however, the sets { 1, 2, 3, 4} and {2,1, 3, 4} are the same, since a set, as distinct from a tuple, is defined by the collection of its elements, not by their order.

Tuples, like sets, need not be homogeneous; i.e., the components of a tuple need not all be of the same type. Tuples can have sets as their components and sets can have tuples as their members. Indeed, sets and tuples can be nested within each other to arbitrary depth as members and components, permitting construction of a great variety of data objects. Examples are

(1) [1, "Tom", {"Dick"}, ["Harry"]]

(2) {1, "Tom", ["Dick"], {"Harry"} }

(3) [1, {"Tom", ["Dick", "Harry"] } ]

The first of these constants represents a tuple of four components, which, in order, are the integer 1, the string "Tom", the singleton set {"Dick"}, and the one-component tuple ["Harry"]. The second represents a set of four elements, which (in no particular order) are the integer 1, string "Tom", the one component tuple ["Dick"], and the singleton set {"Harry"}. The third represents a tuple of just two components, namely the integer 1, followed by the two- element set {"Tom", ["Dick", "Harry"]}. We can therefore assert that the string "Harry" is the first (and only) component of the fourth component of the tuple (1); that "Harry" is also a member of a member of the four-element set (2); and finally that "Harry" is the second component of a member of the second component of the tuple (3).

Another example of a perfectly legal though highly nested SETL construction is

{{{{ }}}}

This designates a set (let's call it s), and the empty set is the only member of the only member of the only member of s.

The presence of repeated tuple components, as distinct from repetition of set elements, is significant and leads to distinct tuple values. For example, the three tuples

["Tom"], ["Tom", "Tom"], and ["Tom", "Tom", "Tom"]

are all distinct; the first has just one component and is of length 1; the second is of length 2; and the third is of length 3 and has three components. Its first, second, and third components are all defined, and each of them is the string "Tom". In contrast, the constants

{"Tom"}, {"Tom", "Tom"}, and {"Tom", "Tom", "Tom"}

designate the same set, which has just one element, namely the string "Tom". Since tuples, as distinct from sets, are considered to have a definite order, it does make sense to refer to the first, second, ..., last component of a tuple. For example, the first component of

["Tom", "Dick", "Tom", "Tom"]

is the string "Tom"; its last (also fourth) component is also "Tom"; its second component is "Dick".

There is a (unique) null or empty tuple, which is written as

[ ]

This plays much the same role for tuples that the important null set, i.e., { }, plays for sets.


 

3.2.1 Some useful tuples of integers

Tuples whose components constitute an arithmetic progression can be written in a special SETL notation similar to that used for sets of integers. The tuple construct

[M..N]

where M and N are integers, describes the tuple whose components are the integers M, M + 1, M + 2, and so forth, up to N. If N is less than M, this construct is equivalent to the empty tuple. Similarly, an arithmetic progression of the form

M, M + k, M + 2*k, .. N

where k is some integer (positive or negative), can be described by writing its first, second, and last component; specifically, the tuple whose components constitute such a sequence can be written as:

[M, M1..N]

where M1, the second term in the sequence, has the value (M + k). For example, the construct [3, 6..600] represents a tuple whose components are first 200 positive multiples of 3, in increasing order. This construct, and related set construct

{M, M1..N},

simple instances of a general numeric iterator construct, which will be discussed in detail in Section 4.3.4.1.


 

3.3 Maps

In SETL a map is simply a set all of whose elements are pairs, i.e., are tuples of length 2. Some properties of maps can be deduced from their structure, i.e., from the fact that all their components are pairs. But maps are important enough to have a number of operations that apply solely to them. We will see that maps are one of the most expressive programming features of SETL and that the proper use of maps is a hallmark of good SETL style. Maps allow us to associate with each other elements of various collections of objects: countries with their capitals, numbers with their cubes, people with their dates of birth, courses with their sets of students, and so forth.

Suppose, for example, that the children in a family, listed in increasing order of age, are

Sue, Tom, Mary, Alphonse.

Suppose that we want to associate each child x in this family with the number of younger sisters that x has. For this purpose, we could use the following map:

	{ ["Sue", 0], ["Tom", 1], ["Mary", 1], ["Alphonse", 2] }.     			(1)

Similarly, the map

	{ ["Sue", 0], ["Tom", 0], ["Mary", 1I ], ["Alphonse", 1] }   			  (2)

associates each child x with the number of younger brothers that x has.

The map

	{ ["Sue", {"Mary"} ], ["Tom", {"Sue", "Mary"} ],           
			["Mary", {"Sue"} ], ["Alphonse", {"Sue", "Mary"} ] } 		(3)

associates each child x with the set of sisters of x. Note therefore that maps can be used to associate values of any type with other values of any other type.

Another interesting map is

	{ ["Sue", "Mary"], ["Tom", "Sue"], ["Tom", "Mary"],           
      		["Mary", "Sue"], ["Alphonse", "Sue"], ["Alphonse", "Mary"] }. 		(4)

This contains a separate pair associating each child x with each of the sisters of x (rather than one pair associating x with the set of all the sisters of x as in (3); ((3) and (4) are different, but closely related and record much the same information). Since several different pairs in (4) (e.g., ["Tom", "Sue"] and ["Tom", "Mary"]) have the same first component, (4) is called a multivalued map. Maps for which this does not happen, i.e., in which no two distinct pairs share the same first component, are called single-valued maps.

Given a map M, we can form the set of all first components of pairs in M. This is called the domain of M and is written

domain M

We can also form the set of all second components of pairs in M, which is called the range of M and is written

range M

The following table shows the domain and range of the maps appearing in examples just presented.

Map numberDomain MRange M

(1){"Sue", "Tom", "Mary", "Alphonse"}{0,1, 2}
(2){"Sue", "Tom", "Mary", "Alphonse"}{0,1}
(3){"Sue", "Tom", "Mary", "Alphonse"}{{"Sue"},{"Mary"},{"Sue", "Mary"}}
(4){"Sue","Tom","Mary","Alphonse"}{"Sue","Mary"}

Maps, and the map-related operations of SETL, which will be presented in Section 3.8, are the most characteristic and important features of the language.

3.4 The Size of Composite Objects: The #Operator

One of the most important characteristics of a composite object is the number of elements which it has. SETL provides a single operator to determine the size of sets, tuples, maps, and strings: the " #" operator, called length, size, or cardinality.

When applied to a string this operator yields its length, i.e., the number of characters it contains; when applied to a tuple, it yields the length of the tuple, -i.e., the largest position in the tuple that is occupied by a component whose value is not OM; and when applied to a set it yields its cardinality, i.e., the number of its elements. For a map, it yields the number of pairs in the map. Thus:

#"Tom" is 3, since "Tom" has 3 characters

#"Tom is hot" is 10, since "Tom is hot" has l0 characters (including 2 blanks)

#["Tom","Dick","Harry"] is 3, since this tuple has 3 components

#"["Tom","Tom","Tom"] is 3, since this tuple also has 3 components

#{"Tom","Dick", "Harry"} is 3, since this set has 3 elements

#{"Tom","Tom", "Tom"} is 1, since this set has "Tom" as its only member

#{ } is 0, since the null set has no members

#[ ] is 0, since the null tuple has no components

#"" is 0, since the null string contains no characters

#{[4, 2], [4,-2] [0, 0] } is 3, because this set (or map) contains three elements (pairs)

Exercises

1. Which of the following objects are the same, and which are different?

(a)"The"and"The"
(b)"The man"and"Theman"
(c)["The","man"]and["man", "The"]
(d){"The","man"}and{"Man, The }
(e){"The man"}and{"man The"}
(f){"The","The","man"}and{"The", "man"}
(g)["The","The","man"]and["The", "man"]
(h)["The","man"] and{"The", "man"}
(i)["The", "man"] and{"The, man"}

2.  Write the size #x of the following strings, sets, and tuples. For each set and tuple,
also write the list of all its integer elements or components and the size of each of
its set, tuple, or string elements or components.

(a){ 1, 2, 2, "Tom"}
(b)[1, 2, 2, "Tom"]
(c){ 1, {2, 2}, "Tom"}
(d){1,1,{ },{ }}
(e)[{ },[[ ]]]
(f)"abracadabra"
(g)"abra cadabra"
(h)"abra, cadabra"
(i){1, "abra", "cadabra"}
(j){1, "abra"cadabra"}
(k){1, "abra, cadabra"}
(l){1, "abra", "cadabra"}
(m){1, "abra", "cadabra"}
(n){[ ], ,{ }, [ ], {[ ]}, { } }

3. Write the size of the first, second, and last component of each of the following tuples:

 
(a)	["Tom", "Dick", "Harry"]
(b)	["Tom", "Dick", "Harry", "Tom"]
(c)	[ "Tom", ["Tom"], ["Tom"], [ ]]
             

4. Indicate whether "Tom" is a member, component, member of component, component of member, component of component, etc., of each of the following sets or tuples:


(a) [1, "Tom"] (b) {["Tom", 3], ["Dick", 4], ["Harry", 5] } (c) {{"Tom", "Dick", "Harry"} } (d) [[ [ ["Tom"], "Tom"], "Dick", "Tom", "Harry"] ] (e) [["Tom, Dick"],"Tom, Harry"] 5. Write a map which indicates the age of each of your brothers and sisters by associating their age with their first name. Write the range and domain of this map. 6. Write a map which associates each component of the tuple ["Tom", "Dick", "Harry"] with the square of the component length. Write the range and domain of this map.

7. How many maps are there whose domain is {"Tom, Dick"} and whose range is {"Sue", "Mary"}? How many of these maps are single-valued?

8. A map M associates the age of each child in a family with the name of the child. The range of M is {7, 9,13}, and the domain is {"Sue", "Mary", "Tom", "Dick"}. What is interesting about this family?

9. Consider the following map M:

{ ["Smith", { ["Sue", 11], ["Jim", 13] } ], ["Jones", { ["Albert", 1], ["Anna", 3], ["Ron", 9] } ], ["Skallagrim", { ["Thorolf ", 7], ["Egil", 5], ["Asgerd", 4] } ] }

What information might this map represent? What is its domain? What is its range?

10. Let S be the set

{"Tom", {"Dick", ["Harry", "Arthur", {"Tom"} ] } }

"Dick" is a member a member of S. Match each name in the following list with the manner in which it appears in S:

(a) Tom (b) Harry (c) Arthur (i) component of member (ii) member of component of member (iii) member

11. Consider the map M of Exercise 9 as a set. What are all the members of this set? Which of the components of the members of M are sets, and what are the members of these components? What are all the components of the members of all the components of the members of M which are sets? What are all the lengths of all the components of the members of M which are not sets?

12. Write a map which associates each of the Pacific coast states with the name of its state capital.

13. For how many integers between 1 and 100 is I = 5 * (I / 5) true? For exactly which integers is this true? For how many integers between 1 and 100 is I = (5 * I) / 5 true?

3.5 Set Operations and Set Formers

SETL provides several important kinds of set operators, of which the easiest to understand are the built-in, elementary set operations and the set formers. We shall present these constructs in the present section; the even more important map operations are presented in Section 3.8.

3.5.1 Binary set operators

The binary set operations compute a result value from two inputs, one or both of which must be a set. These operations are as follows (in what follows, s and ss are always sets, and x can be an object of arbitrary type):

s + sscomputes the union of two sets, i.e. the set of all objects which belong either to s or to ss.
s - sscomputes the difference of two sets, i.e. the set of all objects which belong to s but not to ss.
s * sscomputes the intersection, or common part, of two sets, i.e. the set of all objects which belong to both s and ss.
s mod ss computes the symmetric difference of two sets, i.e., the set of objects that are either in s or ss, but not in both.Note that s mod ss is equivalent to s + ss - (s * ss)
k npow shere k must be a non-negative integer. This operation yields the collection of all subsets of s which contain exactly k elements. An error results if k is negative. So we don't have to remember which of the arguments must be a set, and which an integer, both possibilities are allowed: we can write indifferently s npow k or k npow s.
s with xproduces a set whose members are the members of s, with x inserted (if x is not already a member of s). s with x is equivalent to s + {x}.
s less xproduces a set whose members are the members of s, with x removed (if necessary, i.e., if x is a member of s). >s less x is equivalent to s-{x}.

The following set operators are predicates; i.e., they perform a test and yield a Boolean value:

x in stests x for membership in the set s. The value true is produced if x is a member of s, false otherwise.
x notin stests x for non membership in the set s. The value true is produced if x is not a member of s, false otherwise.
s = sstests s and ss for equality, yielding true if s and ss have exactly the same members, false otherwise.
s /= sstests s and ss for inequality, yielding false if s and ss have exactly the same members, true otherwise.
s incs sstests ss for inclusion within s, yielding true if every member of ss is also a member of s, false if ss has any member which is not also a member of s.
s subset sstests s for inclusion within ss, yielding true if every member of s is also a member of ss, false if s has any member which is not also a member of ss.

Examples of use of these binary set operators are the following:

print({1,2} + {"Tom", "Dick"});yields{1 2 "Tom" "Dick"}
print({ } + {1,2},{ } + { }); yields{1 2} { }
print({1,2,3} - {1,4}, {1, 2, 3} - { });yields {2 3} {1 2 3}
print({1,2,3} - {3,1,2});yields{ }
print({ } - {1, 2, 3});yields{ }
print({l,2,3} * {2,5,3});yields{2 3}
print({1,2} * {3,4});yields{ }
print({ } * {3,4});yields{ }
print({{1},2,3} - {1,2,3});yields{ {1} }
print({{1},{2,3}} - {1,2,3}); yields{{1}, {2 3}}
print(1 in {1,2,3},{1} in {1,2,3});yields true false
print({ } in { },{ } in {{ }});yields false true
print(1 notin {1}, { } notin { });yields false true
print({1,2,3} with 5);yields{1 2 3 5}
print({1,2,3} with 1);yields{1 2 3}
print({1,2,3} less 1, {1,2,3} less 4);yields{2 3} {1 2 3}
print({1,2,3} = {3,2,1});yieldstrue
print({ } = [ ], { } = {{ }});yieldsfalse false
print({1,2} /= {2,1}, {1,2,2} /= {1,2});yieldsfalse false
print(2 npow {1,2,3});yields{{1 2} {2 3} {1 3}}
print({1} incs { }, { } incs {1});yieldstrue false
print({1,2} incs {1,2});yieldstrue
print({2,2,2} subset {1,2});yieldstrue

5.2 Unary set operators

Unary set operators compute a result value from a single set input s. The unary set operators are as follows:

#syields the number of (distinct) elements of the set s.
pow syields the set of all subsets of s (which is also called the power set of s; hence the name pow).
arb sYields an arbitrarily selected element of s. arb s means "some element of s; I don't care which.".(Depending on the particular SETL implementation used, successive evaluations of arb s may or may not yield the same element of s. If s is the empty set, arb s yields OM.

Examples of these unary operators are as follows:

print(#{2}, #{2, 2, 2, 2});yields1 1
print(#{1,2,3,4,1,2,3,4,40});yields5
print(pow{1, 2});yields{{ } {1} {2} {1 2}}
print(arb{1,2,3},arb{1,2, 3});yields1 1 (or possibly 2 2 or 3 3)
print(arb{1,2,3},arb{3,1,2});can yield1 2 (even though {1,2,3} = {3,2,1} yields true)

Of course, the basic construct

{x1, x2,..., xk}

which forms a set by enumerating its elements explicitly is also a (multiargument) set operator. The x1, x2,..., xk appearing in this construct can be arbitrary expressions. As several of the preceding examples show, this construct can form a set of fewer than k elements. For example, if x has the value {1, 2} and y the value {1}, then {x, y, x + y} is the two-element set { { 1,2}, {1}}

As already noted, the set of all integers in the range from m to n (inclusive) an be written as

{m..n}

and the set of all integers n, n + k, n + 2k, etc., up to m can be written

{n, n + k..m}

In this last form, the "step" k can be negative, and n + k need not actually be a sum but can be any arbitrary expression. For example,

print({3,6 - 1..10}); yields {3 5 7 9}

If the m in n..m is less than n, then the null set results. Similar rules apply to {n, n + k..m}, for example,

print({3,5.. 1})yields{ }
print({3,2..-3});yields{3 2 1 0 -1 -2 -3}
print({3,2..4});yields{ }
print({3,4..2});yields{ }

Section 4.3.4 examines these constructs in greater detail.

Many interesting mathematical relationships connect the set operators presented in this section. For example, the values of (s * s1) subset s, and (s1 + s2) * s3 = s1 * s3 + s2 * s3 are always true. Other relationships of this sort appear in the exercises following Section 3.13.

3.5.3 Set former expressions

Sets are the basic data objects of SETL, and the language provides a number of ways of constructing them. We have seen already in Section 3.1 that sets can be constructed by listing their elements and enclosing the list between set brackets. More generally, sets can be constructed by enumerating their elements, be they constants, variables, or expressions. For example, the set expression

{x,y,x + y,[ ]}

describes a set whose elements are the value of the variable x, the value of variable y, the value of the expression (x + y), and the null tuple. Such sets constructed by enumeration can contain any number of expressions of any type.

In mathematics, the most powerful and general way of forming a set is simply to define it by stating a characteristic property of its elements. We will now describe the method for doing this in SETL, which uses a notation almost identical to mathematical language. The standard mathematical notation for this construction is

		{x | C}                   				(1)
read "the set of all x having the property C," or equivalently "the set of all x such that C." C must be an expression that yields a Boolean value. For example in mathematics one commonly writes

	{x | x < 0}							  (2)

which is read "the set of all x such that x < 0." (As this example shows, the Boolean expression C of (1) will generally depend on the variable x.) It should be noted that this standard method of describing a set in mathematics uses a shortcut which omits any explicit specification of the range of values for the variable x; mathematically, x is sometimes allowed to range over 'everything'. This omission is not permitted in SETL, and that is the only difference between the SETL method of constructing a set and mathematical notation. Thus the following significantly more limited construct, which makes generation of infinite sets impossible, is used in SETL:

	{x in s | C}.							(3)

Here s is a set which has been constructed earlier, either by enumeration or by a previous instance of (3). It follows that s itself must be finite, and thus all sets constructed in SETL are finite. The construct (3) is the basic SETL set former.

Several important generalizations of the construct (3) are used in mathematics and also allowed in SETL. Suppose, for example, that s is a set of numbers. Rather than simply forming the set (3), we may want to form a set of numbers obtained from (3) by applying some common transformation to all its elements, for example, by squaring them. To form this set, we are allowed to write

{x * x: x in s | C}

which can be read "the set of all values x squared, for all x ranging over the set s which are such that C." The general form of the more powerful kind of set former is

		                  {e: x in s | C}					(4)

In (4), e can be any expression, s any set valued expression, C any Boolean- valued expression. We can read (4) as "the set of all values e, formed for those x in s for which C has the value true." Usually both e and C will depend on the value of x, i.e., on the various values of the members of s. This reading of the notation (4) suggests a further generalization, which again is used in standard mathematics and is also legal in SETL. Specifically, there is no reason why in forming a set like (4) we should only allow one variable x to range over one set s. Instead, we can allow any number of variables to range over any number of sets. The notations

		{e: x in s1, y in s2 | C}					(5a)
		{e: x in s1, y in s2, z in s3 | C}    			        (5b)

etc., express this more general construction. Note that (5b) can be read as "x ranging over s1, y independently ranging over s2, z ranging (again independently) over s3, but only in combinations x, y, z for which C has the value true."

Subsequently we will see that even further generalizations of the set former constructs (3), (4), (5a), (5b), etc., are allowed. But, even as they stand, these constructs are extremely powerful, as the following interesting examples will illustrate. We begin by considering the problem of printing out prime numbers, for example, all prime numbers in a given range, let us say the range {1...100}. We remind the reader that positive numbers like 6 = 2 * 3, 9 = 3 * 3, 4 = 2 * 2 which are the product of two smaller numbers are called composite, and that numbers larger than 1 which are not composite are called prime; examples of primes are 3, 5, 7, 11 , 13, 17.... It is easy to express the set of all composite numbers up to 100 using a set former (of type (5b)), namely as

          {i * j: i in {2..10}, j in {2..50} | i * j < 101}.			(6)

Since the prime numbers we want are exactly the elements of {2..100} which do not belong to the set (6), we can print them out simply by writing

print({2..100} - {i * j: i in {2..10}, j in {2..50} | i * j < 101});

Sometimes the condition C appearing in (4), (5a), and (5b) is unnecessary. For example, given a set s of numbers we may simply want to form all the squares of numbers in s. In such cases one is simply allowed to drop the condition C, i.e., to write {e: x in s}, read "the set of all values e formed for x in s." Similarly, we can write

{e:x in s1, y in s2},
{e:x in s1, y in s2, z in s3}, etc.

For example, we can write the set of all pairs x, y, where x ranges over s1 and y ranges over s2, as

{[x, y]: x in s1, y in s2}.

(In mathematics, this set is called the Cartesian product of s1 and s2, after René Descartes, the inventor of coordinate geometry.) Using these abbreviated set formers we can print the sets of primes already considered more simply; for example, we can print the primes up to 100 by writing

print( {2..100} - {i * j: i in {2..10}, j in {2..50}})

Mathematicians who study prime numbers are often interested in primes having particular forms, for example, primes p which are one more than a multiple of 4, or three more than a multiple of 4. Since the set of all numbers (greater than 1) up to 100 which are one more (respectively, three more) than a multiple of 4 can be expressed as

{4 * n + 1: n in {0..24} | 4 * n + 1 < 101}

and

{4 * n + 3: n in {0..24} | 4 * n + 3 < 101}
we can get these special sets of primes by writing

a := {2..100} - {i * j: i in {2..10}, j in {2..50}};

b := {4 * n + 1: n in {0..24} | 4 * n + 1 < 101};

print(a * b);
and

a := {2..100} - {i * j: i in {2..10}, j in {2..50}};

b := {4 * n + 3: n in {0..24} | 4 * n + 3 < 101};

print(a * b);
respectively.

Figure 3.1 Set Former Diagram

We can also print the set of primes (up to 100) which are 1 more than multiple of 4 by writing

print({4*n + 1: n in{1..24} | 4*n + 1 < 101} - {i * j: i in {2 ..10}, j in {2..50} | i * j < 101});

and the corresponding set of primes which are 3 more than a multiple of 4 by writing

print({4*n + 3: n in{0..24} | 4*n + 3 < 101} - {i * j: i in {2..10}, j in {2..50} | i * j < 101});

The various forms of the set former described are summarized in the syntax diagram in Figure 3.1. Other kinds of set formers also exist and will be described later.

3.5.4 Existential and universal quantifiers

Very often, the key to a mathematical problem is to determine whether there exists any object x satisfying a given condition C. Similarly the key to a programming problem often lies in finding such an x if it exists, and if we know where to look for it, i.e., if we know the set of objects among which x might be found. Using set formers, it is easy to express the condition that there should exist an x in s satisfying C. We have only to write

              {x in s | C} /= { }						(7)

or in words, "the set of elements of s which satisfy C is not empty." Moreover, if the condition (7) is satisfied, we can easily obtain such an x by extracting it from the non empty set we have just constructed:

		arb{x in s | C}.						(8)

Since the test (7) is so important and common, a special abbreviation is provided for it, namely

		exists x in s | C.						(9)

this is a Boolean-valued expression, yielding exactly the same value as (7). Moreover, if it yields the value true, it will set x to the value of (8), i.e., to some value satisfying C. If (7) is false, then the variable x in (8) gets the value OM (just as arb { } is OM).

As in a set former, the s in (9) can be an arbitrary set-valued expression; C can be an arbitrary Boolean-valued expression. Generalizations of (9) corresponding to the generalized set formers (5a), (5b) are allowed. Specifically, one can write

		exists x in s1, y in s2 | C					(10a)
		exists x in s1, y in s2, z in s3 | C				(10b)

etc., where s1, s2, .. are arbitrary set-valued expressions and C a Boolean expression. The constructs (10a) and (10b) search the set of all x in s1, y in s2, ... for values satisfying the condition C. If such values are found, then (10a) (or (10b)) yields the value true and the variables x, y,... are set to these values. Otherwise (10a) (or (10b)) yields the value false and x, y, .. get the value OM.

The constructs (9), (10a), (10b), etc., are called existential quantifiers or existentially quantified expressions.

The existential quantifier allows us to express naturally the common query, Does there exist an object in a certain collection which satisfies a given criterion? A related query, which is also very common in programming contexts, is the following: Do all the objects in a collection satisfy some stated criterion? Such queries are expressed in SETL by means of constructs such as the following:

  forall x in s | C						(11a)
  forall x in s1, y in s2 I C					(11b)
  forall x in s1, y in s2, z in s3 | C				(11c)

which make use of the keyword forall. These constructs which are called universal quantifiers are closely related to existential quantifiers. The three cases just given are equivalent to:

  not (exists x in s | (not C))					(12a)
  not (exists x in s1, y in s2 | (not C))				(12b)
  not (exists x in s1, y in s2, z in s3 | (not C))			(12c)

respectively. For example, (11c) searches the set of all x in s1, y in s2, z in s3 for values such that the condition C takes on the value false. If none exists then (11c) returns the value true (and the variables x, y, z take the value OM). However, if values satisfying not C exist, then (11 c) returns the value false (and the variables x, y, z take on values (in s1, s2, and s3, respectively) fulfilling the condition not C). The keywords exists and forall are quantifiers. By using quantifiers we can write a simpler and more readable set former representing the set of all primes up to 100. Specifically, an integer n is prime if there exists no smaller integer m (other than 1) which divides n evenly, i.e., such that n mod m = 0. Hence

print({n in{2..100} | (not (exists m in{2..n - 1} | n mod m = 0)) and (n + 1) mod 4 = 0});

will print all the primes up to 100 which are one more than a multiple of 4, and

print({n in{2..100} | (not (exists m in{2..n - 1} | n mod m = 0)) and (n + 3) mod 4 = 0});

will print the set of all primes up to 100 which are 3 more than a multiple of 4.

As we have said, the existential quantifier (9) returns exactly the same value as the expression (7). However, the quantifier calculates this value more efficiently than (7) would, since to evaluate (9) the SETL system will search systematically through the elements of s but stop searching and return the value true as soon as an x satisfying C has been found, whereas to evaluate (7) it would always search through the whole of s building up the set {x in S | C} and only test it for emptiness after it had been evaluated fully.

3.5.4.1 A remark on bound variables in compound set formers and quantifiers

The variables x, y, z occurring in (9), (10a-b), (11a-c), and (12a-c) are called bound variables, since the evaluation of each quantified expression proceeds by giving to x, y, and z the values of successive elements of the sets s1, s2, etc., i.e., binds them to successive values of their respective domains. Quantifiers (or set formers) such as (10a-b), (11b-c), or (12b-c) involving more than one bound variable cause multiple iterations, e.g., in evaluating

exists x in s1, y in s2 | C

x is given successive values from the set s1, and then for each of these values of x, y is given all possible values from s2. For this reason, the expression s2 in (10a) is allowed to depend on the bound variable x, but s1 must be independent of y. Similarly, in

exists x in s1, y in s2, z in s3 | C

s3 can depend on both x and y, s2 can depend on x but not z, and s1 cannot depend on either y or z. Similar rules apply to universal quantifiers and to set formers.

3.5.5 Some illustrative one-statement programs

So far only a few of the commands available to the SETL programmer have been described, so that we cannot yet show any substantial programs. However, the mechanisms that have been described are powerful enough to allow various interesting single-statement programs to be written. In this section, we collect a few such programs.

3.5.5.1 More about prime numbers

As noted in the preceding section, an integer is called prime if it is not exactly divisible by any smaller (positive) integer other than 1.

To form the set of all prime numbers up to 100 we can use the one-line program idea given in the preceding section, which simply prints a set former:

print({n in {2..100} | not (exists m in {2..n - 1} | n mod m = 0)});

The output of this single-statement program is

{2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97}

(Note however that since sets are not ordered the elements of this set will actually be printed in some arbitrary order.)

Mathematicians who study prime numbers are sometimes interested in finding not all the primes in a given range, but only those which have various special properties. For example, a prime n is said to belong to a prime pair if both n and n + 2 are primes. (Note that, since all primes except 2 are odd, we cannot expect both n and n + 1 to be prime, because if n is a prime then n + 1 will be even, hence not a prime.) To find all prime pairs up to 100 we can simply write

	print({n in {2..100} |
		((not (exists m in {2..n - 1} | n mod m = 0))
			and (not (exists m in {2..n + 1} | (n + 2) mod m = 0)))});

The output of this program is

{3 5 11 17 29 41 59 71}

indicating that the only such twin-prime pairs are

[3, 5], [5, 7], [11, 13], [17, 19], [29, 31], [41, 43], [59, 61], [71, 73],

Sometimes one is interested in primes which satisfy particular quadratic equations, for example, primes n of the form n = k**2 + 1. Since if n is not larger than 100, any integer k solving this equation would have to be smaller than 10, we can find all the primes of this form just by writing

	print({n in {2..100} | ((not (exists m in {2..n - 1} | n mod m = 0))
		and (exists k in {1..10} | n = k*k + 1))});

Similarly, to find all the primes up to 100 which have the form 2k**2 + 3 we can write

	print({n in {2..100} | ((not (exists m in {2..n - 1} | n mod m = 0)) 
		and (exists k in {1..10} | n = 2*k*k + 3))});

the output of the first of these programs is

{2 5 17 37}

and the output of the second program is

{5 11 53}

(You may notice that these programs can be to run written much more efficiently; for now, we choose to ignore efficiency considerations.)

3.5.5.2 Integer right triangles

The famous theorem of Pythagoras states that the length h of the hypotenuse of a right triangle and the lengths a and b of its two sides are related by the equation a2 + b2 = h2. Whole-number solutions of this equation are useful to people who make up elementary mathematics exams and want to invent problems that have whole number answers. Examples of such integer right triangles are 3, 4, 5 and 5, 12, 13. The following single-statement program finds all integer right triangles a, b, h for which a is less than b and both are less than 30. We let b range over the set {1..30} and a range over the set {1..b -1}. To find if a*a + b*b is a perfect square, we simply search for an integer h whose square is equal to that sum. The possible range of h is from to a + b (approximately; can you give a more precise range for it?). In the program that follows we eliminate all triangles for which a and b have a common divisor, since these are simple multiples of smaller integer right triangles.

	print({[a, b, h]: b in {1..30}, a in {1..b - 1} | 	
		(exists h in {2..a + b} | (a*a + b*b = h*h)) and 
		(not (exists d in {2..b - 1} | ((b mod d) = 0 and (a mod d) = 0)))}); 

The output of this program is

{[3 4 5] [5 12 13] [8 15 17] [20 21 29] [7 24 25]}.

It is not hard to prove mathematically that there exist infinitely many different integer right triangles.

1.6 Tuple Operations and Tuple Formers

We have mentioned repeatedly that sets are unordered and can never have duplicate or undefined members; tuples are ordered and can have both duplicate and undefined components. For example,

[1, 0, 1, 0, OM, OM, 1, 0]

is a perfectly legal tuple; its first, third, and seventh components are all 1, while its fifth and sixth components are undefined. In spite of this very fundamental difference between sets and tuples, the binary and unary operators on tuples that SETL provides are similar to corresponding set operators. In addition, SETL allows tuple formers that construct tuples in the same manner that set formers build sets. The syntax of tuple formers is similar to that of set formers. In fact, all set-forming expressions can be transformed into tuple forming expressions by replacing the set brackets with tuple brackets.

3.6.1 Binary tuple operators

Binary tuple operators compute a result value from two inputs, one or both of which must be a tuple. The binary tuple operators are as follows (in what follows, t and tt are always tuples, and x can be an arbitrary value):

t + ttconcatenates tt to the end of t. Here n must be an integer.
n * tThis forms n copies of t and concatenates them end to end, to form a tuple n times as long as t. If n = 0, then the null tuple (i.e., [ ]) is obtained; if n < 0, an error results.
t * nif n is an integer, this is equivalent to n*t
t with xyields a new tuple identical to t except that x is appended to it as an additional final component
The following are predicates on tuples:
x in tyields true if x equals one of the components of t, false otherwise.
x notin tyields false if x equals one of the components of t; true otherwise.
t = ttyields true if all components of t are identical to the corresponding components of tt, false otherwise.
t /= ttyields true if some component of t differs from the corresponding component of tt, false otherwise.

A tuple is considered to extend from its first component to its last defined component, i.e., its last component different from OM. That is, all tuples can be regarded as having an indefinitely long sequence of trailing OM components (standing for the arbitrary number of components that may be added to it), but the identity of the tuple is given by its defined components only. In particular, when a tuple is printed its trailing OM components are not displayed. For example,

[OM, OM, OM, OM]is printed as[ ]
[1, OM, 2, OM]is printed as[1, OM, 2]
[1, OM] with OMis printed as[1]

Some examples of the binary tuple operators are

print([1,2] + [3,4]);yields[1 2 3 4]
print([1,2] with [3,4]);yields[1 2 [3 4]]
print(2 * [1,2],[1,3] * 2);yields[1 2 1 2] [1 3 1 3]
print(1 in [1, 2, 3], [1, 2] in [1, 2, 3]);yieldstrue false
print(OM in [1, 2, 3], OM in [1, OM, 3]);yieldsfalse true
print([1, 2] = [2,1], [1, 2,1, 2] = [1, 21, 2]);yieldsfalse false
print([1, 2, 1, 2] /= [1, 2,1, 2,1], [1,1]/= [1,1,1], [1]/= [1,OM]);yieldstrue true false
print ([ ] /= {}); yieldstrue

3.6.2 Unary tuple operators

Unary tuple operators produce a value from a single tuple operand. The unary tuple operators are

#tyields the index of the last non-OM component of t
random tyields a component of t picked at random from its first to its last non-OM component. All components, including OM components in this range, have an equal chance to be picked. Note that successive uses of random t will generally yield different, independently chosen random components.

The following are examples of the unary tuple operators.

print(#[3], #[ ], #[1,OM]);yields 1 0 1
print(#[1,OM], #[OM, 1], #[1,1,1]);yields 1 2 3
print(#[1,OM,OM,OM,OM, 1]);yields 6
print( #[1, OM, OM, OM, OM] );yields 1
print(#[1,2,3,4], #[1, 2, [3,4]], #[[1,2,3,4]]);yields 4 3 1

3.6.3 Other tuple operators. Indexing and slicing.

As for sets, so for tuples the construct

[x1, x2,... ,xk]

forms a tuple by enumerating its elements explicitly. The various xj appearing in this construct can be arbitrary expressions. If some of the xj appearing at the end of this construct evaluate to OM, then a tuple of length less than k will be formed. For example, if t has the value [1, OM, OM, 2], then

[t(4), t(3), t(2), t(3)]

forms the tuple [2, OM, OM, OM], i.e., the tuple [2], whose length is of course 1. As was discussed in Section 3.2.1 the tuple of integers ranging from n to m (inclusive) can be written as

[n..m]

and the tuple of integers n, n + k, n + 2k, etc., up to m can be written

[n, n + k..m].

In this last form, the "step" k can be positive (producing an ascending sequence) or negative (producing a descending sequence). The quantity n + k need not actually be a sum but can be any integer-valued expression. If the m in [n..m] is less than n, then the null tuple results. Similarly, if m = m1, then [m,m1..n] will yield the null tuple. Similar rules apply to [n,n + k..m]. For example,

print([3,5..1]);yields[ ]
print([3,2..-3]);yields[3 2 1 0 -1 -2 -3]
print([3,2..4]);yields[ ]
print([3,4..2]);yields[ ]

Tuple indexing, slice, and assignment operators are similar to the string slice and assignment operators described in Section 2.5.3. The indexing and slice operators are as follows (we assume as before that t designates a tuple):

t(i)yields the i-th component of the tuple t. If i is zero or less, an error results; if i exceeds the index of the last non-OM component of t, then t(i) yields OM.
t(i..j)yields the section or slice of t extending from its i-th through its j-th components, inclusive. If i = j + 1, then t(i..j) always yields the null tuple. If i > j + 1, an error results. If i exceeds the last non-OM component of t, then a null tuple is returned.If j exceeds the last non-OM component of t, an error results.
t(i..)yields the section or slice of t extending from its i-th through its last non-OM component, inclusive. This operator is generally equivalent to t(i..#t). If i is zero or negative, or if i exceeds #t + 1, an error results. If i = #t + 1, then t(i..) yields the null tuple.

To give examples of these operators, we assume that t is the tuple [10,OM,30,OM,50,OM,70]. Then:

print(t(1),t(2),t(3));yields10 OM 30
print(t(7),t(8));yields70 OM
print(t(2..5),t(2..6));yields[OM 30 OM 50] [OM 30 OM 50]
print(t(2..8));yieldsan error
print(t(3..2));yields[ ]
print(t(8..11));yieldsan error
print(t(3..8));yieldsan error
print(t(9..));yieldsan error

It should also be noted that if the i-th component of t is itself a tuple or a string, then further indexing of t(i) is possible. Suppose, for example, that t is the following tuple of tuples of strings:

[["Tom","Dick","Harry"],["Peter","Paul","Mary"],["Mutt","Jeff"]]
print(t(2));yields [Peter Paul Mary]
print(t(2)(3));yields Mary
print(t(2)(3)(1));yields M
print(t(2.. 3)); yields [[Peter Paul Mary] [Mutt Jeff]]
print(t(2..3)(2..));yields [[Mutt Jeff]]
print(t(2..3)(2..)(1));yields [Mutt Jeff]
print(t(2..3)(2..) (1)(2..));yields [Jeff]
print(t(2..3)(2..)(1)(2..)(1)(2));yields e

The tuple assignment operators are as follows (we assume as before that the values of t and tt are tuples):

t(i) := x;modifies the i-th component of the tuple t, setting it equal to the value of x. If i is zero or negative, an error results. If i exceeds the index of the last non-OM component of t, then t will be extended with as many OM components as necessary, and then its i-th component will be set equal to x. (Therefore the assignment t(i) := x can increase the length of t by any amount up to i)
t(i..j) := tt;modifies the section of t extending from its i-th through its j-th component, setting this section equal to the components of tt. If i is zero or negative, or if i exceeds j + 1, an error results. If i = j + 1, then the components of tt will be inserted into t immediately before position i. If i exceeds the index of the last non-OM component of t, then t will be extended with as many OM components as necessary, and then tt will be appended.
t(i..) := tt;this assignment is equivalent to t(i..#t) := tt. Thus it modifies the section of t extending from its i-th component to its last non-OM component, setting it equal to the components of tt. If i is zero or negative, or if i exceeds #t + 1, an error results. If i = #t + 1, then the components of tt are appended to the end of t.

To give examples of these operators, suppose that t1, t2,..., t22 all have the value [1, 2, 3, OM, OM, 6]. Then,

t1(2) := OM;-- now t1 = [1,OM,3,OM,OM,6]
t2(4) := 40;-- now t2 = [1, 2, 3, 40, OM, 6]
t3(8) := 70;-- now t3 = [1,2,3,OM,OM,6,OM,70]
t4(9) := OM;-- now t4 = [1, 2, 3, OM, OM, 6]
t5(2..4) := [OM, 30, 40];-- now t5 = [1, OM, 30, 40, OM, 6]
t6(2..2) := [20];-- now t6 = [1, 20, 3, OM, OM, 6]
t7(2) := 20;-- now t7 = [1, 20, 3, OM, OM, 6]
t8(2) := [20];-- now t8 = [1, [20], 3, OM, OM, 6]
t9(2..2) := [20];-- now t9 = [1, 20, 3, OM, OM, 6]
t10(2..1) := [20, OM, 30];-- now t10 = [1,20,OM,30,2,3,OM,OM,6]
t11(6..5) := [20, OM, 30];-- now t11 = [1,2,3,OM,OM,20,OM,30,6]
t12(1..0) := [20, OM, 30];-- now t12 = [20, OM, 30, 1, 2, 3, OM, OM, 6]
t13(7..9) := [20, OM, 30];-- results in an error
t14(5..5) := [20, OM, 30];-- now t14 = [1,2,3,OM,20,OM,30,6]
t15(5..5) := [20, OM, OM];-- now t15 = [1,2,3,OM,20,6]
t16(4..5):=[ ];-- now t16 = [1,2,3,6]
t17(2..3) := [20];-- now t17 = [1, 20, OM, OM, 6]
t18(2..4) := [20];-- now t18 = [1,20,OM,6]
t19(6..) := [ ];-- now t19 = [1, 2, 3]
t20(5..) := [50,60,70,80];-- now t20 = [1,2,3,OM,50,60,70,80]
t21(7..) := [50,60,OM,80];-- now t21 = [1,2,3,OM,OM,6,50,60,OM,80]
t22(8..) := [20, OM, 30];-- results in an error

Repeatedly indexed tuple (and map) assignments such as

t(i)(j..k)(1) := tt;

are possible in some cases; see Section 3.12 for a general discussion of these assignments.

3.7 Tuple Formers; Simple Tuple and String Iterators

The construct

				[e:x in s | C] 					(1)

reads "the tuple of all values assumed by the expression e as x ranges over the elements of s for which the condition C has value true." It is similar to the set former

				{e:x in s | C},  		               	(2)

(see Section 3.5.3) except that (2) eliminates duplicates and builds a set, whereas (1) builds a tuple and does not eliminate duplicates. The order in which the components of the tuple (1) are arranged is determined by the order in which iteration proceeds over the elements x of the set s.

As in the case of set formers, the condition C appearing in (1) need not appear; i.e., one can write

				[e: x in s]  		               	(3)

read "the tuple of all values assumed by the expression e as x ranges over all the elements of s." Moreover, multiple iterations can be used in a tuple formers; i.e., constructs like

				[e:x in s1,y in s2]			(3a)
				[e:x in s1,y in s2, z in s3]		(3b)

etc., are allowed. Again, the order in which the components of (3a) or (3b) are arranged depends on the order in which iteration proceeds over the elements of s1, s2, etc. For example, in (3a) a complete iteration over s2 will always be made each time the variable x advances from one element of s1 to the next, and in (3b) a complete iteration over s3 will always take place each time the variable y advances from one element of s2 to the next.

[x in s | C]

is read "the tuple of all x in s for which the condition C evaluates to true." It is also possible to elide C, thereby writing

[x: x in s]

This simply arranges the elements of the set s in (arbitrary) order as a tuple.

As noted in Section 3.5.3, the iterator x in s appearing in such constructs as the set former

					{e:x in s | C}			(4)

and the existential quantifier

				...exists x in s | C..			(5)

iterates over the elements of s, assigning each one of them in turn as the value of x, until the iteration terminates, either because (as in (4)) all elements of s have been processed or because (as in (5)) an element x of s satisfying the condition C has been found. Since iterative constructions and searches of this kind are quite useful, corresponding iterators over tuples and strings are also provided. If t is a tuple, then the iterator x in t, which can be used in such contexts as

				{e: x in t | C}				(4a)

				...exists x in t | C..			(4b)

iterates over the components of t, in order, from its first component to its last non-OM component, assigning each component in turn as the value of the variable x, until the iteration terminates for one of the two possible reasons stated. The iteration advances over all components, including OM components, in turn, but components not satisfying the Boolean condition C appearing in (4a) and (4b) are bypassed. We emphasize that, even though the corresponding set iterator, e.g.,

...exists x in s | C

iterates over the elements of the set s in some unpredictable, arbitrary order, the tuple iterator (4b) always iterates over the components of t in a known order, namely from first component to last. Therefore, if the existential search (4b) finds any component x of t satisfying the condition C, it will always find the leftmost such component, which will become the value of x.

We can iterate over the successive characters of a string in similar fashion. If in (4a) t is a string, then (4a) iterates over its characters, in order, from its first character to its last, assigning each character in turn as the value of the variable x, until the iteration terminates for one of the two possible reasons stated. Characters not satisfying the condition C appearing in (4a) are bypassed. Similar remarks apply to the set former (4a) and to universal quantifiers which iterate over strings and tuples.

Note, as an easy application of all this, that the set s of all distinct components of a tuple t can be formed by writing

{x: x in t}.

If t is a string, this same expression will form the set of all its distinct characters.

A more general account of the iterator forms usable in set formers, tuple formers, compound operators, and for loops is given in Section 4.3.

By writing the iterator

x in [M..N]

as part of a set former or quantifier we can cause x to range over all the integers of the numerical range M through N inclusive in order. Similarly, by writing the iterator

x in [M,M + k..N]

we cause x to be iterated over integers lying between M and N, starting with M and proceeding by steps of k. This iteration will proceed either in increasing or in decreasing order, depending on whether k is positive or negative. (If k = O, the iteration will be terminated as soon as it is attempted.) For example, to find all the vowels in a string which are followed by other vowels and print the corresponding set of all double vowels or diphthongs, we can simply write:

vowels := {"a","e","i","o","u","y"};
print({s(i..i + 1): i in [1.. #s - 1] | (s(i) in vowels and s(i + 1) in vowels)});

Note that we could have defined vowels as

vowels := "aeiou";

because the construct (a in b) will test whether a is an element or a component of b, if b is either a set, a tuple, or a string. Similarly, to find the set of all places in a tuple of integers at which the sign of its component changes from + to - we can simply write

print({i in [1..#t-1] | t(i) > 0 and t(i + 1) < 0}).

3.8 Map Operations

Sets of a special kind, namely sets all of whose elements are tuples of length 2 (also called ordered pairs or simply pairs) have a very special importance in SETL because they can be used to record associations between pairs of objects, or between objects and attributes of these objects (for example, between the names of people and their birth dates, between words in one language and words in another, etc.).

Sets of this kind are called maps, and the most significant operators of SETL, its map operators, apply only to such sets. In this section, we will describe these operators and their use. Let us start by restating that maps are sets, and as such all set operations (union, membership, etc.) apply to maps, without exception.

3.8.1 The image-set operator f{x}.

Suppose that f is a map, i.e., a set of pairs:

		{[x1,y1], [x2,y2],..., [xk,yk]}.        	   (1)

Then f{x}, called the image set of f at the point x, is defined to be the set of all second components of pairs in f whose first component is x. Using the standard set former, we can write this set as

		{y(2): y in f | y(1) = x}       		       (2)

The significance of this operation lies in the fact that, if we regard f as representing a certain abstract relationship R, then f{x} is precisely the set of all elements which stand in the relationship R to the object x.

Suppose, for example, that f contains the pair [s, c] if and only if s is a student in a particular school and c is a course in which s is registered. Then f{s} designates the set of all courses in which student s is registered. Suppose next that g is another map, which contains the pair [c, s] if and only if f contains the pair [s, c]. (This map is called the inverse of the map f.) Then for each course c, g{c} is the set of all students registered.

For a still more specific example, suppose that f is the map

	{["Jones", "Tom"], ["Khalid", "Leila"], ["Smith", "Mary"], ["Khalid", "Fatima"] }           (3)
Then:

f{"Jones"} is {"Tom"};
f{"Smith"} is {"Mary"};
f{"Khalid"} is {"Leila","Fatima"}.

Moreover

f{"Chang"} is the null set, i.e. { }

since no pair beginning with "Chang" is present in the map (3).

In Section 3.3 we introduced the notions of domain and range of a map. In terms of the image-set operation, their meaning can be restated as follows: the domain of f, namely the set of all first components of pairs in f, is also the set of all x for which f{x} is different from { }, and the range of f, namely the set of all second components of pairs in f, is also the set of all y which belong to at least one set of the form f{x}.

3.8.2 The single-valued image operator f(x)

If the image set f{x} contains exactly one element y, that is, if f{x} is {y}, then we can also write f(x) for y (rather than having to write arb f{x}, which would give the same value). The quantity f(x) (read "f of x") is called the image (or sometimes, for additional emphasis, the single valued image), of the element x under the map f, and we say that the map f sends x into f(x). If x is not in the domain of f, so that f{x} is empty, or if f{x} contains more than one element, then f(x) yields OM.

This last rule can be understood as follows. If, as before, we regard f as representing an abstract relationship R, then f(x) represents the unique element y which stands in the relationship R to x. If x is not in domain f, then f(x) is obviously undefined, since no element stands in the relationship R to x. If f{x} contains more than one element, then f(x) is still undefined, since we cannot tell which one of the several elements of f{x} the expression f(x) is supposed to represent. We can only speak of the element standing in the relationship R to x if f{x} contains exactly one element, in which case f(x) gives a non-OM value.

For an example of all this, suppose once more that f is the map (3). Then

f("Jones") is "Tom"; f("Smith") is "Mary";
f("Chang") is OM, since "Chang" is not in the domain of f;
f("Khalid") is OM, since f{"Khalid"} is a set containing more than one element

A map f is called single-valued at x if f(x) is defined, but is called multiple-valued at x if f{x} contains more than one element. The map f is said to be a single-valued map (or simply to be single-valued) if it is single-valued at each element x of its domain.

Note that maps are also sets (namely sets all of whose elements are tuples of length 2), so that all set operations also apply to maps. In particular, we an form the union, intersection, and difference of maps; add elements to and subtract elements from a map using the with and less operators; evaluate #f where f is a map, etc. For example, if f and g are both maps, then f + g, f * g, and f - g are also maps since every element of any one of these sets will be a pair; the same remark applies to f less z for any z. Moreover, if f is a map and z is known to be a pair, then f with z is still a map since all its elements are pairs. For example, if f is the map (3) and we let f2 be f with ["Jones", "Sue"], then f2 is still a map; moreover f2{"Jones"} is {"Tom", "Sue"}, and f2("Jones") is OM.

SETL allows us not only to evaluate expressions like f{x} and f(x), but also to use such expressions as assignment targets. If the value of f is a map, the map assignment

                 	 f(x) := y;     			             (4)

is always legal. The effect of this assignment is to modify f, and, as the notation (4) is intended to suggest, to modify it in such a way as to cause the value of f(x) to become y immediately after the assignment (4) is executed. This is done by modifying f as follows:

  1. First, all pairs [x, z] whose first component is x are removed from f. (This has the effect of removing x from domain f).

  2. Next (if y has a value other than OM), the single pair [x, y] is inserted into f. Thus f will contain exactly one pair [x, y] whose first component is x, guaranteeing that now f(x) will evaluate to y.

  3. However, if y has the value OM, then only step (a), but not step (b), is performed. In this case x will simply have been removed from domain f, guaranteeing that f(x) will evaluate to OM.

Rules (a), (b), and (c) tell us that if y /= OM, then (4) has exactly the same effect as the assignment

			f := {z: z in f | z(1)/= x} with [x, y];			(5a)

while if y = OM, then (4) has the same effect as the assignment

			f:= {z:z in f | z(l)/= x}.			(5b)

The intuitive significance of the assignment (4) can be explained as follows: it directs us to drop any prior association to the element x that is recorded in f, and then to associate x with y (for which we insert the pair [x, y] into f if y /= OM, but simply leave x without any association if y = OM). This is exactly the effect of steps (a-c).

For example, suppose again that f is the map (3), and that we first perform the assignment

f("Jones") := "Thomas";

This changes f to

{["Jones", "Thomas"], ["Khalid", "Leila"], ["Smith", "Mary"], ["Khalid", "Fatima"]}

Suppose that the assignment

f("Chang") := "Zhong-Tien";

is performed next. In this case, no pairs need to be removed from f, but one pair is added, changing f to

{["Jones", "Thomas"], ["Khalid", "Leila"], ["Smith", "Mary"], ["Chang", "Zhong-Tien"], ["Khalid", "Fatima"]}

Next, suppose that the assignment

f("Cohen") := OM;

is performed. This will simply remove all pairs with first component "Cohen" from f, but since there are none such, it will actually leave f unchanged. After this, suppose that the assignment

f("Khalid") := "Nuri";

is performed. This removes the pairs ["Khalid","Leila"] and ["Khalid","Fatima"] from f and gives f the value

{["Jones", "Thomas"], ["Smith", "Mary"], ["Khalid", "Nuri"], ["Chang", "Zhong-Tien"]}

Assignments of the form (4), which change the element y associated with an element x, are generally used for one of three purposes:

  1. to update an attribute f(x) of x;

  2. to define an attribute of x which has previously been undefined;

  3. to drop an attribute f(x) that is no longer needed, which we do by executing f(x) := OM.

Suppose, for example, that a map called count is being used to keep track of the number of times that each word x has been seen in a body of text that is being scanned. On encountering a word, we test to see whether it has been seen before; if so, we simply increment its count. Otherwise, we must initialize its count attribute, which will be undefined, to the value 1. This is done by the following code, which uses several map assignment operations.

if count(x) = OM then				-- word is new.
	count(x) := 1;				-- establish initial count for new word
else
	count(x) := count(x) + 1;		-- increment count if previously seen.
end if;

Note that a map assignment f(x) := y begins (see (a)) by attempting to remove a certain set of pairs from f, which assumes that f is already a map. Hence the operation f(x) := y (like the operations y := f(x) and y := f{x}) can only be applied if f is already a map. The question then arises as to how to initialize a map f. This can be done in one of two ways:

  1. If f is initially supposed to be the ("everywhere undefined") map whose domain is null (so that initially f(x) = OM for all x and f{x} = { } for all x), we simply write

    f := { };

    This makes f the everywhere undefined map with null domain and null range.

  2. A map value can be built up directly using a set former, providing that all the elements thus formed are pairs. For example, we can write

    f := {[x, #x]: x in {"Tom","Dick","Harry"}};

    This makes f into the map {["Tom", 3], ["Dick", 4], ["Harry", 5]}; i.e., f is a map with domain {"Tom","Dick","Harry"}, and f maps each element x in this domain into the length of x.

The multivalued map assignment

		f{x} := y;                  (6)

also has meaning in SETL. As the notation (6) suggests, this assignment modifies f in such a way as to cause the value f{x} to be y immediately after the assignment (6) is executed. It follows that (6) makes sense only if the value of y is a set. Otherwise the operation (6) generates an error.

The multivalued map assignment (6) is performed as follows:

  1. We first check that f is a map (i.e., a set consisting of pairs only), and that y is a set. If either of these conditions is violated, an error is generated.

  2. All pairs x, y whose first component is x are removed from f (This has the effect of removing x from domain f.)

  3. After this, the set of all pairs [x,z], for all z in y, is added to f. This guarantees that f{x} will evaluate to y.

These rules tell us that (6) has exactly the same effect as the assignment

		f := {u: u in f | u(1) /= x} + {[x,z]: z in y}.         (7)

Note therefore that if y /= OM, the single-valued assignment (4) has exactly the same effect as the map assignment

		f{x} := {y};               				  (8a)

if y = OM, then the effect of (4) is exactly that of

		f{x} := { };               				  (8b)

The value (5b) given to f by either f(x) := OM or by f{x} := { } can also be written in another form, using the map operator lessf:

		f := f lessf x;               				  (9)

which occasionally is more convenient. The expression f lessf x means "f less all the pairs in f whose first element is x." Note that (9), like the map assignment operators, applies only to maps and will generate an error if applied to a set f which contains any non-pair elements.

As an example, suppose again that f is the map

	{["Jones", "Tom"], ["Khalid", "Leila"], ["Smith", "Mary"], ["Khalid", "Fatima"]} (3)

Then after the assignment

f{"Khalid"} := f{"Khalid"} with "Omar";

f will have the value

{["Jones", "Tom"], ["Khalid", "Leila"], ["Khalid", "Omar"], ["Smith", "Mary"], ["Khalid", "Fatima"]}
.

If we subsequently execute the assignment

f{"Jones"} := { };
.

then f will take on the value

{["Khalid", "Leila"], ["Khalid", "Omar"], ["Smith", "Mary"], ["Khalid", "Fatima"]}
.

Along with the general set former construct, the map operations f(x), f{x}, f(x) := y, and f{x} := y are the most characteristic and important operations of the SETL language. Their importance derives from the fact that they allow arbitrary objects x to appear as indices; i.e., any object can appear as the x in a construct like f(x) or f(x) := y. Of course, other, lower-level, programming languages, such as PL/I, PASCAL, and Ada, support constructs with exactly the syntax f(x) := y and with a very similar intended use. However, in these other languages, an f used in this way must be an array (an object much like a SETL tuple), and the x appearing in f(x) or in an assignment f(x) := y must be a discrete type, which means an integer or something easily described by an integer. This complicates the manipulation of attributes associated with arbitrary objects x (and attribute manipulation is basic to programming). To manipulate attributes of a non-integer object x (say string or a set) in these other languages, one must first find a way of associating an integer with x and then must use this integer, instead of x itself, whenever the attributes of x need to be used or manipulated. This introduces a layer of artifice into programs, making them less direct, less readable, and more error-prone. This objection applies even to a language as elegant and powerful as APL, which only allows integers (and arrays of integers) to appear as indices. The best known languages which support something like the map operations of SETL are JavaScript, Java, SNOBOL (through its TABLE feature), some versions of LISP, and Python.

8.3 Some remarks on multivalued maps

When the elements of the range of a map are all of the same type, which is the common case, we often use that type to describe the map. For example, we speak of an integer-valued map f if f(x) is always an integer value. Similarly we say that a map is set-valued if f(x) is always a set.

Set-valued maps can be handled (in SETL) in one of two nearly equivalent styles. Either style is acceptable, and neither has any overwhelming advantage, but they are different, and to avoid error it is important to distinguish clearly between them. These two possibilities are as follows:

  1. A set-valued map f can be represented as a single-valued map whose value f(x) is a set, but
  2. The same map can be represented by a multivalued map g such that g{x} = f(x).
If f is available, then g can be produced by writing

		g := {[x,y]: x in domain f, y in f(x)}    		      (10)

Note the double iteration: for every element x in the domain of f we obtain the set f(x), and for each element y in this set we add the pair [x,y] to the map g.)

Conversely, if g is available, then f can be produced by writing

		f := {[x,g{x}]: x in domain g}    		      (11)

Note however that elements x such that f(x) = { } contribute no pairs to the construction of y in (10). If there are such elements in f, and we perform (10) followed by (11) to recover f, then these elements will have disappeared, and f(x) will now be OM.

A new pair [x,y] can be added to g simply by writing

g := g with [x, y];

The equivalent transformation of f must be written

f(x) := f(x) with y;

which is a bit clumsier.

To initialize g to a set of pairs defined by a condition C, one normally writes something like

g:= {[x,y]: x in s1, y in s2 | C}

The corresponding initialization of f, namely

f:= {[x,{y in s2 | C}]: x in s1};

is somewhat clumsier.

These small technical differences sometimes lead one to prefer the "g" representation of set-valued maps to the "f" representation.

3.8.4 Two useful map operations

The inverse of a map g is the map h such that [x,y] in h if and only if [y,x] in g. (If g is single-valued, this is equivalent to the condition that y = h(x) if and only if x = g(y)). We can produce h from g simply by writing

h := {[p(2), p(1)]: p in g};

Here we treat g as a set and iterate over it in the usual fashion. Successive values of the bound variable p are elements of g, namely pairs. We simply reverse the two components of these pairs to construct the elements of the inverse map. This important construction occurs frequently.

The "product" or "composite" of two maps g1, g2 is the map G such that [x,y] in G if and only if there exists a z such that [x,z] in g1 and [z,y] in g2. (If g1 and g2 are both single-valued, this is equivalent to G(x) = g2(g1(x)).) To produce G from g1 and g2, we can simply write

G:= {[x,y]: x in domain g1, z in g1{x}, y in g2{z}};

In other words: g1 maps each element x of its domain into a set g{x}; each element z of this set may be mapped by g2 into a set g2{z}; and the composition of the two maps establishes a mapping from x to each one of the elements of g2{z}.

If g1 and g2 are single-valued, their composition is expressed more simply by

G := {[x,g2(g1(x))]: x in domain g1 | g2(g1(x)) /= OM};

This map product operation is also quite important. For example, if Fa maps each person x onto the father of x, and Mo maps each person y onto the mother of y, then the composite of Mo and Fa maps each person x onto x's paternal grandmother, while the composite of Fa and Mo maps each x onto x's maternal grandfather.

3.8.5 Multiparameter maps

As noted previously, maps f are used to associate attributes f(x) or sets f{x} of attributes with elements x. It is occasionally necessary to deal with attributes f(x1,...,xk) that depend on two or more objects x1, ..., xk. For this purpose, the generalized map operations

		f{x1,..., xk}							(la)

		f(x1,..., xk)							(1b)

and the corresponding map assignments

		f{x1,...,xk} := y							(2a)

		f(x1,..., xk) := y							(2b)

are provided. These simply abbreviate

		f {[x1,..., xk]}								(1a')

		f ([x1,..., xk])								(1b')

and

		f {[x1,..., xk]} := y							(2a')

		f ([xl,...,xk]) := y							(2b')

respectively. That is, a multiparameter map f(x1,...,xk) of k parameters is regarded simply as a map whose domain consists of tuples of length k.

All the map constructs of SETL can be used with multiparameter maps if they are regarded as one parameter maps whose domain elements are tuples. For example, if f is a k-parameter map, then domain f will be a set of tuples of length k.

3.9 Compound Operators

Many common calculations on collections of values (sets or tuples) involve the combination of all the elements of this collection by means of some binary operator. For example, if t is a tuple of integers, and we want to compute their average value, we will need first to evaluate their sum, that is to say:

			t(l) + t(2) + ... + t(n)					(1)

Such computations are common enough that SETL provides a special notation for them. For example, (1) can be written as

+/t

(read: "plus over t", or "sum over t"). The combination +/ is called a composite operator. In this case it is analogous to standard mathematical notation, which would be written:

(fill this equation in)

For any binary operator bop, and any composite object O, the notation

			bop/O							(1a)

has the same meaning as the expression

			e1 bop e2 bop...bop en					(2a)

obtained by combining all the elements or components ej of O together using the binary operator bop repeatedly. If O contains only one element or component, then the value of (2a) is e1, that is to say that single component. If O is empty, then the value of (2a) is undefined, i.e., OM. For example, if S is the set {1..5} then

+/Syields 15
*/Syields 120
+/{x in S | x < 4}yields 6
*/[x in S | x > 4]yields 5
+/{x in S | x < 0}yields OM

Not all binary operators can be used to construct composite operators: if a op b op c is to make sense, then the type of the expression (a op b) must also be a valid argument to op. The arithmetic operators satisfy this condition, but the comparison operators (for example, >= ) do not. Also, binary operations which are not associative (for example, mod) may give strange results if used in this context.

In addition to the "prefix" form (1a), composite operators can also be used in "infix" form:

			x bop/O       				           (1b)

where x is an expression of a valid type for the operator. (1b) is equivalent to the expression

			x bop e1 bop e2 bop...bop en,      		     (2b)

where again the ej are all the elements or components of O, and the argument x is used for the first application of bop. If O is empty, the value of (1b) is the left argument x.

Here are further examples of use of compound operators:

0 +/ t yields the sum of all elements of t, but yields 0 if t is null
max/s yields the maximum element in s, but yields OM if s is null

If t> is a tuple of strings, say ["first","next","last"], then

+/t yields "firstnextlast"

obtained by concatenating the strings. On the other hand, if S is a set of strings, say {"one","another"} then

+/S yields "oneanother", or maybe "anotherone"

because once again S has no definite ordering, and its elements are chosen in some arbitrary way when computing (1a) or (1b).

If a and b are tuples that represent vectors in some space, then

0 +/ [a(i) * b(i): t in [1..#a]]

is the scalar or dot product of vectors a and b. If t is a tuple of integers, then

			1 */ [x in t | x/=O]					(3)

is the product of all the nonzero components of t, or 1 if t is empty. If t is the tuple [0, 2, 0, 2, 5, 2], then the value of (3) is 40. Notice that

			1 */ {x in t | x /=0}					(3a)

which looks very similar to (3), has a different value, namely 10. (Can you see why, before reading further?)

(The reason is that {x in t | x /= 0} is the set {2,5} with no duplicate values, and [x in t | x /= 0] is the tuple [2, 2, 5, 2]. When you want to apply a composite operator over some collection of values, you should examine carefully whether the collection should be represented by a set without duplicate elements, or as a tuple. The results can differ, as the examples demonstrate.)

The compound operator bop/ can only be formed using the built-in binary.

3.10 Types and type-testing operators

As we have seen, the possible types of SETL values are atom, Boolean, integer, real, string, set, and tuple. Given a variable x, it is often useful to know what the type of its current value is. This type may change as the result of successive assignments. The built-in unary operator type applies to any value or expression and produces its type, as a capitalized string: i.e., for any x with a definite value, type x is either "ATOM", "BOOLEAN", "INTEGER", "REAL", "STRING", "SET", or "TUPLE". SETL also provides a set of built-in binary operators called is_atom, is_boolean, is_integer, is_string, is_set, is_tuple, each of which yields true if applied to an object of the corresponding type, false if applied to an object of any other type.

One additional unary operator, is_map, yields true when applied to a set all of whose elements are pairs, false otherwise.

The undefined value OM cannot be expected to have a type, and indeed the expression type(OM) yields OM itself. In addition, any of the type predicates, such as is_set(OM) or is_atom(OM), yields false.

3.11 The ? Operator

In certain situations undefined (i.e., OM) results can be expected to appear, and one will want to replace them by some other default values when they do appear. A typical situation of this kind was described in Section 3.8.2, when OM.

34. Generate about a hundred different pairs of tuples, tl and t2, of the same length, all of whose components are floating-point numbers. Then count the number of those t's which satisfy the following inequality:

         (+/[x * x: in tl]) * (+/[x * x: x in t2])
              >= abs(+/[tl(i) + t2(i): i in [1..#tl]])**2

(Be careful not be fooled by small errors in the computation; i.e., a pair of tuples that barely satisfies or fails to satisfy the preceding equality should be considered indeterminate and ignored). What percentage of the tuples tested satisfy this inequality? What do you deduce from this?

35. Build and print out the following sets, letting x vary over 10 floating-point numbers scattered over the range of 1.0 to 10.0:

(a) The set of x for which x ** 0.0 is different from 1.0.

(b) The set of x for which x ** 0.0 is different from 1.

(c) The set of all differences sqrt(x) - x ** 0.5.

(d) The set of all differences x ** 0.5 - x ** (1.0/2.0).

(e) The set of all differences x * x - x ** 2.0.

(f) The set of all differences x * x * x - x ** 3.0.

(g) The set of all differences x - (x ** 3) ** (1.0/3.0).

(h) The set of all differences sqrt(x * x) - x

(i) The set of all x such that sin(x) ** 2 + cos(x) ** 2 = 1.0.

3.12 General Form of the SETL Assignment: The Operators from, frome, and fromb

In preceding sections, we have observed that some of the constructs which can appear in an expression, and which retrieve values or parts of values, can also appear on the left hand side of an assignment, allowing the corresponding values to be assigned or modified. For example, when it appears in an expression the expression f{x} retrieves the image set of x under the map f, but when it appears to the left of an assignment, as in

f{x} := e;

then the image set of x becomes e. Similarly, when the expression s(i..j) appears in an expression it yields a string or tuple slice, but when it appears to the left of an assignment, as in

s(i..j) := e;

it causes the value of this string or tuple slice to become e.

Constructs which can appear to the left of an assignment operator can also appear in expressions, and the relationship between left-hand and right-hand appearances (i.e., ordinary appearances within an expression) of any such construct always exhibits an important logical symmetry. Specifically, if lhs denotes any construct which, like the constructs f{x} and s(i..j), can appear to the left of an assignment, then the effect of the assignment

lhs := e;

is to assure that immediately subsequent evaluation of lhs (within an expression, i.e., in a "right-hand" context) will yield the assigned value e.

The elementary constructs which are allowed to appear to the left of an assignment operator are the following:

  1. A variable identifier x. The assignment
    x := e;

    modifies the value of x.

  2. A tuple-former [x1,...,xk]. (Notice that the ellipsis (...) stands for some unspecified number of other components of the tuple. This should not be confused with the SETL substring operation s(x..y)). The assignment

    [x1,...,xk] := e;

    where e must be a tuple, modifies the value of each of x1, ..., sk and is equivalent to the series of simple assignments:

    x1 := e(1);
    x2:= e(2);
    ...
    xk:= e(k);

    In such an assignment, any of the xj can be replaced by the dummy symbol "-" (dash), in which case no assignment is performed for this particular xj. (This is the one exception to the general rule that any construct which can appear to the left of an assignment can also appear to its right.) As an example of this, note that the assignment

    			[x,-,y] := [1,2,3];        		      (1a)

    gives x the value 1 and y the value 3. Moreover, the assignment

    			[x,-, y] := [1,2,3,4];           		  (1b)

    has the same effect, since the fact that y occurs as the third component of the tuple on the left of (1b) means that the third component of the right-hand side of (1b) will be assigned to y. For the same reason, the assignment

    			[x,-,y,z,w] := [1,2,3,4];           		  (1c)

    gives x, y, z, and w the respective values 1, 3, 4, and OM.

    iii. A tuple, string, or map selection f(x). The assignment

    f(i):= e;

    modifies component i of f if f is a tuple, character i of f if f is a string, and the value f(i) if f is a map.

    iv. A multiparameter map selection f(x1,...,xk). This is equivalent to f([x1,...,xk]), and the assignment

    f(x1,...,xk):= e;

    is equivalent to f(x1,..., xk]) := e.

    v. A multivalued selection f{x}. The assignment

    f{x}:= e;

    modifies the set f{x} .

    vi. A multivalued, multiparameter map selection f{x1,...,xk}. This is equivalent to f{[x1,...,xk]}, and the corresponding assignment

    f{x1,...,xk}:= e;

    is equivalent to f{[x1,...,xk]} := e;

    vii. A string or tuple slice t(i..j) or t(i..). The effect of

    t(i..j):= e or t(i..):= e

    is to modify the portion t(i..j) of the string or tuple. Note that the value of the string or tuple expression e may have a length different from that of the subsection of t which e replaces, so these assignments can increase or decrease length of t. See Section 2.5.3 and 3.6.3, also Table 2.1, for a discussion of marginal cases of these assignments, e.g., j = i - 1, and i= #t+1.

    Simple expressions, of any of the types we have just listed, which can appear on the left of an assignment can also be compounded to build up more complex assignment targets that are also allowed to appear to the left of an assignment operator. For example, if f and g are maps, t is a tuple, and s is a string, then the assignment

    		[[x,y],f(u),g{v},s(j..)] := e;					(1a)

    is a legal assignment, whose effect is the same as that of the following sequence of assignments:

    		x	:=	e(1)(1); -- The first component of the first component
    		y	:=	e(1)(2);
    		f(u)	:=	e(2);							(1b)
    		g{v}	:=	e(3);
    		s(j..)	:=	e(4);
    

    Map and tuple component extraction operators can also be compounded; e.g., we are allowed to write h{u}(v)(i) if h is a map such that H1 = h{u} is also a map such that H1 (v) is a tuple whose i-th component can be extracted. The value x that h{u}(v)(i) produces is exactly that produced by the sequence

    		H1:= h{u};
    		H2:= Hl(v);
    		x:= H2(i);
    

    where H1 and H2 are otherwise unused, compiler-generated variables. Compounds of this sort can also be used to the left of assignment operators; for example we can write

    			h{u}(v)(i) := e;   				             (2a)

    This has exactly the same effect as the following sequence, into which the SETL compiler expands (2a):

    		H1 := h{u};
    
    H2 := H1 (v);
    H2(i) := e;
    H1(v) := H2;
    h{u} := H1;

    The general rules used to expand compound assignments can be stated as follows:

    i. An assignment of the form

    		[e1,...,ek] := x      			        (3a)

    is legal if, for each j between 1 and k, either ej is the sign "-" (dash), or else the assignment

    		ej := y;					

    is itself legal. If it is legal, (3a) is expanded into the sequence:

    		e1 := x(l);					(3b)
    		...	
    		ek := x(k);

    but in (3b) every assignment corresponding to an ej of the form "-" is omitted.

    ii. An assignment of one of the forms

    e(i) := x;(4a)
    e(i1,...,ik) := x;(4b)
    e{y} := x;(4c)
    e{y1,...,yk} := x;(4d)
    e(i..j) := x;(4e)
    e(i..) := x;(4f)

    is legal if and only if e is an expression, other than a tuple former [z1,...,zk], which can appear to the left of an assignment operator, and if in addition the corresponding code sequence

    temp_var := e; temp_var(i) := x; e := temp_var;(5a)
    temp_var := e; temp_var(i1,.., ik) := x; e := temp_var;(5b)
    temp_var := e; temp_var{y} := x; e := temp_var;(5c)
    temp_var := e; temp_var{y1,..,yk} := x; e := temp_var;(5d)
    temp_var := e; temp_var(i..j) := x; e := temp_var;(5e)
    temp_var := e; temp_var(i..) := x; e := temp_var;(5f)

    would be legal. When an operation (4a-f) is legal, it is expanded into the corresponding assignment sequence (5a-f). Of course, the final assignment of each of these sequences may itself require expansion; if necessary, this is performed recursively, leading to expansions like those shown in (1b) and (2b).

3.12.1 Assigning forms of infix operators

Assignments of the form

				lhs:= lhs op e;				(6)

where op designates any built-in (or user-defined, see Section 5.6.2) infix operator, and lhs designates any simple or compound expression which can legally appear to the left of an assignment operator, are extremely common. In SETL as in C, such assignments can be abbreviated as

				lhs op:= e;			(7)

For example, we can abbreviate

i := i + 1;

as

i +:= 1;

and

x := x max y;

as

x max:= y;

3.12.2 Assignment expressions

Simple assignments x := y (and even more complex assignments such as f{u} (v) := y) can be used as expressions. The value of such an expression is simply its right-hand-side y, but of course evaluation of such an expression always has a side effect; namely it modifies the value of the left-hand side. Because of this side effect, such an assignment is not strictly speaking an expression but can appear anywhere an expression can.

Assignment expressions of this sort are frequently used to abbreviate sequences of assignments which initialize a collection of variables by giving the same value to all of them. For example, the assignment

x := y := z := w := 0;

which is equivalent to

x := (y := (z := (w := 0)));

gives all four variables x, y, z, w the value zero. Another common use of assignment expressions is to save the value of quantities that one needs to use just past the point at which they are first evaluated. The code fragment

		if (x := f(u) + g(v)) in s then f(u) := x; else...       (8)

illustrates this. Since the quantity f(u) + g(v) is needed immediately after the test in which it is first evaluated, the programmer may find it convenient to assign this quantity as the value of an auxiliary variable x, saving reevaluation, and, equally important, abbreviating the program source text. A related example, showing another common use of the assignment expression construct, is

			if (x := y + z) >0 then
				positives with := x;
			else						(9)
				negatives with := x;
			end if;

Over-enthusiastic use of assignment expressions will lead to a crabbed programming style in which important operations flit by without sufficient syntactic emphasis. This will be bad if it deprives a program's reader of too much of the redundancy on which understanding of the program depends. A good rule of thumb is to use an assignment expression only when the subsequent target variable of the expression is used again within a few lines (less than 5, say) after the assigning expression itself.

3.12.3 Other positions in which assignment targets are allowed

There are a few of the other positions in which variables can occur that resemble the left-hand sides of assignment operators, in that new values are assigned to variables appearing in these positions when the contexts containing them are evaluated. These assigning positions are as follows:

i. The position of x in an iterator

x in s | ...

is assigning, since the iterator will assign successive values to x. The same remark applies to the position of x in an existential quantifier

exists x in s | ...

and in a universal quantifier

forall x in s | ...

This means that in such an iterator, the bound variable need not be a variable name, but can be any valid left-hand side. For example, if m is a map, then we know that in the iterator:

p in m

the successive values of p are tuples of length 2, p(1) is in the domain of m, and p(2) in its range. The components of p can be obtained if we write the iterator directly as

[x,y] in m

Now x and y are both bound by the iterator; in the course of the iteration, x will receive the value of successive elements of the domain of m, and y the corresponding element in the range. With this notation, the inverse map of m can be written simply as

m_inverse := {[y, x]: [x, y] in m};

Of course, the same remark applies to variables appearing in corresponding positions in multiple iterators, as in the case of the variables x, y, and x in

x in s, y in t, z in [1..n] |...

ii. The position of x and i in a map, tuple, or string iterator (to be discussed in Chapter 4)

x = f(i) |..

or in a multivalued map iterator

x = f{i} |..

is assigning. Of course, the corresponding positions in multiple iterators and in quantifiers are also assigning positions.

iii. Argument positions in function and procedure invocations corresponding to formal procedure or function parameters (see Chapter 5) that carry the read/write qualifier rw are also assigning positions.

Precisely the same expressions that can appear to the left of an assignment operator are allowed to appear in any other assigning position. Thus, for example, the construction

x + y in s |...

is illegal, since

x + y := e

would also be illegal; x + y is not a legal assignment target. On the other hand,

			[x,y] in s | C				(lOa)

			f(x) in s | C				(lOb)

			[[u,v],y] in s | C			(lOc)

are all legal, and have the same respective meanings as the code fragments

	(for temp_var in s) [x,y] := temp_var; if not C then quit; end; 	(11 a)

	(for temp_var in s) f(x) := temp_var; if not C then quit; end; 	   	(11b)

	(for temp_var in s) [[u,v],y] := temp_var; if not C then quit; end; 	(11c)

Much the same remark applies to quantifiers containing iterators in assigning positions, for example, in

			...exists [x,y] in s | C(x,y)...   	       (12)

The iteration implicit in the existential quantifier (12) will generate successive elements z of s and perform an implicit assignment [x,y] := z before the Boolean expression C(x,y) is evaluated.

As already noted, the position of i in

			for x = f(i) |... loop...				(13a)

and in

			for x = f{i} |... loop...				(13b)

also the positions of i1, ..., ik in

			for x = f(il,...,ik) |... loop...			(13c)

and in

			for x = f{il,...,ik} |... loop...			(13d)

are assigning.

Any expression which can appear to the left of an assignment operator can be substituted for the i in (13a) or (13b), or for any of i1 through ik in (13c) or (13d).

			for [x,y] = f([u,v]) | C(x,y,u,v) loop...			(14)

In (14), the iterator will generate successive elements z of the domain off and w of its range and then perform implicit assignments [x, y] := w and [u, v] := z before the Boolean expression C(x,y,u,v) is evaluated. Note also that (13c) and (13d) are equivalent to

			 for x = f([i1,...,ik]) |... loop...			(l5c)

and

			 for x = f{[i1,..., ik]}|... loop...		(15d)

respectively.


 

3.12.4 The operators from, frome, and fromb

As we will see in coming chapters, a very common model for solving problems involving sets has the following form: construct a set that details the work to be done; then remove some arbitrary element of the set, process it in some fashion, and continue until the set (often called a workpile) becomes empty. This model is also often used with tuples, in cases where the order in which we must tackle the work to be done is important. For this purpose, SETL provides three related operators: from, frome, and fromb. All these of them have an assignment-like effect. The first of these, namely

			x from s;					(16a)

where s is a set, is equivalent to the following code fragment:

			x := arb s; -- Choose any element of s.  	(16b)
			s := s less x; -- Remove it from s.

Thus the statement (16a) has an effect on both of its arguments: it assigns to the left argument and removes something from the right. The alert reader will easily guess that the left argument of from is in an assigning position and can therefore be any valid left-hand side. In fact so is the right-hand side of from, as (16b) makes clear. For example, if T is a tuple of maps, we might write

[f(a), g(b)] from T(x);

which is equivalent to


p := arb T(x);
f(a) := p(1);
g(b) := p(2);
T(x) less := p;

If s is the null set, then the effect of (16a) is to set x to OM and leave s untouched (there is nothing we can remove from it).

In the case of tuples, it turns out to be useful to be able to specify that the element being removed should come from either the beginning or the end of the tuple. Accordingly, the two operators forms provided are

		x frome t; -- take x 'from the end' of t      (17a)

		x fromb t; -- take x 'from the beginning' of t      (17b)

The effect of (17a) is to remove the last (non-OM) component of t and assign it to the variable x. If t is the null tuple, then x becomes OM and t remains null. The effect of (17b) is to remove the first component of t and assign it to the variable x. If this first component is OM, then x becomes OM, but t is reduced in length by 1 (its leading OM component is removed). If t is null, then x becomes OM and t remains null.

Like (16a), the forms (17a) and (17b) can be used as expressions. When used in this way they both yield the value assigned to x.

Note that, if t has OM components immediately preceding its last non-OM component, then (17a) can decrease the length of t by more than 1. For example, the sequence

		t := [1,2,OM,OM,3];
		x frome t; y frome t;
		print(x, y, #t);

will produce the output

3 2 1.

If a tuple is constructed solely by means of the with operator, which adds elements to the end of the tuple, and subsequently modified by means of with and frome, the last element that was added to the tuple is always the first one to be removed from it. Such a structure is known as a "last in, first out," or lifo, structure, or more commonly as a stack.

Conversely, if a tuple is built by means of with, and modified by means of with and fromb, then the first element that was added to the tuple is always the first one to be removed, and a tuple used in such a fashion is known as a"first in, first out," or fifo, structure, or also as a queue. If elements are added to a tuple at both ends and removed from either end, the resulting structure is called a deque.

3.13 Operator Precedence Rules

The following table shows the precedence rules which determine the order in which the operators in an expression are evaluated. If two operators share a common operand, then the one with the higher precedence is evaluated first. If both operators have the same precedence, then the left hand one is generally evaluated first (i.e., operators of a given precedence level are evaluated in a left-associative manner.) The only exception to this is the exponentiation operator '**': 2 ** 3 ** 4 means 2 ** (3 ** 4), not (2 ** 3) ** 4, and so evaluates to 2417851639229258349412352, not 4096.

Parentheses can be used freely to emphasize or alter the order of operations as determined by this table.

PrecedenceOperators

11:= (on left side)
assigning operators (on left side)
from (both sides)
10all unary operators except
not and the is_xx operators
9**
8* / mod
7+ - max, min
5= /= < <= > >= in notin subset incs
0:= (on right side)
assigning operators (on right side)

The following examples of equivalent expressions with and without parentheses illustrate the operation of these rules:

a + b + c * d

is the same as (a + b) + (c * d).

a + b +:= c / d

is the same as

a + (b + := (c / d)).

EXERCISES

1. Given that t is a tuple, explain the meaning of ?/(t + t).

2. Write a set former which will produce the set of all proper subsets of a set s, i.e., the set of all subsets sl of s which are different from s.

3. In how many ways can two pairs of parentheses be inserted into the expression

1 + 23 * 4 / 5
to produce a legal expression? Take 5 of these expressions and calculate their values. Do the same for

1 + 23 * 4/ 5 - 6.

4. Express #pow(s) in terms of #s. Is there any set such that s = pow(s)? For what sets is #pow(s) = 1? Is there any set that #s = #pow(s)? Is there any set such that #pow(s) = 2?

5. Given two sets s and t, their Cartesian product cprod(s,t) is { [x, y]: x in s, y in t}. Express #cprod(s, t) in terms of #s and #t. If cprod(s,t) = { }, what are s and t? Express #cprod(s,t) in terms of #cprod(t,s).

6. Determine the type of the value of x in each of the following code fragments, assuming that the code shown executes without causing any error.

	(a)		x := z + 1;
	(b)		x := z + "1"';
	(c)		x := z - {l};
	(d)		x := z - [1];
	(e)		read(x);
		if x > O then print(x); end if;
	
	(f)	x := arb s;
	for y in s | y > 0 loop print(y); end loop;

	(g)	if exists x in s | #x(i ..j) < j - i then 	
			print(x); 
	 	end if;

7. If f is a map and s is a set, then the image set of s under f, sometimes written f[s], is by definition the set {y: [x,y] in f | x in s}. The inverse image of s under f, sometimes written f_inv [s], is by definition the set {x:[x,y] in f | y in s}. Express f[s] in terms of the sets f{x}, using a compound operator. What is f[domain f]? What is f_inv[range f]?

8. Execute the programs


			[A, A, A] := [1, 2, 3]; print(A);

			[A,B,A,B] := [1,2,3,4]; print(A,B);
What results do you expect? What is going on?

9. Write expressions which will find the following positions in a string s:

(a) The position of the first occurrence of the letter "a"

(b) The position of the second occurrence of the letter "a"

(c) The position of the n-th occurrence of the letter "a"

(d) The position of the last occurrence of a that is preceded by no more than five occurrences of e

If the desired occurrences do not exist, your expression should return the value OM.

10. Write an expression which, given a tuple t of integers, forms the tuple t2 of all "partial sums" of the components of t. That is, the j-th component of t2 should be the sum of components 1 through j of t.

11. A tuple t of tuples, all of the same length n, can be regarded as an m x n rectangular array of items. Write a program which rearranges this array by turning it 90 degrees, so that it becomes an n x m rectangular array of items, represented by a tuple t2 of tuples all of length m. If this operation is repeated twice, what happens?

12. It can be shown that two set expressions el and e2 involving any number of variables x1, ..., xn and formed using only the set union, intersection, and difference operations are equal for all possible set values of the variables x1,.... xn if and only if they are equal whenever each of these variables has one of the two values { } and {1}. Therefore we can check a set-theoretic identity like x * y = y * x simply by evaluating

#{[x,y]: x in{{ },{1}},y in{{ },{1}} | x * y /= y * x}

and observing that its value is zero. Moreover since x incs y is equivalent to x * y = y, this same technique can be used to check inclusions of the form el incs e2. Using this technique, verify that the following set-theoretic identities and inclusions are true for all possible set values of x, y, and z:

	(a)		(x * y) = (y * x)
	(b)		(x+y) = (y+x)
	(c)		((x * y) * z) = (x * (y * z)), also ((x + y) + z) = (x + (y + z))
	(d)		((x + y) - z) = ((x - z) + (y - z))
	(e)		(x * x) = x, also (x + x) = x
	(f)		(x - x) = { }
	(g)		((x + y) * z) = (x * z + y * z), also ((x * y) + z) = ((x + z) * (y + z))
	(h)		(x + (y - x)) = (x + y)
	(i)		(x - (y + z)) = ((x - y) * (x - z))
	(i)		(x * { }) = { }, also (x + { }) = x

3.14 OM and Errors

When an illegal operation or an operation having an undefined result is evaluated during the running of a SETL program, one of two possible things will happen. Errors classified (somewhat arbitrarily) as "severe" will cause execution to terminate. In this case, a brief error indication will be written.

The following errors terminate execution:

  1. Type errors, e.g., an attempt to evaluate 1 + {0},1.0 + 2, [0] + {1}, "1" + 2, s{y} where s is a string or tuple, etc.

  2. Illegal use of OM, e.g., attempts to evaluate s + OM, OM * x, OM with x, etc.

  3. String or tuple parameters which are out of bounds, e.g., attempts to evaluate

    s(0) or s(-1),

    where s is a string or tuple.

  4. Illegal file operations, e.g., attempts to manipulate files which have not been opened (see Chapter 9).

"Mildly erroneous," deliberately intended, operations whose result is undefined will return the undefined value om. These include

  1. selection of an element from an empty set or tuple, as in

    x from { }, x from [ ], x frome [ ], or arb { }

  2. evaluation of a map at a point at which it is undefined or multiple valued, as in f(0) or f(1), where f is

{[1,1], [1,2]};

also evaluation of an undefined component of a tuple. Since in these cases execution is not immediately terminated, it is possible to test for an OM result in this case, giving greater semantic flexibility. Some typical constructs exploiting this flexibility are

		if (x from s) /= OM then... -- test a set for nullity and extract
		if f(x) /= OM then...    -- see if the map f is uniquely defined at x

On the other hand, since the legal uses of OM are severely restricted, unexpected OM values are likely to force error termination soon after they appear. Consequently, errors of this sort can generally be tracked down pretty easily.