The content of this notebook was presented in Dyalog Webinars: Boolean Scans and Reductions.
Logical values are the integers 0
and 1
, in contrast to distinct true
and false
values you might find in other programming languages.
Boolean data is widely used in APL.
Results of comparison functions:
_ ← ,⍥⊆ ⍝ Stranding function
¯2 ¯1 0 1 2 (↑ < _ ≤ _ = _ ≠ _ ≥ _ >) 0
Data-driven conditionals. For example, "increase the salaries in group A by 5%":
groups ← 'ABACC'
salaries ← 20000 25750 21000 32350 32400
salaries × 1.05 * groups = 'A'
Selecting from arrays:
values ← 4 10 6 8 16 24
names←'Anne' 'Ben' 'Charlie' 'Demi' 'Ella' 'Fiona'
(values≥10)/names
Simple statistics:
(+⌿÷≢)'the alphabet'∊'aeiou'
With modern APL, all sixteen logical functions for 2-bit inputs can be represented by either primitives, atops or constant functions.
The descriptions given below are either common names for digital logic gates or mnemonic descriptions of the functionality with regards to left and right Boolean arguments.
The binary column here represets the output of the logic gate for all combinations of two 1-bit inputs — also known as the truth table. For example, the truth table for an OR gate:
0 0 1 1 ∨ 0 1 0 1 0 1 1 1 2⊥0 1 1 1 7
Binary | Decimal 2⊥ |
Function f |
Description |
---|---|---|---|
0 0 0 0 |
0 |
0⍨ |
FALSE |
0 0 0 1 |
1 |
∧ |
AND |
0 0 0 1 |
2 |
> |
Left but not right |
0 0 1 1 |
3 |
⊣ |
Left |
0 0 0 1 |
4 |
< |
Right but not left |
0 1 0 1 |
5 |
⊢ |
Right |
0 0 1 1 |
6 |
≠ |
Exlusive OR |
0 1 1 1 |
7 |
∨ |
OR |
0 0 0 1 |
8 |
⍱ |
NOR |
1 0 0 1 |
9 |
= |
Exclusive NOR |
0 1 0 1 |
10 |
~⍤⊢ |
Not right |
1 1 0 1 |
11 |
≥ |
Left OR not right |
0 0 1 1 |
12 |
~⍤⊣ |
Not left |
1 0 1 1 |
13 |
≤ |
Right OR not left |
0 1 1 1 |
14 |
⍲ |
NAND |
1 1 1 1 |
15 |
1⍨ |
TRUE |
Note
[Phil Last's article](http://archive.vector.org.uk/art10501140) gives [scalar](https://aplwiki.com/wiki/Scalar_function) [dfn](https://aplwiki.com/wiki/Dfn) definitions, whereas some definitions in the previous table are not scalar functions.
Reductions summarise some property:
+/4 3 1 5 ⍝ Sum
Scans indicate a progression of that property along the array:
+\4 3 1 5 ⍝ Cumulative sum
Some reductions, and their related scans, are well known to APLers:
∧/1 1 1 1 1 1 1 1 ⍝ Are all true?
∧\1 1 0 1 1 1 0 1 ⍝ Were all true so far?
∨/0 0 1 0 0 0 0 0 ⍝ Are any true?
∨\0 0 1 0 0 0 1 0 ⍝ Were any true so far?
The value of a vector reduction f/⍵
is always the last value in the related scan f\⍵
.
+/3 4 1 5
⊃⌽+\3 4 1 5
In general, properties of boolean vectors for which 1≡f/⍵
for a function f
reflect properties in the related scans f\⍵
.
From here we will take a quick look at a couple of particular scans and reductions which have some interesting applications. At the end of this notebook is a table of reductions for the sixteen logical functions, as well as some references for further reading.
Less-than-scan leaves only the first true bit on, and all others are turned off:
<\0 0 1 0 1 1 1 1 1 1 0 1 0 1 1
This is because the reduction only returns true when the last bit is the one and only true bit:
</0 0 0 1
Too see the relationship, view the scan as a progression of reductions:
⍝ <\0 1 0 1 1
0
0<1
0<1<0
0<1<0<1
0<1<0<1<1
An example use is to mark the start of end-of-line comments:
code ← '+/⍳10 ⍝ sum⍝of⍝first⍝10⍝ints'
c ← code = '⍝'
↑ code c (<\c)
The pair-wise reduction can be used to mark the start of groups of similar cells:
v ← '⍟⎕⍟⍟⍟⎕⎕⍟⎕⎕⍟⍟⍟⍟⎕⎕⎕⎕⍟⍟⎕⎕⎕⎕⍟⍟⎕⍟'
q ← '⎕'=v
↑v q (2</0,q)
The not-equal reduction ≠/
only returns true if there are an odd number of 1
s
≠/1 1 1 1
2|+/1 1 1 1
≠/1 1 1 0
2|+/1 1 1 0
The related scan ≠\
then has the property that odd and even instances of 1
s are joined by a series of 1
s
b ← 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1
↑ b (≠\b)
Again, looking at the progression can help us to see how this works:
⍝ ≠\0 1 0 0 1
0
0≠1
0≠1≠0
0≠1≠0≠0
0≠1≠0≠0≠1
An example use of this is to mark quoted parts of text:
quoted ← 'extract the "quoted" parts from this "text"'
q ← '"'=quoted
inc ← (~∧≠\)'"'=quoted ⍝ Including quote marks
exc ← (⊢∨≠\)'"'=quoted ⍝ Exclusing quote marks
↑ quoted q inc exc
These masks can easily be used to isolate the quoted parts:
inc/quoted
exc⊆quoted
With a bit more effort, we can mark C-style (Java, JavaScript, CSS etc.) comments:
css ← '* {color: #000; /* text */ background: #fff; /* bg */}'
cpos ← '/*'∘(≠\⍷∨¯1⌽∘⌽⍷∘⌽)css
↑ css cpos
Some of these examples, and more, can be found on APLcart.
For one last example, let's use something like a data table with successive fields of different lengths:
data ← (?9⍴100),(4 2 3⌿3 3⍴3/'ABC'),⍪?9⍴0
In data
, there is a field of length 4
containing the 'A'
markers, followed by a field of length 2
with the 'B'
markers and finally a field of length 3 with the 'C'
markers.
⎕←data
We would like to present the data above in a more human-readable format, so we are going to insert two blank lines between each field.
To start with, let's create some partition vectors. The function MTake
creates a boolean vector where 1
indicates the start of a field:
MTake
has a neat, nested-array style definition:
MTake ← ∊↑∘1¨
MTake 4 2 3
A more traditional definition has performance benefits for larger data:
MTake ← {¯1⌽(⍳+/⍵)∊+\⍵}
MTake 4 2 3
If f
indicates a field, and b
indicates a series of blank lines,
1 0 0 0 1 0 1 0 1 0 1 0 0 1 0
←--f--→ ←b→ ←f→ ←b→ ←-f-→ ←b→
we can use catenate-rank-zero to interleave the blank lines:
MTake,4 2 3,⍤0⊢2
And use not-equal-scan to switch sections on and off, creating an expansion vector:
⎕←e←≠\MTake,4 2 3,⍤0⊢2
Which we then use to expand a character matrix format of our data:
e⍀⍕data
A similar technique can be used to also label our fields:
labels ← (MTake 4 2 3)⍀(↑'Field '∘,¨'ABC')
e⍀⍕labels,data
The STSC publication Boolean Functions and Techniques contains a wealth of information on Boolean functions and techniques (duh).
In particular, there is a description of a general procedure for applying some function to parts of an array independently, where parts is defined similarly to the fields in the previous example.
The general procedure involves a loop. With nested arrays, the pattern is quite easy to show:
1 0 0 0 1 0 0 {∊⌽¨⍺⊂⍵} 'ABCDXYZ' ⍝ Reverse two parts independently
1 0 0 0 1 0 0 {∊+/¨⍺⊂⍵} ⍳7 ⍝ Sum two parts independently
This pattern is easily turned into a dop:
_P ← {∊⍺⍺¨⍺⊂⍵} ⍝ A partitioned-function-application operator
p ← 1 0 0 0 1 0 0
p ⌽ _P 'ABCDXYZ'
p +/_P ⍳7
However, this looping approach can become inefficient for large arrays with many parts. Page 10 of Boolean Functions and Techniques gives some definitions for more array-oriented solutions for use with specific functions. Here we will look at a couple of these, starting with partitioned-reverse PREVERSE
.
The plus-scan is useful with these types of Boolean partitioned vectors, as it causes partitions to be marked by groups of successive integers:
a ← 'ABCDXYZ'
↑ a p (+\p)
If we downgrade the plus-scan, the relative positions of indices for each partition relative to the whole array are reversed, but the positions of indices within each partition stay in ascending order:
⍒+\p
Reversing the plus-scan then restores the relative positions of partitions within the whole array, but the indices within each partition are reversed:
⌽⍒+\p
Indexing with this vector creates the desired effect:
a[⌽⍒+\p]
These kinds of techniques often give performance improvements compared to their general nested-array counterparts:
p ← 1,1↓1=?1e4⍴10
t ← ⎕A[?1e4⍴26]
]runtime -c "p ⌽_P t" "p {⍵[⌽⍒+\⍺]} t"
p ⌽_P t → 1.4E¯4 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
p {⍵[⌽⍒+\⍺]} t → 3.5E¯5 | -75% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
The partitioned sum starts with the plus-scan of the whole vector:
+\⍳7
We then extract the progressive sums up to the end of each partition:
(1⌽p)/+\⍳7
Finally subtract each partition's sum from the sum of its preceding partition:
¯2-/0,(1⌽p)/+\⍳7
Once again, there is a speedup relative to the general case:
p ← 1,1↓1=?1e4⍴10
i ← ?1e4⍴1000
]runtime -c "p +/_P i" "p {¯2-/0,(1⌽⍺)/+\⍵} i"
p +/_P i → 6.8E¯5 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
p {¯2-/0,(1⌽⍺)/+\⍵} i → 8.9E¯6 | -87% ⎕⎕⎕⎕⎕
Some languages implement boolean types as bytes (8 bits). The decision to use bit representations was made by Larry Breed for the first implementation of APL.
Not only do boolean arrays use the smallest amount of memory possible, they are also subject to SIMD parallelisations - even on hardware without special SIMD capabilities.
More on this topic can be found on the APL Wiki article on Booleans.
Finally let's return to the table of Boolean functions from earlier. In the Vector article, Phil Last gives plain English descriptions for properties of boolean vectors when their reduction using a Boolean function f
gives 1
.
Function f |
f/⍵ is true if ⍵ satisfies: |
---|---|
0⍨ |
Never |
∧ |
All ones |
> |
Odd leading ones |
⊣ |
First is one |
< |
Last is the only one |
⊢ |
Last is one |
≠ |
Odd ones |
∨ |
At least one one |
⍱ |
Odd leading zeros else the last is the only one |
= |
Even zeroes |
~⍤⊢ |
Last is parity of the length |
≥ |
Even leading zeros |
~⍤⊣ |
First is zero |
≤ |
Last is not the only zero |
⍲ |
Even leading ones else last is the only zero |
1⍨ |
Always |
For those not described in this notebook, you might want to experiment and show yourself the relationships between the descriptions given in the table, and the properties of the scans f\⍵
. APL expressions offering similar insights are given on page 23 of the Boolean Functions and Techniques publication.