@StefanKarpinski presented at Strange Loop on September 19, 2013
Tends to be written as function application:
f(a,b,c)
⟸ LIKE THISa.f(b,c)
⟸ NOT THISIn Java or C++ you can provide these virtual methods:
Parent this: f(Parent that)
Parent this: f(Child that)
Child this: f(Parent that)
Child this: f(Child that)
Dispatched on this
but not on that
However, quoting Double Dispatch is a Code Smell:
The presence of Double Dispatch generally means that each type in a hierarchy has special handling code within another hierarchy of types. This approach to representing variant behavior leads to code that is less resilient to future changes as well as being more difficult to extend.
Something smells, but it's not necessarily the double dispatch
f(a::Any, b) = "fallback"
f(a::Number, b::Number) = "a and b are both numbers"
f(a::Number, b) = "a is a number"
f(a, b::Number) = "b is a number"
f(a::Integer, b::Integer) = "a and b are both integers"
f (generic function with 5 methods)
f(1.5,2)
"a and b are both numbers"
f(1,"bar")
"a is a number"
f(1,2)
"a and b are both integers"
f("foo",[1,2])
"fallback"
f{T<:Number}(a::T, b::T) = "a and b are both $(T)s"
f (generic function with 6 methods)
f(big(1.5),big(2.5))
"a and b are both BigFloats"
f(big(1),big(2)) #<== integer rule is more specific
"a and b are both integers"
f("foo","bar") #<== still doesn't apply to non-numbers
"fallback"
f(args::Number...) = "$(length(args))-ary heterogeneous call"
f{T<:Number}(args::T...) = "$(length(args))-ary homogeneous call"
f (generic function with 8 methods)
f(1)
"1-ary homogeneous call"
f(1,2,3)
"3-ary homogeneous call"
f(1,1.5,2)
"3-ary heterogeneous call"
f() #==> why is it heterogeneous not homogeneous?
"0-ary heterogeneous call"
f(1,2) #<== previous 2-arg method is more specific
"a and b are both integers"
f("foo") #<== doesn't apply to non-numbers
no method f(ASCIIString,) at In[16]:1
Arithmetic operators:
Number + Number | String + String | Array + Array
Number - Number | Time - Time | Time - Number | Array - Array
Number * Number | Array * Integer | Array * String | String * Integer
Integer << Integer | String << String | String << Integer
Arrays, Hashes & Strings:
(Array|Hash).fetch(index,default|block)
(Array|Hash).new(object|block) | String.new(string)
(Array|Hash)[int|range] | String[int|range|regex|string]
(Array|Hash)[int|range]= | String[int|range|regex|string]=
Array.slice(int|range) | String.slice(int|range|regex|string)
Array.slice!(int|range) | String.slice!(int|range|regex|string)
Just Strings:
String.index(string|int|regex)
String.rindex(string|int|regex)
String.sub(pattern,replacement|block)
String.sub!(pattern,replacement|block)
String.gsub(pattern,replacement|block)
String.gsub!(pattern,replacement|block)
Related meanings:
"she goes (home|away)" go(subj::Noun, where::PlaceAdverb)
"it went (wrong|well)" go(subj::Noun, how::MannerAdverb)
Default arguments:
"go (home|away|well)" go(adv::Adverb) = go(Person("addressee"), adv)
"he goes" go(subj::Noun) = go(subj, PlaceAdverb("somewhere"))
"go" go() = go(PlaceAdverb("somewhere"))
Fragment of type hierarchy:
Person <: Noun
PlaceAdverb <: Adverb
MannerAdverb <: Adverb
Not surprising that o.o. languages are emulating multiple dispatch
But they're missing a really crucial ingredient
eval
without being able to manipulate code as dataA generic function is a first-class object
f
f (generic function with 8 methods)
methods(f)
# 8 methods for generic function "f": f(a::Integer,b::Integer) at In[1]:5 f{T<:Number}(a::T<:Number,b::T<:Number) at In[6]:1 f{T<:Number}(args::T<:Number...) at In[10]:2 f(a::Number,b::Number) at In[1]:2 f(args::Number...) at In[10]:1 f(a::Number,b) at In[1]:3 f(a,b::Number) at In[1]:4 f(a,b) at In[1]:1
typeof(f)
Function
f2(x) = f(x,x)
f2("foo")
"fallback"
f(a::String, b::String) = "a and b are both strings";
f2(x) = f(x,x) #<== to reset method cache (issue #265)
f2("foo")
"a and b are both strings"
Mads Torgersen's paper The Expression Problem Revisited:
To which degree can your application be structured in such a way that both the data model and the set of virtual operations over it can be extended without the need to modify existing code, without the need for code repetition and without runtime type errors.
Basically you want to be able to add:
immutable Interval{T<:Real} <: Number
lo::T
hi::T
end
(a::Real)..(b::Real) = Interval(a,b)
Base.show(io::IO, iv::Interval) = print(io, "($(iv.lo))..($(iv.hi))")
show (generic function with 96 methods)
1..2
(1)..(2)
typeof(ans)
Interval{Int64} (constructor with 1 method)
sizeof(1..2) #==> two 8-byte ints
16
(1.5)..(2.5)
(1.5)..(2.5)
typeof(ans)
Interval{Float64} (constructor with 1 method)
(1//2)..(2//3)
(1//2)..(2//3)
typeof(ans)
Interval{Rational{Int64}} (constructor with 1 method)
sizeof((1//2)..(2//3)) #==> just four 8-byte ints
32
methods(Interval)
# 1 method for generic function "Interval": Interval{T<:Real}(lo::T<:Real,hi::T<:Real)
#(0.5)..(1) #<== would be a no method error because of mixed types
Interval(lo::Real, hi::Real) = Interval(promote(lo,hi)...)
Interval{T<:Real} (constructor with 2 methods)
(0.5)..1
(0.5)..(1.0)
1..pi
(1.0)..(3.141592653589793)
e..pi
(2.718281828459045)..(3.141592653589793)
+
and -
operators¶Adding and subtracting intervals:
a::Interval + b::Interval = (a.lo + b.lo)..(a.hi + b.hi)
a::Interval - b::Interval = (a.lo - b.hi)..(a.hi - b.lo)
- (generic function with 107 methods)
(2..3) + (-1..1)
(1)..(4)
(2..3) - (1..2)
(0)..(2)
What about arithmetic mixing intervals and numbers?
Instead, we'll hook Intervals
into Julia's promotion system:
import Base: convert, promote_rule #<== allows extending them
convert{T<:Real}(::Type{Interval{T}}, x::Real) = (x = convert(T,x); x..x)
convert{T<:Real}(::Type{Interval{T}}, iv::Interval) =
convert(T,iv.lo)..convert(T,iv.hi)
promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{B}) =
Interval{promote_type(A,B)}
promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{Interval{B}}) =
Interval{promote_type(A,B)}
promote_rule (generic function with 126 methods)
Presto! Everything now Just Works™:
1..2 + 1
(2)..(3)
1..2 + 1.5
(2.5)..(3.5)
3 - 1..2 + 2//3
(5//3)..(8//3)
big(2)^100 + (1//3)..(2//3)
(3802951800684688204490109616129//3)..(3802951800684688204490109616130//3)
Let's examine the last computation in detail:
big(2)^100 + (1//3)..(2//3)
(3802951800684688204490109616129//3)..(3802951800684688204490109616130//3)
typeof(ans)
Interval{Rational{BigInt}} (constructor with 1 method)
@which big(2)^100 + (1//3)..(2//3)
+(x::Number,y::Number) at promotion.jl:148
The left-hand-side is a BigInt
:
big(2)^100
1267650600228229401496703205376
typeof(ans)
BigInt (constructor with 7 methods)
The right-hand-side is an interval of rationals of 64-bit integers:
typeof((1//3)..(2//3))
Interval{Rational{Int64}} (constructor with 1 method)
There are no methods for adding these specific types
This method is defined in base/promote.jl
:
+(x::Number, y::Number) = +(promote(x,y)...)
The promote
function for two arguments looks like this:
promote{T,S}(x::T, y::S) =
(convert(promote_type(T,S),x), convert(promote_type(T,S),y))
where promote_type
is defined in terms of promote_rule
— it's a bit complicated.
Now, the following promotion rules exist:
promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{B}) =
Interval{promote_type(A,B)}
promote_rule{A<:Integer,B<:Integer}(::Type{Rational{A}}, ::Type{B}) =
Rational{promote_type(A,B)}
promote_rule{T<:Integer}(::Type{BigInt}, ::Type{T}) = BigInt
These all kick in when computing promote_type
, producing this result:
promote_type(Interval{Rational{Int64}}, BigInt)
Interval{Rational{BigInt}} (constructor with 1 method)
So promote
converts both big(2)^100
and (1//3)..(2//3)
to Interval{Rational{Int64}}
:
convert(Interval{Rational{BigInt}}, big(2)^100)
(1267650600228229401496703205376//1)..(1267650600228229401496703205376//1)
convert(Interval{Rational{BigInt}}, (1//3)..(2//3))
(1//3)..(2//3)
After that adding them is just a call to the +
method for same-typed Intervals
.
The syntax for ± already exists in the parser, but its behavior is undefined:
:(2 ± 1)
:(±(2,1))
2 ± 1
± not defined at In[55]:1
Let's define it:
a::Real ± b::Real = (a - b)..(a + b)
a::Interval ± b::Interval = (a.lo - b.hi)..(a.hi + b.hi)
a::Number ± b::Number = ±(promote(a,b)...)
± (generic function with 3 methods)
1 ± 2
(-1)..(3)
pi - (2 ± 1//2)
(0.6415926535897931)..(1.6415926535897931)
Notice that:
immutable Interval{T<:Real} <: Number
lo::T
hi::T
end
Interval(lo::Real, hi::Real) = Interval(promote(lo,hi)...)
(a::Real)..(b::Real) = Interval(a,b)
Base.show(io::IO, iv::Interval) = print(io, "($(iv.lo))..($(iv.hi))")
import Base: convert, promote_rule
convert{T<:Real}(::Type{Interval{T}}, x::Real) = (x = convert(T,x); x..x)
convert{T<:Real}(::Type{Interval{T}}, iv::Interval) =
convert(T,iv.lo)..convert(T,iv.hi)
promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{B}) =
Interval{promote_type(A,B)}
promote_rule{A<:Real,B<:Real}(::Type{Interval{A}}, ::Type{Interval{B}}) =
Interval{promote_type(A,B)}
a::Interval + b::Interval = (a.lo + b.lo)..(a.hi + b.hi)
a::Interval - b::Interval = (a.lo - b.hi)..(a.hi - b.lo)
a::Real ± b::Real = (a - b)..(a + b)
a::Interval ± b::Interval = (a.lo - b.hi)..(a.hi + b.hi)
a::Number ± b::Number = ±(promote(a,b)...)
Generic functions elegantly and smoothly solve the expression problem:
Interval
type works with built-in +
and -
operators±
operator works with built-in Int
typeGeneric function in Julia aren't special – they're the default
+
are generic functions+
+ (generic function with 96 methods)
This means you're free to extend everything
code_llvm(+,(Int,Int))
define i64 @"julia_+2202"(i64, i64) { top: %2 = add i64 %1, %0, !dbg !10575 ret i64 %2, !dbg !10575 }
code_llvm(+,(Int,Float64))
define double @"julia_+2208"(i64, double) { top: %2 = sitofp i64 %0 to double, !dbg !10605 %3 = fadd double %2, %1, !dbg !10605 ret double %3, !dbg !10605 }
Let's define a scalar-valued function using the Interval
type we just defined:
lo(a,b) = (a ± b).lo
hi(a,b) = (a ± b).hi
hi (generic function with 1 method)
lo(3,2), hi(3,2)
(1,5)
code_llvm(lo,(Int,Int))
define i64 @julia_lo2227(i64, i64) { top: %2 = sub i64 %0, %1, !dbg !10684 ret i64 %2, !dbg !10684 }
code_llvm(hi,(Int,Int))
define i64 @julia_hi2228(i64, i64) { top: %2 = add i64 %1, %0, !dbg !10690 ret i64 %2, !dbg !10690 }
Even for complex combinations of types, generated code is very efficient:
code_native(hi,(Float64,Rational{Int}))
.section __TEXT,__text,regular,pure_instructions Filename: In[62] Source line: 2 push RBP mov RBP, RSP Source line: 2 vcvtsi2sd XMM1, XMM0, RSI vcvtsi2sd XMM2, XMM0, RDI vdivsd XMM1, XMM2, XMM1 vaddsd XMM0, XMM1, XMM0 pop RBP ret
Since generic functions are open:
We're forced to think much harder about the meaning of opearations
Example:
search(string, pattern, [start])
Search for the first occurance of the given pattern within the given string (or byte array). The pattern may be a single character, a vector or a set of characters, a string, or a regular expression (regular expressions are only allowed on contiguous strings, such as ASCII or UTF-8 strings). The third argument optionally specifies a starting index. The return value is a range of indexes where the matching sequence is found, such that
s[search(s,x)] == x
. The return value when the pattern is not found is0
when the pattern is a character, or0:-1
when searching for a substring or regular expression match.
The split
function is defined generically in terms of the search
abstraction:
function split(str::String, splitter, limit::Integer, keep_empty::Bool)
strs = String[]
i = start(str)
n = endof(str)
r = search(str,splitter,i)
j, k = first(r), nextind(str,last(r))
while 0 < j <= n && length(strs) != limit-1
if i < k
if keep_empty || i < j
push!(strs, str[i:prevind(str,j)])
end
i = k
end
if k <= j; k = nextind(str,j) end
r = search(str,splitter,k)
j, k = first(r), nextind(str,last(r))
end
if keep_empty || !done(str,i)
push!(strs, str[i:])
end
return strs
end
split(s::String, spl, n::Integer) = split(s, spl, n, true)
split(s::String, spl, keep::Bool) = split(s, spl, 0, keep)
split(s::String, spl) = split(s, spl, 0, true)
This gives the ability to:
search(str, pattern)
for the new patternsplit
functionality for freeRegex doesn't know anything about split
:
methods(split)
# 5 methods for generic function "split": split(str::String,splitter,limit::Integer,keep_empty::Bool) at string.jl:1250 split(s::String,spl,keep::Bool) at string.jl:1272 split(s::String,spl,n::Integer) at string.jl:1271 split(s::String,spl) at string.jl:1273 split(str::String) at string.jl:1277
Yet split works when with Regex
patterns:
split("foobarbazqux", r"[aeiou]+"i)
5-element Array{String,1}: "f" "b" "rb" "zq" "x"
Works because Regex
provides the search
function
split
is defined generically in terms of search
Meaning is hard and there are a lot of "protocols" that we still need to abstract out.
plot
functions in every plotting packageplot
functionThese kinds of abstract protocols are "narrow waists"
Figuring out the right ones is hard but essential work.