You are of course completely correct. Embarrassing that I didn’t notice the first error that we always tell people to avoid!!
Try wrapping everything in a function .
You are of course completely correct. Embarrassing that I didn’t notice the first error that we always tell people to avoid!!
Try wrapping everything in a function .
Thanks @dpsanders and @bkamins
So this the latest version I have so far
using Combinatorics
"counts the number of occurances of l in s "
count_o(s,l)=count(x->x==l,s)
"The set of ways of concatenating one item from collection A with one from B."
cross(A, B)=["$a$b" for a in A for b in B]
"All combinations of n items; each combo as a concatenated str."
function combos(items, n)
(join(combo," ") for combo in combinations(items,n))
end
"""The probability of an event, given a sample space of equiprobable outcomes.
event can be either a set of outcomes, or a predicate (true for outcomes in the event)."""
function P(event, space)
length(intersect(event,space))//length(space)
end
function P(event::Function, space)
count(event,space)//length(space)
end
suits = "SHDC"
ranks = "A23456789TJQK"
deck = cross(ranks, suits)
Hands = combos(deck, 5) #A massive list of all possible hands
function flush(hand)
s::String=suits
any(count_o(hand,suit) == 5 for suit in s)
end
P(flush, Hands) #Warm up
Total execution time for Julia is now down to 3.94
vs python at 4.36
. This is pretty decent and it also makes for nicer code. The use of generators and avoiding set operators really made a big difference.
However, I am still not fully satisfied. I am still surprised the python did better when full sets and set operators were used.
Can I infer form that the python’s handling of sets must be better that Julia’s at this stage?
For the sake of completeness, the implementation using sets is
using Combinatorics
"counts the number of occurances of l in s "
count_o(s,l)=count(x->x==l,s)
"The set of ways of concatenating one item from collection A with one from B."
cross(A, B)=["$a$b" for a in A for b in B]
"All combinations of n items; each combo as a concatenated str."
function combos(items, n)
Set(join(combo," ") for combo in combinations(items,n))
end
"""The probability of an event, given a sample space of equiprobable outcomes.
event can be either a set of outcomes, or a predicate (true for outcomes in the event)."""
function P(event, space)
event_ = such_that(event, space)
length(intersect(event_,space))//length(space)
end
#Making use of Julia's multiple dispatch
"The subset of elements in the collection for which the predicate is true."
function such_that(predicate::Function, collection)
filter(predicate,collection)
end;
"Default return for a collection"
function such_that(event, collection)
event
end
suits = "SHDC"
ranks = "A23456789TJQK"
deck = cross(ranks, suits)
Hands = combos(deck, 5) #A massive list of all possible hands
function flush(hand)
s::String=suits
any(count_o(hand,suit) == 5 for suit in s)
end
P(flush, Hands) #Warm up
This takes 14.1
seconds.
I continued with more experiments where I tried to preserve use of Sets so as not to deviate from the python code.
In the first variation, I used collect
instead of simply passing the generator to the Set
constructor.
function combos(items, n)
Set(collect(join(combo," ") for combo = combinations(items,n)))
end
The speed improved and execution time went down to 8.7
seconds.
Then I did another variation when I mapped to a preallocated array.
function combos(items, n)
combs=combinations(items,n)
res=Vector{String}(length(combs))
Set(map!(x->join(x," "),res,collect(combs)))
end
Speed improved further to 6.1
seconds.
Conclusions:
map!
always works faster than collect
Set
construction with a generator input.On the risk of getting off topic, I would like to encourage a non-string solution for this problem. I see two problems with the string approach:
suits
and ranks
don’t overlap, which is just a fortuitousness in the English language.But in Julia we can have abstractions at zero cost, which can also make the code safer. In this spirit I would like to offer the following version:
using Combinatorics
const suits = "SHDC"
const ranks = "A23456789TJQK"
immutable Card{TS,TR}
suit :: TS
rank :: TR
end
"""The probability of an event, given a sample space of equiprobable outcomes.
event can be either a set of outcomes, or a predicate (true for outcomes in the event)."""
function P(event::Function, space)
count(event,space)//length(space)
end
function flush(hand)
any( count(card -> card.suit==s, hand) == 5 for s in suits )
end
function main()
deck = [ Card(s,r) for r in ranks for s in suits ]
hands = combinations(deck, 5)
P(flush, hands) #Warm up
@time p = P(flush, hands)
println(p)
end
main()
With julia 0.5.0, this gets me:
0.624516 seconds (20.79 M allocations: 991.425 MB, 9.39% gc time)
The main speedup comes from not using strings, but using the dedicated Card
type, and having a hand simply be a list of Card
s (thus avoiding the strings). The other changes I made (explicitly marking global variables as const, and putting the top-level code into a main
function) have little impact on the speed. A little more speed can actually be gained by changing the defintion of suits
as follows:
@enum SUIT spades hearts diamonds clubs
const suits = instances(SUIT)
which gets me
0.531291 seconds (20.79 M allocations: 991.425 MB, 10.73% gc time)
(I guess on current architectures, Char
s are too small so that operating on them individually incurs too much overhead.)
Note: on my machine the original Python version takes 5.4 seconds.
Very nice!
Probably some more speed can be gained (also in the original version) by replacing the flush
function to just check if all the suits are equal to each other.
@dpsanders is right, in my version replacing flush
by
function flush(hand)
s1 = hand[1].suit
count(card -> card.suit==s1, hand) == 5
end
yields
0.388377 seconds (15.59 M allocations: 872.454 MB, 11.35% gc time)
i.e. another ~25% faster.
Instead of an enum, you can just use integers instead of Char
s for suits and ranks, and get a similar speedup. I am curious as to why there’s still so much allocation.
Or just
function flush(hand)
hand[1].suit == hand[2].suit == hand[3].suit == hand[4].suit == hand[5].suit
end
which is even slightly faster. Not sure how to write this more beautifully.
all(card.suit==s1 for card in hand)
Ah, the allocations come from traversing the generator of the combinations.
Indeed, if you actually do hands = collect(combinations(deck, 5))
, then the allocations go down to 0, and the goes down to 0.05 secs!
But that moves much of the work from the P
function into the collect
operation, so to be fair we should time that as well. Rewriting the bottom part to
function main()
deck = [ Card(s,r) for r in ranks for s in suits ]
hands = combinations(deck, 5)
p = P(flush, hands)
println(p)
end
@time main() # Warm up
@time main()
gives me
0.905246 seconds (18.43 M allocations: 961.952 MB, 27.77% gc time)
0.419223 seconds (18.19 M allocations: 951.772 MB, 14.00% gc time)
but adding the collect
call for the combinations gives me
1.461934 seconds (18.45 M allocations: 982.869 MB, 31.46% gc time)
1.061860 seconds (18.19 M allocations: 971.600 MB, 32.99% gc time)
Another way to speed up string processing code which I don’t think has been mentioned here is to use a different type of string. Julia defaults to UTF8, which is a variable length encoding and can slow some calculations down. You may want to look at LegacyStrings.jl or some of the stuff in JuliaString if you need even more speed.
It’s unclear (to me at least) which version of Julia you’re using here, @mbeltagy, so I should mention that strings on 0.6-alpha should be much faster than they were on 0.5. They have, until 0.6 been slower than Python, but should now be the same (or hopefully even faster). In the future, we have a few potential tricks that could make them even faster still, but they’re fairly tricky tricks, so it’s hard to say for sure. Of course, we want all aspects of Julia to be at least as fast any other language, but we haven’t quite gotten to that state of performance nirvana yet. Strings are something Python has been very good at for a long time (and Perl is even better); now at least Julia should be competitive.
I should also mention that although we surely can and will, we often haven’t spent that much time honing the performance of generic data structures like hashes and dictionaries. This isn’t because we don’t want them to be fast (we do), but because they’re not as important as they are in a language like Python where they’re your only option. In Julia, when performance counts, your best bet is often to use a different, better suited data structure. In this case, @traktofon’s post showed that you can get a huge speedup that way – far faster than you could ever get with generic data structures, no matter how much we hone them.
Thanks @StefanKarpinski
I really learned a great deal here. I am appreciating how Julia works and how best to put its power to effective use.
I tested on 0.5 and 0.6 (master branch). The results were very similar.
For the original code that I first posted, total execution was 52
seconds for 0.5 and 54
for 0.6.
For the one that uses sets (more comparable to Python) it was 5.3
and 5.4
seconds.
As far I as I can tell, the difference is quite minimal between the two versions.
In the code below, I just timed the main
function after warmup.
using Combinatorics
count_o(s,l)=count(x->x==l,s)
cross_(A, B)=["$a$b" for a in A for b in B]
function combos(items, n)
combs=combinations(items,n)
res=Vector{String}(length(combs))
map!(x->join(x," "),res,collect(combs))
end
function P(event, space)
event_ = such_that(event, space)
length(intersect(event_,Set(space)))//length(space)
end
function such_that(predicate::Function, collection)
filter(predicate,collection)
end;
suits = "SHDC"
ranks = "A23456789TJQK"
function flush(hand)
s::String=suits
any(count_o(hand,suit) == 5 for suit in s)
end
function main()
deck = cross_(ranks, suits)
Hands = combos(deck, 5) #A massive list of all possible hands
P(flush, Hands) #Warm up
end
6 posts were split to a new topic: ANN: Cards.jl
I’ve split the discussion of my new Cards.jl package into a new thread.