Trying to write a simple sampling function

Hi folks! I have what (I think) is an extremely basic question, and I’m not quite sure where to start. I’m moderately proficient in python and R, but have never used Julia before.

I’m trying to make a simple function that generates made-up words; first I sample a length, and then select that number of characters at random from an array. I want to eventually turn this into a generative function using Gen.jl, but I figure I’d start small.

Here’s the code I have:

Using Gen, StatsBase

alphabet = ["i","a","p","t"]

function make_a_word2(alphabet)
    length = (poisson(3))
    word = (StatsBase.sample!(alphabet,length))
    println(word)
end

make_a_word2(alphabet)

and I get the error:

julia> make_a_word2(alphabet)
ERROR: MethodError: no method matching sample!(::Array{String,1}, ::Int64)
Closest candidates are:
  sample!(::AbstractArray, ::AbstractWeights, ::AbstractArray; replace, ordered) at C:\Users\canaa\.julia\packages\StatsBase\Q76Ni\src\sampling.jl:925
  sample!(::AbstractArray, ::AbstractArray; replace, ordered) at C:\Users\canaa\.julia\packages\StatsBase\Q76Ni\src\sampling.jl:487
Stacktrace:
 [1] make_a_word2(::Array{String,1}) at .\REPL[29]:3
 [2] top-level scope at REPL[31]:1

Any suggestions on what’s going wrong here? Thanks a lot for your patience with this very basic question!

The fact that you are not giving it two arrays, and are capturing the output makes me think you are looking for the non-mutating sample rather than the mutating sample!.

Ah, so I am - thank you!

A further question in this vein then - if I want to turn this into a generative function in Gen.jl (https://www.gen.dev/), I do the following:


@gen function make_a_word(alphabet)
length = @trace(poisson(3), :length)
word = @trace(StatsBase.sample(alphabet,length), :word)
println(word)
end

make_a_word(alphabet)

but get this error:

> make_a_word(alphabet)
ERROR: MethodError: no method matching traceat(::Gen.GFUntracedState, ::typeof(sample), ::Tuple{Array{String,1},Int64}, ::Symbol)
Closest candidates are:
traceat(::Gen.GFUntracedState, ::GenerativeFunction, ::Any, ::Any) at C:\Users\canaa\.julia\packages\Gen\thmFY\src\dynamic\dynamic.jl:84
traceat(::Gen.GFUntracedState, ::Distribution, ::Any, ::Any) at C:\Users\canaa\.julia\packages\Gen\thmFY\src\dynamic\dynamic.jl:87
traceat(::Gen.GFUpdateState, ::Distribution{T}, ::Tuple, ::Any) where T at C:\Users\canaa\.julia\packages\Gen\thmFY\src\dynamic\update.jl:19
...
Stacktrace:
[1] ##make_a_word#261(::Gen.GFUntracedState, ::Array{String,1}) at .\REPL[36]:3
[2] (::DynamicDSLFunction{Any})(::Array{String,1}) at C:\Users\canaa\.julia\packages\Gen\thmFY\src\dynamic\dynamic.jl:54
[3] top-level scope at REPL[37]:1

Any wisdom here? Thank you in advance again!

I’ve never used Gen, and I’m not entirely sure what it does, but isn’t the problem that sample uses an external (to Gen) randomness? I would have guessed you need to use Gen’s sampling methods (like poisson) to use @trace.

If you want to create strings rather than vectors of string/char, you may want to use randstring from the Random stdlib. Normally, it’s also cleaner to represent characters as Chars, rather than as unit-length Strings:

using Random: randstring
alphabet = ['i', 'a', 'p', 't']   # single quotes

jl> randstring(alphabet, 7)
"pttiitp"

(If your alphabet is short, you could even use a tuple, alphabet = ('i', 'a', 'p', 't'), for extra performance :wink: )

4 Likes

Hi DNF,

Thanks a lot - this is very helpful. This resolves one issue, but I think there’s a Gen-related issue here too, since I’m still getting the problem even after telling it to use Gen’s distribution function.

Best,
Canaan