How to create columns in a loop

I have the following code:

using InMemoryDatasets
# create a demo data set of CAN bus messages

function demo_data()
    n = 100000
    time = 0.1:0.1:n*0.1
    addr = rand(Int16(0x101):Int16(0x12e), n)
    d1 = rand(UInt8(0):UInt8(255), n)
    d2 = rand(UInt8(0):UInt8(255), n)
    d3 = rand(UInt8(0):UInt8(255), n)
    d4 = rand(UInt8(0):UInt8(255), n)
    d5 = rand(UInt8(0):UInt8(255), n)
    d6 = rand(UInt8(0):UInt8(255), n)
    d7 = rand(UInt8(0):UInt8(255), n)
    d8 = rand(UInt8(0):UInt8(255), n)
    ds = Dataset(time=time, addr=addr, d1=d1, d2=d2, d3=d3, d4=d4, d5=d5, d6=d6, d7=d7, d8=d8)
    ds.addr[5] = missing
    ds
end

In this example I am creating 8 similar columns. How can I do that in a loop for m columns?

Maybe you could first create a Dict and then use the Dataset(::Dict) constructor, e.g.

d = Dict( "time" => 0.1:0.1:n*0.1, 
          "addr" => rand(Int16(0x101):Int16(0x12e), n))

for i in 1:42
    d[ "d" * string(i, pad=3)] = rand(UInt8(0):UInt8(255), n)
end

ds = Dataset(d)

(The leading zeros here are used to ensure that the columns are sorted by i.)

1 Like

I don’t know anything about DataSet, but, in case you don’t know, you can write

rand(UInt8, n)

instead of

rand(UInt8(0):UInt8(255), n)

It’s much faster, too:

julia> @btime rand(UInt8(0):UInt8(255), 1000);
  5.533 ΞΌs (1 allocation: 1.06 KiB)

julia> @btime rand(UInt8, 1000);
  171.714 ns (1 allocation: 1.06 KiB)

Oh, the difference gets even bigger for your array size:

julia> @btime rand(UInt8(0):UInt8(255), 100_000);
  549.300 ΞΌs (2 allocations: 97.73 KiB

julia> @btime rand(UInt8, 100_000);
  5.380 ΞΌs (2 allocations: 97.73 KiB)
1 Like

se non hai stretta necessitΓ  del loop, puoi usare uno schema di questo tipo:

julia> A=rand(UInt8, 6,5)
6Γ—5 Matrix{UInt8}:
 0x44  0x47  0x5d  0xa1  0xa4
 0x5d  0xe4  0x22  0x27  0xc3
 0x25  0x4a  0x03  0x64  0xf7
 0x5b  0xd3  0x81  0x04  0x12
 0xce  0x4e  0xb9  0x2c  0xf6
 0x27  0xbe  0x6e  0x33  0x86

julia> ds=Dataset(A, :auto)
6Γ—5 Dataset
 Row β”‚ x1        x2        x3        x4        x5       
     β”‚ identity  identity  identity  identity  identity
     β”‚ UInt8?    UInt8?    UInt8?    UInt8?    UInt8?
─────┼──────────────────────────────────────────────────
   1 β”‚       68        71        93       161       164
   2 β”‚       93       228        34        39       195
   3 β”‚       37        74         3       100       247
   4 β”‚       91       211       129         4        18
   5 β”‚      206        78       185        44       246
   6 β”‚       39       190       110        51       134

julia> rename(x-> replace(x, "x"=>"d"), ds)
6Γ—5 Dataset
 Row β”‚ d1        d2        d3        d4        d5       
     β”‚ identity  identity  identity  identity  identity
     β”‚ UInt8?    UInt8?    UInt8?    UInt8?    UInt8?
─────┼──────────────────────────────────────────────────
   1 β”‚       68        71        93       161       164
   2 β”‚       93       228        34        39       195
   3 β”‚       37        74         3       100       247
   4 β”‚       91       211       129         4        18
   5 β”‚      206        78       185        44       246
   6 β”‚       39       190       110        51       134

or this scheme

m,n=6, 7
A=rand(UInt8, m,n)
ds=Dataset(A, string.("d", 1:n))

PS
Why aren’t UInt8 integers shown as such?

I also don’t know anything about Dataset but this should do the same thing as your code:

function demo_data()
    n = 100000
    m = 8
    time = 0.1:0.1:n*0.1
    addr = rand(Int16(0x101):Int16(0x12e), n)
    columns = (Symbol("d", i) => rand(UInt8, n) for i in 1:m)
    ds = Dataset(;time, addr, columns...)
    ds.addr[5] = missing
    ds
end
2 Likes

Very nice! Seams to work! :slight_smile:

But what is a Generator?

julia> columns = (Symbol("d", i) => rand(UInt8, n) for i in 1:m)
Base.Generator{UnitRange{Int64}, var"#5#6"}(var"#5#6"(), 1:8)

And why is it needed to write columns… ?

It’s not needed, if you prefer you can create an array with a comprehension instead:

columns = [Symbol("d", i) => rand(UInt8, n) for i in 1:m]

You can think of the generator as a β€œlazy” array which creates its elements on demand.

Documentation: Multi-dimensional Arrays Β· The Julia Language