I have the following code:
using InMemoryDatasets
# create a demo data set of CAN bus messages
function demo_data()
n = 100000
time = 0.1:0.1:n*0.1
addr = rand(Int16(0x101):Int16(0x12e), n)
d1 = rand(UInt8(0):UInt8(255), n)
d2 = rand(UInt8(0):UInt8(255), n)
d3 = rand(UInt8(0):UInt8(255), n)
d4 = rand(UInt8(0):UInt8(255), n)
d5 = rand(UInt8(0):UInt8(255), n)
d6 = rand(UInt8(0):UInt8(255), n)
d7 = rand(UInt8(0):UInt8(255), n)
d8 = rand(UInt8(0):UInt8(255), n)
ds = Dataset(time=time, addr=addr, d1=d1, d2=d2, d3=d3, d4=d4, d5=d5, d6=d6, d7=d7, d8=d8)
ds.addr[5] = missing
ds
end
In this example I am creating 8 similar columns. How can I do that in a loop for m columns?
Maybe you could first create a Dict
and then use the Dataset(::Dict)
constructor, e.g.
d = Dict( "time" => 0.1:0.1:n*0.1,
"addr" => rand(Int16(0x101):Int16(0x12e), n))
for i in 1:42
d[ "d" * string(i, pad=3)] = rand(UInt8(0):UInt8(255), n)
end
ds = Dataset(d)
(The leading zeros here are used to ensure that the columns are sorted by i
.)
1 Like
DNF
August 9, 2022, 6:45am
3
I donβt know anything about DataSet
, but, in case you donβt know, you can write
rand(UInt8, n)
instead of
rand(UInt8(0):UInt8(255), n)
Itβs much faster, too:
julia> @btime rand(UInt8(0):UInt8(255), 1000);
5.533 ΞΌs (1 allocation: 1.06 KiB)
julia> @btime rand(UInt8, 1000);
171.714 ns (1 allocation: 1.06 KiB)
Oh, the difference gets even bigger for your array size:
julia> @btime rand(UInt8(0):UInt8(255), 100_000);
549.300 ΞΌs (2 allocations: 97.73 KiB
julia> @btime rand(UInt8, 100_000);
5.380 ΞΌs (2 allocations: 97.73 KiB)
1 Like
se non hai stretta necessitΓ del loop, puoi usare uno schema di questo tipo:
julia> A=rand(UInt8, 6,5)
6Γ5 Matrix{UInt8}:
0x44 0x47 0x5d 0xa1 0xa4
0x5d 0xe4 0x22 0x27 0xc3
0x25 0x4a 0x03 0x64 0xf7
0x5b 0xd3 0x81 0x04 0x12
0xce 0x4e 0xb9 0x2c 0xf6
0x27 0xbe 0x6e 0x33 0x86
julia> ds=Dataset(A, :auto)
6Γ5 Dataset
Row β x1 x2 x3 x4 x5
β identity identity identity identity identity
β UInt8? UInt8? UInt8? UInt8? UInt8?
ββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β 68 71 93 161 164
2 β 93 228 34 39 195
3 β 37 74 3 100 247
4 β 91 211 129 4 18
5 β 206 78 185 44 246
6 β 39 190 110 51 134
julia> rename(x-> replace(x, "x"=>"d"), ds)
6Γ5 Dataset
Row β d1 d2 d3 d4 d5
β identity identity identity identity identity
β UInt8? UInt8? UInt8? UInt8? UInt8?
ββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β 68 71 93 161 164
2 β 93 228 34 39 195
3 β 37 74 3 100 247
4 β 91 211 129 4 18
5 β 206 78 185 44 246
6 β 39 190 110 51 134
or this scheme
m,n=6, 7
A=rand(UInt8, m,n)
ds=Dataset(A, string.("d", 1:n))
PS
Why arenβt UInt8 integers shown as such?
I also donβt know anything about Dataset
but this should do the same thing as your code:
function demo_data()
n = 100000
m = 8
time = 0.1:0.1:n*0.1
addr = rand(Int16(0x101):Int16(0x12e), n)
columns = (Symbol("d", i) => rand(UInt8, n) for i in 1:m)
ds = Dataset(;time, addr, columns...)
ds.addr[5] = missing
ds
end
2 Likes
Very nice! Seams to work!
But what is a Generator?
julia> columns = (Symbol("d", i) => rand(UInt8, n) for i in 1:m)
Base.Generator{UnitRange{Int64}, var"#5#6"}(var"#5#6"(), 1:8)
And why is it needed to write columns⦠?
Itβs not needed, if you prefer you can create an array with a comprehension instead:
columns = [Symbol("d", i) => rand(UInt8, n) for i in 1:m]
You can think of the generator as a βlazyβ array which creates its elements on demand.
Documentation: Multi-dimensional Arrays Β· The Julia Language