I am trying to implement the python code that looks like :
forecast_out =int(math.ceil(0.1*len(df)))
in julia it looks to me:
forecast_out=int(ceil(0.1.*something))
tried :
length(df)
df[:,:]
don’t know how to iterate it over the whole data frame using a simple function,
also,how to shift columns in julia like for something like this :
In python.
df[‘label’]=df[‘forecast_out’].shift(-forecast_out)
Use:
forecast_out = ceil(Int, 0.1 * nrow(df))
Can you please share sample input and output so that I can clearly understand what you want to achieve?
In general to shift vectors use ShiftedArrays.jl
don’t know how to iterate it over the whole data frame using a simple function
I am not clear what you mean by “iterate” here.
1 Like
Input csv:
My code:
And What I’m trying to get:
What I understood from what sentdex is saying is that he will use the data for linear regression,so what he kind of did is he shifted the data 10 days back,so he can predict for future
His code:
bkamins
October 29, 2020, 10:11am
4
You have copied my code incorrectly. Here is a working example:
julia> using DataFrames, ShiftedArrays
julia> df = DataFrame(x = rand(20))
20×1 DataFrame
│ Row │ x │
│ │ Float64 │
├─────┼───────────┤
│ 1 │ 0.075843 │
│ 2 │ 0.0600812 │
│ 3 │ 0.253931 │
│ 4 │ 0.968895 │
│ 5 │ 0.711306 │
│ 6 │ 0.101996 │
│ 7 │ 0.358225 │
│ 8 │ 0.671676 │
│ 9 │ 0.775928 │
│ 10 │ 0.570634 │
│ 11 │ 0.181218 │
│ 12 │ 0.3468 │
│ 13 │ 0.255522 │
│ 14 │ 0.666059 │
│ 15 │ 0.0278363 │
│ 16 │ 0.371866 │
│ 17 │ 0.436382 │
│ 18 │ 0.0969218 │
│ 19 │ 0.551443 │
│ 20 │ 0.588202 │
julia> forecast_out = ceil(Int, 0.1 * nrow(df))
2
julia> df.y = lag(df.x, forecast_out)
20-element ShiftedArray{Float64,Missing,1,Array{Float64,1}}:
missing
missing
0.075842978087997
0.060081213070150685
0.25393094533500715
0.9688954060450929
0.7113061474058524
0.10199632559770944
0.35822482674900225
0.6716760713442311
0.7759275692967518
0.5706337841740747
0.1812180540671371
0.34680012499114965
0.2555223672389133
0.6660586930982471
0.027836343894886317
0.3718658603118037
0.43638248519823497
0.09692176577640299
julia> df.z = lead(df.x, forecast_out)
20-element ShiftedArray{Float64,Missing,1,Array{Float64,1}}:
0.25393094533500715
0.9688954060450929
0.7113061474058524
0.10199632559770944
0.35822482674900225
0.6716760713442311
0.7759275692967518
0.5706337841740747
0.1812180540671371
0.34680012499114965
0.2555223672389133
0.6660586930982471
0.027836343894886317
0.3718658603118037
0.43638248519823497
0.09692176577640299
0.551442551906473
0.5882020370297223
missing
missing
julia> df
20×3 DataFrame
│ Row │ x │ y │ z │
│ │ Float64 │ Float64? │ Float64? │
├─────┼───────────┼───────────┼───────────┤
│ 1 │ 0.075843 │ missing │ 0.253931 │
│ 2 │ 0.0600812 │ missing │ 0.968895 │
│ 3 │ 0.253931 │ 0.075843 │ 0.711306 │
│ 4 │ 0.968895 │ 0.0600812 │ 0.101996 │
│ 5 │ 0.711306 │ 0.253931 │ 0.358225 │
│ 6 │ 0.101996 │ 0.968895 │ 0.671676 │
│ 7 │ 0.358225 │ 0.711306 │ 0.775928 │
│ 8 │ 0.671676 │ 0.101996 │ 0.570634 │
│ 9 │ 0.775928 │ 0.358225 │ 0.181218 │
│ 10 │ 0.570634 │ 0.671676 │ 0.3468 │
│ 11 │ 0.181218 │ 0.775928 │ 0.255522 │
│ 12 │ 0.3468 │ 0.570634 │ 0.666059 │
│ 13 │ 0.255522 │ 0.181218 │ 0.0278363 │
│ 14 │ 0.666059 │ 0.3468 │ 0.371866 │
│ 15 │ 0.0278363 │ 0.255522 │ 0.436382 │
│ 16 │ 0.371866 │ 0.666059 │ 0.0969218 │
│ 17 │ 0.436382 │ 0.0278363 │ 0.551443 │
│ 18 │ 0.0969218 │ 0.371866 │ 0.588202 │
│ 19 │ 0.551443 │ 0.436382 │ missing │
│ 20 │ 0.588202 │ 0.0969218 │ missing │
Thank you,I reached the lag/lead function point with your direction but it still give this error of
MethodError: no method matching lag(::Array{Float64,1}, ::Int64)
Closest candidates are:
lag(!Matched::TimeSeries.TimeArray{T,N,D,A} where A<:AbstractArray{T,N} where D<:Dates.TimeType, ::Int64; padding, period) where {T, N} at C:\Users\ASHWANI\.julia\packages\TimeSeries\Kbq5K\src\apply.jl:12
1. **top-level scope** @ *[Local: 2](http://localhost:1234/edit?id=92ae8590-1993-11eb-0876-ad210e61eb52#)*
df.label=lag(df.adj_close,forecast_out)#with this one case
bkamins
October 29, 2020, 10:33am
6
have you loaded ShiftedArrays.jl package? You loaded TimeSeries.jl package that also exports lag
, so you need to qualify the name, like in python when you have function name clash ShiftedArrays.lag
.
1 Like
Thank you,that was the error