Implementing a ceil function in a complete dataFrame

I am trying to implement the python code that looks like :
forecast_out =int(math.ceil(0.1*len(df)))

in julia it looks to me:
forecast_out=int(ceil(0.1.*something))

tried :
length(df)
df[:,:]
don’t know how to iterate it over the whole data frame using a simple function,

also,how to shift columns in julia like for something like this :
In python.
df[‘label’]=df[‘forecast_out’].shift(-forecast_out)

Use:

forecast_out = ceil(Int, 0.1 * nrow(df))

Can you please share sample input and output so that I can clearly understand what you want to achieve?

In general to shift vectors use ShiftedArrays.jl

don’t know how to iterate it over the whole data frame using a simple function

I am not clear what you mean by “iterate” here.

1 Like

Input csv:
error2
My code:


And What I’m trying to get:

What I understood from what sentdex is saying is that he will use the data for linear regression,so what he kind of did is he shifted the data 10 days back,so he can predict for future
His code:

You have copied my code incorrectly. Here is a working example:

julia> using DataFrames, ShiftedArrays

julia> df = DataFrame(x = rand(20))
20×1 DataFrame
│ Row │ x         │
│     │ Float64   │
├─────┼───────────┤
│ 1   │ 0.075843  │
│ 2   │ 0.0600812 │
│ 3   │ 0.253931  │
│ 4   │ 0.968895  │
│ 5   │ 0.711306  │
│ 6   │ 0.101996  │
│ 7   │ 0.358225  │
│ 8   │ 0.671676  │
│ 9   │ 0.775928  │
│ 10  │ 0.570634  │
│ 11  │ 0.181218  │
│ 12  │ 0.3468    │
│ 13  │ 0.255522  │
│ 14  │ 0.666059  │
│ 15  │ 0.0278363 │
│ 16  │ 0.371866  │
│ 17  │ 0.436382  │
│ 18  │ 0.0969218 │
│ 19  │ 0.551443  │
│ 20  │ 0.588202  │

julia> forecast_out = ceil(Int, 0.1 * nrow(df))
2

julia> df.y = lag(df.x, forecast_out)
20-element ShiftedArray{Float64,Missing,1,Array{Float64,1}}:
  missing
  missing
 0.075842978087997
 0.060081213070150685
 0.25393094533500715
 0.9688954060450929
 0.7113061474058524
 0.10199632559770944
 0.35822482674900225
 0.6716760713442311
 0.7759275692967518
 0.5706337841740747
 0.1812180540671371
 0.34680012499114965
 0.2555223672389133
 0.6660586930982471
 0.027836343894886317
 0.3718658603118037
 0.43638248519823497
 0.09692176577640299

julia> df.z = lead(df.x, forecast_out)
20-element ShiftedArray{Float64,Missing,1,Array{Float64,1}}:
 0.25393094533500715
 0.9688954060450929
 0.7113061474058524
 0.10199632559770944
 0.35822482674900225
 0.6716760713442311
 0.7759275692967518
 0.5706337841740747
 0.1812180540671371
 0.34680012499114965
 0.2555223672389133
 0.6660586930982471
 0.027836343894886317
 0.3718658603118037
 0.43638248519823497
 0.09692176577640299
 0.551442551906473
 0.5882020370297223
  missing
  missing

julia> df
20×3 DataFrame
│ Row │ x         │ y         │ z         │
│     │ Float64   │ Float64?  │ Float64?  │
├─────┼───────────┼───────────┼───────────┤
│ 1   │ 0.075843  │ missing   │ 0.253931  │
│ 2   │ 0.0600812 │ missing   │ 0.968895  │
│ 3   │ 0.253931  │ 0.075843  │ 0.711306  │
│ 4   │ 0.968895  │ 0.0600812 │ 0.101996  │
│ 5   │ 0.711306  │ 0.253931  │ 0.358225  │
│ 6   │ 0.101996  │ 0.968895  │ 0.671676  │
│ 7   │ 0.358225  │ 0.711306  │ 0.775928  │
│ 8   │ 0.671676  │ 0.101996  │ 0.570634  │
│ 9   │ 0.775928  │ 0.358225  │ 0.181218  │
│ 10  │ 0.570634  │ 0.671676  │ 0.3468    │
│ 11  │ 0.181218  │ 0.775928  │ 0.255522  │
│ 12  │ 0.3468    │ 0.570634  │ 0.666059  │
│ 13  │ 0.255522  │ 0.181218  │ 0.0278363 │
│ 14  │ 0.666059  │ 0.3468    │ 0.371866  │
│ 15  │ 0.0278363 │ 0.255522  │ 0.436382  │
│ 16  │ 0.371866  │ 0.666059  │ 0.0969218 │
│ 17  │ 0.436382  │ 0.0278363 │ 0.551443  │
│ 18  │ 0.0969218 │ 0.371866  │ 0.588202  │
│ 19  │ 0.551443  │ 0.436382  │ missing   │
│ 20  │ 0.588202  │ 0.0969218 │ missing   │

Thank you,I reached the lag/lead function point with your direction but it still give this error of

MethodError: no method matching lag(::Array{Float64,1}, ::Int64)

Closest candidates are:

lag(!Matched::TimeSeries.TimeArray{T,N,D,A} where A<:AbstractArray{T,N} where D<:Dates.TimeType, ::Int64; padding, period) where {T, N} at C:\Users\ASHWANI\.julia\packages\TimeSeries\Kbq5K\src\apply.jl:12

1. **top-level scope** @ *[Local: 2](http://localhost:1234/edit?id=92ae8590-1993-11eb-0876-ad210e61eb52#)*

df.label=lag(df.adj_close,forecast_out)#with this one case

have you loaded ShiftedArrays.jl package? You loaded TimeSeries.jl package that also exports lag, so you need to qualify the name, like in python when you have function name clash ShiftedArrays.lag.

1 Like

Thank you,that was the error :slight_smile: