There is a Problem with Minimum function, can any one help me?

I wanna find the min & max of a column of a matrix. I use minimum function, but I have below problem, anyone know what happen?

image

thanks

It means you have a NaN in your vector. See for example

minimum([1,NaN,2])
1 Like

NaN stands for Not a Number.

The problem is not in the call to the minimum function, but in the code that computed the S_Elastic3 vector.

2 Likes

You can do something like

julia> minimum(x for x āˆˆ [1, NaN, 2] if !isnan(x))
1.0

if you canā€™t change the fact that NaN values are occuring in your calculation but youā€™re still interested in the non-NaN minimum.

3 Likes

find the min & max of a column of a matrix

TL;DR you should rather do using DataFrames and use describe:

Missing values are filtered in the calculation of all statistics, however the column :nmissing will report the number of missing values of that variable.

Iā€™m relying on that whatever you do to import into it rather uses missing than NaN, because only the former is filtered. NaN can also happen in calculations, so isnā€™t strictly a good sign for missing, while I believe other languages e.g. R is it for that.

https://dataframes.juliadata.org/stable/man/comparisons/

Note that pandas skips NaN values in its analytic functions by default. By contrast, Julia functions do not skip NaNā€™s. If necessary, you can filter out the NaNā€™s before processing, for example, mean(Iterators.filter(!isnan, x)).

Pandas uses NaN for representing both missing data and the floating point ā€œnot a numberā€ value. Julia defines a special value missing for representing missing data.

Depending on the cause of NaNs, e.g. if an artifact of importing, you can filter or substitute them somehow (also something like interpolating may apply): Replacing *missing* and *NaN* values in dataframe - #2 by nilshg

Simply replacing NaN (or missing) with 0 isnā€™t good advice (with missing likely better), but I noticed this blog post and it might be helpful: https://www.roelpeters.be/replacing-nan-missing-in-julia-dataframes/

Older text: You can do that in one go, at least this way:

julia> A = [NaN 3; 4 missing]
2Ɨ2 Matrix{Union{Missing, Float64}}:
 NaN    3.0
   4.0   missing

julia> extrema(x for x āˆˆ skipmissing(A) if !isnan(x))
(3.0, 4.0)

About column of a ā€œmatrixā€, it seems clear youā€™re referring to a table, and would want to be using DataFrames (or Pandas.jl).

I intentionally showed you could find extrema (or just e.g. minimum) of a full matrix (across columns), not just for one (or more) columns. You would want to slice one column (or row) at a time, as you know how to do. But I also looked a bit into doing that automatically for all each column.

I see you have the problem of Vector{Any} because of ā€œSc_Young_Modulusā€. If you see Any (an abstract type, the top one; you canā€™t rely on the Abstract prefix, but I think thatā€™s the major (only?) exception) like that, itā€™s likely going to kill performance. Thatā€™s one reason to want to use DataFrames or other way to skip header rows. You want to see concrete types, e.g. Vector of Float64 for your whole column, and it also allows different types for each column without performance problems. Julia is unusual with this missing concept, which is similar to NaN, but more general since it works for all datatypes.

See ā€œHandle Missing Dataā€, e.g. dropmissing! in the cheat sheet below.

Iā€™m no expert on the package, so Iā€™m not sure if it has similar good functions [EDIT: it seems as good] such as in Pandas:

https://pandas.pydata.org/docs/getting_started/intro_tutorials/06_calculate_statistics.html

While there is the Pandas.jl wrapper, that would work with all Julia data thatā€™s compatible with Python, I doubt it supports missing (because Python canā€™t support it, also think the concept was introduced in Julia after that package). Iā€™m not sure where you got NaN from, possibly an extra line when importing some data? You likely want to use CSV.jl to import. Iā€™m not sure if it rather imports with missing, or possibly both it and NaN?

Because even though I showed how the avoid both, just checking for missing (if you can rely in that at most, or even avoid expecting that), is going to be much faster and allocate less, and simpler code:

julia> @time extrema(skipmissing(B))
  0.013557 seconds (11.08 k allocations: 612.520 KiB, 99.58% compilation time)

If you used readdlm, and want to do minimal changes, then I would look into non-default options: header=true, comments=true, comment_char=ā€˜#ā€™

1 Like