I’ve placed beginning of a Histogram implementation here:
https://gist.github.com/nbecker/1c02aedd99ed5bfa72bf9284b108f57b
I’d like to hear any suggestions for improving this.
One thing I’m not sure about is my use of parametrics. Am I overdoing it?
I know that there is a hist.jl in base, but I’m doing this because 1) I want a histogram that behaves differently (behavior on overflow is specified by a parameter, rather than silently ignored), and more because it’s a learning exercise.
Thanks for any feedback.
A couple of observations:
- Type parameters are generally capitalized by convention in Julia
- Parameterizing the integer type in
cnt_t
is fine, but is it really necessary? Are you planning on using smaller integers than Int64
for memory conservation or larger integers for overflow issues?
- The type of
ranges::Tuple{Tuple}
is incompletely defined (it matches a tuple with one element, where that element is any kind of tuple). That won’t perform well, and is kind of a strange choice anyway. There’s probably a better choice for that variable. What do you want it to contain?
Thanks for the suggestions! I’ve updated the gist. Now it works correctly for N-d.
I’m not really planning to use anything other than Int64, so maybe that parameter is unnecessary.
Yes, I know about StatsBase.jl Histogram. It doesn’t do what I want (overflow is silently ignored), and I wanted to use this as a learning exercise.
Any more suggestions on the new (now corrected) version?
I’ve updated my gist https://gist.github.com/nbecker/1c02aedd99ed5bfa72bf9284b108f57b. I added a type (range_t) to represent the ranges of each axis, hopefully this would be more performant since now all the types in histogram are specified?
Any more suggestions?
I find it more reasonable to use the standard Julia convention T
for a type parameter, rather than flt_t
, which feels too C-like.
1 Like
Might I also suggest you open an issue/PR on StatsBase for this? The lack of overflow warning is worth discussing.