Please critique my Histogram.jl implementation


#1

I’ve placed beginning of a Histogram implementation here:
https://gist.github.com/nbecker/1c02aedd99ed5bfa72bf9284b108f57b

I’d like to hear any suggestions for improving this.
One thing I’m not sure about is my use of parametrics. Am I overdoing it?

I know that there is a hist.jl in base, but I’m doing this because 1) I want a histogram that behaves differently (behavior on overflow is specified by a parameter, rather than silently ignored), and more because it’s a learning exercise.

Thanks for any feedback.


#2

A couple of observations:

  • Type parameters are generally capitalized by convention in Julia
  • Parameterizing the integer type in cnt_t is fine, but is it really necessary? Are you planning on using smaller integers than Int64 for memory conservation or larger integers for overflow issues?
  • The type of ranges::Tuple{Tuple} is incompletely defined (it matches a tuple with one element, where that element is any kind of tuple). That won’t perform well, and is kind of a strange choice anyway. There’s probably a better choice for that variable. What do you want it to contain?

#3

StatsBase.jl has a Histogram type. See http://statsbasejl.readthedocs.io/en/latest/empirical.html#histograms


#4

Thanks for the suggestions! I’ve updated the gist. Now it works correctly for N-d.

I’m not really planning to use anything other than Int64, so maybe that parameter is unnecessary.

Yes, I know about StatsBase.jl Histogram. It doesn’t do what I want (overflow is silently ignored), and I wanted to use this as a learning exercise.

Any more suggestions on the new (now corrected) version?


#5

I’ve updated my gist https://gist.github.com/nbecker/1c02aedd99ed5bfa72bf9284b108f57b. I added a type (range_t) to represent the ranges of each axis, hopefully this would be more performant since now all the types in histogram are specified?

Any more suggestions?


#6

I find it more reasonable to use the standard Julia convention T for a type parameter, rather than flt_t, which feels too C-like.


#7

Might I also suggest you open an issue/PR on StatsBase for this? The lack of overflow warning is worth discussing.