That would work for some of the series because there are a couple that, when plotting their ECDF as I show above, lie pretty much on the same line…for most of them though, the ECDF is distinct as the three shown above are.
I like the extreme value theory tips - this seems really promising based on what I’ve read so far. These data series have a lot of zeros, then the bulk of the data lie within a pretty narrow range, but then you have these insanely long tails that cannot be ignored. I’ve been struggling to figure out what to use as a cutoff threshold for outliers and, after reading up on EVT, I now know why : )
The data are actually monetary values. It’s private data so I can’t disclose much about it, but you can almost think of them as purchase amounts from different stores selling different categories of goods/services(for the different series)…lots of people don’t buy anything so there are lots of $0.00…most people that do buy something don’t spend very much (maybe between $20 and a few thousand) but then there are always a fair number of really large transactions that span anywhere from a few tens of thousands to hundreds of thousands to values in the millions.
I have no idea but I bet jewelry store transactions look a lot like this data set ![]()