Learning statistics: seeking advice on how to model a 2d-size distribution of clothing items

Hi everyone, I’m new to Julia and have been trying it out for the past few weeks.

As I’m still only beginning to (re)learn statistics as well, I’ve set out to get some insight into the following problem (I didn’t want to start with neural nets :slight_smile: although if that is the answer to my question i will pursue that avenue)

I was wondering on how to model a clothing-size distribution for a shop having to buy a multitude of clothing items, these items have categorical traits like brand, type and color, but, they have two sizes (i.e. length and width). I’ve been making contour plots of the frequencies these sizes occur, which already provided some insight. (ie: brandX doesn’t have the same 2d-histogram as brandY).

I’ve tried different approaches: a sampler using two Categoricals (from Distributions.jl) and I also tried fitting an MvNormal (although the sizes are of course discrete)

In order to fill an inventory of items I would then use said distribution and sample N sizes out of them. But the fact that some items will have low kardinality is something I’d like to resolve. I’ve been reading about Mixed-effect models, but I’m not sure I’m on the right track.

More so, there are some gaps in my statistical knowledge, maybe I’m missing something obvious.

I was thinking that a different coloured item might have somewhat of the same size distribution, but how would you quantify this? I keep running into walls because of the 2-dim of the problem as well.

Furthermore: items from the same brand or type might also have size-distribution similarity. How would a model account for that?

As a naive approach I was thinking a kmeans-index and use that to sample data from for the more sparse items. But I’m unsure this i a good direction to be taken.

Anything I can read or study? Pointers are more than welcome.

It seems the question is very I’ll defined. If I were a store, I’d be seeking a set of orders which would maximize my profit given historical shopping patterns. So if I tend to buy say all sizes equally but then have to remainder a lot of unusual sizes, then I’d want to adjust the distribution of my purchases to maximize profit (sales minus cost to purchase plus salvage value at remainder)

Since the sizes are discrete, you can use the quantity in each size as decision variables.

Is this on the right track?

Hi @dlakelan thank you for taking the time to answer,

Sorry that the question is ill defined, the problem arises from that fact that there are a lot (100) of different sizes (power law wise (2 sizes), it grows quickly, add in a few colours and types and we’re talking 1k items with the exact same price).

What I’m trying to accomplish: using distributions of older products for newer ones.

For example: What if I were to buy BrandX shoes knowing the size distribution of BrandY shoes.

The question I was pondering on: How much will these distributions be the same? How does one measure that (with only a few or no samples for BrandY).

And what about the known distribution of type of shoes (maybe both BrandX and BrandY are sporting shoes so then I could sample from a higher up class)

I’m not sure I’m using the correct jargon, but might these be prior and posterior distributions which I can then update using Bayes rule? Ifso, how to measure the point from which the distribution is stable enough to use when buying a new batch.

Thanks again!

I guess I don’t understand how you mean “distribution”. I imagine that say mens pants come in some waist and length combos… You can buy any of the available sizes. So the distribution of sizes (how many of each size you should buy) is up to you. So it seems you need some reason to choose to buy more of size 32x32 and less of size 45x28 for example (because there are many more 32x32 people who would want to buy that, and very few 45x28 people who want to buy that). Is that what you are talking about?

Or are you talking about the distribution of sales? trying to estimate how much demand there is for each size?

Indeed the distribution of sales, trying to estimate the demand.

But, maybe BrandY has more demand for outliers (as it for example advertises for bigger people).

Using your example of pants, say that both brands carry wide, low hip and skinny jeans. Since I already know the size distributions for BrandX of those subcategories, I was wondering how to “transfer” those size-distributions to BrandY.

I think your question is really at the early stages, such that a forum on software is not really able to help you. As you point out, the people who shop for brand X are not the same group of people as those who shop for brand Y. So, this is an operations research question and has no definitive “correct” answer. If you can formulate a model, then people here can potentially help you to find appropriate computational packages to compute things from that model.

I run a small consulting company, if you want to private message me we can discuss a research project that might help build that model. I have actually worked for a company doing operations research on clothing mfg in the past (about 2005 or so).


I’m not really looking to solve this problem with a definitive answer and understand that no-one-size-fits-all :wink:

Thanks again for taking the time to respond, really appreciate it.

Sure, no worries. I like this kind of open ended stuff. It’s my favorite sort of thing.

If you simply want to take a distribution of sales for brand X and try to map the physical sizes of those objects (as say measured by a tape measure) to the nearest distribution of sizes in brand Y that gives closest agreement in physical sizes… this is perhaps a more concrete goal that people might help with. People tend to think that if you buy a 36x32 pair of pants from each brand, they should all be the same physical size… but that’s not true as I’m sure you know, “relaxed fit” vs “slim fit” and “big and tall” and “vanity sizing” are all things that vary from one mfg to another.

Glad to hear,

the problem you’re proposing (sizemapping tape measure sizes between brands) is also something I considered doing as well someday, don’t have much data on that though, but interesting as well.

But maybe we can simplify the original problem a bit, as I reckon I’ve been chewing off a bit too much. It’s just been fun revisiting all these wonderful statistical things I once learned about. I went down the rabbit hole a bit too deep I guess…

I’ll probably come up with better questions once I can clearly state the problem, so if you don’t mind me blabbering until put in the right direction I’d welcome the feedback.

So say that we pick a sale distribution of the following:

D_sales_global   = fit(MvNormal, data[:,[:waist, :leg]]) # all of the sales data
D_sales_series_x = fit(MvNormal, data[data.series.=="X", [:waist, :leg]])
D_sales_series_y = fit(MvNormal, data[data.series.=="Y", [:waist, :leg]])

Is there a way to fit Beta in the following totally made up formula with only knowing a few points within the D_sales_series_z sales distribution.

D_sales_series_z = β₀ D_sales_global ⨁ β₁ D_sales_series_x ⨁ β₂ D_sales_series_y + ϵ 

I have no idea if such a thing exists hence the \bigoplus, but I do hope it brings across what my line of reasoning was:

  • taking D_sales_global as a basis would be better than a coinflip
  • taking a mixture of D_sales_global, D_sales_series_x and D_sales_series_y would probably still be better
  • knowing how much D_sales_series_z resembles the others might help in inferring even more

I’ve been playing with Gaussian Processes and I think something stuck from over there, knowing only a few points, honing in for every extra point which is known. Might be totally off though.

I think what you’re talking about is called a gaussian mixture model. Perhaps if you look into that topic, you can find some useful ideas. You can fit a gaussian mixture model using a Bayesian methodology to find posterior distributions for the Beta values for example.


Thank you!

Is this an accurate summary of the problem?

  • The firm is considering the addition of a new product variant (brand) to its offering
  • All combinations of brand/size are the same cost to the firm (you said “we’re talking 1k items with the exact same price”)
  • Given that the only thing that varies from one brand to another is style and fit, you are trying to estimate demand for the potential new brand, based on sales data that you have for other brands

If this is accurate, what comes to mind is:

  • Is the new brand that’s being considered a substitute for existing brands that the firm sells? If so, you want to consider how much cannibalization will occur as a result of the introduction of the new brand.
  • If we’re talking about pants, for example, can you model the sales of brands X and Y using the product measurements/colors as explanatory variables and then use this to estimate sales for brand Z based on the available product measurements/colors? If so, and if you have enough data, there are many options for fitting linear and non-linear models in the Julia ecosystem. Consider GLM.jl, XGBoost.jl, MLJ.jl, Flux.jl, etc.
1 Like

Hi, thank you for your response.

I’m trying to model how the size distribution would shift from one brand to a new brand, only having a small number of samples for the new brand.

For example, what would happen to the size distribution (of sales) when selling a new stretchy fabric? I’d imagine people would maybe buy (or keep) a wider range of sizes as they are more likely to fit.

In essence: can one use some parameters from a closely related distribution (with dense samples) to transfer them into another distribution where the samples are sparse.

Another example that came to mind: When a teacher has the math and language grades per student of past classes, is it possible to find the distribution of likely math grades of a new class only having done one language exam. I’d imagine using the correlation between both skills to start with, but since there has only been one language exam in the new class, how would one express/calculate the uncertainty that comes with the sparsity of only one sample?

Thank you for pointing me towards the packages you mentioned, I already used a few of them, will also look into others.

I am not sure I fully understand the problem, but I assume that this is about modeling demand for clothes. Specifically, you have consumers coming in and buying an item certain color/size/model/brand. There are several ways to model this, first I would go with a multilevel mode generalized linear models (Poisson link), eg Chapter 15 of Gelman and Hill: Data Analysis

But keep in mind that price plays a key role in purchase decisions, and it is a choice variable for a firm, in addition to inventory. This becomes an optimization problem in a very large space, with an trade-off between exploration (experiment with prices to map the demand functions, which may have parameters that shift in time) and exploitation (optimal pricing given demand). These problems are usually handled via heuristics, and solving them requires some experience in operations research or a similar field.


Thank you for pointing out a book I could read to get a better grasp on the matter. I’ve seen the Poisson link you’ve mention show up a few times when searching online, I’ll have a look at that.

I started exploring GLM.jl but AFAICT the GLM.formula macro does not support (y ~ x|a) so went looking elsewhere and found MixedModels.jl. Tried a few things in R as well (which has | and /), but would rather stay in Julia.

You’ve indeed hit the nail on the head that this is a very large problem space. I’ve chosen the problem to try and learn more about statistics, I’m in no hurry to solve it, but revisit it from time to time.

Thank you and everyone investing time to point me on my path, really appreciated!

I think the essential concern is this. While you can do all the statistics you like, there is no way to predict sales from statistics alone. You must have also a model of how the things that change from one situation to another affect the outcomes.

For example, if the original pants are honestly terrible but the people buy them so that other people can stare at the label on their butt, then when the new pants are made available without the old label and are otherwise completely identical (perhaps made on the same assembly line) then sales will plummet. On the other hand if people buy them for comfort and fit and durability and the label doesn’t matter then a competitor with improved color and sizing may sell even better than the original. Predicting what will happen requires a model of human behavior.

I would most likely use a Bayesian approach for this - however, I wouldn’t think a single observation is going to change anything. Let’s say the teacher has hundreds of past test grades and knows that they are distributed normally with a mean of 70 and a standard deviation of 25. Using a Bayesian approach, you could use that information as your prior and then estimate the new parameters (mean and std dev) based on the new test scores that come in for the new class. Once you have the posterior distributions for your mean and std dev, the credible intervals can be calculated from that.


@dlakelan Yes indeed, human behaviour is a factor as you’ve explained. I understand that many other factors like price setting and the label being a vanity thing play a role in sales.

However, I’m not really looking to forecast sales, rather the distribution of sizes of said sales.
Maybe a ludicrous example: I would like to know if red-pants buyers buy bigger pants.

Thing is, there might be many such subspaces where size-distributions shift. I’m looking to find the ones that differ the most.

While I’m typing this, I’m thinking is this just k-means over μ and σ ? But that’s only part of the puzzle, updating for low cardinality in the subgroup is also a problem. How does one measure if the distribution has enough samples?

I think i need to freshen up some basics as well, It’s been a while. Thank you for your patience (all).

Yes! I think I’ll try that. Thank you for your suggestion.


All but the first of those are available to download/read for free and have helped me immensely over the past few years as I’ve found myself doing a lot of applied statistical work for my employer after (like you) not having studied it for many years :slightly_smiling_face: