Why is the histogram for the WildType normalised data differently when plotted with other data compared to being plotted alone?
This plot command
plot(df[df.genotype .== "WildType", :], x=:level, Geom.histogram(bincount=200), Scale.x_log10, color=:genotype, Coord.cartesian(xmin=-2, xmax=1, ymin=0, ymax=8000)
produces this plot
while this plot
plot(df, x=:level, Geom.histogram(bincount=200), Scale.x_log10, color=:genotype, Coord.cartesian(xmin=-2, xmax=1, ymin=0, ymax=8000))
produces
The WildType data are roughly 3x larger in the second plot. This does not happen to the KO data if I plot it by itself. Am I missing somethong about how Gadfly histograms work?
             
            
              
              
              
            
            
           
          
            
            
              Here is a useable example the reproduces the behaviour.
using Distributions, Gadfly, DataFrames
d1 = Normal(0, 1)
d2 = Normal(1, 0.5)
x = rand(d1, 10000)
df = DataFrame(x=x, class="Class 1")
x = 0.5 .* rand(d2, 10000)
append!(df, DataFrame(x=x, class="Class 2"))
spike = -2.0 .* ones(500)
append!(df, DataFrame(x=spike, class="Class 2"))
h = plot(df[df.class .== "Class 1", :], 
  x=:x, 
  color=:class, 
  Geom.histogram(bincount=200), 
  Coord.cartesian(xmin=-4, xmax=4, ymin=0, ymax=1000)
  )
display(h)
h = plot(df,
  x=:x, 
  color=:class, 
  Geom.histogram(bincount=200), 
  Coord.cartesian(xmin=-4, xmax=4, ymin=0, ymax=1000)
  )
display(h)
This produces:
It looks like it has plotted Class 1 as the sum of Class 1 and Class 2. Is this to be expected? Is there a way to turn this behaviour off?
             
            
              
              
              
            
            
           
          
            
            
              @evan-wehi try to reach the Gadfly.jl maintainers in their community channels. They use Gitter instead of Discourse.
             
            
              
              
              1 Like
            
            
           
          
            
            
              My bad - I missed it in the documentation. This behaviour is intended and is controlled by the position argument in Geom.histogram. I need to use position=:dodge.
             
            
              
              
              1 Like