Here is a reproducible example, and I explain the bug at the end. I am just copying/pasting my code here, and some of the steps may not be needed. But it’s fully contained code, not very complicated, and documented. It seems like a lot, but I tend to be very verbose to make it easier on everyone.
using Distributions
using Statistics
using DataFrames
using CSV
using HTTP
using Dates
## get the data
  contstates = ("AL", "AZ" ,"AR" ,"CA" ,"CO" ,"CT" ,"DE" ,"DC" ,"FL" ,"GA", "ID" ,"IL" ,"IN" ,"IA" ,"KS" ,"KY" ,"LA" ,"ME" ,"MD" ,"MA" ,"MI" ,"MN" ,"MS" ,"MO" ,"MT" ,"NE" ,"NV" ,"NH" ,"NJ" ,"NM" ,"NY" ,"NC" ,"ND" ,"OH" ,"OK" ,"OR" ,"PA" ,"RI" ,"SC" ,"SD" ,"TN" ,"TX" ,"UT" ,"VT" ,"VA" ,"WA" ,"WV" ,"WI" ,"WY")
f = download("https://covidtracking.com/api/v1/states/daily.csv") |> CSV.File |> DataFrame!
f = f[:, (1:4)]   # select only the first four columns
f.date = Date.(string.(f.date), DateFormat("yyyymmdd")) # convert the date column 
filter!(row -> row[:state] in contstates, f)  # remove unwanted states
sort!(f, [:state, :date])  # sort the data
gd = groupby(f, :state)  ## SET UP A GROUPED DATA FRAME based on state
Next I do some operations on the grouping:
function calc_incidence(cuminc)
    _tmp = circshift(cuminc, 1)
    _tmp[1] = 0
    cuminc - _tmp
end
transform!(gd, :positive => calc_incidence => :incidence)
the transform! function modifies the original f dataframe, and adds the incidence column for each group (i.e. for each state). Next I simply just want to get a summary of the grouped (state) data
# for each state, get the incidence on last day + the total cumulative f
f_summary = combine([:positive] => (p) -> (positive=p[end]), gd)
Okay this should give me the one value per state (and it does, but the grouping gets messed up). The result is
49×2 DataFrame
│ Row │ date       │ positive_function │
│     │ Date       │ Int64             │
├─────┼────────────┼───────────────────┤
│ 1   │ 2020-03-07 │ 74212             │
│ 2   │ 2020-03-06 │ 36259             │
│ 3   │ 2020-03-04 │ 152944            │
⋮
│ 46  │ 2020-01-22 │ 49247             │
│ 47  │ 2020-03-04 │ 49669             │
│ 48  │ 2020-03-06 │ 5550              │
│ 49  │ 2020-03-07 │ 2346              │
Why did it give me arbitrary dates? the gd is grouped on State. I expected the results to be
49×2 DataFrame
│ Row │ state  │ positive_function │
│     │ String │ Int64             │
├─────┼────────┼───────────────────┤
│ 1   │ AL     │ 74212             │
│ 2   │ AR     │ 36259             │
│ 3   │ AZ     │ 152944            │
⋮
│ 46  │ WA     │ 49247             │
│ 47  │ WI     │ 49669             │
│ 48  │ WV     │ 5550              │
│ 49  │ WY     │ 2346              │