Hi,
Suppose I have the following DataFrame with non specified Column type.
Q = DataFrame(datmonth = [], datstat =[], qrytime = [])
0×3 DataFrame
 Row │ datmonth  datstat  qrytime 
     │ Any       Any      Any     
─────┴────────────────────────────
julia> push!(Q,(datmonth = "2022-03", datstat = true, qrytime = now()))
1×3 DataFrame
 Row │ datmonth  datstat  qrytime                 
     │ Any       Any      Any                     
─────┼────────────────────────────────────────────
   1 │ 2022-03   true     2024-06-08T21:32:32.774
Once I append this to an .arrow file the column type changes
Arrow.append("filepath.arrow", Q)
DataFrame(Arrow.Table("filepath.arrow"))
1×3 DataFrame
 Row │ datmonth  datstat  qrytime                 
     │ String    Bool     DateTime                
─────┼────────────────────────────────────────────
   1 │ 2022-03      true  2024-06-08T21:32:32.774
If I have a function that is  generating new DataFrames with different or even the same Data Type as that of Q, they cannot be appended to the .arrow file.
# appending the same DataFrame a second time returns an error
Arrow.append("filepath.arrow", Q)
ERROR: ArgumentError: Table schema does not match existing arrow file schema
Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/Arrow/5pHqZ/src/append.jl:190 [inlined]
  [2] macro expansion
    @ ./task.jl:479 [inlined]
  [3] append(io::IOStream, source::DataFrame, arrow_schema::Tables.Schema{…}, compress::Nothing, largelists::Bool, denseunions::Bool, dictencode::Bool, dictencodenested::Bool, alignment::Int64, maxdepth::Int64, ntasks::Float64, meta::Nothing, colmeta::Nothing)
    @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/append.jl:179
  [4] 
    @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/append.jl:125
  [5] append
    @ ~/.julia/packages/Arrow/5pHqZ/src/append.jl:70 [inlined]
  [6] #149
    @ ~/.julia/packages/Arrow/5pHqZ/src/append.jl:64 [inlined]
  [7] open(::Arrow.var"#149#150"{@Kwargs{}, DataFrame}, ::String, ::Vararg{String}; kwargs::@Kwargs{})
    @ Base ./io.jl:396
  [8] open
    @ ./io.jl:393 [inlined]
  [9] #append#148
    @ ~/.julia/packages/Arrow/5pHqZ/src/append.jl:63 [inlined]
 [10] append(file::String, tbl::DataFrame)
    @ Arrow ~/.julia/packages/Arrow/5pHqZ/src/append.jl:62
 [11] top-level scope
    @ REPL[27]:1
Some type information was truncated. Use `show(err)` to see complete types.
I understand the workaround would be to specify column type in the initialization of the dataframe
Q = DataFrame(datmonth = String[], datstat =Bool[], qrytime = DateTime[])
However this does not work if different data types are generated and need to be appended to the arrow file. i.e. Even if the DataFrame is initialized as
Q = DataFrame(datmonth = Any[], datstat =Any[], qrytime = Any[])
Creating a new arrow file via Arrow.append automatically converts the Data Columns to the type of the data in the initial DataFrame.
Just wondering if I was going about this the wrong way or if there was a kwarg to turn off this behavior.  Thanks.