Dataframe encoding

question
#1
 using Pkg
    Pkg.add("CSV")
    Pkg.add("DataFrames")
    Pkg.add("SQLite")

    using CSV
    using DataFrames
    using SQLite

    path = "/Users/Desktop/Data"

    ref = CSV.File(path, "filename.csv"))
    println(size(ref))
    first(ref, 10)

    Ref = SQLite.DB() 
    SQLite.load!(filename, path) 

    DataFrame(SQLite.Query(ref,"""select * from tablename
          where date > '1999-01-01'
          """))       
0 Likes

#2

my guess - “Orderdate” type is a BLOB - and it is strange …

$ sqlite3 acme.sqlite 
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" for usage hints.
sqlite> PRAGMA table_info(orders);
0|OrderID|INT|0||0
1|CustomersID|INT|0||0
2|EmployeeID|INT|0||0
3|Orderdate|BLOB|0||0
4|ShipperID|INT|0||0
sqlite> 

created with (Julia 1.0.1; SQLite v0.7.0 )

using CSV, DataFrames, SQLite
orders = CSV.read("orders.csv")
acme = SQLite.DB("acme.sqlite") 
SQLite.load!(orders, acme, "orders") 
$ cat orders.csv
OrderID,CustomersID,EmployeeID,Orderdate,ShipperID
10248,90,5,1996-07-04,3
10249,81,6,1996-07-05,1
0 Likes

#3

maybe this is related: https://github.com/JuliaDatabases/SQLite.jl/issues/160

0 Likes

#4

How do I call string on the date values? Should I do it when reading the CSV file to Julia

0 Likes

#5

Tried this, didn’t work

    mydateformat = Dates.DateFormat("yyyy-mm-dd")
    Dates.Date.(["orders[:,4]",mydateformat])
0 Likes

#6

How do I call string on the date values?

simple workaround: try on the CSV reading part - reading as a String

  • orders = CSV.read( IOBuffer(csv) ; types=[Int64,Int64,Int64,String,Int64] )

My log ( Julia 1.1 )

julia> using CSV, DataFrames, SQLite
julia> csv = """OrderID,CustomersID,EmployeeID,Orderdate,ShipperID
       10248,90,5,1990-07-04,3
       10249,81,6,1996-07-05,1
       """
"OrderID,CustomersID,EmployeeID,Orderdate,ShipperID\n10248,90,5,1990-07-04,3\n10249,81,6,1996-07-05,1\n"
julia> orders = CSV.read( IOBuffer(csv) ; types=[Int64,Int64,Int64,String,Int64] )
2×5 DataFrame
│ Row │ OrderID │ CustomersID │ EmployeeID │ Orderdate  │ ShipperID │
│     │ Int64   │ Int64       │ Int64      │ String     │ Int64     │
├─────┼─────────┼─────────────┼────────────┼────────────┼───────────┤
│ 1   │ 10248   │ 90          │ 5          │ 1990-07-04 │ 3         │
│ 2   │ 10249   │ 81          │ 6          │ 1996-07-05 │ 1         │
julia> acme = SQLite.DB()
SQLite.DB(in-memory)
julia> SQLite.load!(orders, acme, "orders")
"orders"
julia> DataFrame(SQLite.Query(acme,"""select * from orders
                 where orderdate = '1996-07-05'
                 """))
1×5 DataFrame
│ Row │ OrderID │ CustomersID │ EmployeeID │ Orderdate  │ ShipperID │
│     │ Int64⍰  │ Int64⍰      │ Int64⍰     │ String⍰    │ Int64⍰    │
├─────┼─────────┼─────────────┼────────────┼────────────┼───────────┤
│ 1   │ 10249   │ 81          │ 6          │ 1996-07-05 │ 1         │

code:

using CSV, DataFrames, SQLite
csv = """OrderID,CustomersID,EmployeeID,Orderdate,ShipperID
10248,90,5,1990-07-04,3
10249,81,6,1996-07-05,1
"""
orders = CSV.read( IOBuffer(csv) ; types=[Int64,Int64,Int64,String,Int64] )
acme = SQLite.DB() 
SQLite.load!(orders, acme, "orders") 
DataFrame(SQLite.Query(acme,"""select * from orders
          where orderdate = '1996-07-05'
          """))  
0 Likes

#7

ImreSamu - You the man, thanks

1 Like