Read data from excel starting from second row

jacobcvt12 · May 16, 2018, 8:17pm

I’m trying to read data in an excel workbook that starts on the second row. Using ExcelFiles, I can I read into a dataframe if the data starts on the first row

DataFrame(load(fname, "sheetname")

But I’m not sure what to do when the data headers begin on the second (or later) row.

stillyslalom · May 16, 2018, 8:34pm

If you know the size of your data, you can explicitly specify the input range:

julia> DataFrame(load("test.xlsx", "Sheet1!A2:B5"))
3×2 DataFrames.DataFrame
│ Row │ test1 │ test2 │
├─────┼───────┼───────┤
│ 1   │ 1.0   │ 4.0   │
│ 2   │ 2.0   │ 5.0   │
│ 3   │ 3.0   │ 6.0   │

edit: looks like it might be better to use ExcelReaders.jl, which skips blank starting lines by default, or allows you to skip some number of lines:
readxlsheet("test.xlsx", "Sheet1", skipstartrows=1) |> DataFrame

davidanthoff · May 17, 2018, 12:12am

load(fname, "sheetname") should skip empty rows at the beginning. If not, it is a bug and it would be great if you could file an issue with a replication case.

You can also do things like load(fname, "sheetname", skipstartrows=1) to skip a row at the beginning. You would do that if the row actually contains stuff, i.e. if it is not blank.

The whole documentation for this is a bit in flux. I’m starting to consolidate things here, if you want to take a look at an early version of the new documentation.

jacobcvt12 · May 17, 2018, 2:18pm

Thanks all.

@davidanthoff This first row of my spreadsheet does have data, just not actually part of a tabular dataset, so I think this is the intended behavior. However, the skipstartrows keyword only works with readxlsheet, not load in my environment. Is this something that’s available in master but not version 0.3?

davidanthoff · May 17, 2018, 4:39pm

Ah, that is a bug! Fixed on master, once https://github.com/JuliaLang/METADATA.jl/pull/14692 is merged it will go out widely. Thanks for reporting!

jacobcvt12 · May 17, 2018, 4:42pm

Thanks @davidanthoff! And thanks for the great packages

Topic		Replies	Views
"Skip" parameter XLSX package New to Julia r , xlsx , excel	2	528	November 30, 2021
Reading in xlsx with multiple header rows using the XLSX Pkg General Usage xlsx	2	1662	May 24, 2021
Excel data to dataframes Data data , dataframes , xlsx , excel	16	4917	July 25, 2021
Read a .xlsx file General Usage question , xlsx	15	678	November 19, 2024
Not sure how to use XLSX General Usage	3	316	March 24, 2020

Read data from excel starting from second row

Related topics