Transpose columns to rows

davide445 · October 28, 2023, 2:40pm

To prepare the import into an ERP system of a list of customers (identified from an id) and their related 10 years historical usage data of a system, I needed to shift the data from

customer1, y1:value1, y2:value2, … y10:value10
customer2, y1:value1, y2:value2…
…

to
customer1, y1:value1
customer1, y2:value2
…
customer1, y10:value10
customer2, y1:value1
…

,
There are 889 customers in the file, not finding a way to do this in Excel even usig chatgpt I decided to retrieve from the dust my Julia installation and try to do this excercise.

I come up with this solution, for the sake of knowledge wanted to kindly ask if there is (I’m sure) a better and easier way to get this done

using DelimitedFiles

inv = DelimitedFiles.readdlm("/home/Documents/Cumulative_import.csv", ',';header=false, skipstart=0)

work = zeros(Float64, 8890,2)

for i = 1:10
    work[i,1]=inv[1,1]
    work[i,2]=inv[1,2]
end

for i = 2:889
    for j = ((i-1)*10)+1:((i-1)*10)+10
        work[j,1] = inv[i,1]
    end
end

v=inv[1,2:11]

work[1:10,2]=v

for i = 2:889
    v=inv[i,2:11]
    for j = ((i-1)*10)+1:((i-1)*10)+10
        if mod(j/10,1) > 0
            k=trunc(Int,round(mod(j/10,1);digits=1)*10)
        else
            k=10
        end
        work[j,2]=v[k]
    end
end

DelimitedFiles.writedlm("/home/Documents/Cumulative_import_export.csv", work,',')

If someone is interested I can provide the origin csv and final result.

aplavin · October 28, 2023, 3:03pm

Probably the simplest no-dependency solution:

work = mapreduce(vcat, eachrow(inv)) do row
    stack(row[2:end], dims=1) do value
        (row[1], value)
    end
end

(using your inv variable, and writing to the work variable you want)

Dan · October 28, 2023, 3:05pm

Here is one function which might be fashioned into a working solution:

function inout(fname)
    for line in readlines(fname)
        fields = strip.(split(line, ','))
        for i in 2:length(fields)
            println(fields[1], ", ", fields[i])
        end
    end
end

With this I got:

julia> inout("in.txt")
customer1, y1:value1
customer1, y2:value2
customer1, y10:value10
customer2, y1:value1
customer2, y2:value2

from:

customer1, y1:value1, y2:value2, y10:value10
customer2, y1:value1, y2:value2

There are many ways to go about this… but very probably the problem is not so accurately specified.

davide445 · October 28, 2023, 4:04pm

To clarify I think will be useful to show the data.
This is the origin data excerpt

And this the resulting output excerpt

Dan · October 28, 2023, 6:41pm

It is best if you do this transformation in Excel (Excel is powerful as any programming language but might not be so efficient).

I think functions such as VLOOKUP() and IF() might be enough.

davide445 · October 28, 2023, 7:00pm

If was only shifting the position of the column data I have had no doubt working in Excel
But need to generate new rows programmatically and I have no skill in VB or scripting language (not that in Julia I’m much more advanted, still I’m able to achieve something)

Dan · October 28, 2023, 7:02pm

Well, Excel has many tricks (and I picked up just a few of them).
In any case, I’ve managed to make the transformation you wanted in Google Sheets (very similar), and it used: IF, VLOOKUP, MATCH, INDEX functions.
I’ll try to add a bit of details in a bit.

The data is in a table $A$1:$J$21 …

Then on row 23, below the table, the first row has:

A23        B23    C23
21002342   2      =vlookup(A23,$A$1:$J$21,B23,0)

which are the first customer, column number with first data and formula to obtain that data.

From then on, all the rows contain:

A24           B24         C24
=if(B23=10, index($A$1:$A$21,match(A23,$A$1:$A$21)+1), A23)
              =if(B23=10,2,B23+1)
                          =vlookup(A24,$A$1:$J$21,B24,0)

The 10 in the first two formulas is the number of the last column with data (and a trigger to move on to the next row).

I hope you can understand this, because it is quite excruciating to write down these formulas in this post.

rafael.guerra · October 28, 2023, 7:46pm

You could also try DataFrames,jl:

using CSV, DataFrames
inv = CSV.read("input.csv", DataFrame; header=false)
sort(stack(inv,2:11), 1)

rocco_sprmnt21 · October 29, 2023, 3:57pm

You could also do this (it being understood that I prefer @rafael’s solution): line up the columns from 2 to the last one after the other; you make 10 copies of the index column and, finally, you put them side by side.

values=reshape(inv[:,2:11],:,1)
ids=repeat(inv[:,1],10)
res=[ids values]

davide445 · October 30, 2023, 6:37pm

I discovered the produced csv has some formatting problem, so was impossible to load the data correctly in Excel.
In the end I was able to generate new rows using this simple method

using TRANSPOSE to shift data from column to rows, and also copying the data in the block of 10 rows using this way

and also a combination of IF and INDIRECT with some external ID to regenerate the ids in between

Testing again julia was fun, but in the end for this small activty Excel was more productive

Topic		Replies	Views
Writing data from Julia to an Excel Spreadsheet New to Julia xlsx	18	7774	August 10, 2024
Copying Data from one Excel sheet to three different ones Data xlsx	4	418	November 6, 2022
How to export data in arrays form? Data question	11	1236	April 13, 2021
Spilt column and save into different column New to Julia question	7	346	September 10, 2020
Question about transpose dataframe? General Usage dataframes	4	368	July 6, 2022

Transpose columns to rows

Related topics