Transpose columns to rows

To prepare the import into an ERP system of a list of customers (identified from an id) and their related 10 years historical usage data of a system, I needed to shift the data from

customer1, y1:value1, y2:value2, … y10:value10
customer2, y1:value1, y2:value2…

to
customer1, y1:value1
customer1, y2:value2

customer1, y10:value10
customer2, y1:value1

,
There are 889 customers in the file, not finding a way to do this in Excel even usig chatgpt I decided to retrieve from the dust my Julia installation and try to do this excercise.

I come up with this solution, for the sake of knowledge wanted to kindly ask if there is (I’m sure) a better and easier way to get this done

using DelimitedFiles

inv = DelimitedFiles.readdlm("/home/Documents/Cumulative_import.csv", ',';header=false, skipstart=0)

work = zeros(Float64, 8890,2)

for i = 1:10
    work[i,1]=inv[1,1]
    work[i,2]=inv[1,2]
end

for i = 2:889
    for j = ((i-1)*10)+1:((i-1)*10)+10
        work[j,1] = inv[i,1]
    end
end

v=inv[1,2:11]

work[1:10,2]=v

for i = 2:889
    v=inv[i,2:11]
    for j = ((i-1)*10)+1:((i-1)*10)+10
        if mod(j/10,1) > 0
            k=trunc(Int,round(mod(j/10,1);digits=1)*10)
        else
            k=10
        end
        work[j,2]=v[k]
    end
end

DelimitedFiles.writedlm("/home/Documents/Cumulative_import_export.csv", work,',')

If someone is interested I can provide the origin csv and final result.

Probably the simplest no-dependency solution:

work = mapreduce(vcat, eachrow(inv)) do row
    stack(row[2:end], dims=1) do value
        (row[1], value)
    end
end

(using your inv variable, and writing to the work variable you want)

1 Like

Here is one function which might be fashioned into a working solution:

function inout(fname)
    for line in readlines(fname)
        fields = strip.(split(line, ','))
        for i in 2:length(fields)
            println(fields[1], ", ", fields[i])
        end
    end
end

With this I got:

julia> inout("in.txt")
customer1, y1:value1
customer1, y2:value2
customer1, y10:value10
customer2, y1:value1
customer2, y2:value2

from:

customer1, y1:value1, y2:value2, y10:value10
customer2, y1:value1, y2:value2

There are many ways to go about this… but very probably the problem is not so accurately specified.

1 Like

To clarify I think will be useful to show the data.
This is the origin data excerpt


And this the resulting output excerpt

It is best if you do this transformation in Excel (Excel is powerful as any programming language but might not be so efficient).

I think functions such as VLOOKUP() and IF() might be enough.

If was only shifting the position of the column data I have had no doubt working in Excel
But need to generate new rows programmatically and I have no skill in VB or scripting language (not that in Julia I’m much more advanted, still I’m able to achieve something)

Well, Excel has many tricks (and I picked up just a few of them).
In any case, I’ve managed to make the transformation you wanted in Google Sheets (very similar), and it used: IF, VLOOKUP, MATCH, INDEX functions.
I’ll try to add a bit of details in a bit.

The data is in a table $A$1:$J$21 …

Then on row 23, below the table, the first row has:

A23        B23    C23
21002342   2      =vlookup(A23,$A$1:$J$21,B23,0)

which are the first customer, column number with first data and formula to obtain that data.

From then on, all the rows contain:

A24           B24         C24
=if(B23=10, index($A$1:$A$21,match(A23,$A$1:$A$21)+1), A23)
              =if(B23=10,2,B23+1)
                          =vlookup(A24,$A$1:$J$21,B24,0)

The 10 in the first two formulas is the number of the last column with data (and a trigger to move on to the next row).

I hope you can understand this, because it is quite excruciating to write down these formulas in this post.

1 Like

You could also try DataFrames,jl:

using CSV, DataFrames
inv = CSV.read("input.csv", DataFrame; header=false)
sort(stack(inv,2:11), 1)
4 Likes

You could also do this (it being understood that I prefer @rafael’s solution): line up the columns from 2 to the last one after the other; you make 10 copies of the index column and, finally, you put them side by side.

values=reshape(inv[:,2:11],:,1)
ids=repeat(inv[:,1],10)
res=[ids values]
1 Like

I discovered the produced csv has some formatting problem, so was impossible to load the data correctly in Excel.
In the end I was able to generate new rows using this simple method

using TRANSPOSE to shift data from column to rows, and also copying the data in the block of 10 rows using this way

and also a combination of IF and INDIRECT with some external ID to regenerate the ids in between


Testing again julia was fun, but in the end for this small activty Excel was more productive