How to import this kind of file properly?

Ahmed_Salih · September 11, 2020, 11:42am

Hi!

I have a bit of an unusual csv file (for me atleast). It is structured as such:

Pos X/Y/Z [m]:	0	0	-1.2	0	0	-1.188	0	0	-1.176
Part	Time [s]	Vel_0.x [m/s]	Vel_0.y [m/s]	Vel_0.z [m/s]	Vel_1.x [m/s]	Vel_1.y [m/s]	Vel_1.z [m/s]	Vel_2.x [m/s]	Vel_2.y [m/s]	Vel_2.z [m/s]
0	0	0	0	0	0	0	0	0	0	0
1	0.0500427	0	0	0	0.00142405	0	-0.000190126	-0.000321171	0	0.000234272
2	0.100087	0	0	0	0.0204263	0	-0.00111015	0.0165089	0	-0.00108161
3	0.150025	0	0	0	0.00314608	0	0.00161472	-0.00802806	0	0.00136044
4	0.200064	0	0	0	0.0703902	0	-0.00361722	0.0535595	0	-0.00335195

Where the top rows show x y and z position for a particle, and down below one sees its corresponding x y and z velocities in columns. Each new row is then at a new time step.

How would I import this in such a way that I would it such that a point, P, would be associated with a velocity, V, such that:

P(x,y,z) => V(v_x,v_y,v_z)

I hope it makes sense

Kind regards

tomerarnon · September 11, 2020, 11:57am

It’s not totally clear to me what the result of this processing step should be. From my understanding, the “header” of the table refers to the (x,y,z) position of a particle, and the column corresponding to the x header of the first particle (x=0[m] in the table above, if I understand correctly), is the x-velocity over time. Therefore there are multiple velocities associated with a particle (since it varies over time). How, then, would you like the final result to look?

mastrof · September 11, 2020, 1:06pm

I find it indeed a weird format to represent this kind of information.

Associating a “point”(that is a triple of Float64) with a velocity vector is not feasible(since you would need to be checking equality between Float64 which can result in a mess) nor meaningful(since you are not dealing with a continuous velocity field).

I would infer that what you want to do is as follows:

read and store initial position of particle i
read velocity of particle i at each timestep t
map i,t \mapsto \mathbf{v}_i(t) (and if needed also i,t \mapsto \mathbf{r}_i(t))
repeat for all i.

Now there are several possible ways in which you can do this, depending on how you want your data to be organized. I think we need more information to help

Ahmed_Salih · September 11, 2020, 2:40pm

Hi again!

Unfortunately I have no easy way to control the way this output is formulated, since it is done using a simulation package in an external program… therefore I have to work with this kind of data layout.

@tomerarnon I don’t know if what I am suggesting now makes complete sense but I have produced a small picture to explain a bit:

Imagine I have a lot of points (squares) and then at some red dots I am averaging results to get a result in the proximity. These red points are the headers, “0 0 -1.2” based on Pos X/Y/Z

What I would like then that the velocity columns for x y and z velocities over time are associated with its respective red point in the header. My best explanation in Julia terms is to imagine the red point as a struct;

struct Point
      pos  :: tuple(x,y,z)
      vel  :: tuple(v_x,v_y,v_z)
      time :: Float64
end

Of course saving the data this way means that I will have to construct a new identical point for each time step which is not nice… Instead I would like something where I know a specific point and then the time history of its velocity. The point (red circle) is always fixed in space and does not change over time.

I am not sure if I have explained it well enough, but I struggle with finding an easy way to do this using DataFrames etc.DataFrames is not a must, I just thought it would be easier than manual parsing…

Kind regards

tomerarnon · September 11, 2020, 2:57pm

Manual parsing sounds like overkill unless you expect this pre processing step to be so time consuming (very large data) or so frequent that it would be unacceptable to do the “easy way”.

I would read the data with CSV.jl (will return a DataFrame). Then construct your position tuples from the header (names(data) will give you the header, and you may have to do some string parsing on this to extract the positions as numbers). I.e., if I understand the structure of your table, something like Tuple(names(data)[3:5]) should get you the first position as a tuple.

Assuming the positions are unique, you can then do as @mastrof suggested and create a Dict with keys (point, time) i.e. ((x,y,z), t). You can iterate over the rows and add entries to the table as d[(p, t)] = (vx, vy, vz) using the same approach as to get the positions from the header.

pdeffebach · September 11, 2020, 4:05pm

I agree. I think you need two separate objects in this scenario. Don’t try to fit everything into one table

lmiq · September 11, 2020, 4:20pm

I think what you need is a struct in which the vel field is an array, with the velocities of that point at each time step:

struct Point
   pos :: Vector{Float64}
   vel :: Vector{Vector{Float64}}
end
steps = 10 
p = Point(zeros(3),[ zeros(3) for i in 1:steps ])
p.vel[5] # velocity vector at step 5

I would not put the time in the same structure, as it is probably the same vector for all points.

Topic		Replies	Views
Import XLSX data General Usage question	13	1808	August 5, 2020
Best way to import a CSV with a repeated structure Data io	5	182	September 15, 2024
Specifying column type efficiently in CSV.read for large datasets General Usage	4	615	June 22, 2020
Import CSV where the rows are separated by brackets General Usage csv	1	800	January 26, 2021
Converting CSV string values to floats (Python to Julia) New to Julia python , dataframes , csv	13	3245	February 2, 2021

How to import this kind of file properly?

Related topics