Comparison between time taken in Python and Julia

richie96 · August 29, 2019, 11:08am

Hi,

I am doing a comparison between Python and Julia by analyzing the same data set.
I am using a data set containing nearly 800,000 rows of data. I load the data from two csv files containing equal amount of data each and then merge them together.
I applied Random Forest algorithm to the data.
Given below is the time the python and julia code took to do the same tasks.

Loading Data: Python - 2.195s
Julia - 15.232s

Merging data: Python - 0.1505s
Julia - 5.55s

Prediction time: Python - 10.2617s
Julia - 24.5291s

Visualization: Python - 0.3434s
Julia - 35.338s

Can anyone help me in understanding why Julia is so much slower compared to Python in the above mentioned tasks.

Thanks.

Tamas_Papp · August 29, 2019, 11:15am

Probably not without your code.

stevengj · August 29, 2019, 11:39am

I’m guessing that this means you are including loading/compilation time. When you launch Julia and load a plotting package (e.g. using Plots) and run your first plot (e.g. plot(...)), you spend a lot of time waiting for everything to compile, after which point the code is fast.

Compilation time is irrelevant for large computational tasks because it scales with the code size, not the computation time. That is, if you are running something for an hour, waiting 30 seconds to compile at the beginning is irrelevant.

For interactive usage, the compilation delay is annoying, and is something that will be improved in future Julia versions — it’s just a matter of caching compiled code, nothing fundamental. However, for interactive exploration I would typically recommend just opening a Jupyter notebook, loading Plots and whatever other modules you need, and leaving the notebook open as you work (creating and evaluating new notebook cells as needed). If you are working interactively for more than a few minutes, a 30s delay at the beginning quickly becomes irrelevant.

(Even if you are doing development work, the Revise package means that you rarely need to restart an interactive Julia session.)

Topic		Replies	Views
Julia DataFrames -> plot 48 times slower than Python Pandas Data	16	6289	October 3, 2017
Long startup time when loading Plots.jl General Usage	9	2964	December 24, 2016
Why Julia is fast in interpreter but slow when dealing with files Performance	11	6041	March 1, 2018
Slow Jupyter with Julia New to Julia ttfp	3	2115	January 29, 2022
Question on simple performance comparison between Python and Julia General Usage question	23	1136	June 13, 2023

Comparison between time taken in Python and Julia

Related topics