Why is copying using a loop is much slower than `copy` for large arrays?

anon56330260 · November 22, 2022, 1:37pm

A possible explaination :
The builtin-in copy simply calls a builtin function jl_array_copy, which in turn calls a builtin array constructor and memcpy function. Your simple implementation cannot beat C’s highly optimized memcpy (memcpy is really smart and it can sometimes even utilize special feature offered by operating system). The case of small array may be related to alignment problem or overhead in memcpy, since you copy directly and LLVM knows more information, some logic in memcpy can be skipped.

Topic		Replies	Views
Normal vs broadcasted slice assignment General Usage	5	265	February 16, 2024
Slicing array on julia 4000ms vs c++ 400ms Performance vector	24	692	November 10, 2024
Use "memcpy" instead of "memmove" to copy Array when there's no overlap? Internals & Design	7	2129	January 9, 2022
Why do functions like similar and copy make 2 allocations for large arrays? New to Julia	7	159	July 19, 2024
Random memory access performance - how to hint compiler/runtime to preload/cache/plan Performance question , performance , memory , turbo	0	262	January 16, 2023

Why is copying using a loop is much slower than `copy` for large arrays?

Related topics