Custom (NumPy style) broadcasting rule that avoids iterating over elements (for GPU-acceleration)
|
|
10
|
375
|
August 24, 2025
|
Using getrf_batched to find matrix inverses
|
|
2
|
36
|
August 7, 2025
|
Bend: a new GPU-native language
|
|
45
|
12464
|
August 6, 2025
|
Sparse matrix multiplication for Metal
|
|
15
|
328
|
July 31, 2025
|
😤 Multi-line expressions aren't fully computed
|
|
22
|
438
|
July 11, 2025
|
PackageCompiler fails to create app for MadNLPGPU + ExaModels (CUDSS linear solver)
|
|
11
|
342
|
June 30, 2025
|
How to Manage Memory with Sequential, GPU-Intensive (e.g., PyTorch) Python Calls via PythonCall.jl
|
|
0
|
67
|
June 17, 2025
|
Julia (AcceleratedKernels) vs JAX time comparison
|
|
21
|
805
|
June 11, 2025
|
GPU/CPU Agnostic FFT code
|
|
7
|
461
|
June 10, 2025
|
Solving ODE on GPU from Python with DifferentialEquations.jl
|
|
11
|
1036
|
May 29, 2025
|
"I don't like NumPy" - Julia equivalents to the numpy code?
|
|
19
|
1403
|
May 21, 2025
|
``mod1`` based Periodic Indexing on GPUs
|
|
1
|
70
|
May 3, 2025
|
.== and .<= inside Zygote.gradient() are inaccurate on GPU
|
|
10
|
205
|
April 30, 2025
|
Block/Tile-Based GPU Programming (not Scratch)
|
|
2
|
268
|
April 6, 2025
|
Lightweight dependency for GPU programming
|
|
7
|
251
|
March 27, 2025
|
Inconsistency in `accumulate` between `Array` and `CuArray.`
|
|
2
|
68
|
March 26, 2025
|
I don't understand why it is slower with CuStaticSharedArray
|
|
9
|
289
|
March 17, 2025
|
Why is my kernel as slow in FP32 as in FP64 on A2000 Ada-based GPU?
|
|
10
|
186
|
March 11, 2025
|
[ANN] Introducing AlternateVectors.jl - A Library for Peculiar One-Dimensional Array Patterns
|
|
0
|
190
|
February 8, 2025
|
Any updates on using AMDGPU in WSL?
|
|
8
|
475
|
February 6, 2025
|
FFTW scales pretty well (some @btime benchmarks)
|
|
1
|
1715
|
February 4, 2025
|
How to develop code in Vulkan using Julia?
|
|
1
|
204
|
February 1, 2025
|
Batched Matrix Multiply
|
|
11
|
3726
|
January 31, 2025
|
Does the new LLVM SPIR-V backend help Julia in any way?
|
|
2
|
296
|
January 28, 2025
|
Lux, optimization on gpu
|
|
8
|
332
|
January 13, 2025
|
Broadcasting performance
|
|
13
|
581
|
January 6, 2025
|
CUDA async is not working properly
|
|
4
|
170
|
December 31, 2024
|
Cumulative sum on GPUArray using KernelAbstractions
|
|
4
|
241
|
December 24, 2024
|
Can I move an array asynchronously from main program to CUDA?
|
|
7
|
221
|
December 15, 2024
|
Symmetric view of sparse matrix CUDA.jl
|
|
0
|
45
|
December 13, 2024
|