Memory not released on HTTP POST requests?!


I was trying to find a memory leak in my application, but now I think either it is in HTTP.jl or I’m doing some thing wrong…

I created a test script to reproduce the issue (replace “path/to/a/large/file.iso” to some large test file on your system to reproduce):

using HTTP, Profile

function example(req::HTTP.Request)
    @info "request handling: \"Ping!\""
    return HTTP.Response(201, "Ping!")

const router = HTTP.Router()
HTTP.register!(router, "POST", "/", example)

@info "start server"
HTTP.serve!(router, HTTP.Sockets.localhost, 8080)

@info "send POST request"
data = Dict(
    "label" => "Large File",
    "file" => HTTP.Multipart("dvd_image.iso", open("path/to/a/large/file.iso"), "application/octet-stream")
body = HTTP.Form(data)
HTTP.request("POST", "", [], body)

@info "force GC run"

@info "create heap snapshot"

After executing this, I still see the file I sent via the HTTP POST request taking all the memory:

The memory of the request payload should be released after the handler function returns, shouldn’t it?

Can anyone help, is this a HTTP.jl issue or am I using it wrong?

What I think is happening, this could be a problem:

@info "send POST request"
data = Dict(
    "label" => "Large File",
    "file" => HTTP.Multipart("dvd_image.iso", open("path/to/a/large/file.iso"), "application/octet-stream")

you open the (large) file, but you never close, and open is documented to require close (the GC through finalizers for open files should potentially take care of it, if possible, might it actually take longer than for regular memory?)

This is just a guess, and I’m not confident it’s the problem since it’s not mentioned at the help (or extended help) of HTTP.Multipart.

But what must happen behind the scenes is, I believe you’re uploading the file/POSTing it, so it must be opened, loaded into memory, then sent. Are you just seeming it taking space, but not space you need to worry about? That will be reused later?

I don’t see a lot of memory with:

julia> varinfo()  #  for e.g. body or data, so I might well be wrong, or it doesn't show all from some lower layer, related to IOStream

Another thing to have in mind, is what you did is eqivalent to:

my_open_file = open("path/to/a/large/file.iso")
data = Dict(
    "label" => "Large File",
    "file" => HTTP.Multipart("dvd_image.iso", my_open_file, "application/octet-stream")

If data is in global scope then my_open_file would also be. And the GC can’t do anything with global variables since not yet dead.

Even if that were part of a function and (the implicit) my_open_file a local variable, then the GC would take it into account, but if you return data from your function, then it holds on to that my_open_file. And than body etc. And all of this if it were in global scope.

Does any of that make sense?

What is that seemingly interesting tool you show the heap snapshot with (and the string its showing)?

I do see there Base.IntrusiveLinkedList{Task}. It seems related to HTTP, i.e. you don’t use (it nor) Task directly? Anyway, threads and I believe taksks have had some GC issues, so have you used latest master or 1.10-beta2 where I think it may be fixed already?

1 Like

You are right it was the payload of the client call, not on the server (HTTP.jl) side.

I thought I had sorted that out earlier, because I did the client request from another julia session, but now I noticed, that Profile.take_heap_snapshot() is showing not only the heap of the session you call it, but of all running sessions. When I exit the client session before taking the snapshot, memory consumption goes down to ~150 MB.
I checked with 1.9.3 and 1.10 beta, both behave the same way, however the heap structure in 1.10 looks a bit different.

I wasn’t aware of varinfor(), that’s going to be helpful. :+1:

I’m using Chromium to inspect the heap snapshot, that was introduced with 1.9, see: Julia 1.9 Highlights

Thank you!