HTTP multipart/form-data processing by server

Is any way to parse multipart/form-data by HTTP functions?

I’m trying to upload and process a file in Bukdu app with

  <form method="post" enctype="multipart/form-data">...</form>

The request.body is correct for multipart. But I don’t see any way to convert it to HTTP.Multipart. The are only client related methods at https://github.com/JuliaWeb/HTTP.jl/blob/master/src/multipart.jl

Should I implement that functionality self? Or may be it is already implemented somewhere? Any ideas why it is not inside HTTP?

I’m talking about a body like:

"-----------------------------189075738616438618431008876646\r\n
Content-Disposition: form-data; name=\"image\"; filename=\"1.png\"\r\nContent-Type: image/png
\r\n\r\n\x89PNG\r\n\x1a\n\0\0\0\rIHDR\0\0\xaeB`\x82\r\n
-----------------------------189075738616438618431008876646--\r\n"

Workaround for now (it is working at least in my demo app):

import HTTP, HTTP.Parsers
"""
  returns embedded data or nothing
"""
function parse_multipart(data::Vector{UInt8})
    chunk_end = HTTP.Parsers.find_end_of_chunk_size(data)
    if chunk_end == 0
        return nothing
    end

    boundary = data[1:chunk_end - 2] # should be without \r\n
    header_end = HTTP.Parsers.find_end_of_header(data)

    data_start = header_end + 1
    data_end = length(data) - chunk_end - 2
    # check that block ending is with same boundary + two additional hyphens
    if (data[data_end + 1: end] == [boundary..., 0x2d, 0x2d, 0x0d, 0x0a])
       return data[data_start : data_end]
    end

    return nothing
end

...
data = parse_multipart(c.conn.request.body)

Looks like there is a pull request with similar function but still not merged into HTTP - https://github.com/JuliaWeb/HTTP.jl/pull/264

1 Like

Demo app is available by https://github.com/rssdev10/JWebImageDemo.jl

Known issue - multipart data contains form params too as separate chunks. But the code above doesn’t extract these parts. Actual HTTP data looks like:

POST /api/process_image HTTP/1.1
Host: 127.0.0.1:8080
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:67.0) Gecko/20100101 Firefox/67.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://127.0.0.1:8080/
Content-Type: multipart/form-data; boundary=---------------------------182023285717490760841965583652
Content-Length: 92137
DNT: 1
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0

-----------------------------182023285717490760841965583652
Content-Disposition: form-data; name="image"; filename="file1.jpg"
Content-Type: image/jpeg

......JFIF.............C..........
-----------------------------182023285717490760841965583652
Content-Disposition: form-data; name="num"

2
-----------------------------182023285717490760841965583652--

It works for me as well thanks!!!

What if you want to handle multiple files?

No, it doesn’t work for multiple files and for different multipart data like form params and a file simultaneously.

For now I see more general solution at https://github.com/JuliaWeb/HTTP.jl/pull/427 But looks like some help for the author is required to prepare correct tests and check it. Actually it would be good to do performance tests too…

1 Like

thanks for pointing the pull-request! With a few minor modifications it works

In case other people need it, here are the steps (and hopefully this will not be needed anymore in not so long) to incorporate it in a project:

  1. Download parsemultipart.jl from the repository (https://github.com/JuliaWeb/HTTP.jl/blob/a833cefafe67e39b283281217163c1c803f469ea/src/parsemultipart.jl) and put it somewhere in your project
  2. Replace Line 92:
    push!(d, Multipart(string(filename), io, string(contenttype), string(""), string(name)))
    by the following:
    push!(d, Multipart(string(filename), io, string(contenttype), string("")))

By doing so, we don’t need to modify the struct HTTP.Multipart.

If using Mux you will also need this method so that you can pass the body and not the request itself:

function parse_multipart_form(body::Vector{UInt8}, content_type::String)
    m = match(r"multipart/form-data; boundary=(.*)$", content_type)
    m === nothing && return nothing
    parse_multipart_body(body, m[1])
end

You can then call parse_multipart_form using either the request or the request body.

If using Mux it looks like this:

contentType =
    filter(x -> x.first == "Content-Type",
            req[:headers])[1].second
contentType = string(contentType)

parts = parse_multipart_form(req[:data],
                                contentType)        
    
for p in parts
    # Save the file in the temp directory
    destfile = tempname()
    @info "Saving $(p.filename) to @destfile"
    write(destfile,take!(p.data))

end # ENDOF for-loop on message multiparts

Uploading a binary file of size 9MB is taking 32 minutes on server’s specified directory. It takes less then 1 minute to upload it server’s RAM memory (i.e temporary location) but parsing the chunks of form data takes around 32 minutes to /var/tmp directory.
How can we reduce the time for complete file transfer uploaded using HTTP POST request?
Linux version 2.4.31-uc0
gcc version 3.4.1 ( Xilinx EDK 8.1 Build EDK_I.17 121005 )