I have a loop that writes bytes to S3. It works about 95% of the time. Then periodically, I get a cryptic error from the depths of HTTP.jl about SSL_ERROR_SYSCALL
.
The function looks like this
using Retry: @repeat
using AWSS3
using AWS.AWSExceptions: AWSException
function write_and_log(path::S3Path, bytes::Vector{UInt8})
@repeat 4 try
@info "START: Running put to s3" path length(bytes)
write(path, bytes)
@info "END: successfully put file to s3" path
catch e
@retry if e isa HTTP.Exceptions.HTTPError || e isa AWSException
bt = catch_backtrace()
@error "END: Failed to put file to s3. Will try again." path exception=(e, bt)
end
@error "END: Failed to write file to s3" path=path exception=(e, catch_backtrace())
end
end
The error thrown looks like this:
โ Error: END: Failed to put file to s3. Will try again.
โ job.path = p"s3://ppad-processing-processed-datanode-output-prod/rollout2024q2-150/consolidated_rollout2024q2-150_innetwork/data_version_p=1.1.0/versioneddatasourceid_p=150/sourcefileid_p=1446234/batch_2_chunk_0.parq"
โ exception =
โ HTTP.RequestError:
โ HTTP.Request:
โ HTTP.Messages.Request:
โ """
โ PUT /ppad-processing-processed-datanode-output-prod/rollout2024q2-150/consolidated_rollout2024q2-150_innetwork/data_version_p%3D1.1.0/versioneddatasourceid_p%3D150/sourcefileid_p%3D1446234/batch_2_chunk_0.parq HTTP/1.1
โ Content-Type: application/octet-stream
โ User-Agent: AWS.jl/1.0.0
โ Host: s3.us-east-1.amazonaws.com
โ x-amz-date: 20240619T154349Z
โ x-amz-content-sha256: a167eba412e99ca1b5a51fab79f014a783fab46274cde03b5fe2b94a65434381
โ Content-MD5: Hf6y6YV/Vt4O6NhptirfpQ==
โ x-amz-security-token: # [redacted] because I don't know if this should be shared :)
โ Authorization: AWS4-HMAC-SHA256 Credential=[redacted]/us-east-1/s3/aws4_request, SignedHeaders=content-md5;content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-security-token, Signature=b94a0794fdbe81efa864a85afb59b4c22697ae92f192294d5a1ef80750be9b78
โ Accept: */*
โ Content-Length: 302537168
โ Accept-Encoding: gzip
โ
โ
โ โฎ
โ 302537168-byte body
โ """Underlying error:
โ IOError: SSL_ERROR_SYSCALL
โ Stacktrace:
โ [1] (::HTTP.ConnectionRequest.var"#connections#4"{HTTP.ConnectionRequest.var"#connections#1#5"{HTTP.TimeoutRequest.var"#timeouts#3"{HTTP.TimeoutRequest.var"#timeouts#1#4"{HTTP.ExceptionRequest.var"#exceptions#2"{HTTP.ExceptionRequest.var"#exceptions#1#3"{typeof(HTTP.StreamRequest.streamlayer)}}}}}})(req::HTTP.Messages.Request; proxy::Nothing, socket_type::Type, socket_type_tls::Type, readtimeout::Int64, connect_timeout::Int64, logerrors::Bool, logtag::Nothing, kw::@Kwargs{iofunction::Nothing, decompress::Nothing, verbose::Int64})
โ @ HTTP.ConnectionRequest C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\ConnectionRequest.jl:143
โ [2] (::HTTP.RetryRequest.var"#manageretries#3"{HTTP.RetryRequest.var"#manageretries#1#4"{HTTP.ConnectionRequest.var"#connections#4"{HTTP.ConnectionRequest.var"#connections#1#5"{HTTP.TimeoutRequest.var"#timeouts#3"{HTTP.TimeoutRequest.var"#timeouts#1#4"{HTTP.ExceptionRequest.var"#exceptions#2"{HTTP.ExceptionRequest.var"#exceptions#1#3"{typeof(HTTP.StreamRequest.streamlayer)}}}}}}}})(req::HTTP.Messages.Request; retry::Bool, retries::Int64, retry_delays::ExponentialBackOff, retry_check::Function, retry_non_idempotent::Bool, kw::@Kwargs{iofunction::Nothing, decompress::Nothing, verbose::Int64})
โ @ HTTP.RetryRequest C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\RetryRequest.jl:35
โ [3] manageretries
โ @ C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\RetryRequest.jl:30 [inlined]โ [4] (::HTTP.CookieRequest.var"#managecookies#4"{HTTP.CookieRequest.var"#managecookies#1#5"{HTTP.RetryRequest.var"#manageretries#3"{HTTP.RetryRequest.var"#manageretries#1#4"{HTTP.ConnectionRequest.var"#connections#4"{HTTP.ConnectionRequest.var"#connections#1#5"{HTTP.TimeoutRequest.var"#timeouts#3"{HTTP.TimeoutRequest.var"#timeouts#1#4"{HTTP.ExceptionRequest.var"#exceptions#2"{HTTP.ExceptionRequest.var"#exceptions#1#3"{typeof(HTTP.StreamRequest.streamlayer)}}}}}}}}}})(req::HTTP.Messages.Request; cookies::Bool, cookiejar::HTTP.Cookies.CookieJar, kw::@Kwargs{iofunction::Nothing, decompress::Nothing, verbose::Int64, retry::Bool})
โ @ HTTP.CookieRequest C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\CookieRequest.jl:42
โ [5] managecookies
โ @ C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\CookieRequest.jl:19 [inlined]
โ [6] (::HTTP.HeadersRequest.var"#defaultheaders#2"{HTTP.HeadersRequest.var"#defaultheaders#1#3"{HTTP.CookieRequest.var"#managecookies#4"{HTTP.CookieRequest.var"#managecookies#1#5"{HTTP.RetryRequest.var"#manageretries#3"{HTTP.RetryRequest.var"#manageretries#1#4"{HTTP.ConnectionRequest.var"#connections#4"{HTTP.ConnectionRequest.var"#connections#1#5"{HTTP.TimeoutRequest.var"#timeouts#3"{HTTP.TimeoutRequest.var"#timeouts#1#4"{HTTP.ExceptionRequest.var"#exceptions#2"{HTTP.ExceptionRequest.var"#exceptions#1#3"{typeof(HTTP.StreamRequest.streamlayer)}}}}}}}}}}}})(req::HTTP.Messages.Request; iofunction::Nothing, decompress::Nothing, basicauth::Bool, detect_content_type::Bool, canonicalize_headers::Bool, kw::@Kwargs{verbose::Int64, retry::Bool})
โ @ HTTP.HeadersRequest C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\HeadersRequest.jl:71
โ [7] defaultheaders
โ @ C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\HeadersRequest.jl:14 [inlined]
โ [8] (::HTTP.RedirectRequest.var"#redirects#3"{HTTP.RedirectRequest.var"#redirects#1#4"{HTTP.HeadersRequest.var"#defaultheaders#2"{HTTP.HeadersRequest.var"#defaultheaders#1#3"{HTTP.CookieRequest.var"#managecookies#4"{HTTP.CookieRequest.var"#managecookies#1#5"{HTTP.RetryRequest.var"#manageretries#3"{HTTP.RetryRequest.var"#manageretries#1#4"{HTTP.ConnectionRequest.var"#connections#4"{HTTP.ConnectionRequest.var"#connections#1#5"{HTTP.TimeoutRequest.var"#timeouts#3"{HTTP.TimeoutRequest.var"#timeouts#1#4"{HTTP.ExceptionRequest.var"#exceptions#2"{HTTP.ExceptionRequest.var"#exceptions#1#3"{typeof(HTTP.StreamRequest.streamlayer)}}}}}}}}}}}}}})(req::HTTP.Messages.Request; redirect::Bool, redirect_limit::Int64, redirect_method::Nothing, forwardheaders::Bool, response_stream::Base.BufferStream, kw::@Kwargs{verbose::Int64, retry::Bool})
โ @ HTTP.RedirectRequest C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\RedirectRequest.jl:17
โ [9] redirects
โ @ C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\RedirectRequest.jl:14 [inlined]
โ [10] (::HTTP.MessageRequest.var"#makerequest#3"{HTTP.MessageRequest.var"#makerequest#1#4"{HTTP.RedirectRequest.var"#redirects#3"{HTTP.RedirectRequest.var"#redirects#1#4"{HTTP.HeadersRequest.var"#defaultheaders#2"{HTTP.HeadersRequest.var"#defaultheaders#1#3"{HTTP.CookieRequest.var"#managecookies#4"{HTTP.CookieRequest.var"#managecookies#1#5"{HTTP.RetryRequest.var"#manageretries#3"{HTTP.RetryRequest.var"#manageretries#1#4"{HTTP.ConnectionRequest.var"#connections#4"{HTTP.ConnectionRequest.var"#connections#1#5"{HTTP.TimeoutRequest.var"#timeouts#3"{HTTP.TimeoutRequest.var"#timeouts#1#4"{HTTP.ExceptionRequest.var"#exceptions#2"{HTTP.ExceptionRequest.var"#exceptions#1#3"{typeof(HTTP.StreamRequest.streamlayer)}}}}}}}}}}}}}}}})(method::String, url::URIs.URI, headers::Vector{Pair{SubString{String}, SubString{String}}}, body::Vector{UInt8}; copyheaders::Bool, response_stream::Base.BufferStream, http_version::HTTP.Strings.HTTPVersion, verbose::Int64, kw::@Kwargs{redirect::Bool, retry::Bool})
โ @ HTTP.MessageRequest C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\MessageRequest.jl:35
โ [11] makerequest
โ @ C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\clientlayers\MessageRequest.jl:24 [inlined]
โ [12] request(stack::HTTP.MessageRequest.var"#makerequest#3"{HTTP.MessageRequest.var"#makerequest#1#4"{HTTP.RedirectRequest.var"#redirects#3"{HTTP.RedirectRequest.var"#redirects#1#4"{HTTP.HeadersRequest.var"#defaultheaders#2"{HTTP.HeadersRequest.var"#defaultheaders#1#3"{HTTP.CookieRequest.var"#managecookies#4"{HTTP.CookieRequest.var"#managecookies#1#5"{HTTP.RetryRequest.var"#manageretries#3"{HTTP.RetryRequest.var"#manageretries#1#4"{HTTP.ConnectionRequest.var"#connections#4"{HTTP.ConnectionRequest.var"#connections#1#5"{HTTP.TimeoutRequest.var"#timeouts#3"{HTTP.TimeoutRequest.var"#timeouts#1#4"{HTTP.ExceptionRequest.var"#exceptions#2"{HTTP.ExceptionRequest.var"#exceptions#1#3"{typeof(HTTP.StreamRequest.streamlayer)}}}}}}}}}}}}}}}}, method::String, url::URIs.URI, h::Vector{Pair{SubString{String}, SubString{String}}}, b::Vector{UInt8}, q::Nothing; headers::Vector{Pair{SubString{String}, SubString{String}}}, body::Vector{UInt8}, query::Nothing, kw::@Kwargs{redirect::Bool, retry::Bool, response_stream::Base.BufferStream})
โ @ HTTP C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\HTTP.jl:457
โ [13] #request#20
โ @ HTTP C:\Users\mrufsvold\.julia\packages\HTTP\Y2JKB\src\HTTP.jl:315 [inlined]
โ [14] macro expansion
โ @ C:\Users\mrufsvold\.julia\packages\Mocking\Q17aB\src\mock.jl:29 [inlined]
โ [15] (::AWS.var"#48#50"{AWS.Request, OrderedCollections.LittleDict{Symbol, Any, Vector{Symbol},
Vector{Any}}})()
โ @ AWS C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\utilities\request.jl:225
โ [16] (::Base.var"#96#98"{Base.var"#96#97#99"{AWS.AWSExponentialBackoff, AWS.var"#49#51", AWS.var"#48#50"{AWS.Request, OrderedCollections.LittleDict{Symbol, Any, Vector{Symbol}, Vector{Any}}}}})(; kwargs::@Kwargs{})
โ @ Base .\error.jl:308
โ [17] (::Base.var"#96#98"{Base.var"#96#97#99"{AWS.AWSExponentialBackoff, AWS.var"#49#51", AWS.var"#48#50"{AWS.Request, OrderedCollections.LittleDict{Symbol, Any, Vector{Symbol}, Vector{Any}}}}})()
โ @ Base .\error.jl:291
โ [18] _http_request(http_backend::AWS.HTTPBackend, request::AWS.Request, response_stream::IOBuffer)
โ @ AWS C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\utilities\request.jl:250
โ [19] macro expansion
โ @ C:\Users\mrufsvold\.julia\packages\Mocking\Q17aB\src\mock.jl:29 [inlined]
โ [20] (::AWS.var"#41#44"{AWS.AWSConfig, AWS.Request, IOBuffer, Vector{Int64}})()
โ @ AWS C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\utilities\request.jl:134
โ [21] (::AWS.var"#42#46"{AWS.var"#41#44"{AWS.AWSConfig, AWS.Request, IOBuffer, Vector{Int64}}, IOBuffer})()
โ @ AWS C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\utilities\request.jl:149
โ [22] (::Base.var"#96#98"{Base.var"#96#97#99"{AWS.AWSExponentialBackoff, AWS.var"#43#47"{AWS.AWSConfig, Vector{String}, Vector{String}, Int64}, AWS.var"#42#46"{AWS.var"#41#44"{AWS.AWSConfig, AWS.Request, IOBuffer, Vector{Int64}}, IOBuffer}}})(; kwargs::@Kwargs{})
โ @ Base .\error.jl:296
โ [23] (::Base.var"#96#98"{Base.var"#96#97#99"{AWS.AWSExponentialBackoff, AWS.var"#43#47"{AWS.AWSConfig, Vector{String}, Vector{String}, Int64}, AWS.var"#42#46"{AWS.var"#41#44"{AWS.AWSConfig, AWS.Request, IOBuffer, Vector{Int64}}, IOBuffer}}})()
โ @ Base .\error.jl:291
โ [24] submit_request(aws::AWS.AWSConfig, request::AWS.Request; return_headers::Nothing)
โ @ AWS C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\utilities\request.jl:200
โ [25] (::AWS.RestXMLService)(request_method::String, request_uri::String, args::Dict{String, Any}; aws_config::AWS.AWSConfig, feature_set::AWS.FeatureSet)
โ @ AWS C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\AWS.jl:287
โ [26] RestXMLService
โ @ C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\AWS.jl:251 [inlined]
โ [27] #put_object#172
โ @ C:\Users\mrufsvold\.julia\packages\AWS\SchLh\src\services\s3.jl:5754 [inlined]
โ [28] s3_put(aws::AWS.AWSConfig, bucket::SubString{String}, path::String, data::Vector{UInt8}, data_type::String, encoding::String; acl::String, metadata::Dict{String, String}, tags::Dict{String, String}, parse_response::Bool, kwargs::@Kwargs{})
โ @ AWSS3 C:\Users\mrufsvold\.julia\packages\AWSS3\8cxdr\src\AWSS3.jl:1037
โ [29] s3_put
โ @ AWSS3 C:\Users\mrufsvold\.julia\packages\AWSS3\8cxdr\src\AWSS3.jl:985 [inlined]
โ [30] write(fp::S3Path{Nothing}, content::Vector{UInt8}; part_size_mb::Int64, multipart::Bool, returns::Symbol, other_kwargs::@Kwargs{})
โ @ AWSS3 C:\Users\mrufsvold\.julia\packages\AWSS3\8cxdr\src\s3path.jl:696
โ [31] write
โ @ C:\Users\mrufsvold\.julia\packages\AWSS3\8cxdr\src\s3path.jl:674 [inlined]
โ [32] macro expansion
โ @ c:\Users\mrufsvold\Projects\DIL-price-transparency-psd\TableConsolidator.jl\src\jobs\GetSendJob.jl:17 [inlined]
โ [33] macro expansion
โ @ C:\Users\mrufsvold\.julia\packages\Retry\vS1bg\src\repeat_try.jl:192 [inlined]
โ [34] write_and_log(job::Main.TableConsolidator.SendJob)
โ @ Main.TableConsolidator c:\Users\mrufsvold\Projects\DIL-price-transparency-psd\TableConsolidator.jl\src\jobs\GetSendJob.jl:15
โ [35] (::Main.TableConsolidator.var"#60#61"{Channel{Main.TableConsolidator.SendJob}})()
โ @ Main.TableConsolidator c:\Users\mrufsvold\Projects\DIL-price-transparency-psd\TableConsolidator.jl\src\TableConsolidator.jl:71
โ @ Main.TableConsolidator c:\Users\mrufsvold\Projects\DIL-price-transparency-psd\TableConsolidator.jl\src\jobs\GetSendJob.jl:22
Some version info:
julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 8 ร 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 8 virtual cores
(TableConsolidator) pkg> status
Project TableConsolidator v0.1.0
Status `C:\Users\mrufsvold\Projects\DIL-price-transparency-psd\TableConsolidator.jl\Project.toml`
โ [fbe9abb3] AWS v1.90.3
[1c724243] AWSS3 v0.11.2
[336ed68f] CSV v0.10.14
[0f8b85d8] JSON3 v1.14.0
โ [e6f89c97] LoggingExtras v1.0.2
[98105f81] LoggingFormats v1.5.0
โ [98572fba] Parquet2 v0.2.19
โ [2dfb63ee] PooledArrays v1.4.2
[20febd7b] Retry v0.4.1
โ [bd369af6] Tables v1.10.1
โ [28d57a85] Transducers v0.4.78
โ [9d95f2ec] TypedTables v1.4.3
[56ddb016] Logging
[9a3f8284] Random
Info Packages marked with โ have new versions available and may be upgradable.
The files are not huge, around 300MB. And the retry loop usually succeeds the second time. I notice that, when it is going to fail, it hangs for a long time before finally throwing this error.
Edit: I canโt seem to reproduce this error in isolation. This write
function is getting called in a program that has a number of async
read operations. So maybe there is an issue with SSL being used from multiple threads authenticating the reads at the same time as my write loop?
Edit2: openssl - SSL_read failing with SSL_ERROR_SYSCALL error - Stack Overflow indicates that S3 might be closing the connection before getting to the actual EOF. So maybe my write thread is getting interrupted, S3 drops the connection, and then when it tries to write again, we hit this error. If thatโs the case, Iโm not sure how to convince Julia that a task should not be interrupted.
Edit3: Iโm more and more convinced that this is happening because of task switching. Consistently, I hit this error when the start of a write step unblocks a new batch of reads. I think what might be happening is that I take!
a send job, it starts, but that unblocks upstream put!
s for read
tasks. So then the writer is interrupted to consume the batch of files. And then S3 closes the connection. The retry succeeds immediately because upstream tasks are blocked again.
I see that there is no API to tell a task not to yield. So is this a bug in AWSS3 that it doesnโt more gracefully handle retries of multipart uploads?