Rewrite streamUpload #22

domenkozar · 2020-03-24T14:37:53Z

I wanted to rewrite streaming to improve performance of chunking. DList is quite inefficient compared to vectorBuilder. See https://www.fpcomplete.com/blog/2014/07/vectorbuilder-packed-conduit-yielding

A few improvements along the way:

control flow is still a mess, bit I think a bit better now
concurrent uploads during streaming

Need testing.

Out of scope: rewriting concurrentUpload in terms of streamUpload

domenkozar · 2020-03-24T18:43:51Z

cc @axman6

domenkozar · 2020-03-24T19:04:59Z

Something is wrong, it's leaking memory.

axman6

Thanks for the PR @domenkozar, despite the mostly negative comments below, I do very much like the restructuring of this - splitting things into conduits is something I probably should have already done, and the use of concurrentMapM_ quite nice, I like the idea of being able to build up Parts while others are being uploaded.

It looks like there's very serious issue with vectorBuilder - unless I'm really not understanding the code, it looks like it completely avoids streaming the data, which will lead to huge memory problems, and defeat the purpose of this library.

src/Network/AWS/S3/StreamingUpload.hs

axman6 · 2020-03-24T23:07:19Z

src/Network/AWS/S3/StreamingUpload.hs

+import qualified Data.Vector.Storable as ScopedTypeVariables
+import Conduit           ( MonadUnliftIO(..), mapC, PrimMonad )
+import Data.Conduit      ( ConduitT, Void, await, catchC, handleC, (.|), leftover, yield, awaitForever)
+import Data.Conduit.Combinators (vectorBuilder, sinkList)
 import Data.Conduit.List ( sourceList )


Can we keep the formatting of imports consistent? (Looks like I wasn't very consistent, but I though this repo had a stylish-haskell config so it should be fixed automatically if you run that)

Done (I hope, there weren't consistent beforehand).

axman6 · 2020-03-24T23:09:01Z

src/Network/AWS/S3/StreamingUpload.hs

+    startUpload :: (MonadUnliftIO m, MonadAWS m, MonadFail m, MonadResource m) =>
+                        ConduitT
+                        (Int, ByteString)
+                        Void
+                        m
+                        (Either
+                           (AbortMultipartUploadResponse, SomeException)
+                           CompleteMultipartUploadResponse)


Suggested change

startUpload :: (MonadUnliftIO m, MonadAWS m, MonadFail m, MonadResource m) =>

ConduitT

(Int, ByteString)

Void

m

(Either

(AbortMultipartUploadResponse, SomeException)

CompleteMultipartUploadResponse)

startUpload :: (MonadUnliftIO m, MonadAWS m, MonadFail m, MonadResource m)

=> ConduitT (Int, ByteString) Void m

(Either (AbortMultipartUploadResponse, SomeException)

CompleteMultipartUploadResponse)

axman6 · 2020-03-24T23:10:06Z

src/Network/AWS/S3/StreamingUpload.hs

@@ -85,74 +92,107 @@ See the AWS documentation for more details.

 May throw 'Network.AWS.Error'
 -}
-streamUpload :: (MonadUnliftIO m, MonadAWS m, MonadFail m)
+streamUpload :: (MonadUnliftIO m, MonadAWS m, MonadFail m, MonadResource m, PrimMonad m)


We have been talking about replacing thew MonadAWS m constraint with AWSConstraint r m, if possible it'd be good to do that here too. That would bring in the MonadResource constraint IIRC. Can any of these constraints be removed easily enough? MonadFail is a bit of a pain, I had thought about making the return type of this conduit a sum type instead of the Either, which could include a constructor for failures.

I would appreciate if that's out of scope here.

src/Network/AWS/S3/StreamingUpload.hs

- more performance chunking - concurrent upload - a tiny bit better flow

domenkozar · 2020-05-18T11:42:48Z

@axman6 I've updated the PR to use raw memory buffer that should become part of the conduit api at some point snoyberg/conduit#438

The last blocking thing is amazonka release as it needs some extra monad instances for AWST. See brendanhay/amazonka#574

This reduces GC time on the 100mb upload from 2s to 0.1s. (cherry picked from commit 72c4628) Signed-off-by: Domen Kožar <[email protected]>

This is approximately 80x faster. (cherry picked from commit 5824b37) Signed-off-by: Domen Kožar <[email protected]>

domenkozar · 2020-05-20T08:32:23Z

I can confirm with this code I can upload to s3 bucket with ~80MB/s.

domenkozar · 2020-06-04T18:53:18Z

@axman6 ping?

axman6 · 2020-06-05T06:38:15Z

Sorry, I hadn't seen the updates here. One of the goals I'd had for the conduit side of things was to avoid copying data if possible, which is why the I was using DLists of bytestrings. I'm not sure that is a great idea though, and the use of a preallocated buffer is nice, but it makes me think that maybe we should just be using byte string builders if we're going to go that route, and be more sure that we're not going to have dangerous memory accesses. I might make a fork of this branch and play with it to see if I can come up with something I'm happier with.
Again sorry this has taken so long, life been hectic and maintaining libraries has been far from the front of my mind.

axman6

I'm having a go at rewriting this at the moment to use Bytestring.Builder. Initial results are very promising, but I'm worried something is going wrong with my benchmarks - I'm unable achieve the 80MB/s speeds you mentioned, for some reason I'm only getting 8MB/s, and there's definitely a memory leak, it seems to be holding onto all of the input data.

With the Builder version, I'm achieving 80MB/s from an EC2 host in the same region as the bucket. I'm also seeing a similar memory leak though.

axman6 · 2020-06-05T06:54:49Z

src/Network/AWS/S3/StreamingUpload.hs

+processChunk :: ChunkSize -> ByteString -> S -> IO ([ByteString], S)
+processChunk chunkSize input =
+    loop id 0
+  where
+    loop front idxIn s@(S fptr ptr idxOut)
+        | idxIn >= B.length input = return (front [], s)
+        | otherwise = do
+            pokeByteOff ptr idxOut (unsafeIndex input idxIn)
+            let idxOut' = idxOut + 1
+                idxIn' = idxIn + 1
+            if idxOut' >= chunkSize
+                then do
+                    let bs = PS fptr 0 idxOut'
+                    s' <- newS chunkSize
+                    loop (front . (bs:)) idxIn' s'
+                else loop front idxIn' (S fptr ptr idxOut')


This looks like it's quite a convoluted way to implement memcpy, which already exists in as copyBytes. Also does line 316 mean that the buffer might end up being filled to less than chunkSize? the multipart upload API requires that parts are at least 5MB, so we need to be sure that we always send at least chunkSize bytes in each request.
I'm struggling to assess the correctness of this code, for example I can't tell without thinking very hard whether it's possible that chunks might get not be included in the output - what happens to input in the idxIn >= B.length input case? I think this is another good reason to use Builders - we can accumulate input bytestrings until we have allocated at least chunkSize bytes, and then output the builder, and its length, so we can use https://hackage.haskell.org/package/bytestring-0.10.10.0/docs/Data-ByteString-Builder-Extra.html to produce a lazy byte string with a single chunk. This avoids doing all the pointer mangling by hand.

domenkozar · 2020-12-23T22:55:14Z

@axman6 what would you like to do with this?

domenkozar · 2021-09-10T11:46:19Z

I'm using this in production for http://cachix.org/.

I understand the code is not ideal, but it improves effectiveness and CPU/memory usage over the current code.

I also understand you want to keep the code tidy and correct.

Closing as I don't see a way forward. That said, thanks a lot for this library!

ghost · 2021-10-01T01:24:00Z

Hey @domenkozar, would you be interested in taking over this package? I don't have a use for it or time to maintain it (not that I ever did really). If so, I am happy to add/make you the maintainer on Hackage and you can keep it going with releases if you like. It's great to hear it's being used in production for something so useful, and I'm even happier to know it's not using my atrocious code! 😅

domenkozar · 2021-10-04T22:10:35Z

Hey @axman6 / @axman6-da, I gave this a thought and I'm not sure I'm in a good place to maintain this library as well. Already have too many packages on my plate :(

Maybe you could make a call for maintainers and see if someone steps up?

endgame · 2024-07-05T02:23:53Z

If the memory leak is in request bodies like the Glacier issue linked upthread, snoyberg/http-client#538 seems pretty suspicious as a cause.

domenkozar force-pushed the rewrite branch 2 times, most recently from 79a50ac to 59bc3b4 Compare March 24, 2020 18:41

domenkozar changed the title ~~Rewrite~~ Rewrite streamUpload Mar 24, 2020

domenkozar marked this pull request as ready for review March 24, 2020 18:43

domenkozar force-pushed the rewrite branch from 59bc3b4 to 730237b Compare March 24, 2020 19:11

axman6 reviewed Mar 25, 2020

View reviewed changes

domenkozar force-pushed the rewrite branch from 8900f25 to b60917d Compare May 8, 2020 10:04

domenkozar mentioned this pull request May 10, 2020

Concurrent uploading results in WrongRequestBodyStreamSize fpco/cache-s3#26

Open

domenkozar added 3 commits May 10, 2020 11:59

Rewrite streamUpload:

c7a10a6

- more performance chunking - concurrent upload - a tiny bit better flow

stack: build exe and use latest amazonka to fix build

1691883

s3upload: set region

fd33f89

domenkozar force-pushed the rewrite branch from ca7a610 to abf9442 Compare May 10, 2020 10:00

domenkozar force-pushed the rewrite branch from 38c8185 to 7c81af7 Compare May 18, 2020 13:54

mpickering and others added 3 commits May 20, 2020 10:30

Disable parallel gc

3ff4a59

This reduces GC time on the 100mb upload from 2s to 0.1s. (cherry picked from commit 72c4628) Signed-off-by: Domen Kožar <[email protected]>

Use raw memory buffer for chunking rather than vectorBuilder

303d886

This is approximately 80x faster. (cherry picked from commit 5824b37) Signed-off-by: Domen Kožar <[email protected]>

Cleanup

234fb6c

domenkozar force-pushed the rewrite branch from 7c81af7 to 234fb6c Compare May 20, 2020 08:30

axman6 reviewed Jun 5, 2020

View reviewed changes

domenkozar closed this Sep 9, 2021

endgame mentioned this pull request Oct 4, 2021

Apparent leak when sending repeated requests to Glacier brendanhay/amazonka#475

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite streamUpload #22

Rewrite streamUpload #22

domenkozar commented Mar 24, 2020 •

edited

Loading

domenkozar commented Mar 24, 2020

domenkozar commented Mar 24, 2020

axman6 left a comment

axman6 Mar 24, 2020

domenkozar May 18, 2020

axman6 Mar 24, 2020

axman6 Mar 24, 2020 •

edited

Loading

domenkozar May 18, 2020

domenkozar commented May 18, 2020 •

edited

Loading

domenkozar commented May 20, 2020

domenkozar commented Jun 4, 2020

axman6 commented Jun 5, 2020

axman6 left a comment

axman6 Jun 5, 2020

domenkozar commented Dec 23, 2020

domenkozar commented Sep 10, 2021 •

edited

Loading

ghost commented Oct 1, 2021

domenkozar commented Oct 4, 2021

endgame commented Jul 5, 2024

Rewrite streamUpload #22

Rewrite streamUpload #22

Conversation

domenkozar commented Mar 24, 2020 • edited Loading

domenkozar commented Mar 24, 2020

domenkozar commented Mar 24, 2020

axman6 left a comment

Choose a reason for hiding this comment

axman6 Mar 24, 2020

Choose a reason for hiding this comment

domenkozar May 18, 2020

Choose a reason for hiding this comment

axman6 Mar 24, 2020

Choose a reason for hiding this comment

axman6 Mar 24, 2020 • edited Loading

Choose a reason for hiding this comment

domenkozar May 18, 2020

Choose a reason for hiding this comment

domenkozar commented May 18, 2020 • edited Loading

domenkozar commented May 20, 2020

domenkozar commented Jun 4, 2020

axman6 commented Jun 5, 2020

axman6 left a comment

Choose a reason for hiding this comment

axman6 Jun 5, 2020

Choose a reason for hiding this comment

domenkozar commented Dec 23, 2020

domenkozar commented Sep 10, 2021 • edited Loading

ghost commented Oct 1, 2021

domenkozar commented Oct 4, 2021

endgame commented Jul 5, 2024

domenkozar commented Mar 24, 2020 •

edited

Loading

axman6 Mar 24, 2020 •

edited

Loading

domenkozar commented May 18, 2020 •

edited

Loading

domenkozar commented Sep 10, 2021 •

edited

Loading