I wrote a picture collector, to gather the images, take at my sisters wedding.
At some point in the tests I encountered a timeout caused by the zip files creation. It was too large, more precisely, it took to long, to compile the zip, before sending it off to the client. The advantage of having the zip archive on disk is, you don’t need to build it another time.
The disadvantage on the other hand is, you need to have it ready, as soon, as someone asks for it.
The logical reason was to create the archive on the fly.
Et voila, it downloads like a charm.
This could be the end and could all live happy ever after, couldn’t it?
There is a catch. These images are going to be downloaded 50 or 100 times. Surely a modern CPU will handle it, and given the audience, a slow download won’t bother anybody, but why wrap the present 50 or 100 times, if you can do it once and copy it?
So I added an update date to my datamodel, which I could compare with the file time of the zip and started wondering, how to write one stream into two output streams.
The answer was kindly provided by google and a github user called tenbits.
MultiStream is a simple wrapper class, you can provide multiple streams, and all calls are passed down to these.
When the first user starts the download, the archive will be written to disk and send to the user, the next user downloading will get the file from disk, if it is newer than the latest change in its collection. The only time the zip is created another time for the same set of pictures, is while the first download. I didn’t want to go down the rabbit hole of tapping into a file, that is actively written, possibly with a faster connection, so the second downloader reaches the end of the file and ends up with a crippled archive.
Zip writes its table of contents to the end, one missing byte will ruin your archive, so I didnt take any chances.