Monday, July 20, 2009

Uploading large files using WCF

In WCF, uploading large files (or transferring large data, in general) can be achieved in two ways - one way involves configuring the 'binding' to use 'streamed' mode while other involves using the 'buffered' mode. In the 'buffered' mode, all the data to be sent is buffered in memory at the client, and then put on the wire. On the service side, again, all the sent data is buffered before delivering it to the service. In the 'streamed' mode, the data is delivered as it is received by the underlying transport. Microsoft recommends using the 'buffered' mode whenever possible. I'll be talking about 'buffered' mode in this post.
 Lets take a simple service implementation as shown below
//service contract
[OperationContract]
void UploadData(byte[] data, int length);

//service implementation
public void UploadData(byte[] data, int length)
{
...
.. 
       using (FileStream fs = new FileStream(_filePath, FileMode.Create, FileAccess.Write)
       {
           fs.write(data, 0, length);
       }
...
..
}
Here, ‘data’ is a byte array that contains file data and 'length' is the amount of data to be written. A client will then read the file as a byte array and invoke the service. On the receiving side the service simply writes the received byte array to a local file. But there is a big catch here. If the file size is multiple GBs, for example, then WCF will attempt to buffer it in memory before delivering it to the service, thus consuming multiple GBs of RAM for one file transfer. Obviously, this is not the best solution.

One simple solution to avoid this issue is to split the file into multiple smaller sized chunks and send them in each service call. This way only the memory required to hold the chunk would be consumed in each call. The service implementation would then take care of putting them together to form the original file. Here’s some code for doing this on the client side

// client code
void SendFile(string fileToSend, UploadServiceClient client)
{
  // uniquely identifies the file to the service
  string guid = Guid.NewGuid().ToString();

  // amount of data to be sent in each service call
  const int chunkSize = 20 * 1024 * 1024; //20MB chunk
  byte[] fileData = new byte[chunkSize];

  using (FileStream fs = File.OpenRead(fileToSend))
  {
     int bytesRead = 0;
     // keep reading and sending the chunks over
     while ((bytesRead = fs.Read(fileData, 0, fileData.Length)) != 0)
     {
         client.UploadData(guid, fileData, bytesRead);
     }
  }
}
The service implementation code would need to be modified accordingly..

// service code
Dictionary<string, FileStream> _files = new Dictionary<string, FileStream>();
void UploadData(string guid, byte[] data, int noOfBytesToWrite)
{
   FileStream fs;
   if (_files.ContainsKey(guid))
   {
       fs = _files[guid];
   }
   else
   {
      fs = new FileStream(guid, FileMode.OpenOrCreate, FileAccess.Write);
   }
   fs.Write(fileData, 0, noOfBytesToWrite);
}
The client invokes the service for each chunk it wants to send over until the file is completely read. A unique 'guid' is created for each file the client wishes to send. On the service side, this 'guid' helps identify the correct FileStream to use when writing the received data to a local file. Notice that we are using chunks of size of 20MB, but this is not going to work out-of-the-box.

 WCF standard bindings have a default limit (MaxReceivedMessageSize) of 64KB on the size of the 'message' that can be received. Note that the limit is on the 'message' size so the overhead caused by SOAP header and XML attributes/elements required to form the SOAP message should also be taken into account. Also, by default, the HTTP based standard bindings (BasicHttpBinding and WsHttpBinding) base64 encode binary data (the 'data' byte array in our case), which would again increase the size of the 'message' by approximately 4/3 times. So, to send 20MB of data we need to set MaxReceivedMessageSize to at least a little over 26MB. The Net* bindings do not have this issue, but then they are WCF specific and not recommended by Microsoft where maximum interoperability is needed.

 Another important factor we need to take into consideration is the no of elements in the array being received. This is controlled by the MaxArrayLength value of the ReaderQuotas (type XmlDictionaryReaderQuotas) property of the 'binding'. This number determines the max allowed length of the array that could be received. The default is 16384, which again in our case should be changed to 20*1024.  All these properties can also be set in app.config
..
<bindings>
      <basicHttpBinding>
             <binding name="ChunkedUpload" maxReceivedMessageSize="30000">
                       <readerQuotas maxArrayLength="30000" />
             </binding>
      </basicHttpBinding>
...
Another factor to take into account is the possibility of timeouts when sending data over slower networks. The default 'send' and 'receive' timeouts (SendTimeout and ReceiveTimeout) for standard WCF bindings are about 1 min. These timeouts apply to each service call, so, on an internal network (100/10000Mbps LAN) 20MBs should be easily sent and received within a min.

1 comment:

  1. This comment has been removed by a blog administrator.

    ReplyDelete