Persistent file transfer
A file upload or download may experience various types of transient failures:
- One or more data servers have failed.
- A network connection fails.
- The host PC is turned off or the core client quits.
BOINC uses a mechanism called persistent file transfer for efficiently recovering from these conditions, and deciding when a permanent failure has occurred. The FILE_XFER class encapsulates a single transfer session to a particular data server. If the file has previously partially transferred, FILE_XFER resumes at the appropriate point.
The PERS_FILE_XFER class encapsulates a persistent file transfer, which may involve a sequence of FILE_XFERs, possibly to different data servers.
When a file is involved in a persistent file transfer, the state is saved in the client state file in the following XML element (included in the <file_info> element):
<persistent_file_xfer> <num_retries>2</num_retries> <first_request_time>1030665600</first_request_time> <next_request_time>1030665725</next_request_time> </persistent_file_xfer>
- The num_retries element is the number of transfer sessions so far.
- The first_request_time element is the time the first transfer session started.
- The next_request_time element is the earliest time to start a new transfer session.
When there is a transient failure, the core client increments num_retries and calculates a new next_request_time based on randomized exponential backoff, given by
next_request_time = current_time+max(MIN_DELAY,min(MAX_DELAY,exp(rand(0,1)*num_retries)))
Where MIN_DELAY is 1 minute and MAX_DELAY is 4 hours. The client classifies the transfer as a permanent failure if the current time becomes much later than this (default is two weeks). ??? later than what? In this event, the file will be deleted and the failure reported to the scheduling server via an <download_error> or <upload_error> tag with error_code -114 in the result (see The BOINC scheduling server protocol). The client will also record a Giving up on upload message.