Difference: ClientRetryStrategy (1 vs. 31)

Revision 312019-09-24 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Changed:
<
<
This section only describes frontier client version 2.8.15 (February 2016) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
>
>
This section only describes frontier client version 2.8.21 (October 2019) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
  Definitions:
  • proxy group - either all the IP addresses of one IP family (IPv4 or IPv6) in a round-robin DNS name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
Line: 30 to 30
 
  1. If there is a max-age exceeded condition, retry the same server and proxy with the specified maximum cache age N seconds (Cache-control: max-age=N). If the condition persists after the retry (should only happen on buggy versions of squid), ignore it and move on to the next strategy.
  2. If a server error happens, keep the same proxy and try the next server URL. This server is flagged as having had an error so it is not tried again until after a 30 minute reset (this is mainly relevant when loadbalance=servers). However, if this happens on the last server, clear all the server error flags, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  3. If a connect error to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
Changed:
<
<
  1. If a protocol error happens and the soft retry of step 1 has already happened (which means there was a max-age exceeded condition), with the same proxy & server retry the request with a hard refresh (Pragma: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because of If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
>
>
  1. If a protocol error happens and the soft retry of step 1 has already happened (which means there was a max-age exceeded condition), with the same proxy & server retry the request with a hard refresh (Cache-Control: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because of If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
 
  1. For all other errors, do the same as a protocol error except without the hard refresh.
  2. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).
Line: 44 to 44
  Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.
Added:
>
>
For frontier clients 2.8.15 (February 2016) through 2.8.20, these are the differences compared to 2.8.21 (October 2019) above:
  1. Hard refreshes were done with Pragma: no-cache instead of Cache-Control: no-cache.
 For frontier clients 2.8.9 (January 2014) through 2.8.14, these are the differences compared to 2.8.15 (February 2016) above:
  1. Different IP families in the same DNS name were not considered separate groups (in fact IPv6 was not supported at all before 2.8.14).
  2. 404 was considered a protocol error, not a server error.

Revision 302018-03-09 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 30 to 30
 
  1. If there is a max-age exceeded condition, retry the same server and proxy with the specified maximum cache age N seconds (Cache-control: max-age=N). If the condition persists after the retry (should only happen on buggy versions of squid), ignore it and move on to the next strategy.
  2. If a server error happens, keep the same proxy and try the next server URL. This server is flagged as having had an error so it is not tried again until after a 30 minute reset (this is mainly relevant when loadbalance=servers). However, if this happens on the last server, clear all the server error flags, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  3. If a connect error to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
Changed:
<
<
  1. If a protocol error happens and the soft retry of step 1 has already happened (which means there was a max-age exceeded condition), with the same proxy & server retry the request with a hard refresh (Pragma: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
>
>
  1. If a protocol error happens and the soft retry of step 1 has already happened (which means there was a max-age exceeded condition), with the same proxy & server retry the request with a hard refresh (Pragma: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because of If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
 
  1. For all other errors, do the same as a protocol error except without the hard refresh.
  2. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).

Revision 292018-01-19 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 21 to 21
  The frontier client error retry strategy distinguishes between five types of non-fatal errors:
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the max age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for that client's request to the database. A condition that the server might notice at that point is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
Changed:
<
<
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code of either 404 or in the 500 series. The result will not be cached.
>
>
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code of either 404 or in the 500 series. The result will not be cached. Server errors can also be caused by a variety of problems with downloading the host certificate when the signing security feature is enabled, and that result may be cached.
 
  1. A connect error - the socket couldn't be connected, meaning that a proxy or direct-connect server is down. This can be connection timeout, connection refused, or network unreachable.
Changed:
<
<
  1. A protocol error - either the response http code was not 200 OK, the code was not 404 or in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).
>
>
  1. A protocol error - either the response http code was not 200 OK (and the code was not 404 or in the 500 server-error series, covered above), the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).
 
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

Line: 45 to 45
 Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.

For frontier clients 2.8.9 (January 2014) through 2.8.14, these are the differences compared to 2.8.15 (February 2016) above:

Changed:
<
<
  • Different IP families in the same DNS name were not considered separate groups (in fact IPv6 was not supported at all before 2.8.14).
  • 404 was considered a protocol error, not a server error.
  • Connection refused and network unreachable were not considered connect errors, only connection timeout was.
  • There was a bug present since 2.8.6 that considered a whole proxy group as failed when the last one failed, even if previous ones hadn't failed. That could happen after load balancing within a proxy group.
  • There was a bug present since 2.8.9 where if a server did not send an Age header (because it was a fresh query) and the previous query on the same connection had an old enough Age, a max-age exceeded condition could be erroneously triggered; the Age from the previous query was applied to the one without an Age.

For frontier clients version 2.8.6 (April 2013) through 2.8.8, there are these differences compared to the 2.8.9 strategies above:

  • There was no concept of a max-age exceeded condition.
  • For a protocol error, the request would be immediately retried first with a soft retry (Cache-control: max-age=0) before the hard retry.
>
>
  1. Different IP families in the same DNS name were not considered separate groups (in fact IPv6 was not supported at all before 2.8.14).
  2. 404 was considered a protocol error, not a server error.
  3. Connection refused and network unreachable were not considered connect errors, only connection timeout was.
  4. There was a bug present since 2.8.6 that considered a whole proxy group as failed when the last one failed, even if previous ones hadn't failed. That could happen after load balancing within a proxy group.
  5. There was a bug present since 2.8.9 where if a server did not send an Age header (because it was a fresh query) and the previous query on the same connection had an old enough Age, a max-age exceeded condition could be erroneously triggered; the Age from the previous query was applied to the one without an Age.

For frontier clients version 2.8.7 (June 2013) and 2.8.8, there are these differences compared to the 2.8.9 strategies above:

  1. There was no concept of a max-age exceeded condition.
  2. For a protocol error, the request would be immediately retried first with a soft retry (Cache-control: max-age=0) before the hard retry.

For frontier client version 2.8.6 (April 2013) compared to the 2.8.7 strategies above:

  • There was no digital signature support so there were no server errors with downloading host certificates.
  For frontier client versions 2.8.2 through 2.8.5, these are the additional differences:

Revision 282018-01-17 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 21 to 21
  The frontier client error retry strategy distinguishes between five types of non-fatal errors:
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the max age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for that client's request to the database. A condition that the server might notice at that point is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
Changed:
<
<
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code was either 404 or in the 500 series. The result will not be cached.
>
>
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code of either 404 or in the 500 series. The result will not be cached.
 
  1. A connect error - the socket couldn't be connected, meaning that a proxy or direct-connect server is down. This can be connection timeout, connection refused, or network unreachable.
  2. A protocol error - either the response http code was not 200 OK, the code was not 404 or in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).
  3. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

Revision 272016-02-04 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 40 to 40
 
  • PROXY1+SERVER2: server error, no more servers so go back to the beginning of the server list and try next proxy
To avoid that situation, if the server list has been set back to its beginning by a server error while in a given proxy group since the last 5-minute full reset, do not try to start again at the beginning of that proxy group even if the strategy calls for it.
Deleted:
<
<
When there are no proxies in use either because they weren't configured or they all failed and the client directly connects to the server, the host names listed in serverurls also get resolved in the DNS like proxies, and most of the same load balancing and retry logic is applied to them directly where it makes sense; the same code is reused.
 

Older client version differences

Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.

Revision 262016-02-03 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 22 to 22
 The frontier client error retry strategy distinguishes between five types of non-fatal errors:
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the max age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for that client's request to the database. A condition that the server might notice at that point is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
  2. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code was either 404 or in the 500 series. The result will not be cached.
Changed:
<
<
  1. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
>
>
  1. A connect error - the socket couldn't be connected, meaning that a proxy or direct-connect server is down. This can be connection timeout, connection refused, or network unreachable.
 
  1. A protocol error - either the response http code was not 200 OK, the code was not 404 or in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).
  2. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

  1. If there is a max-age exceeded condition, retry the same server and proxy with the specified maximum cache age N seconds (Cache-control: max-age=N). If the condition persists after the retry (should only happen on buggy versions of squid), ignore it and move on to the next strategy.
  2. If a server error happens, keep the same proxy and try the next server URL. This server is flagged as having had an error so it is not tried again until after a 30 minute reset (this is mainly relevant when loadbalance=servers). However, if this happens on the last server, clear all the server error flags, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
Changed:
<
<
  1. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
>
>
  1. If a connect error to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
 
  1. If a protocol error happens and the soft retry of step 1 has already happened (which means there was a max-age exceeded condition), with the same proxy & server retry the request with a hard refresh (Pragma: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
  2. For all other errors, do the same as a protocol error except without the hard refresh.
  3. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).
Line: 47 to 47
 Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.

For frontier clients 2.8.9 (January 2014) through 2.8.14, these are the differences compared to 2.8.15 (February 2016) above:

Changed:
<
<
  • IP families were not considered separate groups (in fact IPv6 was not supported at all before 2.8.14)
  • 404 was considered a protocol error, not a server error
>
>
  • Different IP families in the same DNS name were not considered separate groups (in fact IPv6 was not supported at all before 2.8.14).
  • 404 was considered a protocol error, not a server error.
  • Connection refused and network unreachable were not considered connect errors, only connection timeout was.
 
  • There was a bug present since 2.8.6 that considered a whole proxy group as failed when the last one failed, even if previous ones hadn't failed. That could happen after load balancing within a proxy group.
Changed:
<
<
  • There was a bug where if a server did not send an Age header (because it was a fresh query) and the previous query on the same connection had an old enough Age, a max-age exceeded condition could be erroneously triggered; the Age from a previous query was applied to the next one.
>
>
  • There was a bug present since 2.8.9 where if a server did not send an Age header (because it was a fresh query) and the previous query on the same connection had an old enough Age, a max-age exceeded condition could be erroneously triggered; the Age from the previous query was applied to the one without an Age.
  For frontier clients version 2.8.6 (April 2013) through 2.8.8, there are these differences compared to the 2.8.9 strategies above:
  • There was no concept of a max-age exceeded condition.

Revision 252016-02-02 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Changed:
<
<
This section only describes frontier client version 2.8.9 (January 2014) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
>
>
This section only describes frontier client version 2.8.15 (February 2016) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
  Definitions:
Changed:
<
<
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
>
>
  • proxy group - either all the IP addresses of one IP family (IPv4 or IPv6) in a round-robin DNS name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
  If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error or a response is larger than 16KB. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error. Using a different proxy in the proxy group when possible is done to improve load balancing. The server URL will be the same unless loadbalance=servers is set, in which case another random server that hasn't had a server error will be chosen.
Line: 13 to 13
  Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.
Added:
>
>
Order of proxy groups:
  • In general, proxy groups are tried in the order they appear in the configuration. The exceptions are:
    • if loadbalance=proxies is set, the proxyurls are randomized
    • backuproxyurls always come after proxyurls
  • If there are both IPv4 and IPv6 addresses in the same DNS name (either proxyurl or backuprpoxyurl), the IP family chosen by the preferipfamily option (default 4) will be the first group and the other family will be the next group. If preferipfamily=0, the family of the first address returned by the operating system will be the preferred family.
 The frontier client error retry strategy distinguishes between five types of non-fatal errors:
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the max age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for that client's request to the database. A condition that the server might notice at that point is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
Changed:
<
<
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
>
>
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code was either 404 or in the 500 series. The result will not be cached.
 
  1. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
Changed:
<
<
  1. A protocol error - either the response http code was not 200 OK, the code was not in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).
>
>
  1. A protocol error - either the response http code was not 200 OK, the code was not 404 or in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).
 
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

Line: 34 to 40
 
  • PROXY1+SERVER2: server error, no more servers so go back to the beginning of the server list and try next proxy
To avoid that situation, if the server list has been set back to its beginning by a server error while in a given proxy group since the last 5-minute full reset, do not try to start again at the beginning of that proxy group even if the strategy calls for it.
Added:
>
>
When there are no proxies in use either because they weren't configured or they all failed and the client directly connects to the server, the host names listed in serverurls also get resolved in the DNS like proxies, and most of the same load balancing and retry logic is applied to them directly where it makes sense; the same code is reused.
 

Older client version differences

Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.

Added:
>
>
For frontier clients 2.8.9 (January 2014) through 2.8.14, these are the differences compared to 2.8.15 (February 2016) above:
  • IP families were not considered separate groups (in fact IPv6 was not supported at all before 2.8.14)
  • 404 was considered a protocol error, not a server error
  • There was a bug present since 2.8.6 that considered a whole proxy group as failed when the last one failed, even if previous ones hadn't failed. That could happen after load balancing within a proxy group.
  • There was a bug where if a server did not send an Age header (because it was a fresh query) and the previous query on the same connection had an old enough Age, a max-age exceeded condition could be erroneously triggered; the Age from a previous query was applied to the next one.
 For frontier clients version 2.8.6 (April 2013) through 2.8.8, there are these differences compared to the 2.8.9 strategies above:
  • There was no concept of a max-age exceeded condition.
  • For a protocol error, the request would be immediately retried first with a soft retry (Cache-control: max-age=0) before the hard retry.

Revision 242015-11-23 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 7 to 7
 Definitions:
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
Changed:
<
<
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error or a response is larger than 16KB. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error; using a different proxy in the proxy group when possible is done to improve load balancing. The server URL will be the same unless loadbalance=servers is set, in which case another random server that hasn't had a server error will be chosen.
>
>
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error or a response is larger than 16KB. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error. Using a different proxy in the proxy group when possible is done to improve load balancing. The server URL will be the same unless loadbalance=servers is set, in which case another random server that hasn't had a server error will be chosen.
  When any query needs to be done and more than 5 minutes has elapsed since the first connection, the connection is also dropped, the proxy list is reset to the beginning, all cached DNS names of proxies are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy errors are cleared so they will be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later. This is all in case a proxy that was having a problem had since been fixed. If it has been more than 30 minutes since the first connection (using a separate starting time variable) then the server list is also reset, server errors errors are cleared, and that starting time is reset. The reason for the longer time for servers is because of some of the details below (servers are tried with every proxy in a group, and read timeouts are longer, so for example a dead server tried with 4 proxies can take 40 seconds, which is a significant percentage of 5 minutes).

Revision 232014-11-05 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 24 to 24
 
  1. If there is a max-age exceeded condition, retry the same server and proxy with the specified maximum cache age N seconds (Cache-control: max-age=N). If the condition persists after the retry (should only happen on buggy versions of squid), ignore it and move on to the next strategy.
  2. If a server error happens, keep the same proxy and try the next server URL. This server is flagged as having had an error so it is not tried again until after a 30 minute reset (this is mainly relevant when loadbalance=servers). However, if this happens on the last server, clear all the server error flags, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  3. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
Changed:
<
<
  1. If a protocol error happens, with the same proxy & server retry the request with a hard refresh (Pragma: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
>
>
  1. If a protocol error happens and the soft retry of step 1 has already happened (which means there was a max-age exceeded condition), with the same proxy & server retry the request with a hard refresh (Pragma: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
 
  1. For all other errors, do the same as a protocol error except without the hard refresh.
  2. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).

Revision 222014-08-05 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 38 to 38
  Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.
Changed:
<
<
For frontier clients version 2.8.6 (April 2013) through 2.8.8, there is one difference compared to the 2.8.9 strategies above:
>
>
For frontier clients version 2.8.6 (April 2013) through 2.8.8, there are these differences compared to the 2.8.9 strategies above:
 
  • There was no concept of a max-age exceeded condition.
Changed:
<
<
  • For a protocol error, the request would be immediately be retried first with a soft retry (Cache-control: max-age=0) before the hard retry.
>
>
  • For a protocol error, the request would be immediately retried first with a soft retry (Cache-control: max-age=0) before the hard retry.
  For frontier client versions 2.8.2 through 2.8.5, these are the additional differences:

Revision 212014-01-02 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Changed:
<
<
This section only describes frontier client version 2.8.9 (December 2013) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
>
>
This section only describes frontier client version 2.8.9 (January 2014) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
  Definitions:
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.

Revision 202013-12-11 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 14 to 14
 Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.

The frontier client error retry strategy distinguishes between five types of non-fatal errors:

Changed:
<
<
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the max age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for the database. A condition that the server might notice at that point is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
>
>
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the max age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for that client's request to the database. A condition that the server might notice at that point is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
 
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
  2. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
  3. A protocol error - either the response http code was not 200 OK, the code was not in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).

Revision 192013-12-11 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 14 to 14
 Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.

The frontier client error retry strategy distinguishes between five types of non-fatal errors:

Changed:
<
<
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for the database. A condition that the server can notice that late is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
>
>
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the max age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for the database. A condition that the server might notice at that point is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
 
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
  2. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
  3. A protocol error - either the response http code was not 200 OK, the code was not in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).

Revision 182013-12-11 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 13 to 13
  Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.
Changed:
<
<
The frontier client error retry strategy distinguishes between four types of non-fatal errors:
>
>
The frontier client error retry strategy distinguishes between five types of non-fatal errors:
  1. A cache max-age exceeded condition -- the server specified a maximum cache age in its response, and the received "Age" http header (inserted by the squid proxy) is greater than that. The server sets this maximum age near the end of its response when it is too late to put it in the http header, which happens when a condition to limit the age is found after the http header has already been sent by the server. The http header is sent either when a buffer full of data is ready or when a "keepalive" has to be sent because the server has been waiting for more than 5 seconds for the database. A condition that the server can notice that late is either some kind of database error or an empty result. If the server doesn't set the maximum cache age but it is a protocol error (see below), then the client uses a default maximum cache age of 5 minutes (this is to support servlet versions older than 3.33).
 
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
  2. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
Changed:
<
<
  1. A protocol error - either the response http code was not 200 OK or in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
>
>
  1. A protocol error - either the response http code was not 200 OK, the code was not in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its response that it encountered a late error and should not be cached for a long time (even though the http headers said it was OK and could be cached).
 
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

Added:
>
>
  1. If there is a max-age exceeded condition, retry the same server and proxy with the specified maximum cache age N seconds (Cache-control: max-age=N). If the condition persists after the retry (should only happen on buggy versions of squid), ignore it and move on to the next strategy.
 
  1. If a server error happens, keep the same proxy and try the next server URL. This server is flagged as having had an error so it is not tried again until after a 30 minute reset (this is mainly relevant when loadbalance=servers). However, if this happens on the last server, clear all the server error flags, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
Changed:
<
<
  1. If a protocol error happens, with the same proxy & server retry the request with a maximum cache age of 5 minutes (Cache-control: max-age=300). If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
  2. For all other errors, do the same as a protocol error except without the retry with limited cache age.
>
>
  1. If a protocol error happens, with the same proxy & server retry the request with a hard refresh (Pragma: no-cache). (The hard refresh is needed to clear the cache in cases that a soft refresh doesn't clear because If-Modified-Since revalidation.) If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
  2. For all other errors, do the same as a protocol error except without the hard refresh.
 
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).

Since the first strategy resets the server list while others reset to the beginning of a proxy group, and since each try can independently return its own error, it is possible to alternate between the strategies and never terminate nor make progress. For example, if there are two proxies in the same group PROXY1 and PROXY2 and two servers SERVER1 and SERVER2, and SERVER1 is totally down (which translates to a read timeout, the last type of error) while SERVER2 always returns a server error, then the last two steps will keep repeating:

Line: 37 to 39
 Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.

For frontier clients version 2.8.6 (April 2013) through 2.8.8, there is one difference compared to the 2.8.9 strategies above:

Changed:
<
<
  • For a protocol error, the request would be immediately retried first with a soft retry (Cache-control: max-age=0) and if it still happened then the request would be tried again with a hard retry (Pragma: no-cache).
>
>
  • There was no concept of a max-age exceeded condition.
  • For a protocol error, the request would be immediately be retried first with a soft retry (Cache-control: max-age=0) before the hard retry.
  For frontier client versions 2.8.2 through 2.8.5, these are the additional differences:

Revision 172013-12-02 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Changed:
<
<
This section only describes frontier client version 2.8.9 (November 2013) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
>
>
This section only describes frontier client version 2.8.9 (December 2013) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
  Definitions:
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
Line: 39 to 39
 For frontier clients version 2.8.6 (April 2013) through 2.8.8, there is one difference compared to the 2.8.9 strategies above:
  • For a protocol error, the request would be immediately retried first with a soft retry (Cache-control: max-age=0) and if it still happened then the request would be tried again with a hard retry (Pragma: no-cache).
Changed:
<
<
For frontier client versions 2.8.2 through 2.8.5, these are these additional differences:
>
>
For frontier client versions 2.8.2 through 2.8.5, these are the additional differences:
 
  1. The entire proxy (and server) list and DNS name lookups were not reset at once after 5 (and 30) minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net result for proxies is not a lot different, other than being harder to explain. The net result for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors never got cleared.
  2. A proxy connect timeout was not treated any differently from other network errors.

Revision 162013-11-13 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Changed:
<
<
This section only describes frontier client version 2.8.6 (April 2013) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
>
>
This section only describes frontier client version 2.8.9 (November 2013) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
  Definitions:
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
Line: 22 to 22
 These are the retry strategies:
  1. If a server error happens, keep the same proxy and try the next server URL. This server is flagged as having had an error so it is not tried again until after a 30 minute reset (this is mainly relevant when loadbalance=servers). However, if this happens on the last server, clear all the server error flags, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
Changed:
<
<
  1. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
  2. For all other errors, do the same as a protocol error except without the soft & hard retries.
>
>
  1. If a protocol error happens, with the same proxy & server retry the request with a maximum cache age of 5 minutes (Cache-control: max-age=300). If that still returns a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
  2. For all other errors, do the same as a protocol error except without the retry with limited cache age.
 
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).

Since the first strategy resets the server list while others reset to the beginning of a proxy group, and since each try can independently return its own error, it is possible to alternate between the strategies and never terminate nor make progress. For example, if there are two proxies in the same group PROXY1 and PROXY2 and two servers SERVER1 and SERVER2, and SERVER1 is totally down (which translates to a read timeout, the last type of error) while SERVER2 always returns a server error, then the last two steps will keep repeating:

Line: 36 to 36
  Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.
Changed:
<
<
For frontier client versions 2.8.2 through 2.8.5, these are the differences compared to version 2.8.6 strategies above:
>
>
For frontier clients version 2.8.6 (April 2013) through 2.8.8, there is one difference compared to the 2.8.9 strategies above:
  • For a protocol error, the request would be immediately retried first with a soft retry (Cache-control: max-age=0) and if it still happened then the request would be tried again with a hard retry (Pragma: no-cache).

For frontier client versions 2.8.2 through 2.8.5, these are these additional differences:

 
  1. The entire proxy (and server) list and DNS name lookups were not reset at once after 5 (and 30) minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net result for proxies is not a lot different, other than being harder to explain. The net result for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors never got cleared.
  2. A proxy connect timeout was not treated any differently from other network errors.

Revision 152013-07-03 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Changed:
<
<
This section only describes frontier client version 2.8.6 (April 2013) and later. See below for the differences in older client versions.
>
>
This section only describes frontier client version 2.8.6 (April 2013) and later. See below for the differences in older client versions. For comparison you could see the CVMFS network path selection description.
  Definitions:
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.

Revision 142013-05-20 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 40 to 40
 
  1. The entire proxy (and server) list and DNS name lookups were not reset at once after 5 (and 30) minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net result for proxies is not a lot different, other than being harder to explain. The net result for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors never got cleared.
  2. A proxy connect timeout was not treated any differently from other network errors.
Changed:
<
<
  1. For non-server errors, the next proxy in the list was always tried using the same server until running out of proxies, then the proxy list was reset and the next server was used. The different addresses in the same proxy round-robin DNS name were not all tried; if one of them had an error, it would move on to the next proxy group. (On the other hand, with loadbalance=proxies all the listed proxyurls were tried, as is still the case).
>
>
  1. For non-server errors, the next proxy in the list was always tried using the same server until running out of proxies, then the proxy list was reset and the next server was used. The different addresses in the same proxy round-robin DNS name were not all tried; if one of them had an error, it would move on to the next proxy. (On the other hand, with loadbalance=proxies all the listed proxyurls were tried, as is still the case).
 
  1. For direct connections to servers, similar to proxies only one address in round-robin DNS names were tried when there were errors.

Revision 132013-04-24 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 20 to 20
 
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

Changed:
<
<
  1. If a server error happens, keep the same proxy and try the next server name. If this happens on the last server, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset.
  3. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then go back again to the beginning of the server list and move to the next proxy group.
>
>
  1. If a server error happens, keep the same proxy and try the next server URL. This server is flagged as having had an error so it is not tried again until after a 30 minute reset (this is mainly relevant when loadbalance=servers). However, if this happens on the last server, clear all the server error flags, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset or after there are no more proxies.
  3. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers. If there are no more servers, clear any server error flags, go back again to the beginning of the server list, and move to the next proxy group.
 
  1. For all other errors, do the same as a protocol error except without the soft & hard retries.
  2. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).

Revision 122013-04-24 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 20 to 20
 
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

Changed:
<
<
  1. If a server error happens, keep the same proxy and try the next server name. If all servers are tried, go back to the beginning of the server list and try the next proxy with the first server.
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL.
>
>
  1. If a server error happens, keep the same proxy and try the next server name. If this happens on the last server, go back to the beginning of the server list and try the next proxy with the first server (in case a proxy was somehow returning a server error).
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL. This proxy is flagged as having had an error so it is not tried again until after a 5 minute reset.
 
  1. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then go back again to the beginning of the server list and move to the next proxy group.
  2. For all other errors, do the same as a protocol error except without the soft & hard retries.
  3. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).

Revision 112013-04-23 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 16 to 16
 The frontier client error retry strategy distinguishes between four types of non-fatal errors:
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
  2. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
Changed:
<
<
  1. A protocol error - either the response couldn't be completely parsed as expected or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
>
>
  1. A protocol error - either the response http code was not 200 OK or in the 500 server-error series, the response couldn't be completely parsed as expected, or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
 
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

Line: 24 to 24
 
  1. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL.
  2. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then go back again to the beginning of the server list and move to the next proxy group.
  3. For all other errors, do the same as a protocol error except without the soft & hard retries.
Changed:
<
<
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be listed (typically after the round-robin name).
>
>
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be included in the list of serverurls (typically after the round-robin name).
  Since the first strategy resets the server list while others reset to the beginning of a proxy group, and since each try can independently return its own error, it is possible to alternate between the strategies and never terminate nor make progress. For example, if there are two proxies in the same group PROXY1 and PROXY2 and two servers SERVER1 and SERVER2, and SERVER1 is totally down (which translates to a read timeout, the last type of error) while SERVER2 always returns a server error, then the last two steps will keep repeating:
  • PROXY1+SERVER1: read timeout, advance proxy

Revision 102013-04-11 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 9 to 9
  If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error or a response is larger than 16KB. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error; using a different proxy in the proxy group when possible is done to improve load balancing. The server URL will be the same unless loadbalance=servers is set, in which case another random server that hasn't had a server error will be chosen.
Changed:
<
<
When any query needs to be done and more than 5 minutes has elapsed since the first connection, the connection is also dropped, the proxy list is reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy errors are cleared so they will be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later. If it has been more than 30 minutes since the first connection (using a separate starting time variable) then the server list is also reset, server errors errors are cleared, and that starting time is reset. The reason for the longer time for servers is because of some of the details below (servers are tried with every proxy in a group, and read timeouts are longer, so for example a dead server tried with 4 proxies can take 40 seconds, which is a significant percentage of 5 minutes).
>
>
When any query needs to be done and more than 5 minutes has elapsed since the first connection, the connection is also dropped, the proxy list is reset to the beginning, all cached DNS names of proxies are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy errors are cleared so they will be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later. This is all in case a proxy that was having a problem had since been fixed. If it has been more than 30 minutes since the first connection (using a separate starting time variable) then the server list is also reset, server errors errors are cleared, and that starting time is reset. The reason for the longer time for servers is because of some of the details below (servers are tried with every proxy in a group, and read timeouts are longer, so for example a dead server tried with 4 proxies can take 40 seconds, which is a significant percentage of 5 minutes).
  Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.

Revision 92013-04-10 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 7 to 7
 Definitions:
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
Changed:
<
<
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error, a response is larger than 16KB, or after 100 smaller queries. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error; using a different proxy in the proxy group when possible is done to improve load balancing. The server URL will be the same unless loadbalance=servers is set, in which case another random server that hasn't had a server error will be chosen.
>
>
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error or a response is larger than 16KB. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error; using a different proxy in the proxy group when possible is done to improve load balancing. The server URL will be the same unless loadbalance=servers is set, in which case another random server that hasn't had a server error will be chosen.
 
Changed:
<
<
When any connection is attempted to be established and more than 5 minutes has elapsed since the first connection, the proxy and server lists are reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy and server errors are cleared so they will be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later.
>
>
When any query needs to be done and more than 5 minutes has elapsed since the first connection, the connection is also dropped, the proxy list is reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy errors are cleared so they will be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later. If it has been more than 30 minutes since the first connection (using a separate starting time variable) then the server list is also reset, server errors errors are cleared, and that starting time is reset. The reason for the longer time for servers is because of some of the details below (servers are tried with every proxy in a group, and read timeouts are longer, so for example a dead server tried with 4 proxies can take 40 seconds, which is a significant percentage of 5 minutes).
  Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.
Line: 38 to 38
  For frontier client versions 2.8.2 through 2.8.5, these are the differences compared to version 2.8.6 strategies above:
Changed:
<
<
  1. Connections were not dropped after 100 small queries.
  2. The entire proxy/server list and DNS name lookups were not reset at once after 5 minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net result for proxies is not a lot different, other than being harder to explain. The net result for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors didn't get cleared every 5 minutes.
>
>
  1. The entire proxy (and server) list and DNS name lookups were not reset at once after 5 (and 30) minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net result for proxies is not a lot different, other than being harder to explain. The net result for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors never got cleared.
 
  1. A proxy connect timeout was not treated any differently from other network errors.
  2. For non-server errors, the next proxy in the list was always tried using the same server until running out of proxies, then the proxy list was reset and the next server was used. The different addresses in the same proxy round-robin DNS name were not all tried; if one of them had an error, it would move on to the next proxy group. (On the other hand, with loadbalance=proxies all the listed proxyurls were tried, as is still the case).
  3. For direct connections to servers, similar to proxies only one address in round-robin DNS names were tried when there were errors.

Revision 82013-04-10 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 39 to 39
 For frontier client versions 2.8.2 through 2.8.5, these are the differences compared to version 2.8.6 strategies above:

  1. Connections were not dropped after 100 small queries.
Changed:
<
<
  1. The entire proxy/server list and DNS name lookups were not reset at once after 5 minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net results for proxies is not a lot different, other than being harder to explain. The net results for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors don't get cleared every 5 minutes.
>
>
  1. The entire proxy/server list and DNS name lookups were not reset at once after 5 minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net result for proxies is not a lot different, other than being harder to explain. The net result for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors didn't get cleared every 5 minutes.
 
  1. A proxy connect timeout was not treated any differently from other network errors.
  2. For non-server errors, the next proxy in the list was always tried using the same server until running out of proxies, then the proxy list was reset and the next server was used. The different addresses in the same proxy round-robin DNS name were not all tried; if one of them had an error, it would move on to the next proxy group. (On the other hand, with loadbalance=proxies all the listed proxyurls were tried, as is still the case).
Deleted:
<
<
  1. For direct connections to servers, again only one address in round-robin DNS names were tried.
 \ No newline at end of file
Added:
>
>
  1. For direct connections to servers, similar to proxies only one address in round-robin DNS names were tried when there were errors.

Revision 72013-04-03 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

This section only describes frontier client version 2.8.6 (April 2013) and later. See below for the differences in older client versions.

Changed:
<
<
Definition:
  • proxy group - either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
>
>
Definitions:
  • proxy group - either all the IP addresses in a round-robin name in a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
 
Changed:
<
<
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error, a response is larger than 16KB, or after 100 smaller queries. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error, and with the same server URL; using a different proxy in the group when possible is done to improve load balancing.
>
>
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error, a response is larger than 16KB, or after 100 smaller queries. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error; using a different proxy in the proxy group when possible is done to improve load balancing. The server URL will be the same unless loadbalance=servers is set, in which case another random server that hasn't had a server error will be chosen.
 
Changed:
<
<
When any connection is attempted to be established and more than 5 minutes has elapsed since the first connection, the proxy and server lists are reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy and server errors are cleared so they will all be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later.
>
>
When any connection is attempted to be established and more than 5 minutes has elapsed since the first connection, the proxy and server lists are reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy and server errors are cleared so they will be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later.
  Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.
Line: 21 to 21
  These are the retry strategies:
  1. If a server error happens, keep the same proxy and try the next server name. If all servers are tried, go back to the beginning of the server list and try the next proxy with the first server.
Changed:
<
<
  1. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server.
>
>
  1. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server URL.
 
  1. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then go back again to the beginning of the server list and move to the next proxy group.
  2. For all other errors, do the same as a protocol error except without the soft & hard retries.
Changed:
<
<
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name.
>
>
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name. Note: this is the only time that all the addresses in a round-robin DNS server name are used, since when going through a proxy only the server URL is used, not individual IP addresses. So if it is important to guarantee that all servers be tried through proxies, the individual server names need to be listed (typically after the round-robin name).
  Since the first strategy resets the server list while others reset to the beginning of a proxy group, and since each try can independently return its own error, it is possible to alternate between the strategies and never terminate nor make progress. For example, if there are two proxies in the same group PROXY1 and PROXY2 and two servers SERVER1 and SERVER2, and SERVER1 is totally down (which translates to a read timeout, the last type of error) while SERVER2 always returns a server error, then the last two steps will keep repeating:
  • PROXY1+SERVER1: read timeout, advance proxy

Revision 62013-04-02 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 39 to 39
 For frontier client versions 2.8.2 through 2.8.5, these are the differences compared to version 2.8.6 strategies above:

  1. Connections were not dropped after 100 small queries.
Changed:
<
<
  1. The entire proxy/server list and DNS name lookups were not reset after 5 minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared.
>
>
  1. The entire proxy/server list and DNS name lookups were not reset at once after 5 minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared. Also, on every new connection it would go back to the beginning of the proxy & server list and look for one of each that hadn't been implicated in an error. The net results for proxies is not a lot different, other than being harder to explain. The net results for server errors is different because when going through proxies, server names are not looked up in the DNS so those errors don't get cleared every 5 minutes.
 
  1. A proxy connect timeout was not treated any differently from other network errors.
  2. For non-server errors, the next proxy in the list was always tried using the same server until running out of proxies, then the proxy list was reset and the next server was used. The different addresses in the same proxy round-robin DNS name were not all tried; if one of them had an error, it would move on to the next proxy group. (On the other hand, with loadbalance=proxies all the listed proxyurls were tried, as is still the case).
  3. For direct connections to servers, again only one address in round-robin DNS names were tried.
\ No newline at end of file

Revision 52013-03-29 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Line: 30 to 30
 
  • PROXY1+SERVER1: read timeout, advance proxy
  • PROXY2+SERVER1: read timeout, go back to beginning of proxy group with next server
  • PROXY1+SERVER2: server error, no more servers so go back to the beginning of the server list and try next proxy
Changed:
<
<
To avoid that situation, if the server list has been set back to its beginning by a server error while in a particular proxy group since the last 5-minute full reset, do not try to start again at the beginning of that proxy group even if the strategy calls for it.
>
>
To avoid that situation, if the server list has been set back to its beginning by a server error while in a given proxy group since the last 5-minute full reset, do not try to start again at the beginning of that proxy group even if the strategy calls for it.
 

Older client version differences

Revision 42013-03-29 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

This section only describes frontier client version 2.8.6 (April 2013) and later. See below for the differences in older client versions.

Changed:
<
<
Definitions: a proxy group is either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set. A server group is likewise all the IP addresses in a round-robin serverurl, or all the serverurls together if loadbalance=servers is set.
>
>
Definition:
  • proxy group - either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set.
 
Changed:
<
<
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error, a response larger than 16KB, or after 100 smaller queries. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error and a server within the same server group that had not previously generated an error (groups are used rather than the exact same proxy/server in order to improve load balancing).
>
>
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error, a response is larger than 16KB, or after 100 smaller queries. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error, and with the same server URL; using a different proxy in the group when possible is done to improve load balancing.
 
Changed:
<
<
When any connection is attempted to be established and more than 5 minutes has elapsed since the job started or since the last reset happened, the proxy and server lists are reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, and all proxy and server errors are cleared so they will all be retried.
>
>
When any connection is attempted to be established and more than 5 minutes has elapsed since the first connection, the proxy and server lists are reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, all proxy and server errors are cleared so they will all be retried, and the starting time of the first connection is reset so the same process can be done 5 minutes later.

Since there are usually many combinations of proxies and servers to try, it is important that the client does not spend much time on any one combination. For that reason, the default connect timeout time is 5 seconds and the default read timeout time is 10 seconds.

  The frontier client error retry strategy distinguishes between four types of non-fatal errors:
Changed:
<
<
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
  2. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
  3. A protocol error - either the response couldn't be completely parsed as expected or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
>
>
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
  2. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
  3. A protocol error - either the response couldn't be completely parsed as expected or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
 
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.

These are the retry strategies:

  1. If a server error happens, keep the same proxy and try the next server name. If all servers are tried, go back to the beginning of the server list and try the next proxy with the first server.
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server.
Changed:
<
<
  1. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. On the other hand if the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then go back again to the beginning of the server list and move to the next proxy group.
>
>
  1. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. If the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then go back again to the beginning of the server list and move to the next proxy group.
 
  1. For all other errors, do the same as a protocol error except without the soft & hard retries.
Changed:
<
<
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to the first server and then to all servers in the server group before moving on to the next server group.
>
>
  1. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to all servers including all the addresses in each round-robin DNS name.
  Since the first strategy resets the server list while others reset to the beginning of a proxy group, and since each try can independently return its own error, it is possible to alternate between the strategies and never terminate nor make progress. For example, if there are two proxies in the same group PROXY1 and PROXY2 and two servers SERVER1 and SERVER2, and SERVER1 is totally down (which translates to a read timeout, the last type of error) while SERVER2 always returns a server error, then the last two steps will keep repeating:
  • PROXY1+SERVER1: read timeout, advance proxy
  • PROXY2+SERVER1: read timeout, go back to beginning of proxy group with next server
  • PROXY1+SERVER2: server error, no more servers so go back to the beginning of the server list and try next proxy
Changed:
<
<
To avoid that situation, if the server list has been set back to its beginning by a server error while in a particular proxy group since the last full 5-minute reset, do not try to start again at the beginning of that proxy group even if the strategy calls for it.
>
>
To avoid that situation, if the server list has been set back to its beginning by a server error while in a particular proxy group since the last 5-minute full reset, do not try to start again at the beginning of that proxy group even if the strategy calls for it.
 

Older client version differences

Revision 32013-03-29 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

This section only describes frontier client version 2.8.6 (April 2013) and later. See below for the differences in older client versions.

Changed:
<
<
Definitions: a proxy group is either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set. A server group is likewise all the IP addresses in a round-robin serverurl, or all the serverurls together if loadbalance=servers.
>
>
Definitions: a proxy group is either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set. A server group is likewise all the IP addresses in a round-robin serverurl, or all the serverurls together if loadbalance=servers is set.
  If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error, a response larger than 16KB, or after 100 smaller queries. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error and a server within the same server group that had not previously generated an error (groups are used rather than the exact same proxy/server in order to improve load balancing).

When any connection is attempted to be established and more than 5 minutes has elapsed since the job started or since the last reset happened, the proxy and server lists are reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, and all proxy and server errors are cleared so they will all be retried.

The frontier client error retry strategy distinguishes between four types of non-fatal errors:

Changed:
<
<
  1. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server was down.
  2. A server error -- a proxy correctly responded but there was either an error code from the server or the proxy immediately determined a server problem and returned an error code indicating a server problem. The result will not be cached.
>
>
  1. A server error -- a proxy or server responded but there was either an http error code from the server where it knew it had a problem, or the proxy immediately determined a server problem and returned an error code indicating so. In other words, there was an http error code in the 500 series. The result will not be cached.
  2. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server is down.
  1. A protocol error - either the response couldn't be completely parsed as expected or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
Changed:
<
<
  1. All other types of errors, for example networking problem, read timeout because of overloading or non-response from a server, etc.
>
>
  1. All other types of errors, for example networking problems, read timeouts because of overloading or non-response from a server, etc.
 
Changed:
<
<
This is the retry strategy:
  1. If a server error happens, keep the same proxy and try the next server. If all servers are tried, reset the server list to the beginning and try the next proxy with the first server.
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group, with the same server.
  3. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as it is in the same proxy group. On the other hand if the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then reset the server list and move to the next proxy group.
>
>
These are the retry strategies:
  1. If a server error happens, keep the same proxy and try the next server name. If all servers are tried, go back to the beginning of the server list and try the next proxy with the first server.
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group if there are any left that hadn't had an error, with the same server.
  3. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as the proxy is in the same proxy group. On the other hand if the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then go back again to the beginning of the server list and move to the next proxy group.
 
  1. For all other errors, do the same as a protocol error except without the soft & hard retries.
  2. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to the first server and then to all servers in the server group before moving on to the next server group.
Changed:
<
<
>
>
Since the first strategy resets the server list while others reset to the beginning of a proxy group, and since each try can independently return its own error, it is possible to alternate between the strategies and never terminate nor make progress. For example, if there are two proxies in the same group PROXY1 and PROXY2 and two servers SERVER1 and SERVER2, and SERVER1 is totally down (which translates to a read timeout, the last type of error) while SERVER2 always returns a server error, then the last two steps will keep repeating:
  • PROXY1+SERVER1: read timeout, advance proxy
  • PROXY2+SERVER1: read timeout, go back to beginning of proxy group with next server
  • PROXY1+SERVER2: server error, no more servers so go back to the beginning of the server list and try next proxy
To avoid that situation, if the server list has been set back to its beginning by a server error while in a particular proxy group since the last full 5-minute reset, do not try to start again at the beginning of that proxy group even if the strategy calls for it.
 

Older client version differences

Revision 22013-03-29 - DaveDykstra

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Frontier client retry strategy

Changed:
<
<
This page only covers frontier client version 2.8.2 (June 2011) and later.
>
>
This section only describes frontier client version 2.8.6 (April 2013) and later. See below for the differences in older client versions.
 
Changed:
<
<
Talk about persistent connections here.
>
>
Definitions: a proxy group is either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies is set. A server group is likewise all the IP addresses in a round-robin serverurl, or all the serverurls together if loadbalance=servers.
 
Changed:
<
<
First, if any DNS name has not been looked up in at least 5 minutes, the name is re-looked up and all errors on that name are considered cleared.
>
>
If any connections are already established and working, the frontier_client will continue to reuse that connection for subsequent queries until one of them returns an error, a response larger than 16KB, or after 100 smaller queries. Squid also drops connections from its end after they have been idle for 1 minute. When connections are dropped in one of these ways, the next query will attempt to reconnect to a proxy within the same proxy group that had not previously generated an error and a server within the same server group that had not previously generated an error (groups are used rather than the exact same proxy/server in order to improve load balancing).
 
Changed:
<
<
The frontier client retry strategy distinguishes between four types of errors:
>
>
When any connection is attempted to be established and more than 5 minutes has elapsed since the job started or since the last reset happened, the proxy and server lists are reset to the beginning, all cached DNS names are considered to be invalid so they will need to be looked up again the next time they are referenced, and all proxy and server errors are cleared so they will all be retried.

The frontier client error retry strategy distinguishes between four types of non-fatal errors:

  1. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server was down.
 
  1. A server error -- a proxy correctly responded but there was either an error code from the server or the proxy immediately determined a server problem and returned an error code indicating a server problem. The result will not be cached.
  2. A protocol error - either the response couldn't be completely parsed as expected or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
Deleted:
<
<
  1. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server was down.
 
  1. All other types of errors, for example networking problem, read timeout because of overloading or non-response from a server, etc.
Changed:
<
<
frontier client version 2.8.6 (April 2013) also introduced the concept of a proxy group, which is either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies.
>
>
This is the retry strategy:
  1. If a server error happens, keep the same proxy and try the next server. If all servers are tried, reset the server list to the beginning and try the next proxy with the first server.
  2. If a connect timeout to a proxy happens, try the next proxy, including within the same proxy group, with the same server.
  3. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens try a hard retry (Pragma: no-cache). If those two still return a non-server error, try the next proxy with the same server as long as it is in the same proxy group. On the other hand if the end of the proxy group is reached, start again at the beginning of the proxy group and try the next server until there are no more servers, and then reset the server list and move to the next proxy group.
  4. For all other errors, do the same as a protocol error except without the soft & hard retries.
  5. When there are no more proxies to try, unless failovertoserver=no is set (and backupproxyurl implies that), try directly connecting to the first server and then to all servers in the server group before moving on to the next server group.

Older client version differences

Frontier clients prior to version 2.8.2 (June 2011) are not documented on this page.

For frontier client versions 2.8.2 through 2.8.5, these are the differences compared to version 2.8.6 strategies above:

 
Changed:
<
<
  1. If a server error happens, keep the same proxy and try the next server. If all servers are tried, reset the server list to the beginning and try the next server.
  2. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens a hard retry (Pragma: no-cache). If those two still return a protocol error, prior to frontier client 2.8.6 always try the next proxy using the same server. Starting with frontier client 2.8.6, try the next proxy with the same server as long as it is in the same proxy group, but if the end of the group is reached, try the next server until there are no more and then reset the server list for the next proxy group.
  3. Starting with frontier client 2.8.6, if a connect timeout happens to a proxy, always move to the next proxy and keep the server the same.
  4. For all other types of errors, do the same as a protocol error except without the soft & hard retries.
>
>
  1. Connections were not dropped after 100 small queries.
  2. The entire proxy/server list and DNS name lookups were not reset after 5 minutes. Instead, if any DNS name of a proxy or server (including round-robin DNS names) had not been looked up in at least 5 minutes and was attempted to be used again, that name was re-looked up in the DNS and all errors on the addresses in that name were considered cleared.
  3. A proxy connect timeout was not treated any differently from other network errors.
  4. For non-server errors, the next proxy in the list was always tried using the same server until running out of proxies, then the proxy list was reset and the next server was used. The different addresses in the same proxy round-robin DNS name were not all tried; if one of them had an error, it would move on to the next proxy group. (On the other hand, with loadbalance=proxies all the listed proxyurls were tried, as is still the case).
  5. For direct connections to servers, again only one address in round-robin DNS names were tried.

Revision 12013-03-28 - DaveDykstra

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

Frontier client retry strategy

This page only covers frontier client version 2.8.2 (June 2011) and later.

Talk about persistent connections here.

First, if any DNS name has not been looked up in at least 5 minutes, the name is re-looked up and all errors on that name are considered cleared.

The frontier client retry strategy distinguishes between four types of errors:

  1. A server error -- a proxy correctly responded but there was either an error code from the server or the proxy immediately determined a server problem and returned an error code indicating a server problem. The result will not be cached.
  2. A protocol error - either the response couldn't be completely parsed as expected or the server signaled at the end of its transmission that its response had an error and should not be cached (even though the http headers said it was OK and could be cached).
  3. A connect timeout - the socket couldn't be connected, meaning that a proxy or direct-connect server was down.
  4. All other types of errors, for example networking problem, read timeout because of overloading or non-response from a server, etc.

frontier client version 2.8.6 (April 2013) also introduced the concept of a proxy group, which is either all the IP addresses in a round-robin name of a proxyurl or a backupproxyurl, or all the proxyurls together if loadbalance=proxies.

  1. If a server error happens, keep the same proxy and try the next server. If all servers are tried, reset the server list to the beginning and try the next server.
  2. If a protocol error happens, with the same proxy & server try a soft retry (Cache-control: max-age=0) and if it still happens a hard retry (Pragma: no-cache). If those two still return a protocol error, prior to frontier client 2.8.6 always try the next proxy using the same server. Starting with frontier client 2.8.6, try the next proxy with the same server as long as it is in the same proxy group, but if the end of the group is reached, try the next server until there are no more and then reset the server list for the next proxy group.
  3. Starting with frontier client 2.8.6, if a connect timeout happens to a proxy, always move to the next proxy and keep the server the same.
  4. For all other types of errors, do the same as a protocol error except without the soft & hard retries.
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback