Gridpack files and the Frontier infrastructure
You can
leave a comment at the bottom.
Introduction
On the weekend of September 7th-8th 2013, a workflow requested a large file (692 MB in size) many times, often aborting & restarting the requests. This resulted in heavy loading of the CMS Launchpads, causing them to collapse. Here, a study of the events is follows.
Since every Launchpad is installed on a 1 Gbit link, each one could serve up to 119 MiB/s theoretical bandwidth. In practice, the protocols reduce the effective bandwidth, but from the plots it can be seen that none of them reached such a transfer rate during the weekend.
Reset events
The timeline of machine restarts, as recorded in the Tomcat and Squid logs, is:
To be included
Messages in the logs include
INFO: The APR based Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path: /u
sr/lib/jvm/java-1.6.0-sun-1.6.0.43.x86_64/jre/lib/amd64/server:/usr/lib/jvm/java-1.6.0-sun-1.6.0.43.x86_64/jre/lib/amd64:/usr/lib/jvm/java-1.6.0-sun-1.6.0.43.x86_64/jre
/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
Sep 7, 2013 3:31:16 PM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-8080
Comments
This is a persistent comments section