Notes from 24 April 2013 meeting of the Http Proxy Discovery Task Force

Present: Jakob Blomer, Brian Bockelman, Dan Bradley, Dave Dykstra, Ian Gable

Parts in green are relevant but weren't discussed at the meeting.

Discussed Dave's Http Proxy Discovery Proposal suggested standards. There was mostly agreement except that Brian suggested adding an option to allow site administrators to put the PAC file into a file:// URL instead of the default http://wpad/wpad.dat web servers. This has been added as standards step 2 in the proposal wiki page. (Also, file:// support for the PAC file has been added to the frontier client and will be added to pacwget). Note, however, that that with file:// there's a technical caveat that the implementations will need to use a less reliable method for determining the IP address for the myIpAddress() function in the PAC files: with an http server, it can look up the client's IP address from the socket to the http server, but without it the pacparser library uses a DNS lookup on the hostname, and if that fails,

Jakob was concerned that the PAC internet standard doesn't support client-based load balancing. There is a web page suggesting some PAC file javascript code that returns a different ordering of proxies depending on the low order bits of the client IP address. It's crude and can get especially messy with larger numbers of proxies if it is to stay balanced when one of the proxies is down but it could be sufficient. Discussed the possibility of extending the PAC standard to allow vertical bars in the proxy names the way that the CVMFS client uses to indicate load balancing, but Brian thought that it would be better to rely on the javascript trick in order to stick with the standard, and nobody disagreed with him.

Ian talked about the use case that Shoal is trying to take care of. They are aiming toward clouds and automatically starting more proxies as necessary. They have added a PAC server interface so any client supporting it could find out the needed squids.

Dan is looking into (after a recent discussion with Brian and Dave) a similar but more dynamic use case of running opportunistically on a grid where preemption and relatively short wall clock limits could cause squids to disappear more frequently than in clouds. It would probably be handled at the pilot job level, so at least it wouldn't have to have long delays of waiting for jobs to find batch slots. They can't rely on virtual machines, but squids can run without root privileges. They would also have the limitation of not being able to probe the squids from offsite because of firewalls. Dave suggested using a DNS server to dynamically track squids coming and going throughout the life of jobs, since frontier and cvmfs clients frequently (every ~5 minutes) clear caches of DNS names but only read the names at start time (or reconfigure time in the cvmfs case). Squids can also be configured to query peers for cached items before going to an upstream server; this is not currently used at grid sites but it might be more helpful in this type of environment. Brian wondered whether the clients could instead be changed to re-read PAC files every 5 minutes instead of having to set up another mechanism with dynamically updated DNS servers. After the meeting Dave considered that that would cause a huge increase in the load on the PAC file servers, and that DNS is an extremely efficient system with local caches everywhere, so it's probably better to use DNS. That's also better for squids finding their peers (need to verify if squids use expiring names for cache_peer since some names in squid expire and some don't).

It was recognized that the use cases of opportunistic grid use and cloud use do have a lot in common. Ian and Dan will talk more about the possibility of using one set of software tools for both cases.

We didn't meet with the whole Task Force team, so we will have to meet again. There's not a big hurry to figure this out because it depends on work based on the Squid Monitoring Task Force agreements that hasn't been started yet. Meanwhile we'll give this some time to think more deeply about these things.

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2013-05-14 - DaveDykstra
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback