Download sources from servers that don't respond to ping
Ryan Schmidt
ryandesign at macports.org
Sun Oct 4 16:20:34 PDT 2009
On Oct 4, 2009, at 14:39, Scott Haneda wrote:
> On Oct 3, 2009, at 3:18 PM, Ryan Schmidt wrote:
>
>> This is not perfect because as you say some servers might not
>> respond to pings (though hopefully that's rare), and some servers
>> that respond quickly to pings may not actually be fast at sending
>> the file, but it was easy enough to implement and better than what
>> we had. We're open to suggestions on further improvements for auto-
>> selecting fast download locations.
>
> I had the misfortune of some bad routes with my colocation provider
> a while back. Traceroute being the main tool I was using. This
> combined with Comcast doing strange things on their routers gave me
> a chance to learn a little more about checking hosts.
>
> Rather than ping, what about "curl --head example.com". Of course
> you can be as specific as you want and attach as deep a URI as you
> like.
>
> This may not work as I believe it's a port 80 based test and I'm not
> sure what ports need testing. Though if it's using ping, with as
> unreliable as that is, I can't see curl being anything but an
> improvement.
>
> I assume these are rsync servers, or are we talking about the actual
> servers that hold the distro? In the latter, curl would work
> perfect, in the former, there would be a requirement that http be
> active. Or perhaps curl could be told to hit another port.
>
> Where ping and traceroute fail me for basic up status testing, curl
> with a --head flag always gives me more data, in more detail.
>
> Any reasons this would not be a better approach?
We are talking about distfile servers (the servers portfiles specify
in their master_sites line), which MacPorts currently automatically
checks for quality using ping as I described. We are not talking about
rsync servers which are used only to retrieve portfiles during "sudo
port sync" and MacPorts base code during "sudo port selfupdate", and
which the user must still manually select in sources.conf.
"curl --head" could be used to determine whether we can connect to a
server, but it doesn't give you a quality indication that could be
used to rank multiple available servers. "ping" tells you how many
milliseconds it took to reach the server, which is how we sort
distfile servers today. You might be able to use "time" to time a curl
command, but I have a feeling that would be less accurate because it
would time the entire curl process, and not just the network access
part like ping's internal timing does. I'm also fairly sure doing an
HTTP or HTTPS or FTP request and response, like curl would do, would
take a lot more time than a ping. This is not to say ping is the only
solution we should use, but these are probably some of the reasons why
it was selected in the first place, and some things to consider when
thinking about if it should be replaced.
If a server does not respond to pings, it will simply appear toward
the end of the available servers, and other servers, if available,
will be tried first. The 3 or 4 MacPorts distfile mirrors will almost
certainly be available, so the file should be retrievable from one of
those. And if no other server has the file (perhaps the distfiles
mirrors haven't mirrored it yet), then the server that doesn't respond
to pings will still be tried, albeit last. So at worst this should
result in slightly slower downloads for users close to the unpingable
server. I don't think that consequence is worth rewriting the MacPorts
base code for.
Note I am unable to explain the symptom Jann experienced, that of
receiving a 0-byte file instead of a complete file or a failure
response. That sounds like a misconfigured server and would have
nothing to do with the nearby server selection mechanism.
More information about the macports-dev
mailing list