Download sources from servers that don't respond to ping

Eric Hall opendarwin.org at darkart.com
Sun Oct 4 21:16:19 PDT 2009


On Sun, Oct 04, 2009 at 06:20:34PM -0500, Ryan Schmidt wrote:
> On Oct 4, 2009, at 14:39, Scott Haneda wrote:
> 
> >On Oct 3, 2009, at 3:18 PM, Ryan Schmidt wrote:
> >
> >>This is not perfect because as you say some servers might not  
> >>respond to pings (though hopefully that's rare), and some servers  
> >>that respond quickly to pings may not actually be fast at sending  
> >>the file, but it was easy enough to implement and better than what  
> >>we had. We're open to suggestions on further improvements for auto- 
> >>selecting fast download locations.
> >
[snip]
> >
> >Rather than ping, what about "curl --head example.com". Of course  
> >you can be as specific as you want and attach as deep a URI as you  
> >like.
> >
> >This may not work as I believe it's a port 80 based test and I'm not  
> >sure what ports need testing. Though if it's using ping, with as  
> >unreliable as that is, I can't see curl being anything but an  
> >improvement.
[snip]
> 
> We are talking about distfile servers (the servers portfiles specify  
> in their master_sites line), which MacPorts currently automatically  
> checks for quality using ping as I described. We are not talking about  
> rsync servers which are used only to retrieve portfiles during "sudo  
> port sync" and MacPorts base code during "sudo port selfupdate", and  
> which the user must still manually select in sources.conf.
> 
> "curl --head" could be used to determine whether we can connect to a  
> server, but it doesn't give you a quality indication that could be  
> used to rank multiple available servers. "ping" tells you how many  
> milliseconds it took to reach the server, which is how we sort  
> distfile servers today. You might be able to use "time" to time a curl  
> command, but I have a feeling that would be less accurate because it  
> would time the entire curl process, and not just the network access  
> part like ping's internal timing does. I'm also fairly sure doing an  
> HTTP or HTTPS or FTP request and response, like curl would do, would  
> take a lot more time than a ping. This is not to say ping is the only  
> solution we should use, but these are probably some of the reasons why  
> it was selected in the first place, and some things to consider when  
> thinking about if it should be replaced.

[snip]

	Instead of using curl, why not use an internal socket
connection on "the right port(s)" for the service used to 
deliver distfiles?  That will be tcp/80 for most ports,
some will have different ports (svn or ftp for example).
Start a tcp connection, once you get the socket completed,
check the timing.  That also lets you know if the service
you want to use is available.
	In fact, if this was done "at the right place" (TM),
the "fastest" connection could be the connection used for
the actual download, and the other connections can be dropped.
Note that I did not look at the code that implements the
downloads, it may not be practical to insert timing checks
and connection-drop logic there.



		-eric



More information about the macports-dev mailing list