Communicating Network Topology Information
From Gnutella Developers
|Table of contents|
Capturing the network topology, to measure its overal diameter, analyze its connectivity, or compute other network metrics needed to tune some protocol implementation parameters, is a difficult task, and quite tricky to implement. In addition, using any crawler can impose significant load on the network, so crawlers should not be deployed on wide scales as they may cause significant degration of the global network performance.
If you're interested in studying the Gnutella topology, contact User:Agthorr who can provide snapshots of the topology.
This document is a quick overview of how to best make your Gnutella 0.6 client crawler compatible with the LimeWire crawler.
Initially, capturing this topology was performed with more or less success by making use of the Gnutella Ping/Pong messages and other host discovery mechanisms (like global pong caches now obsolete, Gwebcaches, or even today with UHC-capable hosts used for discovering alternate servents) for attempting new connections. This method is still usable, but is not fast enough to compute usable network topology snapshots in a reasonnable time (because it requires initiating full Gnutella connections, which involves several roundtrips to perform the header processing).
This page documents a simple connection method, with which many roundtripping message can be saved, allowing a much better performance of capturing crawlers, with lower impact on the network.
This extension should still be considered experimental, and may be replaced by other mechanisms in the future, notably with UDP-based crawlers.
Upon a connection attempt, if you receive the header
respond with the headers
Leaves: ipleaf1:port, ipleaf2:port, ... Peers: ippeer1:port, ippeer2:port, ...
After finishing the connection process, just close the connection.
The majorversion.minorversion of the crawler header is currently 0.1.
Leaves specify the ip and port of leaf nodes connected to your client if it is running as an ultrapeer. Peers specify the non-leaf clients that you are connected to (i.e. ultrapeers, 0.6 and 0.4 style clients).
Crawler Client ----------------------------------------------------------- GNUTELLA CONNECT/0.6 User-Agent: LimeWire (crawl) X-Ultrapeer: False Query-Routing: 0.1 Crawler: 0.1 GNUTELLA/0.6 200 OK User-Agent: BearShare Leaves: 127.0.0.1:6346,127.0.0.2:6346,127.0.0.3:6346 Peers: 127.0.0.4:6346,127.0.0.5:6346,127.0.0.6:6346 GNUTELLA/0.6 200 OK
This is the preferred method of delivering leaves and peers to the LimeWire crawler.
The LimeWire crawler will currently wait 40 seconds before disconnecting. In fact, it always attempts to employ the following backup algorithm.
Traditional Crawler Pings
In backup to the pure header only algorithm, the LimeWire crawler will establish the connection as above and then send a crawler ping with (hop=0, TTL=2) and read your pong responses. After sending the crawler ping, it also sends a network ping with (hop=1, ttl=6) in an attempt to discover more hosts. All pongs are processed. Note that they should not be "Big" pongs with GGEP extensions.
- UHC UDP-based Host Caching Protocol: the now preferred host-discovery method, based on Gnutella ping/pongs over UDP.
- HSEP Horizon Size Estimation Protocol: experimental extension.