Communicating Network Topology Information

From Gnutella Developers

<< Standard Message Architecture | Advertizing Shared Content >> | Main Page

Table of contents

Important Notice

Capturing the network topology, to measure its overal diameter, analyze its connectivity, or compute other network metrics needed to tune some protocol implementation parameters, is a difficult task, and quite tricky to implement. In addition, using any crawler can impose significant load on the network, so crawlers should not be deployed on wide scales as they may cause significant degration of the global network performance.

If you're interested in studying the Gnutella topology, contact User:Agthorr who can provide snapshots of the topology.

This document is a quick overview of how to best make your Gnutella 0.6 client crawler compatible with the LimeWire crawler.

Initially, capturing this topology was performed with more or less success by making use of the Gnutella Ping/Pong messages and other host discovery mechanisms (like global pong caches now obsolete, Gwebcaches, or even today with UHC-capable hosts used for discovering alternate servents) for attempting new connections. This method is still usable, but is not fast enough to compute usable network topology snapshots in a reasonnable time (because it requires initiating full Gnutella connections, which involves several roundtrips to perform the header processing).

This page documents a simple connection method, with which many roundtripping message can be saved, allowing a much better performance of capturing crawlers, with lower impact on the network.

This extension should still be considered experimental, and may be replaced by other mechanisms in the future, notably with UDP-based crawlers.

Header Processing

Upon a connection attempt, if you receive the header

   Crawler: majorversion.minorversion

respond with the headers

   Leaves: ipleaf1:port, ipleaf2:port, ...
   Peers: ippeer1:port, ippeer2:port, ...

After finishing the connection process, just close the connection.

The majorversion.minorversion of the crawler header is currently 0.1.

Leaves specify the ip and port of leaf nodes connected to your client if it is running as an ultrapeer. Peers specify the non-leaf clients that you are connected to (i.e. ultrapeers, 0.6 and 0.4 style clients).

Example:

      Crawler                          Client
      -----------------------------------------------------------
      GNUTELLA CONNECT/0.6
      User-Agent: LimeWire (crawl)
      X-Ultrapeer: False
      Query-Routing: 0.1
      Crawler: 0.1

                                       GNUTELLA/0.6 200 OK
                                       User-Agent: BearShare
                                       Leaves: 127.0.0.1:6346,127.0.0.2:6346,127.0.0.3:6346
                                       Peers: 127.0.0.4:6346,127.0.0.5:6346,127.0.0.6:6346

      GNUTELLA/0.6 200 OK

      Disconnect                       Disconnect


This is the preferred method of delivering leaves and peers to the LimeWire crawler.

The LimeWire crawler will currently wait 40 seconds before disconnecting. In fact, it always attempts to employ the following backup algorithm.

Traditional Crawler Pings

In backup to the pure header only algorithm, the LimeWire crawler will establish the connection as above and then send a crawler ping with (hop=0, TTL=2) and read your pong responses. After sending the crawler ping, it also sends a network ping with (hop=1, ttl=6) in an attempt to discover more hosts. All pongs are processed. Note that they should not be "Big" pongs with GGEP extensions.

See Also

  • UHC UDP-based Host Caching Protocol: the now preferred host-discovery method, based on Gnutella ping/pongs over UDP.
  • HSEP Horizon Size Estimation Protocol: experimental extension.

<< Standard Message Architecture | Advertizing Shared Content >> | Main Page