Standard Message Architecture

From Gnutella Developers

<< Leaf Mode and Ultrapeer Mode | Communicating Network Topology Information >> | Main Page

Source - [Latest draft (http://rfc-gnutella.sourceforge.net/src/rfc-0_6-draft.html)]

Once a servent has connected successfully to the network, it communicates with other servents by sending and receiving Gnutella protocol messages. Each message is preceded by a Message Header with the byte structure given below.

  • Note 1: One IP packet may contain several Gnutella messages, and one Gnutella message may be split up on multiple IP-packets. This means one can never assume a Gnutella message ends when the chunk of data read from the socket ends.
  • Note 2: All fields in the following structures are in little-endian byte order unless otherwise specified.
  • Note 3: All IP addresses in the following structures are in IPv4 format. For example, the IPv4 byte array:


byte 0 byte 1 byte 2 byte 3
0xC0 0x00 0x02 0x85


represents the dotted address 192.0.2.133.


Table of contents

Message Header

The message header is 23 bytes divided into the following fields.


Bytes Description
0-15 Message ID/GUID (Globally Unique ID)
16 Payload Type
17 TTL (Time To Live)
18 Hops
19-22 Payload Length


Message ID

A 16-byte string (GUID) uniquely identifying the message on the network.

Servents SHOULD store all 1's (0xff) in byte 8 of the GUID. (Bytes are numbered 0-15, inclusive.) This serves to tag the GUID as being from a modern servent.

Servents SHOULD initially store all 0's in byte 15 of the GUID. This is reserved for future use.

The other bytes SHOULD have random values.

Payload

Indicates the type of message. Gnutella servents MUST accept all the following types:


Type Message
0x00 Ping
0x01 Pong
0x02 Bye
0x40 Push
0x80 Query
0x81 Query Hit


Other Gnutella messages can be used, but if so the servent MUST first make sure that the remote host supports this new message type. This can be done using handshaking headers.

TTL

Time To Live. The number of times the message will be forwarded by Gnutella servents before it is removed from the network. Each servent will decrement the TTL before passing it on to another servent. When the TTL reaches 0, the message will no longer be forwarded (and MUST not).

Hops

The number of times the message has been forwarded. As a message is passed from servent to servent, the TTL and Hops fields of the header must satisfy the following condition:

  TTL(0) = TTL(i) + Hops(i)

where TTL(i) and Hops(i) are the value of the TTL and Hops fields of the message, and TTL(0) is maximum number of hops a message will travel (usually 7).

Payload Length

The length of the message immediately following this header. The next message header is located exactly this number of bytes from the end of this header i.e. there are no gaps or pad bytes in the Gnutella data stream. Messages SHOULD NOT be larger than 4 kB.

The Payload Length field is the only reliable way for a servent to find the beginning of the next message in the input stream. Therefore, servents SHOULD rigorously validate the Payload Length field for each message received. If a servent becomes out of synch with its input stream, it SHOULD close the connection associated with the stream since the upstream servent is either generating, or forwarding, invalid messages.

Abuse of the TTL field in broadcasted messages (Query) will lead to an unnecessary amount of network traffic and poor network performance. Therefore, servents SHOULD carefully check the TTL fields of received query messages and lower them as necessary. Assuming the servent's maximum admissible Query message life is 7 hops, then if TTL + Hops > 7, TTL SHOULD be decreased so that TTL + Hops = 7. Broadcasted messages with very high TTL values (>15) SHOULD be dropped.

Immediately following the message header, is a payload consisting of one of the following messages.


Ping (0x00)

Ping messages MAY contain a GGEP extension block (see Section 2.3), but no other payload.


Pong (0x01)

Pong messages contains information about a Gnutella host. The message has the following payload:


Bytes Field name Description
0-1 Port Number The port number on which the responding host can accept incoming connections.
2-5 IP Address The IP address of the responding host. Note: This field is in big-endian format.
6-9 Number of shared files The number of files that the servent with the given IP address and port is sharing on the network.
10-13 Number of kilobytes shared The number of kilobytes of data that the servent with the given IP address and port is sharing on the network.
14- GGEP block OPTIONAL extension (see GGEP).


Pong messages are only sent in response to an incoming Ping message. It is common for a servent to send all recently-received Pongs in response to every single Ping message. This enables host caches to send cached servent address information in response to a Ping request.

The Message ID of a Pong message MUST be the Message ID of the Ping message it is sent in reply to.

The fields specifying the number of shared files and the number of kilobytes shared were intended to allow one to measure the amount of data available on the network. With a very large Gnutella network, and minimized Ping and Pong message traffic, this can no longer be done. Still, these fields SHOULD be filled out correctly.

Query (0x80)

Since Query messages are broadcasted to many nodes, the total size of the message SHOULD not be larger than 256 bytes. Servents MAY drop Query messages larger that 256 bytes, and SHOULD drop Query messages larger than 4 KB.

A Query message has the following payload:


Bytes Field name Description
0-1 Minimum Speed (Flags) The minimum speed (in kb/second) of servents that should respond to this message. A servent receiving a Query message with a Minimum Speed field of n kb/s SHOULD only respond with a Query Hit if it is able to communicate at a speed >= n kb/s.
2- Search Criteria This field is terminated by a NUL (0x00). See section 2.2.7.3 for rules and information on how to Interpret the Search Criteria
Rest Extensions Block OPTIONAL. The rest of the query message is used for extensions to the original query format. The allowed extension types are GGEP, HUGE and XML (see Section 2.3 and Appendixes 1 and 2).

If two or more of these extension types exist together, they are separated by a 0x1C (file separator) byte. Since GGEP blocks can contain 0x1C bytes, the GGEP block, if present, MUST be located after any HUGE and XML blocks.

The type of each block can be determined by looking for the prefixes "urn:" for a HUGE block, "<" or "{" for XML and 0xC3 for GGEP.

The extension block SHOULD NOT be followed by a null (0x00) byte, but some servents wrongly do that.


Flags field semantics

The first two bytes of the Query message payload were previously used to signal the minumum speed required of the sharing host. The value was in little-endian format. This use has now been deprecated.

The new semantic is in big-endian format. The higher bit in big-endian format (bit 15) is used as a flag to detect queries with the new semantic. This bit MUST be set. If the bit 15 is not set, then this is a query with the legacy minspeed semantic, and the field MAY be ignored, but servents MUST NOT ignore the entire query. If the bit 15 is set, then this is a query with the new semantic. Note however that bit 15 in the new semantics was the bit 7 in the legacy one (encoding for 128 kbps).

In the new semantic, each bit (except for bit 15) is used as a flag, mostly to indicate compatibility with new gnutella extensions. The affectation of each bit is as follow :


Bit no Flag Description
15 MinSpeed/Flags Indicator MUST be set to 1 to indicate that the flags below are used instead of encoding the Minimum Speed.
14 Firewalled Indicator The host who sent the query is unable to accept incoming connections. This flag can be used by the remote servent to avoid returning Query Hits if it is itself firewalled, as the requesting servent will not be able to download any files.
13 XML Metadata Set this bit to 1 if you want the sharing servent to send XML Metadata in the Query Hit. This flag has been assigned to spare bandwidth, returning metadata in queryHits only if the requester asks for it. If this bit is not set, the sharing host MUST NOT send XML metadata in return Query Hit messages.
12 Leaf Guided Dynamic Query When the bit is set to 1, this means that the query is sent by a leaf which wants to control the dynamic query mechanism. This is part of the Leaf guidance of dynamic queries proposal. This information is only used by the ultrapeers shielding this leave if they implement leaf guidance of dynamic queries. If this bit is set in a Query from a Leaf it indicates that the Leaf will respond to Vendor Messages from its Ultrapeer about the status of the search results for the Query.
11 GGEP "H" Allowed If this bit is set to 1, then the sender is able to parse the GGEP "H" extension which is a replacement for the legacy HUGE GEM extension. This is meant to start replacing the GEM mechanism with GGEP extensions, as GEM extensions are now deprecated.
10 OOB Query This flag is used to recognize a Query which was sent using the Out Of Band Query extension.
9 ? Reserved for a future use. Must be set to 0.
0-8 Maximum Query Hits Set when a maximum number of Query Hits is expected, 0 if no maximum. This does not mean that no more Query Hits may be returned, but that the query should be propagated in a way that will cause the specified number of hits.


Query Hit (0x81)

Query Hit messages has the following fields:


Bytes Field name Description
0 Number of Hits The number of query hits in the result set (see below).
1-2 Port The port number on which the responding host can accept incoming HTTP file requests. This is usually the same port as is used for Gnutella network traffic, but any port MAY be used.
3-6 IP Address The IP address of the responding host. Note: This field is in big-endian format.
7-10 Speed The speed (in kb/second) of the responding host.
11- Result Set A set of responses to the corresponding Query. This set contains Number_of_Hits elements. See below for the format.
x Extended QHD This block is not strictly required, but strongly recommended. It is sometimes called EQHD, or (incorrectly) just QHD.
x Private Data Undocumented vendor-specific data. This field continues till the servent Identifier, which uses the last 16 bytes of the message. See below for details.
Last 16 Servent Identifier A 16-byte string uniquely identifying the responding servent on the network. This SHOULD be constant for all Query Hit messages emitted by a servent and is typically some function of the servent's network address. The servent Identifier is mainly used for routing the Push Message (see below).


Query Hit Result Item

Each item contained in the query hit result is structured as follows:


Bytes Field name Description
0-3 File Index A number, assigned by the responding host, which is used to uniquely identify the file matching the corresponding query.
4-7 File Size The size (in bytes) of the file whose index is "File Index". For large files whose size cannot be expressed with a 32-bit integer, a GGEP LF block can be used in the extensions block.
8- File Name The name of the file whose index is "File Index". Terminated by a null byte (i.e. 0x00).
x Extensions block. Allowed extension types are HUGE, GGEP and plain text metadata. This field is terminated by a null (0x00), even if there are no extensions (resulting in a double null). Also, the extensions block itself MUST NOT contain any null bytes.

If two or more of these extension types exist together, they are separated by a 0x1C (file separator) byte. Since GGEP blocks can contain 0x1C bytes, the GGEP block, if present, MUST be located after any HUGE and plan text blocks.

The type of each block can be determined by looking for the prefixes "urn:" for a HUGE block, 0xC3 for GGEP and anything else is probably plain text metadata.

Plain text metadata is intended to be displayed directly to the user. It was first invented by Gnotella (a now discontinued Gnutella servent) to tag MP3 files. Examples:

 "192 kbps 44 kHz 3:23"
 "120 kbps(VBR) 44kHz 3:55" (variable bitrate)

Other plain text formats MAY be used.


Extended Query Hit Descriptor

The extended QHD has the following format:


Bytes Field name Description
0-3 Vendor Code Four case-insensitive characters representing a vendor code. For example "LIME" for LimeWire. See registered codes and register yours at the list of Known Distributors or at the old GDF database (http://groups.yahoo.com/group/the_gdf/database?method=reportRows&tbl=6) (requires GDF membership).
4 Open Data Size Contains the length (in bytes) of the Open Data field. Set to 2 in most current implementations, and 4 in those that support XML metadata outside GGEP (see Section 2.3 and Appendix 2). The Open Data area MAY be larger to allow future extensions.
x Open Data Contains two 1-byte flags fields with the following layout and in the specified order:
               bit:    Description:
               7,6     Reserved for future use
               5       flagGGEP
               4       flagUploadSpeed
               3       flagHaveUploaded
               2       flagBusy
               1       Reserved for future use
               0       flagPush

The first flag byte can be viewed as an enabler for the flags in the second byte, the setter. Only those bits that were enabled must be considered by the servent as being valid. This logic is reversed for flagPush, which is set in the first byte and enabled in the second. The enabling byte allows you to know which flags are supported by a given servent.

  • Bits 5,4,3,2 in the first byte MUST be set if and only if the corresponding flag in the second byte is meaningful.
  • Bit 0 in the second byte MUST be set if and only if the corresponding flag in the second byte is meaningful. Yes, the order is reversed for this flag.
  • flagUploadSpeed is set if and only if the Speed field of the QueryHit message contains the highest average transfer rate (in kbps) of the last 10 uploads. Otherwise Speed field contains the hosts total upload speed as set by the user, and therefore less reliable.
  • flagHaveUploaded is set if and only if the servent has successfully uploaded at least one file.
  • flagBusy is set if and only if the all of the servent's upload slots are currently full.
  • flagPush is set if and only if the servent is firewalled or cannot accept incoming TCP connections for any other reason.
  • The reserved flags MUST not be set, unless they are used for a future extension.
  • If XML metadata (Appendix 2) is included in the current Query Hit message, the following 2 bytes of Open Data area will contain the size of the XML block. The XML block itself is placed in the private data area.


Private Data Area

If the flagGGEP in the open data block is set, this block contains a GGEP (see Section 2.3) extension block. The GGEP block starts with a 0xC3 byte. Any data before or after the GGEP block is vendor-specific data, and MUST be ignored, if not recognized.

Servents are NOT RECOMMENDED to use the private data area for vendor specific data. Servents SHOULD use GGEP extensions instead.

If the Open Data area indicates an XML block is will also be placed in the private area (see Appendix 2). Assuming that the two bytes in the Open Data area specifies an XML block of m bytes, that block can be found by extracting the last m bytes of the private area. Both GGEP and XML can exist in the same Private Data area, but XML SHOULD be implemented inside GGEP.

[TODO: How about the nul after the XML block? What is it good for?]

Push (0x40)

A Push message has the following fields:


Bytes Field name Description
0-15 Servent Identifier. The 16-byte string uniquely identifying the servent on the network who is being requested to push the file with index File_Index. The servent initiating the push request MUST set this field to the Servent_Identifier returned in the corresponding QueryHit message. This is used to route the Push message to the sender of the Query Hit message.
16-19 File Index The index uniquely identifying the file to be pushed from the target servent. The servent initiating the push request MUST set this field to the value of one of the File_Index fields from the Result Set in the corresponding QueryHit message.
20-23 IP Address The IP address of the host to which the file with File_Index should be pushed. This field is in big-endian format.
24-25 Port The port number the receiver of this message should push to.
26- OPTIONAL GGEP extension block (see Section 4.1)


<< Leaf Mode and Ultrapeer Mode | Communicating Network Topology Information >> | Main Page