How konspire works

How konspire works

Overview
konspire is a distributed file-sharing system. A konspire network is made up of two different kinds of network hosts, clients and servers. Clients each have a collection of files that they are sharing, and the servers index these files. To search for a file residing on any client connected to the network, a client simply connects to a server in the network and sends the server a search query. The server then searches through the complete file database and sends results back to the client that made the request. The "results" are basically collection of file pointers, each containing a description of the matching file (name and size) as well as the address of the client that is hosting the file. The client is then free to examine the results and select a file to download, at which point the client connects to the client that is hosting the file. During the file transfer, the two clients are the only hosts involved: in fact, the rest of the konspire network could be completely wiped out and the transfer would continue. Thus, konspire separates the file system into two distinct sets of tasks: 1. hosting and transferring files (the clients) and 2. indexing and searching files (the servers). These groups of tasks can be distributed to machines that are good at particular tasks: for instance, a large memory and a fast network connection is needed to run an index server, though any machine with a hard drive and a modem will work as a client.

The server network and system
Servers are connected in a ring-like network. When a server receives a message that other servers need to get, it adds a list of servers to the message (if a list isn't present already) and checks itself off the list. Then the server sends the message on to one unchecked server on the list. As long as the message goes through to this server, it assumes that all other servers will eventually get the message. If the message fails to go through, the server repeatedly selects another server on the list and tries to send the message. Thus, the ring network is partially fail-safe, but will not tolerate a scenario where a server receives a message but fails before it tries to send it on. In a konspire network, this is a rare case (and we're not dealing with life and death here). The message continues propagating around the ring until all servers have the message (all servers are checked off on the list packaged with the message).

An example of a propagated message is a client file list. When a server receives a new file list from a client, it first propagates the message to one other live server and then adds the collection of file pointers to its database. Thus, all servers in the system are building *similar* databases. These databases are not identical for two reasons: 1. client file lists may propagate to different servers in different orders (thus, the way the databases are built may be different, though they are functionally equivalent) and 2. servers that start up at different times may receive different sets of client file lists (younger servers may have missed some of the first file lists propagated). However, in the limit in a random system like a konspire network, the databases will grow to be functionally identical. We can reasonably assume that in a 24 hour period, most clients will either reconnect or rescan their file lists. For the handful of clients that run without resending their file lists, the databases on servers started at different times will differ by only these handfuls of files.

Another message that is propagated around the server network is a "new server up" message. When a server connects to the network, the server that receives the connection propagates the message about this server so that all other servers can add the new server to their propagation lists.

Whenever a client disconnnects from the konspire network, a message is propagated so that the client's files can be removed from each server database.

Connecting
The process for connecting to a konspire network for the first time is similar for both clients and servers. Each send a "request connect" message to a known (or guessed) live server. First, the live server sends the new host a list of all known live servers. Thus, even if the live server crashes at this point, the new host will still have a good list of other live servers to try (and not be stranded without a server list). Since the number of servers in the system is likely to be far smaller than the number of clients, sending a server list around is not a problem in terms of network bandwidth. For the case of a client connection, the client may be denied a connection to the server after the server list is sent (if the server is full). This is because clients maintain a constant connection to their server for searching purposes. Servers, on the other hand, don't maintain connections to other servers. After the server list is sent to the new server (and after the live server propagates a "new server up" message), the connection to the new server is broken. The server will connect to another server again only when propagating a message.

Distribution of searching
Because each konspire server maintains a complete copy of the file database, it can handle search requests from all clients that are connected to it. The load of search requests is spread out among servers because the clients are spread out among servers. In a stable system (for instance, on a fall evening after university classes are over), the main load on the system will be processing search requests and searching the file database. In a search-heavy scenario, the konspire network will perform well. In a scenario where many clients are connecting at the same time, performance will not be as good, since each server in the system needs to add *every* incoming file list to its database. However, for each file list entering the system, each server only receives and sends a single network message (propagating the message). Thus, network bandwidth during a login-heavy scenario will be reasonable.

File transfers and resumed transfers
After search results return to the client (a collection of file pointers, including client host address, file name, and file size for each matching file), the client can begin downloading several of the matching files. When initiating a download, the client sends a "connect request" message to the client hosting the desired file. If the client has an open connection slot, the download connection is accepted, or else it is rejected. If the download is interrupted by a cancel (from either user) or a broken connection, information about the partial download is saved to disk in a "filename.partial" file. This ".partial" file contains information about the file name and the file size, as well as information about how much of the download was completed when the transfer was interrupted. If the client ever tries to download a file with the same name and size as specified by the ".partial" file, a resumed download will be initiated. In the case of a resumed download, the client sends a starting file position to the hosting client, and the host jumps to this position in the file before sending it. Thus, resumed downloads are supported even if the host resumed from is not the host the first part of the transfer was from (since files are matched only by name and size [in bytes] and not by host). This could cause problems in the case where two *different* files are identically named and are of identical size but have different contents. Baring malicious and deceptive renaming of files, this is a rare case.