Distributed chat (a.k.a. Chat Lobbies)

  • Component: chat lobbies
  • Development: Dec. 2011 – Mar. 2012
  • Status: released

Instant messaging is an essential feature in social networking software. Facebook has it, Google circles has it, etc. RetroShare offers both direct private chat to friends and conversations with the wider Retroshare community via Chat Lobbies.

So what makes RetroShare chat lobbies so different? The answer: it is distributed and anonymous, very much like IRC or XMPP. The main difference is that chat lobbies lie on top of the friend-to-friend RetroShare network, and therefore no login onto the network is required beyond being connected to your friends.

Messages are sent to everyone in the chat lobby by hopping onto direct friends participating in the chat lobby. People can therefore use it to talk to non friends. Peers in the lobby are known by their nickname, which can be arbitrary, hence providing anonymity even for direct friends.

The image below represents a graph of friends connected to each other. Friends of peer A are peers F,C,B,E and H. Peers in green participate to a chat lobby. Messages from A will be forwarded to C, B and H, and then to I and J.

Each peers knows the list of participating peers in the lobby. This list is built according to received messages, including a heart-beat packet that each peer broadcasts once every 5 minutes.

Chat lobbies exist in two forms: public and private. Public lobbies are advertised to friends, whereas private lobbies require an explicit invitation from a participating friend in order to join the lobby.

Implementation challenges

The main difficulties in implementing chat lobbies are to avoid duplicate messages and ensure a stable life of the lobby when peers disconnect.

Avoiding duplicates

Avoiding duplicate messages is not as easy as it sounds. Each participating peer stores a list of message IDs (a 64 bit random number) and a list of direct friends who participate in the lobby. Every time a message is received, its ID is searched for in the cache. If not found, the ID is stored and the message is forwarded to all other friends from the local participant list.

Because a lobby can provide arbitrary long routes, it’s fairly possible that, due to various delays, some message take longer than the maximum cache storage duration. This duration is indeed limited, to keep the cost of the search and the memory footprint tractable. Therefore we stamp each message sent in the lobby with a writing time. Any message that is received later than the cache duration plus the writing time, is discarded.

When the user reads the lobby list, RetroShare asks direct friends for existing lobbies, with usage count, to update the list of accessible public lobbies. A user leaving RetroShare can therefore re-join a public chat lobby. Private lobbies however require that the peer is re-invited by one of its participants.

Ensuring optimal message transport

Because friends may disconnect, a lobby may not be connected at all times. For instance if A disconnects from H, the lobby will be split into two parts and messages from one part will not be known by peers in the second group.

In order to keep the best connectivity in lobbies, RetroShare connects peers that participate in the lobby, while keeping the lobby safe from eves-dropping peers. In our example, it would be useful that B and C know that they can exchange lobby messages directly. To achieve this, we use a challenge system:

Let L be the lobby ID (a 64 bit number), B and C the peer IDs (128 bit numbers), and h is a non-reversible hashing function:

  • C picks up a recent message ID in the cache, N, and computes the challenge h(L,N,F_i) for all friends F_i, and send it to F_i;
  • B receives the challenge from C, and computes h(L_i,N_i,B) for all message N_i in the cache and all participating lobby L_i
  • if B finds a match, B sends an invitation to C for the lobby L, and adds C to participating friends
  • C receives the invitation and adds B to the list of participating friends.

Because the hash is not reversible, peers that are not participating in the lobby cannot deduce the lobby ID from the challenge message. They cannot return the challenge to C either (in order to have C send an invitation) because the challenge function is asymmetric.

Thanks to the high connectivity, the route that a message takes is not pre-determined, and depending on the connection speed, B might receive a lobby message from A through C, indirectly.

Message splitting

Because chat messages can get potentially large (e.g. when pasting certificates with signatures included), big messages are split into sub-packets, to obey packet size requirements of the SSL layer. In classical chat, sub-packets are always received in the same order they get sent, so no special care needs to be taken to handle them properly. In chat lobbies, sub-packets may take different routes and arrive in a random order. Each sub-packet thus contains the total number of sub-packets as well as its position in the final message. With this, we rebuild partial packets at the destination, in a way that is completely transparent to the end user.

Conclusion

Chat lobbies are a very convenient way of communicating with non friends in the RetroShare network. For the moment, we have observed chat lobbies of almost 100 peers, which seems to work fairly well.

Among future improvements, we plan to make the private lobby a bit more handy. One issue at the present state is that when a peer disconnects from a private lobby, he needs to be re-invited by one of his friends participating to the lobby. It would be easier if the friend kept advertising about this lobby to disconnected peers so that they can come again by themselves.

About Cyril

I'm sharing the lead of the RetroShare project with G10H4CK. I've been working on RetroShare for four years now.
This entry was posted in Software components. Bookmark the permalink.

4 Responses to Distributed chat (a.k.a. Chat Lobbies)

  1. devnewton says:

    > Avoiding duplicate messages is not as easy as it sounds.

    It could be simple, stupid if you use UUID as message id : http://en.wikipedia.org/wiki/Universally_unique_identifier

    There is a ready to use UUID generator in Qt: http://doc.qt.digia.com/latest/quuid.html

  2. Cyril says:

    That’s basically what we do. The difficulty is rather to ensure that messages IDs are kept in cache long enough to avoid duplicates and short enough to keep memory bounded, and search cost limited.

    • devnewton says:

      About using a database or file to store all the messages? Other chat software can store all conversation history and just show the last N messages unless you ask for more.

  3. ASmith says:

    The innovative and unique Retroshare 0.6 anonymous chat system is a crowning software development achievement on the Retroshare 0.6 core services, very well done Cyril et al.

Leave a comment