LARK: A Light-weight Resilient Application-Level Multicast Protocol

Srikanth Kandula; Jong-Kwon Lee; Jennifer C. Hou

LARK: A Light-weight Resilient Application-Level Multicast Protocol

Srikanth Kandula ,
Jong-Kwon Lee ,
Jennifer C. Hou

January 2003

IEEE CCW (Invited Paper)

Download BibTex

Application-level multicasting (ALM) has attracted a significant amount of attention, as it is a convincing alternative over traditional IP multicasting. While recent work has been focused, initially toward deploying an ALM alternative [8], [1], and later towards scalability in terms of sustaining the protocol for a much larger number of nodes [4], this has come at the expense of having to maintain sophisticated state at endhosts. Furthermore, the use of leader-election mechanisms and rendezvous points creates concentrated points of failure in the overlay network and seems to be alien to the objective of fully distributed overlay creation and maintenance. Deterministic selection of overlay edges localizes the domain of exposure of an overlay node causing the overlay to be susceptible to clustered failures.

In this paper, we present a simple, light-weight, yet scalable ALM protocol, called LARK, that allows the formation and maintenance of overlay topologies in a completely distributed fashion while maintaining only O(1) state at each node and ensuring robustness in the presence of a large number of node failures. Conceptually, members self-organize into cliques, where a clique is a cluster of end-hosts in which each end-host is aware of, and exchanges state with, every other end-host in the cluster. No control message is exchanged for clique maintenance beyond the necessary state update among members belonging to the same clique. In addition, members are allowed to peer with randomly selected members belonging to other distinct cliques. This ensures that in the event that one or more members leave or fail, the other members can re-join the group at other peers they are aware of. We elaborate on the components of LARK and derive certain theoretical bounds on its performance. We also validate our design through simulations.