Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Paceline : toward high-bandwidth interactive continous media applications over TCP Tayarani Najaran, Mahdi 2009

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
ubc_2009_fall_tayarani_najaran_mahdi.pdf [ 1.19MB ]
[if-you-see-this-DO-NOT-CLICK]
Metadata
JSON: 1.0051679.json
JSON-LD: 1.0051679+ld.json
RDF/XML (Pretty): 1.0051679.xml
RDF/JSON: 1.0051679+rdf.json
Turtle: 1.0051679+rdf-turtle.txt
N-Triples: 1.0051679+rdf-ntriples.txt
Original Record: 1.0051679 +original-record.json
Full Text
1.0051679.txt
Citation
1.0051679.ris

Full Text

Paceline: Toward High-BandwidthInteractive Continous MediaApplications Over TCPbyMahdi Tayarani NajaranB.Sc., Ferdowsi University of Mashhad, 2005M.Sc., Sharif University of Technology, 2007A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate Studies(Computer Science)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2009c Mahdi Tayarani Najaran 2009AbstractMultimedia applications are known to have low end-to-end latency require-ments for interactivity, suggesting UDP to be a suitable transport protocol.Traditionally, the bandwidth requirements were set to be well below the net-work capacity. Modern applications could evolve if more bandwidth wereavailable without giving up interactivity. However, the lack of a suitabletransport providing such high bandwidth with low latency has preventedthe existence of the next generation of multimedia applications. A trans-port built on top of UDP can no longer traverse the network without beinghit by major packet drops or severely damaging other tra c. TCP, havingbuilt-in congestion control, is safe to use for high bandwidth requirements,but has been associated with unexpected high latencies.In this thesis, we go against conventional wisdom and design Paceline,a transport layer on top of TCP to support high-bandwidth low-latencyapplications. Paceline improves latency through application level rate con-trol. It also detects when TCP connections exponentially back-o and auto-matically replaces them with standby connections. Through evaluation weshow Paceline improves TCP’s shortcomings in terms of end-to-end latencyin extremely congested networks, without sacri cing TCP’s e ectiveness inutilization or bandwidth sharing.We design two di erent applications on top of Paceline for streaminglive video. We present a preliminary evaluation of the application’s per-formance when using Paceline in congested networks and show signi cantimprovements compared to TCP.iiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Transport: Latency Controller . . . . . . . . . . . . . . . . . 73.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.2 Application Level Measurements . . . . . . . . . . . . . . . . 83.2.1 Latency Measurements . . . . . . . . . . . . . . . . . 93.2.2 Bandwidth Measurements . . . . . . . . . . . . . . . 113.3 Congestion Window Approximation . . . . . . . . . . . . . . 113.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . 173.4.2 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . 183.4.3 Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5 Incremental Deployment . . . . . . . . . . . . . . . . . . . . 223.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Transport: Failover . . . . . . . . . . . . . . . . . . . . . . . . . 264.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2 Connection Management . . . . . . . . . . . . . . . . . . . . 274.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 284.3.1 Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 284.3.2 Controller . . . . . . . . . . . . . . . . . . . . . . . . 29iiiTable of Contents4.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.5 Incremental Deployment . . . . . . . . . . . . . . . . . . . . 314.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Application: Live-In . . . . . . . . . . . . . . . . . . . . . . . . 345.1 QStream and Live-in . . . . . . . . . . . . . . . . . . . . . . 345.2 Application Level Evaluation . . . . . . . . . . . . . . . . . . 385.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 Application: QPS . . . . . . . . . . . . . . . . . . . . . . . . . . 426.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436.2 Preferential Subscriptions . . . . . . . . . . . . . . . . . . . . 467 Conclusion and Future Work . . . . . . . . . . . . . . . . . . 48Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50ivList of Tables3.1 Average CWND size in the experiments based on the numberof  ows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Oldest un-acked measurements for TCP . . . . . . . . . . . . 193.3 Oldest un-acked measurements for SOCK and ADAPT la-tency controllers vs. TCP . . . . . . . . . . . . . . . . . . . . 193.4 Jain fairness index measurements for SOCK and ADAPT la-tency controllers vs. TCP . . . . . . . . . . . . . . . . . . . . 203.5 Oldest un-acked measurements for ADAPT and mixed TCP . 224.1 Worst case oldest un-acked measurements for SOCK and ADAPTlatency controllers vs. TCP . . . . . . . . . . . . . . . . . . . 264.2 Median and 99.9 percentile oldest un-acked measurements forthe ADAPT latency controller with failover . . . . . . . . . . 304.3 Worst case oldest un-acked measurements for the ADAPTlatency controller with failover . . . . . . . . . . . . . . . . . 304.4 Jain fairness index measurements for ADAPT latency con-troller with failover vs TCP. . . . . . . . . . . . . . . . . . . . 314.5 Oldest un-acked measurements for ADAPT with failover andmixed TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33vList of Figures1.1 General structure of QStream . . . . . . . . . . . . . . . . . . 33.1 rttewma vs. rttewma vs. TCP rtt measurements . . . . . . . 103.2 Di erent rate measurement signals vs. ^bw. . . . . . . . . . . . 123.3 Long-term vs. Long-term+ vs. Long-term smoothing ofbandwidth measurements . . . . . . . . . . . . . . . . . . . . 133.4 ^CWND from Equation 3.4 compared to CWND . . . . . . 133.5 ^CWND obtained from Equation 3.6 vs. CWND . . . . . . . 153.6 Emulab setup used for evaluation . . . . . . . . . . . . . . . . 183.7 Oldest un-acked and fairness of ADAPT compared to TCPwith 8, 16, 24 and 32  ows. . . . . . . . . . . . . . . . . . . . 213.8 Oldest un-acked of ADAPT with mixed number of TCP  ows. 233.9 Fairness of ADAPT with mixed number of TCP  ows. . . . . 243.10 Fairness of ADAPT  ows vs. TCP with 16  ows and 50% mix 254.1 Message port session . . . . . . . . . . . . . . . . . . . . . . . 284.2 Oldest un-acked and fairness of ADAPT with mixed numberof TCP  ows. . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.1 Qvid Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . 355.2 Live timeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.3 Live-In output for local streaming . . . . . . . . . . . . . . . 385.4 Xmit window size versus application goodput. The end-to-end latency is xmit window size plus a constant 3 frames forsetup and decode (1 frame = 33.3ms). . . . . . . . . . . . . . 395.5 Average frames per second with an 8 frame xmit window size. 406.1 Peer-to-peer conferencing vs. QPS conferencing. . . . . . . . 426.2 Qps session and viewers. . . . . . . . . . . . . . . . . . . . . . 446.3 Qps from a subscriber’s view. . . . . . . . . . . . . . . . . . . 456.4 Live TV coverage of an event. . . . . . . . . . . . . . . . . . . 47viAcknowledgementsI would like to take the time to thank my supervisor Charles \Buck" Krasicfor the inspiration, motivation and guidance throughout this work. I’d alsolike to thank my dear wife Sara, for all her support.1Chapter 1IntroductionMultimedia applications have time sensitive data, requiring a transport pro-tocol that can deliver data in strict time bounds over the Internet. For long,transports built on top of UDP have been the choice for time sensitive appli-cations, in contrast to TCP which is believed to be suitable for most typesof tra c, including bulk data transfer. Modern multimedia applications areincreasing their bandwidth requirements, but the lack of a suitable trans-port in the Internet has prevented new generations of such applications fromdominating the Internet. New applications such as multi-party high de ni-tion video conferencing or on-line multiplayer games with an extremely highnumber of players demand both high bandwidth and low latency from thenetwork.Based on traditional wisdom, application designers either have to chooseUDP for low latency and try consuming more bandwidth, use TCP for highbandwidth with the hopes that the network load is low enough for TCPto have an acceptable performance, or design a new transport protocol forthe Internet. Neither of these solutions seem convenient enough. Withoutcongestion control, it is not responsible to use UDP for high bandwidthapplications. Moreover, TCP’s latency starts degrading drastically as thenetwork usage closes upon the network capacity. Finally, TCP has certainfeatures, i.e. it is extremely e cient in consuming available bandwidth, ows share bandwidth in a rather fair manner, it is extremely e cient inhighly utilizing network links and provides an easy to use service model tothe application layer, that have been  ne tuned over the past decades.Generally speaking, TCP has become sophisticated enough to dominatenew protocols. This makes TCP, by far, the dominant transport protocolin the Internet, and the process of designing a successful new transportprotocol overly complicated and to some extent, impossible. Instead oftrying to replace TCP, we decided to improve the performance of TCP usingapplication level techniques. We call the transport we designed Paceline. Weevaluated Paceline in the context of video streaming using QStream [18].QStream is an event based video player than can adapt to disk speed,network speed, available CPU and available bandwidth. Figure 1.1 displays2Chapter 1. Introductionthe relationship between three di erent parts of QStream: the applicationlayer referred to as Qvid, the transport layer referred to as Paceline, and theevent loop referred to as QSF.Figure 1.1: General structure of QStream.Message Port fragments application data units (ADU) handed down fromhigher levels into messages. Each message is assigned a unique sequencenumber, and the messages are ordered by priority. Messages are sent overa TCP connection, re-assembled into ADUs in the receiving side MessagePort and handed up to higher levels. At any point the higher levels cancancel ADUs they had previously written to Message Port, which is a basisof the adaptation to network bandwidth.We present a brief overview of related work in Chapter 2. The maincontributions of our work consist of the next two chapters. We present ourlatency controller in Chapter 3. The latency controller adjusts the sendingrate to minimize the delay experienced by messages. We evaluate our latencycontroller in various heavy network conditions and compare its performanceagainst normal TCP. The results show decreases in median end-to-end la-3Chapter 1. Introductiontency over TCP by factors close to 3.Chapter 4 describes the failover mechanism used to improve worst caseresults of of Chapter 3. Paceline detects when the TCP connection gets stuckin exponential back-o s and replaces it with a new standby TCP connection.We present two di erent applications, Live-In and Qps (QStram Pub-Sub), in Chapters 5 and 6 respectively. Live-In was designed with livevideo conferencing intentions. We evaluate how Paceline can assist suchapplications and show clear improvements over TCP. Qps was created toassist Live-In with multi-party video conferencing, and thereafter, supportfor real-time video mixing was added. Evaluation of Qps is left to futurework. Chapter 7 concludes and discusses what we presented, and providesguidelines for future work..4Chapter 2Related WorkSupport for high-bandwidth low-latency streams has been a core subject ofa broad array of multimedia networking research. The main in uences onour work have been work on congestion control, and alternative transportprotocols. TCP Vegas was a seminal work in congestion control that pro-posed the use of delay measurements to proactively respond to congestion[3], Paceline’s latency controller is part of the long line of work that has sinceemployed similar techniques. Later work on slowly responsive TCP-friendlycongestion control aimed to better suit the needs of multimedia applica-tions [4]. Paceline re ects the general trend in multimedia toward user-levelimplementations that make due with TCP.Aside from congestion control, alternative transport service models havealso appeared, such as the SCTP [13] and DCCP [9], and more recentlyStructured Streams (SST)[5]. Like SCTP, Paceline identi es head of lineblocking as a major issue at the transport level. Paceline’s use of multi-ple transport connections has some similarity to SCTP’s support of multi-homing, but SCTP multi-homing is meant for use with redundant physicalpaths, while Paceline’s failover is employed for connections on the samepath. Paceline shares a datagram (message) orientation of DCCP, and likeDCCP, Paceline was designed with multimedia applications such as videostreaming as the main target. However, Paceline has major di erences indesign and implementation due to the fact that it works above TCP, ratherthan providing a complete replacement. SST, currently implemented overUDP, provides cheap connection setup and tear down for web data.A major inspiration for Paceline was work of Bhandarkar [1] which pro-posed to overcome obstacles facing active queue management (AQM), byemulating it at end hosts. Paceline is similar in that its latency controllercan be viewed as a user-level emulation of TCP Vegas like rate-based con-gestion control.Even though exponential back-o of TCP was known to be essential toTCP’s stability, Mondal et al. [12] have proven removing it from TCP tobe safe in today’s Internet. Paceline’s failover mechanism is an applicationlevel technique to detect and eliminate TCP exponential back-o .5Chapter 2. Related WorkRD network services [16] share our motivation of providing high band-width and low delay communication. However, applications have to chooseeither low delay or high bandwith but not both. In addition, RD proposeschanging routers to implement separate queues for di erent tra c typeswhile we employ a rate control algorithm above TCP to reduce queuingdelays for all tra c types.Recent work on fast paced large scale games, DonneyBrook [2], explainsthe high bandwidth requirements for emerging network games. Emerginggames do not adhere to the old wisdom of network games having thin com-munication streams [15]. Interestingly, DonneyBrook’s main contributionis reducing the bandwidth requirements by de ning interest sets which is atype of adaptation that can be improved using Paceline’s adaptation prim-itives. Priorities can better capture the range of players’ interests insteadof using two discrete types of updates (important and less frequent). More-over, cancellation of expired updates can enable rate adaptation based onthe network conditions without using a complex reservation scheme.The work done in [17] presents a potential solution to improve TCP’sbalancing of bandwidth fairly among competing  ows. They propose usinga variable maximum segment size in TCP.Minbu [7] proves that a great portion of latency experienced by TCPpackets is inside the kernel. Hence, they suggest the application limit itssending rate based on information provided by the kernel. They exploredi erent criteria for limiting the rate, and  nd a balance between utilizationand low latency. Paceline employs a similar technique in the applicationlevel to prevent data from spending excessive amounts of time in kernelbu ers.6Chapter 3Transport: LatencyControllerTCP provides in-order and reliable delivery of data, performs congestioncontrol and is extremely aggressive in utilizing network bandwidth. Suchproperties, which have been  ne-tuned over the past decades, cause theprocess of designing a new transport protocol that is able to compete withTCP in all aspects to become extremely challenging. TCP is currently, byfar, the dominant transport in the Internet [6].However, TCP provides no end-to-end timing guarantees, requiring timesensitive applications using TCP to deal with such issues. The work donein [7] shows a noticeable portion of the delay experienced by data communi-cated via TCP is spent in kernel bu ers. An adaptive application that canlimit it’s sending rate can both take advantage of TCP’s properties alongsideoperating in a low latency zone.In this chapter we describe how rate limiting is done by the application.We describe the logic behind the latency controller, along with a descriptionabout how the measurements needed by the controller are obtained. Wethen present an evaluation of our latency controller using emulation in anEmulab setup.3.1 MethodologyAn application limiting the socket  ll level to CWND + 3 MSS shouldexpect to experience the least amount of sender side bu ering inside thekernel, without sacri cing utilization or claiming less bandwidth than it’sfair share, i.e. the amount of bandwidth it would claim without rate limit-ing. Inspired by work done in the kernel [7], where they limit sending rateof TCP sockets inside the kernel, we designed a similar latency controllerin the application layer, referred to here as SOCK. Using information di-rectly provided from the kernel through the socket API, we have access tothe instantaneous value of the congestion window size, CWND, of the un-73.2. Application Level Measurementsderlying TCP connection. The application queries the sock  ll level usingSIOCOUTQ ioctl, and writes the amount of data to socket to keep the  lllevel close to the aforementioned socket limit.We implemented and tested the SOCK latency controller inside Pace-line (see Figure 1.1). Even though the performance of this controller wasextremely satisfying, it has its own limitations, two of which are of great im-portance. First, the SOCK controller may not behave as expected when theend-to-end connection is split over multiple TCP connections. For example,when using a local proxy with SSH tunneling, the  rst TCP connection isto the local proxy and the second TCP connection is over the Internet. Thismeans information provided by the local TCP connection’s socket does notre ect true behaviour of the network. In such a scenario SOCK may suc-cessfully prevent its socket bu er from  lling, even though the down streamSSH socket bu er has become full, thus paralysing our latency controller.Second, the SOCK latency controller relies on information from the socketAPI that is not provided in all implementations in di erent operating sys-tems, thus limiting portability of the SOCK latency controller to speci cplatforms.To overcome portability limitations mentioned above, we decided to de-sign a portable latency controller that behaves very similar to SOCK. Thenew controller, which we call the ADAPT latency controller and is the mainsubject of this chapter, does not rely on any information provided by thekernel to ensure portability.An application can keep count of the number of bytes it writes to TCPsockets, and hence, can keep track of each socket’s  ll level. This eliminatesthe need to query for socket  ll levels through the kernel. If a latency con-troller is able to estimate the value of TCP’s CWND at any time withoutthe help of the kernel, it can act upon the same logic the SOCK controllerdoes in limiting its sending rate. We, hereafter, call this estimation ^CWND,and explain the process through which we achieved very accurate estima-tions. First, we explain how to measure data required to estimate CWND.3.2 Application Level MeasurementsConsidering the reliable and in-order service provided by TCP, implementingapplication level acknowledgements becomes very straight forward. Everymessage that is transmitted from the sender side is tagged with a uniquesequence number, and the sender keeps a list of written sequence numbers.On the other side, when messages arrive, the receiver generates and sends a83.2. Application Level MeasurementsP-ACK 1 of the sequence number of the message that has arrived, or the lastsequence number if multiple messages arrive at the same time. Upon receiptof the P-ACK by the sender, it is safe to consider all sequence numbers beforeand equal to the P-ACKed sequence number in the list as acknowledged.Application level acknowledgements can be used to measure two impor-tant properties of the network: 1) Latency 2) Bandwidth.3.2.1 Latency MeasurementsWe keep track of the write time of every message that is written to socket,and the round trip time (rtt) it experienced is calculated by the time di er-ence between the write time and acknowledge time of that message. Con-sidering the fact that usually more than one message is acknowledged at atime, we face multiple round trip time measurements rather than one. Thisset of rtt calculations provides us with valuable information.The measurements are done in the application layer and are subjectto many types of noise, such as scheduling delays inside the kernel, thatare independent from network conditions. However, such noise can onlyincrease round trip time measurements, meaning the minimum time in theset of rtt calculations represents the best sample of round trip time possiblein the application level. We use exponentially weighted moving averaging(EWMA) to keep track of the smoothed value of the minimum round triptime, namely rttewma:ewma(avg;sample; ) fave = (1  ) sample+  avg (3.1)gTCP events, such as retransmits, are concealed from the application’sview. It is not feasible to infer accurately about real network characteristicsusing only the rttewma signal. However, there are occasions in which somemessages go through the network with the least possible delay. In order tovalue such measurements, we use asymmetric EWMA, namely ewma , toobtain rttewma :ewma (avg;sample; ) fif(sample avg)avg = sample (3.2)1Paceline ACK93.2. Application Level Measurementselseavg = ewma(avg;sample; )grttewma responds immediately to drops in round trip time measure-ments but smoothes increases. If the smoothing is done heavily, i.e. smooth-ing factor  is high, rttewma can be representative of the real network roundtrip time. Figure 3.1 is a sample plot of rttewma and rttewma compared toTCP’s measurements of round trip time, and shows how rttewma is closerto TCP’s measurements of rtt than rttewma. 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 220  225  230  235  240Time (S)Time (S)TCP RTT RTT EWMA RTT EWMA-Figure 3.1: rttewma versus rttewma versus TCP round trip time measure-ments. Notice how for a signi cant amount of time rttewma is less thanTCP round trip measurements.Finally, the highest round trip time measurement in the set illustrateshow data  ows through this TCP connection. We  nd the maximum of thecurrent highest rtt measurement and the oldest message written to socketand not yet acknowledged. We call this signal oldest unacked, and keep apro le of this signal as a metric of performance of the latency controller (seeSection 3.4.2 for results). Higher values of oldest unacked imply that theapplication on top of this connection experiences a \dead zone" and has towait longer to send and receive data, an issue referred to as head of lineblocking. We emphasize that rtt represents timing of data sent over thenetwork, whereas oldest unacked represents time periods where nothing issent.103.3. Congestion Window Approximation3.2.2 Bandwidth MeasurementsAn incoming  ow of P-ACKs can be used to measure network bandwidthas well. Each incoming P-ACK acknowledges one or more messages. Pe-riodically counting the number of bytes acked in a constant time slot, e.g.every 10ms, divided by the size of the time slot will give an approximation^bw of the network bandwidth. We apply three di erent  lters to ^bw: ewma,ewma and ewma+ (which is exactly the opposite of ewma and is de nedby Equation 3.3) to obtain three smoothed signals bwewma, bwewma andbwewma+ respectively.ewma+(avg;sample; ) fif(sample avg)avg = sample (3.3)elseavg = ewma(avg;sample; )gFigure 3.2 shows a sample plot of the three di erent bandwidth measure-ment signals compared to ^bw, and Figure 3.3 illustrates how they relate toeach other. As we will show, the value of calculating bwewma (bwewma+) liesin the fact that it catches the bases (peaks) bwewma misses due to smooth-ing and tends to under-estimate (over-estimate) bwewma. As we will showshortly, our strategy will be to combine these two signals in a way thatmatches the average, bwewma, but retains the ability to track peaks andbases. In the next section we explain how to use the measurements ob-tained so far to design the ADAPT latency controller.3.3 Congestion Window ApproximationTCP uses CWND as a limit on the number of bytes in  ight in the net-work. Intuitively, having reasonably accurate measurements of bandwidthand latency, we expect the product of bandwidth and latency to approximateCWND. We used Equation 3.4 in the initial design of ADAPT.^CWND = rttewma bwewma (3.4)We tested this controller and discovered it doesn’t perform like SOCK(Figure 3.4). The main reason we smooth the signals is to reduce the e ects113.3. Congestion Window Approximation 50000 100000 150000 200000 250000 300000 350000 100  110  120  130  140  150Rate (bps)Time (S)BW-EWMABW(a) bwewma vs. ^bw 50000 100000 150000 200000 250000 300000 350000 100  110  120  130  140  150Rate (bps)Time (S)BW-EWMA-BW(b) bwewma vs. ^bw 50000 100000 150000 200000 250000 300000 350000 100  110  120  130  140  150Rate (bps)Time (S)BW-EWMA+BW(c) bwewma+ vs. ^bwFigure 3.2: Di erent rate measurement signals vs. ^bw.123.3. Congestion Window Approximation 50000 100000 150000 200000 250000 300000 350000 100  110  120  130  140  150Rate (bps)Time (S)BW-EWMABW-EWMA-BW-EWMA+Figure 3.3: bwewma versus bwewma versus bwewma+. The invariantbwewma+  bwewma bwewma holds at any time. 6000 8000 10000 12000 14000 16000 18000 20000 22000 24000 26000 140  150  160  170  180  190  200  210  220BytesTime (S)ADAPTTCP CWND(a) Steady state. Notice how ^CWND follows the trend of CWND on the top, but doesnot reset when CWND does 0 10000 20000 30000 40000 50000 60000 20  40  60  80  100  120  140BytesTime (S)ADAPTTCP CWND(b) Start-up. Notice it takes around 110 seconds for ^CWND to reach CWNDFigure 3.4: ^CWND from Equation 3.4 compared to CWND133.3. Congestion Window Approximationof measurement noise, and we found a latency controller based on non-smoothed measurements to be unstable.However, using smoothed measurements results in a very slowly respon-sive controller. Figure 3.4(a) displays ^CWND compared to CWND of theTCP connection, each plus 3 MSS. It is worth noting that the latencycontroller only works if it can approximate CWND. We can see how thiscontroller is slow in following changes in CWND. This slow responsivenessof the controller impairs performance in two respects. First, when TCPdrops it’s rate by resetting CWND to initial value (i.e. when a TCP re-transmission timeout occurs), the controller does not respond immediately,causing the socket to  ll and latency to increase. Second, when bandwidthis available in the network, the controller is very weak in claiming availablebandwidth, causing the controller to take much longer to reach steady statethan a normal TCP connection normally would. Notice how in the  rst 140seconds of the same run plotted in Figure 3.4(b) ^CWND is no where closeto CWND.With intentions to increase responsiveness,  rst we need an indicator ofthe network condition. Based on the continuous rate of incoming acknowl-edgments, we use alterations in bandwidth measurements to infer about thestate of the network. If packets are dropped due to congestion, there is asudden decrease in bw. On the contrary, if bandwidth is available in the net-work, bw will start rising. However, in both cases, bwewma gradually followsthe changes and is almost constant as these changes occur. Thus the termpressure de ned by the ratio of bw to bwewma can be used as the networkcondition indicator. When bandwidth is available, pressure climbs close to1, and when congestion happens, pressure becomes very close to 0.We can re-write Equation 3.4 in the form of Equation 3.5, where bwewmahas been replaced with the average of bwewma and bwewma+, the averageof an under-estimate and an over-estimate of available bandwidth.^CWND = rttewma (0:5 bwewma + 0:5 bwewma+) (3.5)Furthermore, using the network condition indicator, by replacing theconstant factors 0.5 with pressure we get to Equation 3.62.pressure = CLAMP( bw2 bwewma;0;1)^CWND = rttewma (1 pressure)  bwewma (3.6)+rttewma pressure  bwewma+2CLAMP(x, min, max) clamps the value of x to the range [min,max].143.3. Congestion Window ApproximationWhat we achieve by this form of the equation is the ability to decidewhich bandwidth estimation to use as the dominant factor of our estimation.When bandwidth is available and pressure > 0:5, bwewma+ becomes thedominant factor, which is the highest rate recently observed in the network.This helps the underlying TCP connection grasp any available bandwidthand reach steady state like a normal TCP connection. On the other hand,when congestion happens in the network and pressure < 0:5, bwewma has more weight in the equation, which is the lowest bandwidth observed.This will act almost the same way TCP does in decreasing or reseting it’sCWND. In steady state conditions where pressure  0:5, Equation 3.6becomes a reasonable estimation of CWND. This is con rmed in Figure3.5, which depicts ^CWND compared to CWND using Equation 3.6.As we will show in the next section, the accuracy of the estimation ishigh enough for the ADAPT latency controller to have similar results as theSOCK latency controller. However, under very heavy load the controllermis-behaves. These are the conditions in which relatively small adjustmentsto rate can have large e ects on queuing delay. This sometimes causes apositive feedback loop, where a small overestimate of bandwidth leads toan increase in rttewma, which results in further over estimation, manifestingas a persistently rising rttewma. These are the conditions in which rttewmarises without any signi cant increase in bwewma. 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 80  100  120  140  160  180Target (Bytes)Time (S)Rate TargetSock TargetFigure 3.5: CWND obtained from the kernel versus ^CWND estimated byEquation 3.6.To detect this situation, we introduce a new stability check in the con-troller, which uses the ratio of rttewma to rttewma as a conservative signthat a correction is necessary. This ratio is always greater than 1, but under153.4. Evaluationnormal operating conditions should not be much greater.The ratio is greater than 1 because we expect some packets to experienceretransmissions on a regular basis. However, we expect at least some packetsnot to undergo retransmission. For these packets we expect the ratio to bebetween 1 and 2. This imples that if we have rttewma=rttewma > 2, we maytake it as a sign that queueing delay in socket bu ers is causing the delay,not retransmissions.In the case where the controller becomes instable, rttewma is no longera reliable signal since it contains a fake amount of kernel bu ering delay.To compensate the over estimation caused by this undesirable delay, wereplace the  rst rttewma in Equation 3.6 with rttewma . This results in atemporal under-estimation ofCWND (recall Figure 3.1), until the controllerre-stabilizes (Equation 3.7).if( rttewmarttewma  2)rttsignal = rttewmaelserttsignal = rttewma (3.7)^CWND = rttsignal (1 pressure)  bwewma +rttewma pressure  bwewma+Equation 3.7 is currently the best latency controller we have designed asADAPT. In the next section, we present results of evaluating the ADAPTlatency controller under various conditions and comparing it to SOCK and toTCP with no latency controller. We collected the same results for Equations3.4, 3.5 and 3.6. We do not present them as they are strictly worse thanresults obtained from Equation 3.7.3.4 EvaluationWe evaluate performance of our latency controller in terms of: Oldest un-acked Fairness Utilization163.4. EvaluationOldest un-acked is a representation of how responsive the transport isfrom the application’s point of view. Our goal in designing the latencycontrollers was to improve this characteristic of TCP. However, TCP  owsare very e cient in sharing bandwidth amongst themselves in a fair anddistributed manner. It is very important that we do not sacri ce this fairnessto achieve good latency.In the following subsections, we compare the ADAPT latency controllerto normal TCP without a latency controller to see how a controller canimprove quality of service provided by TCP. However, since ADAPT wasmeant to imitate SOCK, we also include SOCK in the evaluations as a metricof the optimum performance. We conduct our evaluation within an Emulabnetwork testbed.3.4.1 Experimental SetupOur network setup uses the common dumbbell topology of Figure 3.6, wherea set of clients on a LAN connect through a single bandwidth-delay con-strained link, emulating a wide-area path, to a set of servers on a remoteLAN. In the experiments, we vary the number of  ows sharing the path. Forthe WAN delay, we use a 30ms round trip time and the WAN bottleneck hasa bandwidth of 16Mbps using drop-tail queueing with a queue size of twicethe bandwidth-delay product of the WAN path. In congested conditions,this e ectively adds two round trip times (60ms) and the sum of propagationand queueing delay yield a total network round trip time of 90ms.We con gure the number of clients and servers to ensure that the WANpath is the bottleneck, not some other client or server resource. Each exper-iment run lasts more than 6 minutes (380 seconds). To eliminate experimen-tal startup and shutdown e ects, our measurements come from the middle300 seconds of each run. The node in the middle, node0 in Figure 3.6, doestra c shaping to emulate the bottleneck link and collects tcpdump dataif required. We use a set of scripts3 to automate the whole process. Thenodes each run a separate instance of QStream. The client nodes request avideo  le from a server on the other side of the bottleneck link. Each noderuns a local instance of Qmon, QStream’s data monitoring application tocollect data. We set up various check-points between experiment runs, usingemulab{sync, to ensure the nodes execute the experiment in synchronizedfashion. Except where noted, we ensure that the bottleneck link is fullysaturated.3Inside QStream main repository173.4. EvaluationNumber of  ows 2 4 8 16 24 32Average CWND size (MSS) 60 30 15 7.5 5 3.75Table 3.1: Average CWND size in the experiments based on the number of owsBased on our choice for bandwidth and delay, Table 3.1 shows the av-erage CWND size of TCP  ows, in units of MSS, based on the num-ber of  ows sharing the link. The average CWND size is computed as:total bandwidth (bps) rtt (s)=((1500 8(bits)) num flows). We believe32  ows represent a highly congested network.Figure 3.6: Emulab setup used for evaluation3.4.2 LatencyAs mentioned in Section 3.2, when a P-ACK arrives we calculate the worstround trip time experienced by the messages that were acknowledged andrefer to it as oldest un-acked. Every 100ms, we  nd the maximum oldest-unacked in the current time slot and store time traces of this signal for every ow. We then calculate the overall cumulative distribution function (CDF)183.4. EvaluationNumber of  ows Median (ms) 99.9 Percentile (ms) Worst Case (ms)2 51.9 65.8 65.84 452.5 879.8 960.28 591.1 2588.5 3245.0Table 3.2: Oldest un-acked measurements for 2, 4 and 8 TCP  owsNumber of  ows Median (ms) 99.9 percentile (ms)TCP SOCK ADAPT TCP SOCK ADAPT8 591.1 140.1 140.0 2588.5 319.5 584.916 652.4 164.2 182.5 4580.1 868.1 1165.024 707.6 180.6 217.2 4640.0 1212.4 1470.932 748.6 211.7 264.9 10256.1 1402.1 2064.5Table 3.3: Oldest un-acked measurements for SOCK and ADAPT latencycontrollers versus normal TCP without a latency controller. Notice how thelatency controllers decrease the median latency.of all the  ows participating in the experiment as the oldest un-acked metric.We expect our latency controller to be useful in network conditions wherenormal TCP performance is un-acceptable for time sensitive applications. Inorder to  nd such conditions in our experimental setup, we ran experimentswith various number of  ows increasing from 2-8. Table 3.2 displays theresults for 2, 4, and 8  ows when using normal TCP.The experiments have been set up in a way that bandwidth requirementsof each  ow is more than 5Mbps. Two  ows is the only case in which thenetwork bandwidth is more than the total requirements of the  ows, leavingthe bottleneck link un-saturated. Hence, in this zone, TCP has adequatelatency. However, when the number of  ows doubles to four and saturatesthe link, the median oldest un-acked increases an order of magnitude, to azone where normal TCP is not usable for time sensitive applications. Wechose eight  ows to be the minimum network load for the rest of the ex-periments, where we thoroughly test our latency controllers in zones whereTCP latency is unacceptable. The latency results can be found in Table 3.3and Figure 3.7(a) (we will discuss 99.9 percentile further and consider worstcases in Chapter 4).The SOCK and ADAPT controllers have close responses in terms ofmedian oldest un-acked, and are well below that of TCP. Even in extremelycongested network conditions with 32  ows, the median oldest un-acked iswell below that of TCP with 8  ows.193.4. EvaluationNumber of  ows Jain fairness indexTCP SOCK ADAPT8 0.84 0.86 0.8416 0.78 0.85 0.8624 0.73 0.82 0.8032 0.70 0.79 0.76Table 3.4: Jain fairness index measurements for SOCK and ADAPT latencycontrollers versus normal TCP without a latency controller.3.4.3 FairnessIn order to investigate the e ects of the latency controllers on bandwidthallocation, we use two measures to quantify bandwidth fairness. The  rstis a metric known as the Jain fairness index [8] de ned by the followingequation, where n is the number of  ows and xi is the bandwidth allocatedto  ow number i in a given time slot, e.g. every 500ms. The overall Jainindex is the average of the index of all time slots. This index ranges from1=n (worst case) to 1 (best case). The bandwidth allocation of a speci c ow is calculated from time traces of ^bw every 100ms (see Section 3.2).fairness = (Pxi)2(nPx2i ) (3.8).Table 3.4 has the Jain fairness index measurements for ADAPT andSOCK compared to TCP with 500ms time slots. To understand fairness ingreater detail than allowed by the Jain index, we also convert bandwidthmeasurements from the same experiments into a form of CDF.We subdivide the timeline into uniform time slots (e.g. every 500ms).For each time slot, we compute the ratio of each  ow’s bandwidth to the fairbandwidth share (i.e. total bandwidth / number of  ows). We then plotthe CDF of the bandwidth share of all the  ows. We prefer these CDF’sto a single metric because their shapes convey unfairness both in terms ofper- ow bandwidth (x position) and degree of a ected  ows (y position). Inthe case of perfect fairness, the CDF would appear as a step function, i.e.all  ows get 1 unit of fair share bandwidth in every timeslot. As fairnessdecreases, so will the slope of the CDF line. Also, if the left y-interceptis non-zero, it indicates that some of the  ows experienced periods of totalstarvation.Figure 3.7(b) plots the CDFs of ADAPT and TCP as the number of203.4. Evaluation 0 0.2 0.4 0.6 0.8 1 0  200  400  600  800  1000  1200  1400  1600  1800  2000FractionLatency (ms)ADAPT-8ADAPT-16ADAPT-24ADAPT-32TCP-8TCP-16TCP-24TCP-32(a) Oldest un-acked 0 0.2 0.4 0.6 0.8 1 0.2  0.4  0.6  0.8  1  1.2  1.4  1.6  1.8  2Fraction of FlowsSharing RatioADAPT-8ADAPT-16ADAPT-24ADAPT-32TCP-8TCP-16TCP-24TCP-32(b) Fairness with 500ms time slotsFigure 3.7: Oldest un-acked and fairness of ADAPT compared to TCP with8, 16, 24 and 32  ows.213.5. Incremental DeploymentTCP  ow % Median (ms) 99.9 percentile (ms) Worst Case (ms)0 178.8 1159.8 1990.625 200.9 1288.1 1744.750 214.9 1470.2 2732.475 219.9 1264.1 1835.6Table 3.5: Oldest un-acked measurements for the ADAPT latency controllerand mixed TCP with 16  ows. Increase in percentage of TCP  ows has littlee ect on the median. ows vary between 8-32. The vertical line around 1 represents the idealsharing of bandwidth. Observe that TCP with 8 and ADAPT with 8 and16  ows are the fairest, and TCP and ADAPT with 32  ows are the leastfair. However, neither of the  ows experience starvation, and the rest ofthe runs fall within the range of fairness between TCP with 8  ows and 32 ows. This means ADAPT  ows are at least as fair to each as TCP  ows.We believe this level of fairness to be acceptable.We also calculate the application level utilization of the bottleneck linkto con rm we are not under utilizing available bandwidth in favor of lowerround trip times.3.5 Incremental DeploymentIn the results presented so far, all the  ows participating in an experimentrun were of the same type, i.e. they were either all running the same latencycontroller, or were simply TCP without a latency controller. We wouldlike to investigate how our latency controller would perform if there areother  ows in the network competing for bandwidth. This would be a verycommon situation if we were to incrementally deploy our latency controllerin popular Internet applications.We conduct a series of experiments with 16  ows but with a mixture ofTCP and ADAPT  ows. We vary the mix of  ows using TCP from 0-75percent, while the rest use ADAPT. Figure 3.8 shows the latency results andTable 3.5 shows the median latency, 99.9 percentile and worst case latencyfor the  ows using ADAPT, for each mixture of  ows. Figure 3.8(b) is amagni cation of the last 0.5% of Figure 3.8(a) in a larger time scale. As thefraction of TCP  ows in the mix increases, the median latency worsens, butonly slightly. Our latency controller still operates in an acceptable range.The overall fairness plot of Figure 3.9 also shows a fair sharing of band-223.5. Incremental Deployment 0 0.2 0.4 0.6 0.8 1 0  100  200  300  400  500  600  700  800  900  1000FractionLatency (ms)0%-TCP25%-TCP50%-TCP75%-TCP(a) Oldest un-acked 0.995 0.996 0.997 0.998 0.999 1 0  500  1000  1500  2000  2500  3000FractionLatency (ms)0%-TCP25%-TCP50%-TCP75%-TCP(b) Oldest un-acked 99.5% to 100%Figure 3.8: Oldest un-acked of ADAPT with mixed number of TCP  ows.233.6. Summary 0 0.2 0.4 0.6 0.8 1 0.2  0.4  0.6  0.8  1  1.2  1.4  1.6  1.8  2Fraction of FlowsSharing Ratio0%-TCP25%-TCP50%-TCP75%-TCPFigure 3.9: Fairness of ADAPT with mixed number of TCP  ows.width between the  ows. This plot was generated from bandwidth mea-surements of all  ows, not only the  ows using ADAPT. We would alsoprefer that our  ows do not harm existing TCP tra c unduly. To verifythe e ect our  ows have on TCP  ows, we chose the case where 50% of the ows are TCP and plotted two individual CDFs of fairness in Figure 3.10,one for bandwidth allocation to ADAPT  ows and one for TCP  ows. Wenotice that the pure TCP  ows receive slightly more bandwidth than theirfair share, leaving the ADAPT  ows with less bandwidth. The fact thatADAPT  ows do not consume more bandwidth than they should is a highlydesirable property.3.6 SummaryIn this chapter we designed a latency controller for TCP connections. Thecontroller is based on application level measurements of network round triptime and bandwidth. Using time traces of oldest un-acked messages andbandwidth, we quantify how the controller behaves in terms of time responseand bandwidth sharing. Through evaluating our controller under sever net-work conditions, we discovered we can achieve good time responses withoutdisrupting TCP’s properties of bandwidth sharing and utilization. We also243.6. Summary 0 0.2 0.4 0.6 0.8 1 0.2  0.4  0.6  0.8  1  1.2  1.4  1.6  1.8  2Fraction of FlowsSharing RatioADAPTTCPFigure 3.10: Fairness of ADAPT  ows versus normal TCP  ows. The totalnumber of  ows is 16 where 50% are ADAPT. Notice how TCP  ows receiveslightly more bandwidth than their fair share.explored the e ect  ows using our latency controller may have on normalTCP  ows in the network. We observed our  ows experience a small in-crease in median latency, whereas TCP  ows are allocated more bandwidththan their fair share. This con rms our latency controller is safe to beincrementally deployed in the Internet, while maintaining its performance.25Chapter 4Transport: FailoverIn this chapter we present a technique used by Paceline, wherein Paceline canautomatically detect when the TCP connection enters a period of inactivity,even when there is data ready to be sent, and replaces it with a standby TCPconnection. In the following,  rst we explain why we need this technique,then we explain how connection management is done in QStream. Finally,we explain how we implemented and evaluated it to be useful.4.1 MotivationThe results of the latency controller are extremely pleasing when comparedto normal TCP. However, in the extreme cases tested in the previous chapter,there are worst case situations in which even using the latency controllermay not be satisfactory for time sensitive applications. Table 4.1 lists worstcase oldest un-acked messages for ADAPT and SOCK latency controllersalong with TCP. Notice how the worst case increases as the number of  owsincreases. Even though the cases between the 99.9 percentile and worstcase happen at most 0.1% of the time, such delays could be unsatisfactory.For example for a 100ms sampling period, 0.1% translates to one qualityimpairment every 100 seconds. We would prefer to minimize the magnitudeof those impairments, if possible.Under heavy congestion, TCP can experience back to back packet losses,# of  ows Worst Case (ms) Worst Case/Network RTTTCP SOCK ADAPT TCP SOCK ADAPT8 3245.0 525.0 1085.3 36 5.8 12.116 5192.6 1191.8 2228.9 57.7 13.2 24.824 6536.7 1694.9 2311.0 72.6 18.8 25.732 18587.6 2239.9 3180.5 206.5 24.9 35.3Table 4.1: Worst case oldest un-acked measurements for SOCK and ADAPTlatency controllers versus normal TCP without a latency controller.264.2. Connection Managementleading to one or more retransmission timeouts. We observed such timeoutsand exponential back-o s are correlated with many of the worst case roundtrip times. At some point, TCP spends a long time, in orders of hundredsof milliseconds and even seconds, waiting. Considering how the load on thenetwork is constant, we believe these timeouts are spurious events, and thereis no need for exponential back-o . We could e ectively reduce some of theworst cases if in such state we closed and switched from the current TCPconnection to a fresh new connection, which is safe and will not disrupt thestability of the Internet [12]. This is similar to a web user clicking the stopbutton in a browser followed by reload, when loading a web page takes anun-expected long time. We call this mechanism failover, as we fail from oneconnection over to another one.There are two components to failover: 1) connection management 2)control. Our main contribution is in the control of failover, but  rst weoverview connection management.4.2 Connection ManagementThe connection management component implements the control plane func-tions associated with initiating and establishing the transport instances usedby Message Port sessions (Figure 1.1). A session is a full duplex redundantchannel.For each session, the connection manager may employ several TCP sock-ets. The precise number of sockets and their roles depends on con gurationsettings, such as the number of standbys used for failover as depicted inFigure 4.1.Prior to the  ow of application data, the server and client execute amini-protocol to establish the initial set of session sockets. This protocolhas two phases. The  rst phase exchanges a con guration greeting and re-sponse between the client process and server process, using the  rst socket.The client greeting sets the session-wide con guration parameters, such as aglobally unique identi er (UUID) for the session, and the number of standbyconnections. The second phase of the protocol creates the remaining sock-ets, and exchanges a bind greeting and response on each of them. The bindgreeting contains the UUID of the session, allowing the server to associatethe socket to the correct session state. It also contains information necessaryto synchronize the role of each socket between the client and server. We de- ne four socket roles: client to server data, client to server acknowledgementchannel, server to client data, and server to client acknowledgement chan-274.3. ImplementationFigure 4.1: Message port sessionnel. A data and acknowledgement socket form a half duplex communicationchannel (from client to server or from server to client). After the initial setof channels are established, the connection manager may also remove andreplace failed channels, as part of the failover functionality described next.4.3 Implementation4.3.1 MechanismThe implementation of failover is fully transparent to the application. Thefailover mechanism works by maintaining a pool of channels between thepair of communicating applications. Each channel contains two TCP sock-ets. We only use one of these sockets (in each direction) for data deliveryat a time. As the sender writes messages to socket, it starts a timer set tofailoverthresh for that message, indicating the maximum time frame allowedfor this message to be acknowledged by the receiver. If a timer for a message res, the sender migrates data delivery away from the current active chan-nel to one of the available standby channels. All outstanding data has tobe retransmitted over the new channel, since we have no information abouthow much of that data has successfully traveled through the network. Ad-ditionally, the receiver-side needs some logic to suppress duplicate messagesthat may result of retransmission.Previously, the receiver could simply assemble all incoming messages and284.3. Implementationforward them to higher levels. Since with failover there is a possibility thatsome messages be duplicates of others, the transport needs to keep track ofall messages that have been received from the beginning of the connection.Considering the fact that messages are numbered when they are createdon the sender side and get re-ordered before they are written to socket,received sequence numbers are not necessarily in ascending order. Keepinga list of all sequence numbers is also not feasible. To address this problem,we use an illegal window of sequence numbers. The window is calculatedin a manner that only sequence numbers inside the illegal window may beduplicates. Hence, if the receiver keeps a list of only the sequence numbersinside the illegal window, it can easily detect duplicates. The sender noti esthe receiver to slide the illegal window periodically.Concurrent to activation of the new active channel, the connection man-ager terminates the old one, and initiates a new replacement channel thatonce established enters the pool of available standbys. As mentioned inthe previous section, client to server and server to client communication arehandled by separate channels, which allows failover to be managed indepen-dently in each direction as well.When replacing a failed channel, the connection mini-protocol sends afailover message from the sender to the receiver on the newly selected ac-tive channel sockets to ensure both sides are synchronized. This messageincludes a counter value, maintained by the sender side, incremented eachtime the sender fails the current active channel. If very severe congestioncauses more than one failover to occur back to back, then it is possible thatstandby channels will be established in an order di erent than how theywere initiated. The counter is used to ensure that the sender and receiverremain correctly synchronized on which channel is the newly active one.4.3.2 ControllerIf we manage to setup the failoverthresh to be greater than normal messagetransmission over TCP, but be less than TCP exponential back-o time-out, we are able to fail over at the right time. We calculate failoverthreshin a similar fashion that TCP calculates its retransmission timeout (RTO)[14]. We keep track of the EWMA smoothed values of maximum variancein round trip times experienced by messages, namely rttvar, when acknowl-edgements arrive. We then calculate the threshold using Equation 4.1, wherethresholdmin is set to 225ms, which is the minimum RTO threshold in TCP’scalculations.failoverthresh = MAX(thresholdmin;rttewma + 5 rttvar) (4.1)294.4. EvaluationNumber of  ows Median (ms) 99.9% (ms)ADAPT ADAPT + FO ADAPT ADAPT + FO8 140.0 140.0 584.9 419.316 182.5 179.7 1165.0 707.824 217.2 209.6 1470.9 840.332 264.9 239.4 2064.5 1023.5Table 4.2: Median and 99.9 percentile oldest un-acked measurements forthe ADAPT latency controller with failover compared to ADAPT withoutfailover. Notice how failover helps decrease the 99.9 percentile, withoute ecting the median.Number of  ows Worst Case (ms)ADAPT ADAPT + FO8 1085.3 611.816 2228.9 892.324 2311.0 1052.432 3180.5 1281.9Table 4.3: Worst case oldest un-acked measurements for the ADAPT la-tency controller with failover compared to ADAPT without failover. Noticehow failover helps decrease the worst case latency.Notice TCP uses four times the variance in round trip time, but havingtested both four and  ve, we use  ve. Failing over to a new connection is acostly function, since we are retransmitting some data, and also in the caseof false positives (i.e. failover timer  ring incorrectly), we are swapping aTCP connection with a large CWND with a fresh TCP connection withCWND = 1 in slow start. Considering the fact that our measurementsare done in the application level and are subject to noise, we add a safetymargin of an extra rttvar to decrease false positives. Our results con rm afactor of  ve has fewer false positive than four.4.4 EvaluationWe implemented failover as a complement to our latency controllers. How-ever, using failover with normal TCP does not seem like a reasonable solu-tion, since huge portions of the delay experienced by messages is inside thekernel bu ers, not in the network. This causes our failover threshold calcu-304.5. Incremental DeploymentNumber of  ows Jain fairness indexTCP ADAPT8 0.84 0.8616 0.78 0.8524 0.73 0.8132 0.70 0.76Table 4.4: Jain fairness index measurements for ADAPT latency controllerwith failover vs TCP.lation to mis-behave. We performed the same experiments we did with theADAPT latency controller in Section 3.4 armed with failover using the sameexperimental setup. Table 4.2 presents the median and 99.9 percentile ofADAPT with and without failover, Table 4.3 has the worst case. We noticehow failover has almost no e ect on the median, but e ectively decreasesthe worst case to less than half.Table 4.4 shows the Jain fairness index for ADAPT with failover com-pared with TCP. The fact that the fairness indices don’t change noticeablywith failover is a positive indication that our failover threshold calculationis not mis-behaving.4.5 Incremental DeploymentOnce again we evaluate our latency controller while competing with normalTCP  ows, this time aided with failover. Figure 4.2 shows the latencyresults, and Table 4.5 presents the median, 99.9 percentile and worst caselatencies for a mixture of 0%-75% TCP  ows. We notice the worst case hasimproved, as compared to results of Section 3.5, without major e ects onthe median.We do not present the fairness plots for this evaluation since they arevery similar to that of Section 3.5. This con rms correctness of the failovermechanism.4.6 SummaryIn this chapter we presented a supplementary technique, failover, to the la-tency controller designed in Chapter 3. Failover is designed to detect whenTCP su ers from exponential back-o , and close and move data transferfrom the current stuck TCP connection to an already open standby TCP314.6. Summary 0 0.2 0.4 0.6 0.8 1 0  100  200  300  400  500  600  700  800  900  1000FractionLatency (ms)0%-TCP25%-TCP50%-TCP75%-TCP(a) Oldest un-acked 0.995 0.996 0.997 0.998 0.999 1 0  500  1000  1500  2000  2500  3000FractionLatency (ms)0%-TCP25%-TCP50%-TCP75%-TCP(b) Oldest un-acked 99.5% to 100%Figure 4.2: Oldest un-acked and fairness of ADAPT with mixed number ofTCP  ows.324.6. SummaryTCP  ow % Median (ms) 99.9 percentile (ms) Worst Case (ms)0 179.4 679.9 892.525 188.5 699.6 1069.850 211.5 732.6 1012.375 208.3 827.3 1178.5Table 4.5: Oldest un-acked measurements for the ADAPT latency controllerwith failover and mixed TCP with 16  ows. Increase in percentage of TCP ows has little e ect on the median.connection. Results show how failover can e ectively decrease the worse caseoldest un-acked message to less than half. We also con rmed that our la-tency controller still behaves in an acceptable manner when competing withnormal TCP  ows, and observed we do not hurt other TCP  ows severely.This can be considered a green light towards incremental deployment of ourlatency controller in popular Internet applications.33Chapter 5Application: Live-InIn this chapter we focus on an application layer built on top of Paceline.Referring back to Figure 1.1, Live-In resides in QVid. Live-In was designedto provide QStream with the ability to stream, in real time, live video cap-tured and encoded directly from a camera, e.g. from a webcam as donein video conferencing. In the rest of this chapter, we  rst explain the ba-sic functionality of QStream and Live-In, then evaluate the application levelperformance of streaming live video over Paceline compared to using normalTCP.5.1 QStream and Live-inFigure 5.1 provides a detailed view of QVid. QVid is an adaptive videoplayer and consists of various components. QVid can either run in servermode, i.e. QServer, or in client mode, i.e. QClient. They communicatewith each other using Paceline via Pps Server/Pps Client that just handleseralization/de-serialization of application data units. QServer is responsiblefor locating and streaming video data to QClient, which in turn providesa means for decoding and displaying the video data using QVid player,Decoder and Video-OutUpon connection, QClient requests one or more videos from QServer.QServer initializes either File-In or Live-In based on the requested  lename.A  lename of \live" indicates that live data has been requested by the client,or else the  lename is looked-up for on disk. Once the source of data hasbeen found, QServer transmits the video frames to QClient through PpsServer. Upon noti cation of receipt of video data by Pps Client, QClientforwards the data to the player which in turn decodes the data and preparesit for displaying via Video-Out.Live-In lies between QServer and Video-In. Video-In initializes the cap-turing device, and starts pumping out raw captured images at an almostregulated rate speci ed by Live-In. If the rate at which images are capturedfrom the device is slower (faster) than the requested frame rate, Video-Induplicates (drops) some frames to maintain a constant frame rate. Live-In345.1. QStream and Live-inFigure 5.1: Qvid Structure.is noti ed each time a frame is output from Video-In through the on capturecallback.Raw frames are fed into an SPEG encoder as soon as they become avail-able and Live-In is noti ed when the frame has been encoded by the en-coder produce callback supplied with a bitstream of the encoded frame. Thebitstream contains a header, containing meta data about the frame, a framebase layer, which is required for reconstructing the frame, and at most sevenenhanced spatial layers that provide extra detail for the base frame.Video-In tries to regulate the rate at which frames are output. However,the frames that are output are not always evenly spaced in time. To ensureframes are transmitted at a constant rate, Live-In does not transmit theframes as soon as they are encoded. Instead, once the bitstream has beenparsed, two events are scheduled for the frame: Output Event Expire EventWhen the output event  res, the frame can be forwarded to Pps Serverto be transmitted over the network. The output event schedule time isobtained from: video start time+frame number=frame rate.QStream is based on adaptation. We expect our video applications toadapt to the rate at which data can be sent. This requires the applicationto be able to cancel frames when there is no use in sending them anymore.The expire event is scheduled for such occasions and we cancel a frame whenit  res. The expire event is scheduled using the live timeline.QStream uses an adaptive timeline for playing stored video [10]. Theadaptive window size grows up to 30 seconds for pre-fetching data from355.1. QStream and Live-inFigure 5.2: Live timeline.storage. However, the timing requirements for live video is slightly di erent.Figure 5.2 shows the timeline used by Live-In. Setup refers to when theimage is captured and xmit is the time where the frame is encoded andcan be forwarded to Pps Server for transmission. Decode is the time wheredecoding of the frame should begin by the decoder. At this point, thedecoder starts decoding with whatever number of layers it has received fromthat frame so far, unless either of the frame meta data or base layer aremissing. In such occasions, the entire frame is dropped. Decode is also thetime where it is too late to transmit any data for the current frame, whichis why we refer to decode as xmit expire. Finally, output is the time wherewe expect the frame to be displayed, which is the time by which the decodershould  nish decoding.Two of the times between these di erent events, which are measuredin number of frames, are set to some constant values for a given session.We chose one frame time for setup to xmit and two frames for decode tooutput. The xmit window size, the time between xmit and xmit expire, isa user con gurable parameter. We will get back to the xmit window size inSection 5.2.In normal operation QStream does not assume clock synchronization.The server and client independently keep track of the timeline. This maycause the timelines to be out of phase or rate, out of phase caused by net-work delay, and out of rate because of inaccurate hardware clocks. For thepurpose of evaluation, we synchronize the timelines with NTP, but in normaloperation there exists a feedback loop inside the application to synchronizethe timelines.The frame types output from the encoder are not always I-frames. Thebitrate of the video can be highly reduced if a greater group of images(gop) is used. Using a gop size greater than one introduces inter-framedependencies. This inter-frame dependency should be handled with care.If all expire events of frames are set to xmit expire in the timeline andan I-frame is not transmitted by its deadline, it will be dropped by thesender. This means all subsequent P-frames in the gop, even if they make it365.1. QStream and Live-inthrough to the other side, are useless to the decoder. The same can happenfor P-frames dependant upon earlier dropped P-frames. To handle this, wepostpone expiring all frames in a gop to the xmit expire of the last frame.To ensure the frames are not re-ordered in Message Port4, the utility of eachframe in the gop decreases as the frame approaches the end of the gop. Oneminor problem of this approach is that the gop output from the encoder isnot always equal to the requested gop size. The encoder may have to end thecurrent gop and start a new one if the captured video has sudden changes.A shorter gop will cause the frames to expire later than they should, and betransmitted if bandwidth becomes available. Fortunately, this is a rare caseand we leave implementing a more complicated solution for future work.QServer creates a new instance of all the resources of Figure 5.1 forevery client. Unlike other resources in the system that can be shared be-tween di erent processes, only one instance of Video-In can exist at anytime. Hence, for Live-In to have the ability to support a multi-party videoconference, di erent instances of Live-In should be able to share the sameinstance of Video-In. The  rst instance of Live-In initializes Video-In andencodes the captured frame. Once the encoded frame is ready, it is dupli-cated for each instance of Live-In, if more than one instance exist. EachLive-In re-numbers its own copy of the frame’s data structures to match thecurrent state of the session and forwards it to its own Pps Server upon theoutput event.Live-In was implemented to be compatible with Qvid player. This im-plies Live-In should support commands such as pause, play and seek eventhough they may not have a reasonable meaning with live video. In QServer,the response with a seek is usually followed by the  rst video frame. It isimportant to ensure the  rst frame sent to the other side is an I-frame. SoLive-In refrains from sending a seek result until an I-frame is produced bythe encoder. A big gop size in the encoder can cause a noticeable delay insuch cases. A speci c client pausing the video is equal to the correspondinginstance of Live-In dropping all its duplicated data. In the case where allviewers pause their video, the data is no longer encoded and is dropped rightafter being captured. The reason we don’t stop capturing frames is due thefact that initializing the capture device takes a noticeably long time. Soas long as there are viewers of the video, even paused viewers, we continuecapturing and locally dropping captured video frames.Figure 5.3 shows an image captured from the output of Live-In whenstreaming a 30 frame per second video locally. The clock on the bottom is4Recall that Message Port re-orders data based on their utility375.2. Application Level Evaluationsystem time. The Local Webcam window is the image displayed right afterbeing captured without going through the encoder. Notice how it is 100msbehind the local clock, even though we allowed one frame of encoding time.This is due to the delay the camera has in capturing and outputting theimage, along with the delay of displaying the image. The Live window isthe video that has gone through the process of encoding, transmission anddecoding with an xmit window size of 3 frames. A 200ms delay between thetwo windows is due to a six frame timeline, one for setup, 3 for xmit and 2for decode at a 33ms frame time.Figure 5.3: Live-In output for local streamingIn the next section we investigate the application level performance whenrunning over Paceline.5.2 Application Level EvaluationIn this section we evaluate the performance of streaming live video over acongested network with and without a latency controller. We use the samenetwork setup of Section 3.4 for our experiments. However, since capturingdirectly from a camera was not possible for us in our Emulab setup, insteadwe stream a special stored video with the live timeline. We believe this videoto closely resemble a live video. The video has an average bitrate of 5Mbpswith 30 frames per second, representing a high de nition video conference.In the  rst experiment, we varied the xmit window size. When the clientand server timelines are static and synchronized, the timeline con gurationalmost exactly determines the end-to-end delay, and frames that were dis-played on the screen have a maximum end-to-end delay speci ed by thetimeline (i.e. setup+xmit window+decode). Therefore, we de ne goodput385.2. Application Level Evaluation 0 20 40 60 80 100 1  10  100  1000  10000Goodput (% of acked data)Xmit Window Size (frames)TCP ADAPTFigure 5.4: Xmit window size versus application goodput. The end-to-endlatency is xmit window size plus a constant 3 frames for setup and decode(1 frame = 33.3ms).to be the percentage of bandwidth which contributed to data used and dis-played by the application, i.e. arrived on time for decoding. This quanti esthe overhead of cancelling messages during transfer. In this experiment weincreased the xmit window size to examine the trade o between low latencyinteractions and goodput.In  gure 5.4, we see that for 8 video  ows with a total bitrate of 40Mbps,going through a 16Mbps link, the goodput generally increases as the xmitwindow size increases. Notice the end-to-end latency is equal to the xmitwindow size plus 3 frames for setup and decode, and each frame is 33.3ms.For normal TCP, we noticed that goodput is signi cantly lower that ADAPTwith failover in low latency ranges below 1 second (30 frames). We alsosee that by allowing applications more than 1 second of xmit window, thegoodput increases very slowly with a negligible improvement. This meansin a congested network where a high de nition video conference is runningwith a 5 frame xmit window size (166ms) and more than 60% goodput,the same application would have to allow more than 60 frames of delay (2seconds) to have the same goodput on normal TCP. Recall from Table 4.2the median latency for 8 ADAPT  ows with failover is 140ms.395.2. Application Level Evaluation 0 5 10 15 20 25 4  6  8  10  12  14  16Average-FpsNumber of videosTCP ADAPT(a) Average frames per second 0 1 2 3 4 5 6 7 4  6  8  10  12  14  16Average-Spatial-LayersNumber of videosTCP ADAPT(b) Average spatial layersFigure 5.5: Average frames per second with an 8 frame xmit window size.405.3. SummaryGoodput represents how the application is able to make good use ofbandwidth, but considering how only a few spatial layers are enough toreconstruct a frame, an application with low goodput may still be able topresent a reasonable frame rate to the user. This is due to the fact thatour application allows con guration of policy to balance the importance ofspatial and temporal quality. These experiments were run with the defaultpolicy of preserving temporal quality over spatial quality. This policy isbased on the subjective observation that decreased frames per second ismore annoying than decrease in spatial  delity. To invesitgate how Pacelinecan e ect the frame rate, using an xmit window size of 8 frames (267ms),we run a variable number of videos from 3, the lowest number of videos ableto saturate the bottleneck link, to 16 using ADAPT with failover or TCP.We then calculate the average number of frames per second in Figure 5.5.Figure 5.5(a) shows the average frames per second and Figure 5.5(b) showsthe average spatial layers used to reconstruct frames.When the number of  ows increases, as we expect, the average spatiallayers used in decoding frames decreases for both ADAPT and TCP. How-ever, with 5 videos TCP’s average frame rate drops below 5fps, whereasADAPT is able to maintain an average fps of 25 with 5 videos and theframe rate drops to 8 with 16 videos. Using ADAPT, we are able to streama high de nition live video in a congested network with a negligible decreasein frame rate, or equivalently, allow 3 times more users to use the networkand experience the same quality of service they previously would when usingTCP. Additionally, using ADAPT the frame rate has a graceful decrease asthe the network undergoes higher loads.5.3 SummaryIn this chapter we presented an application layer inside QStream calledLive-In. The main purpose of Live-In is to enable QStream applications tostream live video captured from a webcam over the Internet in real-time.We evaluated Live-In over Paceline in terms of goodput and average framesper second in di erent network conditions. Our comparison with Live-Inover TCP show signi cant improvements. Further evaluation of Live-In isstill ongoing work.41Chapter 6Application: QPSIn this chapter we present another application called QStream Publish Sub-scribe (QPS). Our experience with Live-In over Paceline proved QStream tobe very desirable for video conferencing. We can stream high de nition video(5Mbps bitrate) over a congested network with end-to-end latency of 200mswith very good quality. Our application also supports multi-party confer-encing. However, for normal Internet users, there is one valuable resourcethat tends to become a bottleneck resource when video conferencing: uploadbandwidth. Most ISPs provide much more download bandwidth than theydo for upload, sometimes up to a ratio of 10.The peer-to-peer conferencing of Figure 6.1(a) (currently supported byLive-In) requires N times bandwidth of one video stream, for both upload-ing and downloading, where N is the number of peers. To make our videoapplications more scalable, we decided to design a publish-subscribe appli-cation that could act as the broker in  gure 6.1(b), where each peer uploadsat most one video. The download bandwidth requirements for the peers arestill the same as before, and Qps requires N2 times bandwidth of one videostream.(a) Video conference with peer to peerconnections(b) Video conference with using QPSFigure 6.1: Peer-to-peer conferencing vs. QPS conferencing.426.1. DescriptionThe peer-to-peer approach may be acceptable if the best last mile band-width was not the limit. However, this seems to be unlikely in the forseeablefuture, so even though the second architecture has a potential increase inend-to-end latency, it is currently a reasonable solution especially for cloudcomputing. Amazon charges at most $0.17/GB for any data transfer. In avideo conference with 5 peers with 1Mbps videos, Qps will use 30Mbps ofbandwidth. This is equal to a total cost of $0.306/minute for video confer-encing, or equivalently $0.061/minute for each peer. This rate is comparableto the rate of VoIP services provided by companies such as Skype.6.1 DescriptionQps was designed to be a general purpose pub-sub library, with the ability tosupport any form of real-time tra c and is built on top of Paceline. Handlingspeci c data messages which depend on the type of application using Qps,i.e. video data, game data, web data, etc, is done by the application itselfvia various callbacks. The application created to support video data isQpsDemo.Qps creates a global listener that waits for publishers and subscribers.All Qps connections are initiated through this listener. Every connectingclient should register itself with Qps with a unique name, which will be usedfor all future identi cations purposes, and the type of service it provides, i.e.whether it is a publisher or subscriber. If the newly connected client is asubscriber, it receives the list of all current publishers from Qps. If the newclient is a publisher, its name is added to the list of publishers. Each timea publisher is added or removed from the list, all subscribers are noti edimmediately.A subscriber may subscribe to any publisher available in the list. Thepseudo code Qps uses for handling incoming subscribe messages can be foundin Program 6.1. A subscriber sends a sub message to Qps, along with thename of the desired publisher. Qps will look for an active session from therequested publisher in a hash table of sessions. If no such session is found,Qps creates a new Pps Client and connects to the publisher. Once data isreceived from the publisher, a listener is created for the new session andthe port number of the listener is forwarded to the subscriber. If the activesession is found in the hash table, the port number is simply forwarded tothe subscriber since there is no need to re-create the listener. The subscriberwill now connect to the listener via its own Pps Client, as it normally wouldwhen directly connecting to QServer. Since QpsDemo was designed for live436.1. DescriptionFigure 6.2: Qps session and viewers.video only, a  lename of \live" is used in all connections.Previously, Live-In contained the logic for handling multiple clients of alive stream. It managed data duplication, re-numbering and forwarding viaseparate Pps Servers, or dropping data in the case where video had beenpaused by the client. It also handled cancelling data once it was too lateto send them via the expire event. Similar logic is required inside Qps forevery session, which is handled by what we call a Qps viewer. Figure 6.2shows the relation between a session and a viewer. A session includes oneor more viewers, each one serving a single QClient via its own Pps Server.As before, the video  ow is only paused when all viewers pause the video.Qps uses the same live timeline of Figure 5.2 with the di erence that thereis no need for the setup part of the timeline, since incoming data is alreadycaptured and encoded. One di erence Qps viewers have with Live-In lies inthe fact that an in Live-In an encoded frame becomes available altogetherwhen the encoder  nishes encoding. However, in Qps di erent parts of aframe are arriving from the network with some delay, and some do not makeit. For this, we currently duplicate and forward data for a session as it arrivesfrom the publisher.When a subscriber un-subscribes from a session, its corresponding viewer446.1. DescriptionProgram 6.1 Pseudo code for handling incoming subscribe messages. Qpsreceives a subscribe message from subscriber sub requesting subscription topublisher pub.function recv_subscribe(sub, pub) :session = lookup_session(pub.name)if session = NULL :session = new_session()store_session(session)session.pps_client = create_pps_client()session.pps_client.connect(pub)session.pps_client.recv_data(pub)session.listener = new_listener()endsub.send(session.listener.port)endis deleted, but as long as there are active viewers in the session, the sessionis kept alive. Once all viewers un-subscribe, the listener is deleted and thesession is terminated. If a publisher leaves the session, all viewers of thesession are noti ed. Since we do not know of the state of data in  ight,we wait for the subscribers to un-subscribe before we terminate the session.Figure 6.3 shows a client’s view of Qps in a simple setting where it hassubscribed to two video streams provided by two di erent publishers.Figure 6.3: Qps from a subscriber’s view.456.2. Preferential Subscriptions6.2 Preferential SubscriptionsWe have provided the ability for users of Qps to subscribe to data theychoose, but that is not all. What if Qps could make the decision for theusers? The work done in the context of preference aware pub/sub [11] wasthe main motivation behind creating Qps. In traditional pub/sub systems,users subscribe to the events in which they are interested, and are noti edwhen such events occur. The duties of the pub/sub system summarize inforwarding events to proper handlers. However, in some systems the averagenumber of subscriptions may be high enough to overwhelm the subscribers.In such systems, the user provides the pub/sub system with preferentialcriteria to rank events in order of importance. The pub/sub system cannow always forward only a constant number of the most important eventsto any subscriber based on their preference.We added the support of preferential subscriptions to Qps. Qps main-tains a list of subscribers, and for each element in the list keeps a list ofviewers, i.e. keeps track of all the sessions a speci c subscriber is a memberof. Performing any type of mix on the viewers just requires minor changes todata structures, such as swapping data pointers. For a means of demonstra-tion, the special option in Figure 6.3 enables preferential subscriptions onQps. This will cause Qps to  ip the incoming data for the  rst two viewersof this subscriber every two seconds.Such real-time video mixing can be useful in many occasions. Considerthe live TV coverage of a sport event in Figure 6.4. Many cameras coversuch events, each one following a speci c player or recording from a speci cangle. All videos are fed into the mixing station, where they are multiplexedmanually into a single video stream, which is then broadcasted. However,the interests of the spectators is not always the same as the TV director’s,and modern cameras provide information about directions at which they arepointing. A more desirable solution would be to substitute the TV mixingstation with a real-time video mixer like Qps. Spectators can specify theirorder of preference based on area of coverage and location of each camera,and watch a video speci cally editted for them.Another direction in which we plan to take Qps is in the realm of onlinemulti-player games. DonneyBrook [2] is an online multi-player game basedon Quake3 with the ability to support up to 900 players. Since the gamestate grows as the number of players increases, not all clients have the abilityto forward their data to every player, similar to the problem we had withpeer-to-peer conferencing. Qps can be used to forward some of the gamestate data in such games.466.2. Preferential SubscriptionsFigure 6.4: Live TV coverage of an event.Evaluation of Qps is ongoing work and is not in the scope of this thesis.We presented Qps as an assistance for video conferencing in this chapter.Qps follows some of the same logic inside Live-In for handling multipleclients. We also added a simple form of preferential subscriptions in Qps, asan introduction to our future work.47Chapter 7Conclusion and Future WorkA transport that can provide high bandwidth with low latency can makeway for the next generation of multimedia applications. High de nitionvideo conferencing and massive multiplayer online games are two of suchapplications. However, the lack of a suitable solution for such transport hasbeen a major issue for many years. In this thesis our main contribution wasto move closer to bringing such a transport to life.We designed an application layer latency controller based on top of TCP.The latency controller uses measurements of the network’s bandwidth andlatency to calculate the amount of data it allows to be passed down toTCP. We also aided our latency controller with a failover mechanism thatautomatically detects when the TCP connection becomes \stuck", i.e. suf-fers back to back packet drops and moves into exponential back-o state.It helps improve latency by replacing the currently stuck TCP connectionwith a new standby connection.We evaluated Paceline using an Emulab network setup, comparing it toTCP. We have observed that it can noticeably improve the latency perfor-mance of TCP, while almost leaving its utilization and fairness propertiesintact. We also con rmed  ows using Paceline do not harm other TCPtra c inside the network.On top of Paceline, we designed two di erent applications for live videostreaming. We evaluated the performance of the application when usingPaceline, compared to using normal TCP. Our evaluation showed great im-provements over TCP.The work presented here may be continued in myriad possible directions.In our evaluation we tested a normal droptail FIFO queue in the bottlenecklink, which represents a majority of the Internet. However, it may be inter-esting to investigate how our results may vary where the network providesactive queue management for our  ows.We use SOCK as a reference to an optimal solution. Our results showADAPT’s performance seems to be better than SOCK with lower load onthe network, and shifts to becoming worse under heavy load. Perhaps wecould better tune our latency controller to at least be in the same ballpark48Chapter 7. Conclusion and Future Workto that of SOCK under heavy load. Additionally, SOCK may in turn notbe the best possible solution under extremely heavy load. Such zones ofoperation may also be of interest. Finally, even with failover enabled, westill observe some high worst case latencies. Examining the worst cases ingreater detail with the help of packet level analysis, e.g. using tools liketcpdump and wireshark, may result in ways to further improvements.Qps can still undergo further improvements, one of which stands outin terms of importance. The preferential subscriptions are currently pro-grammed in the source code. However, we would prefer our criteria to bemore  exible and provide the user with the ability to de ne and change hiscriteria using a high level language.49Bibliography[1] Sumitha Bhandarkar, A. L. Narasimha Reddy, Yueping Zhang, andDimitri Loguinov. Emulating aqm from end hosts. In SIGCOMM ’07:Proceedings of the 2007 conference on Applications, technologies, archi-tectures, and protocols for computer communications, pages 349{360,New York, NY, USA, 2007. ACM.[2] Ashwin Bharambe, John R. Douceur, Jacob R. Lorch, Thomas Mosci-broda, Je rey Pang, Srinivasan Seshan, and Xinyu Zhuang. Don-nybrook: enabling large-scale, high-speed, peer-to-peer games. SIG-COMM Comput. Commun. Rev., 38(4):389{400, 2008.[3] Lawrence S. Brakmo and Larry L. Peterson. TCP Vegas: End to EndCongestion Avoidance on a Global Internet. IEEE Journal on SelectedAreas in Communications, 13(8):1465{1480, 1995.[4] Sally Floyd Deepak Bansal, Hari Balakrishnan and Scott Shenker. Dy-namic behavior of slowly-responsive congestion control algorithms. InIn Proceedings of SIGCOMM 2001, San Diego, CA, August 2001.[5] Bryan Ford. Structured streams: a new transport abstraction. SIG-COMM Comput. Commun. Rev., 37(4):361{372, 2007.[6] Chuck Fraleigh, Sue Moon, Bryan Lyles, Chase Cotton, Mujahid Khan,Deb Moll, Rob Rockell, Ted Seely, and Christophe Diot. Packet-leveltra c measurements from the sprint ip backbone. IEEE Network, 17:6{16, 2003.[7] Ashvin Goel, Charles Krasic, and Jonathan Walpole. Low-latency adap-tive streaming over tcp. ACM Transactions on Multimedia Computing,Communication, and Applications, 4(3):1{20, 2008.[8] R. Jain, D. Chiu, and W. Hawe. A quantitative measure of fairnessand discrimination for resource allocation in shared computer systems.Technical Report TR-301, DEC Research, September 1984.50Chapter 7. Bibliography[9] Eddie Kohler, Mark Handley, and Sally Floyd. Designing dccp: conges-tion control without reliability. SIGCOMM Comput. Commun. Rev.,36(4):27{38, 2006.[10] Charles Krasic and Jean-S ebastien L egar e. Interactivity and scalabilityenhancements for quality-adaptive streaming. In MM ’08: Proceeding ofthe 16th ACM international conference on Multimedia, pages 753{756,New York, NY, USA, 2008. ACM.[11] Drosou Marina, Stefanidis Kostas, and Pitoura Evaggelia. Preference-aware publish/subscribe delivery with diversity. In DEBS ’09: Pro-ceeding of the 3rd ACM international conference on Distributed Event-Based Systems. ACM, 2009.[12] Amit Mondal and Aleksandar Kuzmanovic. Removing exponentialbacko from tcp. SIGCOMM Comput. Commun. Rev., 38(5):17{28,2008.[13] L. Ong, Ciena Corporation, J. Yoakum, and Nortel Networks. RFC3286 - An Introduction to the Stream Control Transmission Protocol(SCTP). RFC 3286, May 2002.[14] V. Paxson and M. Allman. Rfc 2988 - computing tcp’s retransmissiontimer. RFC 3988, 2000.[15] Andreas Petlund, Kristian Evensen, P al Halvorsen, and Carsten Gri-wodz. Improving application layer latency for reliable thin-stream gametra c. In NetGames ’08: Proceedings of the 7th ACM SIGCOMMWorkshop on Network and System Support for Games, pages 91{96,New York, NY, USA, 2008. ACM.[16] Maxim Podlesny and Sergey Gorinsky. Rd network services: di erenti-ation through performance incentives. SIGCOMM Comput. Commun.Rev., 38(4):255{266, 2008.[17] Steven Wilson. An Adaptive Packet Size Approach to TCP Conges-tion Control. Master’s thesis, University of British Columbia, Canada,October 2008.[18] www.qstream.org.51

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
United States 12 4
China 5 0
Iran 3 1
Germany 1 2
France 1 0
Hong Kong 1 0
Thailand 1 0
City Views Downloads
Unknown 4 2
Ashburn 4 1
Shenzhen 4 0
Shahrak-e Pars 3 1
Buffalo 3 0
Redmond 1 0
Laguna Niguel 1 0
Seattle 1 0
Bangkok 1 0
Beijing 1 0
Wilmington 1 1

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0051679/manifest

Comment

Related Items