Abstract
As modern network applications (<italic>e.g.</italic>, large data analytics) become more distributed and can conduct application-layer traffic adaptation, they demand better network visibility to better orchestrate their data flows. As a result, the ability to predict the available bandwidth for a set of flows has become a fundamental requirement of today’s networking systems. While there are previous studies addressing the case of non-reactive flows, the prediction for <italic>reactive flows</italic>, <italic>e.g.</italic>, flows managed by TCP congestion control algorithms, still remains an open problem. In this paper, we take the first step to solving this problem in a data center network. To address both theoretical and practical challenges, we introduce a novel learning-based prediction system based on the NUM model, with two key techniques named <italic>fast factor learning</italic> (FFL) and <italic>efficient flow sampling</italic>. We adopt novel techniques to overcome practical concerns such as scalability, convergence and unknown system parameters. A system, Prophet, is proposed leveraging the emerging technologies of <italic>Software Defined Networking</italic> (SDN) to realize the model. Evaluations demonstrate that our solution achieves significant accuracy in a wide range of settings.
- [1] Amazon elastic compute cloud (Amazon EC2), Amazon, Bellevue, WA, USA, 2010.Google Scholar
- [2] Google. (2017). Google Cloud Computing, Hosting Services & APIs. [Online]. Available: https://cloud.google.com/Google Scholar
- [3] Microsoft. (2017). Microsoft Azure Cloud Computing Platform & Services. [Online]. Available: https://azure.microsoft.com/en-us/Google Scholar
- [4] , “The akamai network: A platform for high-performance Internet applications,” ACM SIGOPS Oper. Syst. Rev., vol. 44, no. 3, pp. 2–19, Aug. 2010.Google Scholar
- [5] , “MapReduce: Simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, Jan. 2008.Google Scholar
- [6] , “LHC computing grid,” Des. Rep., vol. 1, p. 8, Oct. 2005.Google Scholar
- [7] , “Gatekeeper: Supporting bandwidth guarantees for multi-tenant datacenter networks,” in Proc. WIOV,
2011 , pp. 1–7.Google Scholar - [8] , “Towards predictable datacenter networks,” in Proc. SIGCOMM,
New York, NY, USA ,2011 , pp. 242–253.Google Scholar - [9] , “Mesos: A platform for fine-grained resource sharing in the data center,” in Proc. NSDI, vol. 11,
2011 , p. 22.Google Scholar - [10] , Application-Layer Traffic Optimization (ALTO) Protocol, document RFC 7285, RFC, 2014, doi:
10.17487/RFC7285 .Google ScholarCross Ref - [11] , “NOVA: Towards on-demand equivalent network view abstraction for network optimization,” in Proc. IEEE/ACM 25th Int. Symp. Qual. Service (IWQoS),
Jun. 2017 , pp. 1–10.Google Scholar - [12] CAIDA. Analyzing UDP Usage in Internet Traffic. Accessed: May 5, 2020. [Online]. Available: http://www.caida.org/research/traffic-analysis/tcpudpratio/index.xmlGoogle Scholar
- [13] Linux. (2017). Manual TCP (7). [Online]. Available: http://man7.org/linux/man-pages/man7/tcp.7.htmlGoogle Scholar
- [14] , “On the fair coexistence of loss- and delay-based TCP,” IEEE/ACM Trans. Netw., vol. 19, no. 6, pp. 1811–1824, Dec. 2011.Google Scholar
- [15] , “Equilibrium of heterogeneous congestion control: Existence and uniqueness,” IEEE/ACM Trans. Netw., vol. 15, no. 4, pp. 824–837, Aug. 2007.Google Scholar
- [16] , “Adaptive bandwidth share estimation in TCP westwood,” in Proc. Global Telecommun. Conf.,
Nov. 2002 , pp. 2604–2608.Google Scholar - [17] , “TCP Westwood: Congestion control with faster recovery,” Univ. California, Los Angeles, CA, USA, Tech. Rep. CSD TR 200017, 2000.Google Scholar
- [18] , “Bandwidth estimation schemes for TCP over wireless networks,” IEEE Trans. Mobile Comput., vol. 3, no. 2, pp. 129–143, Apr. 2004.Google Scholar
- [19] , The Mathematics of Internet Congestion Control. Cham, Switzerland: Springer, 2012.Google ScholarDigital Library
- [20] , “Rate control for communication networks: Shadow prices, proportional fairness and stability,” J. Oper. Res. Soc., vol. 49, no. 3, pp. 237–252, Apr. 1998.Google Scholar
- [21] , “Understanding TCP Vegas: A duality model,” J. ACM, vol. 49, no. 2, pp. 207–235, 2002.Google Scholar
- [22] , “VL2: A scalable and flexible data center network,” in Proc. ACM SIGCOMM Conf. Data Commun.,
New York, NY, USA ,2009 , pp. 51–62.Google Scholar - [23] , “HUG: Multi-resource fairness for correlated and elastic demands,” in Proc. 13th USENIX Symp. Netw. Syst. Des. Implement.,
Santa Clara, CA, USA ,2016 , pp. 407–424.Google Scholar - [24] , “
Real-time identification of different TCP versions ,” in Managing Next Generation Networks and Services, , Eds. Berlin, Germany: Springer, 2007, pp. 215–224.Google Scholar - [25] , “Identification of different TCP versions based on cluster analysis,” in Proc. 18th Int. Conf. Comput. Commun. Netw.,
Aug. 2009 , pp. 1–6.Google Scholar - [26] , “TCP congestion avoidance algorithm identification,” IEEE/ACM Trans. Netw., vol. 22, no. 4, pp. 1311–1324, Aug. 2014.Google Scholar
- [27] , “A scalable, commodity data center network architecture,” in Proc. ACM SIGCOMM Conf. Data Commun.,
2008 , pp. 63–74.Google Scholar - [28] , Lipschitz Optimization. Boston, MA, USA: Springer, 1995, pp. 407–493, doi:
10.1007/978-1-4615-2025-2_9 .Google ScholarCross Ref - [29] , “SketchVisor: Robust network measurement for software packet processing,” in Proc. Conf. ACM Special Interest Group Data Commun.,
Aug. 2017 , pp. 113–126.Google Scholar - [30] , “PIAS: Practical information-agnostic flow scheduling for commodity data centers,” IEEE/ACM Trans. Netw., vol. 25, no. 4, pp. 1954–1967, Aug. 2017.Google Scholar
- [31] , “AuTO: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization,” in Proc. Conf. ACM Special Interest Group Data Commun.,
Aug. 2018 , pp. 191–205.Google Scholar - [32] , “One sketch to rule them all: Rethinking network flow monitoring with UnivMon,” in Proc. Conf. ACM SIGCOMM Conf.,
New York, NY, USA ,2016 , pp. 101–114.Google Scholar - [33] , “Flow radar: A better netFlow for data centers,” in Proc. 13th Symp. Netw. Syst. Des. Implement.,
Santa Clara, CA, USA ,2016 , pp. 311–324.Google Scholar - [34] , “Elastic sketch: Adaptive and fast network-wide measurements,” in Proc. Conf. ACM Special Interest Group Data Commun.,
Aug. 2018 , pp. 561–575.Google Scholar - [35] , “TCP vegas: End to end congestion avoidance on a global Internet,” IEEE J. Sel. Areas Commun., vol. 13, no. 8, pp. 1465–1480, Oct. 1995.Google Scholar
- [36] , “BBR: Congestion-based congestion control,” Queue, vol. 14, no. 5, pp. 50-20–50-53, Oct. 2016.Google Scholar
- [37] , “Apache Hadoop YARN: Yet another resource negotiator,” in Proc. 4th Annu. Symp. Cloud Comput.,
2013 , pp. 1–7.Google Scholar - [38] , “Global fairness of additive-increase and multiplicative-decrease with heterogeneous round-trip times,” in Proc. 19th Annu. Joint Conf. IEEE Comput. Commun. Soc.,
2000 , pp. 1303–1312.Google Scholar - [39] Apache. (2017). Spark SQL & Data Frames. [Online]. Available: https://spark.apache.org/sql/Google Scholar
- [40] Apache. (2017). Calcite—Dynamic Data Management Framework. [Online]. Available: http://calcite.apache.org/Google Scholar
Index Terms
- Prophet: Toward Fast, Error-Tolerant Model-Based Throughput Prediction for Reactive Flows in DC Networks
Recommendations
Prophet: Fast Accurate Model-Based Throughput Prediction for Reactive Flow in DC Networks
IEEE INFOCOM 2018 - IEEE Conference on Computer CommunicationsAs modern network applications (e.g., large data analytics) become more distributed and can conduct application-layer traffic adaptation, they demand better network visibility to better orchestrate their data flows. As a result, the ability to predict the ...
Formula-based TCP throughput prediction with available bandwidth
In the research area of TCP throughput prediction, the well-known throughput model, a function of loss rate, has some disadvantages such as difficulty of loss rate estimation and low accuracy under congestion. This letter proposes a new throughput model ...
Fast selective ACK scheme for throughput enhancement of multi-homed SCTP hosts
This Letter proposes a fast selective ACK scheme for Stream Control Transmission Protocol (SCTP) to enhance transmission throughput in multi-homing scenarios. In the proposed scheme, a multi-homed receiver sends SACK chunks to the sender over the ...
Comments