skip to main content
research-article

Prophet: Toward Fast, Error-Tolerant Model-Based Throughput Prediction for Reactive Flows in DC Networks

Published:15 December 2020Publication History
Skip Abstract Section

Abstract

As modern network applications (<italic>e.g.</italic>, large data analytics) become more distributed and can conduct application-layer traffic adaptation, they demand better network visibility to better orchestrate their data flows. As a result, the ability to predict the available bandwidth for a set of flows has become a fundamental requirement of today&#x2019;s networking systems. While there are previous studies addressing the case of non-reactive flows, the prediction for <italic>reactive flows</italic>, <italic>e.g.</italic>, flows managed by TCP congestion control algorithms, still remains an open problem. In this paper, we take the first step to solving this problem in a data center network. To address both theoretical and practical challenges, we introduce a novel learning-based prediction system based on the NUM model, with two key techniques named <italic>fast factor learning</italic> (FFL) and <italic>efficient flow sampling</italic>. We adopt novel techniques to overcome practical concerns such as scalability, convergence and unknown system parameters. A system, Prophet, is proposed leveraging the emerging technologies of <italic>Software Defined Networking</italic> (SDN) to realize the model. Evaluations demonstrate that our solution achieves significant accuracy in a wide range of settings.

References

  1. [1] Amazon elastic compute cloud (Amazon EC2), Amazon, Bellevue, WA, USA, 2010.Google ScholarGoogle Scholar
  2. [2] Google. (2017). Google Cloud Computing, Hosting Services & APIs. [Online]. Available: https://cloud.google.com/Google ScholarGoogle Scholar
  3. [3] Microsoft. (2017). Microsoft Azure Cloud Computing Platform & Services. [Online]. Available: https://azure.microsoft.com/en-us/Google ScholarGoogle Scholar
  4. [4] Nygren E., Sitaraman R. K., and Sun J., “The akamai network: A platform for high-performance Internet applications,” ACM SIGOPS Oper. Syst. Rev., vol. 44, no. 3, pp. 219, Aug. 2010.Google ScholarGoogle Scholar
  5. [5] Dean J. and Ghemawat S., “MapReduce: Simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107113, Jan. 2008.Google ScholarGoogle Scholar
  6. [6] Bird I.et al., “LHC computing grid,” Des. Rep., vol. 1, p. 8, Oct. 2005.Google ScholarGoogle Scholar
  7. [7] Rodrigues H., Santos J. R., Turner Y., Soares P., and Guedes D. O., “Gatekeeper: Supporting bandwidth guarantees for multi-tenant datacenter networks,” in Proc. WIOV, 2011, pp. 17.Google ScholarGoogle Scholar
  8. [8] Ballani H., Costa P., Karagiannis T., and Rowstron A., “Towards predictable datacenter networks,” in Proc. SIGCOMM, New York, NY, USA, 2011, pp. 242253.Google ScholarGoogle Scholar
  9. [9] Hindman B.et al., “Mesos: A platform for fine-grained resource sharing in the data center,” in Proc. NSDI, vol. 11, 2011, p. 22.Google ScholarGoogle Scholar
  10. [10] Alimi R., Yang Y., and Penno R., Application-Layer Traffic Optimization (ALTO) Protocol, document RFC 7285, RFC, 2014, doi: 10.17487/RFC7285.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Gao K., Xiang Q., Wang X., Yang Y. R., and Bi J., “NOVA: Towards on-demand equivalent network view abstraction for network optimization,” in Proc. IEEE/ACM 25th Int. Symp. Qual. Service (IWQoS), Jun. 2017, pp. 110.Google ScholarGoogle Scholar
  12. [12] CAIDA. Analyzing UDP Usage in Internet Traffic. Accessed: May 5, 2020. [Online]. Available: http://www.caida.org/research/traffic-analysis/tcpudpratio/index.xmlGoogle ScholarGoogle Scholar
  13. [13] Linux. (2017). Manual TCP (7). [Online]. Available: http://man7.org/linux/man-pages/man7/tcp.7.htmlGoogle ScholarGoogle Scholar
  14. [14] Å. Budzisz, Stanojevic R., Schlote A., Baker F., and Shorten R., “On the fair coexistence of loss- and delay-based TCP,” IEEE/ACM Trans. Netw., vol. 19, no. 6, pp. 18111824, Dec. 2011.Google ScholarGoogle Scholar
  15. [15] Tang A., Wang J., Low S. H., and Chiang M., “Equilibrium of heterogeneous congestion control: Existence and uniqueness,” IEEE/ACM Trans. Netw., vol. 15, no. 4, pp. 824837, Aug. 2007.Google ScholarGoogle Scholar
  16. [16] Wang R., Valla M., Sanadidi M. Y., and Gerla M., “Adaptive bandwidth share estimation in TCP westwood,” in Proc. Global Telecommun. Conf., Nov. 2002, pp. 26042608.Google ScholarGoogle Scholar
  17. [17] Mascolo S., Casetti C., Gerla M., Lee S., and Sanadidi M., “TCP Westwood: Congestion control with faster recovery,” Univ. California, Los Angeles, CA, USA, Tech. Rep. CSD TR 200017, 2000.Google ScholarGoogle Scholar
  18. [18] Capone A., Fratta L., and Martignon F., “Bandwidth estimation schemes for TCP over wireless networks,” IEEE Trans. Mobile Comput., vol. 3, no. 2, pp. 129143, Apr. 2004.Google ScholarGoogle Scholar
  19. [19] Srikant R., The Mathematics of Internet Congestion Control. Cham, Switzerland: Springer, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Kelly F. P., Maulloo A. K., and Tan D. K. H., “Rate control for communication networks: Shadow prices, proportional fairness and stability,” J. Oper. Res. Soc., vol. 49, no. 3, pp. 237252, Apr. 1998.Google ScholarGoogle Scholar
  21. [21] Low S. H., Peterson L. L., and Wang L., “Understanding TCP Vegas: A duality model,” J. ACM, vol. 49, no. 2, pp. 207235, 2002.Google ScholarGoogle Scholar
  22. [22] Greenberg A.et al., “VL2: A scalable and flexible data center network,” in Proc. ACM SIGCOMM Conf. Data Commun., New York, NY, USA, 2009, pp. 5162.Google ScholarGoogle Scholar
  23. [23] Chowdhury M., Liu Z., Ghodsi A., and Stoica I., “HUG: Multi-resource fairness for correlated and elastic demands,” in Proc. 13th USENIX Symp. Netw. Syst. Des. Implement., Santa Clara, CA, USA, 2016, pp. 407424.Google ScholarGoogle Scholar
  24. [24] Oshio J., Ata S., and Oka I., “Real-time identification of different TCP versions,” in Managing Next Generation Networks and Services, Ata S. and Hong C. S., Eds. Berlin, Germany: Springer, 2007, pp. 215224.Google ScholarGoogle Scholar
  25. [25] Oshio J., Ata S., and Oka I., “Identification of different TCP versions based on cluster analysis,” in Proc. 18th Int. Conf. Comput. Commun. Netw., Aug. 2009, pp. 16.Google ScholarGoogle Scholar
  26. [26] Yang P., Shao J., Luo W., Xu L., Deogun J., and Lu Y., “TCP congestion avoidance algorithm identification,” IEEE/ACM Trans. Netw., vol. 22, no. 4, pp. 13111324, Aug. 2014.Google ScholarGoogle Scholar
  27. [27] Al-Fares M., Loukissas A., and Vahdat A., “A scalable, commodity data center network architecture,” in Proc. ACM SIGCOMM Conf. Data Commun., 2008, pp. 6374.Google ScholarGoogle Scholar
  28. [28] Hansen P. and Jaumard B., Lipschitz Optimization. Boston, MA, USA: Springer, 1995, pp. 407493, doi: 10.1007/978-1-4615-2025-2_9.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Huang Q.et al., “SketchVisor: Robust network measurement for software packet processing,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2017, pp. 113126.Google ScholarGoogle Scholar
  30. [30] Bai W., Chen L., Chen K., Han D., Tian C., and Wang H., “PIAS: Practical information-agnostic flow scheduling for commodity data centers,” IEEE/ACM Trans. Netw., vol. 25, no. 4, pp. 19541967, Aug. 2017.Google ScholarGoogle Scholar
  31. [31] Chen L., Lingys J., Chen K., and Liu F., “AuTO: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2018, pp. 191205.Google ScholarGoogle Scholar
  32. [32] Liu Z., Manousis A., Vorsanger G., Sekar V., and Braverman V., “One sketch to rule them all: Rethinking network flow monitoring with UnivMon,” in Proc. Conf. ACM SIGCOMM Conf., New York, NY, USA, 2016, pp. 101114.Google ScholarGoogle Scholar
  33. [33] Li Y., Miao R., Kim C., and Yu M., “Flow radar: A better netFlow for data centers,” in Proc. 13th Symp. Netw. Syst. Des. Implement., Santa Clara, CA, USA, 2016, pp. 311324.Google ScholarGoogle Scholar
  34. [34] Yang T.et al., “Elastic sketch: Adaptive and fast network-wide measurements,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2018, pp. 561575.Google ScholarGoogle Scholar
  35. [35] Brakmo L. S. and Peterson L. L., “TCP vegas: End to end congestion avoidance on a global Internet,” IEEE J. Sel. Areas Commun., vol. 13, no. 8, pp. 14651480, Oct. 1995.Google ScholarGoogle Scholar
  36. [36] Cardwell N., Cheng Y., Gunn C. S., Yeganeh S. H., and Jacobson V., “BBR: Congestion-based congestion control,” Queue, vol. 14, no. 5, pp. 50-2050-53, Oct. 2016.Google ScholarGoogle Scholar
  37. [37] Vavilapalli V. K.et al., “Apache Hadoop YARN: Yet another resource negotiator,” in Proc. 4th Annu. Symp. Cloud Comput., 2013, pp. 17.Google ScholarGoogle Scholar
  38. [38] Vojnovic M., Le Boudec J.-Y., and Boutremans C., “Global fairness of additive-increase and multiplicative-decrease with heterogeneous round-trip times,” in Proc. 19th Annu. Joint Conf. IEEE Comput. Commun. Soc., 2000, pp. 13031312.Google ScholarGoogle Scholar
  39. [39] Apache. (2017). Spark SQL & Data Frames. [Online]. Available: https://spark.apache.org/sql/Google ScholarGoogle Scholar
  40. [40] Apache. (2017). Calcite—Dynamic Data Management Framework. [Online]. Available: http://calcite.apache.org/Google ScholarGoogle Scholar

Index Terms

  1. Prophet: Toward Fast, Error-Tolerant Model-Based Throughput Prediction for Reactive Flows in DC Networks
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image IEEE/ACM Transactions on Networking
          IEEE/ACM Transactions on Networking  Volume 28, Issue 6
          Dec. 2020
          457 pages

          1063-6692 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

          Publisher

          IEEE Press

          Publication History

          • Published: 15 December 2020
          Published in ton Volume 28, Issue 6

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader