Adjusting ECN marking threshold in multi-queue DCNs with deep learning


Amanov A., Majidi A., Jahnabakhsh N., ÇETİN A.

JOURNAL OF SUPERCOMPUTING, vol.79, no.5, pp.5443-5468, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 79 Issue: 5
  • Publication Date: 2023
  • Doi Number: 10.1007/s11227-022-04893-7
  • Journal Name: JOURNAL OF SUPERCOMPUTING
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
  • Page Numbers: pp.5443-5468
  • Keywords: Congestion control, Multi queue, Explicit congestion notification, Cut payload, SIZE
  • Gazi University Affiliated: Yes

Abstract

Explicit Congestion Notification (ECN) is designed for single queues. However, today, data center networks (DCNs) need multiple queues on each switch port. But, if some of the switches in multiple queue scenarios exceed the ECN marking threshold, all packets on the same port can receive the ECN mark. To solve this problem, we propose mapping-ECN as a systematic answer to the wrong marking problem. First, we differentiate the mice and elephant flows learning algorithm. Then, we prioritize mice flows by keeping in mind the deadline of other flows to not sacrifice them. Secondly, if a packet is marked, we need to have the privilege of using a faster path than other packets for early notification of network status. This will give a complete picture of the instant requests from all senders. In the worst case, if there is no capacity in the buffer to transmit the packets that exceed the threshold of the buffer, mapping-ECN uses Cut Payload (CP), where CP drops the payloads of the packets when a queue reaches the threshold, rather than the metadata. Consequently, just one bit will transmit that carries the information of the packet. Therefore, the sender will immediately retransmit that packet without waiting for a time-out like TCP. This retransmission can arrive within a millisecond for having an extremely low latency network. Last but not least, mapping-ECN explores different kinds of neural network techniques to avoid miss marking in the output port buffer. Therefore, if any packet is marked within the queue buffer, these marked packets are not considered again for marking choices within the output port buffer. Mapping-ECN improves the overall performance of Flow-Completion Time (FCT) for short flows around 7%, 99th percentile around 52%, and FCT for short flows around 8% in comparison between MQ-ECN. Moreover, when compared to the MQ-ECN, Mapping-ECN improves the FCT for large flows, for cache flows and for mice (web search) flows 4, 15 and 6%, respectively. This improvement is legible in comparison between DemePro and Priority-ECN as well.