net_sched: move TCQ_F_THROTTLED flag
In commit 371121057607e (net: QDISC_STATE_RUNNING dont need atomic bit
ops) I moved QDISC_STATE_RUNNING flag to __state container, located in
the cache line containing qdisc lock and often dirtied fields.
I now move TCQ_F_THROTTLED bit too, so that we let first cache line read
mostly, and shared by all cpus. This should speedup HTB/CBQ for example.
Not using test_bit()/__clear_bit()/__test_and_set_bit allows to use an
"unsigned int" for __state container, reducing by 8 bytes Qdisc size.
Introduce helpers to hide implementation details.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jesper Dangaard Brouer <hawk@diku.dk>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 3e86fd3..39db75c 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -865,7 +865,7 @@
/* try to dequeue direct packets as high prio (!) to minimize cpu work */
skb = __skb_dequeue(&q->direct_queue);
if (skb != NULL) {
- sch->flags &= ~TCQ_F_THROTTLED;
+ qdisc_unthrottled(sch);
sch->q.qlen--;
return skb;
}
@@ -901,7 +901,7 @@
skb = htb_dequeue_tree(q, prio, level);
if (likely(skb != NULL)) {
sch->q.qlen--;
- sch->flags &= ~TCQ_F_THROTTLED;
+ qdisc_unthrottled(sch);
goto fin;
}
}