[metrics] Add GC-work throughput metrics

Track work done (bytes processed) per second by the GC.

Some other minor changes:
1) Adjusted ConcurrentCopying class member's order to make access to
them more cache-access friendly. Counters accessed by GC-thread should
not be in the same cacheline as the one containing counters meant for
mutators, if either of the two modify those counters.
2) Increased max to 10'000 for throughput histograms in case
the throughput is > GB/s

Bug: 191404436
Test: manual
Change-Id: Iefaf1106690b6bae670a3a917f61194b3fcacfe0
diff --git a/libartbase/base/time_utils.h b/libartbase/base/time_utils.h
index fbf3e94..dd73b1c 100644
--- a/libartbase/base/time_utils.h
+++ b/libartbase/base/time_utils.h
@@ -77,6 +77,11 @@
   return ns / 1000 / 1000;
 }
 
+// Converts the given number of nanoseconds to microseconds.
+static constexpr uint64_t NsToUs(uint64_t ns) {
+  return ns / 1000;
+}
+
 // Converts the given number of milliseconds to nanoseconds
 static constexpr uint64_t MsToNs(uint64_t ms) {
   return ms * 1000 * 1000;