ext4: make ext4_block_in_group() much more efficient

Currently in when getting the block group number for a particular
block in ext4_block_in_group() we're using
ext4_get_group_no_and_offset() which uses do_div() to get the block
group and the remainer which is offset within the group.

We don't need all of that in ext4_block_in_group() as we only need to
figure out the group number.

This commit changes ext4_block_in_group() to calculate group number
directly. This shows as a big improvement with regards to cpu
utilization. Measuring fallocate -l 15T on fresh file system with perf
showed that 23% of cpu time was spend in the
ext4_get_group_no_and_offset(). With this change it completely
disappears from the list only bumping the occurrence of
ext4_init_block_bitmap() which is the biggest user of
ext4_block_in_group() by 4%. As the result of this change on my system
the fallocate call was approx. 10% faster.

However since there is '-g' option in mkfs which allow us setting
different groups size (mostly for developers) I've introduced new per
file system flag whether we have a standard block group size or
not. The flag is used to determine whether we can use the bit shift
optimization or not.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 73f3e60..cc58aa8 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -949,7 +949,7 @@
 #define EXT2_FLAGS_TEST_FILESYS		0x0004	/* to test development code */
 
 /*
- * Mount flags
+ * Mount flags set via mount options or defaults
  */
 #define EXT4_MOUNT_GRPID		0x00004	/* Create files with directory's group */
 #define EXT4_MOUNT_DEBUG		0x00008	/* Some debugging messages */
@@ -981,8 +981,16 @@
 #define EXT4_MOUNT_DISCARD		0x40000000 /* Issue DISCARD requests */
 #define EXT4_MOUNT_INIT_INODE_TABLE	0x80000000 /* Initialize uninitialized itables */
 
+/*
+ * Mount flags set either automatically (could not be set by mount option)
+ * based on per file system feature or property or in special cases such as
+ * distinguishing between explicit mount option definition and default.
+ */
 #define EXT4_MOUNT2_EXPLICIT_DELALLOC	0x00000001 /* User explicitly
 						      specified delalloc */
+#define EXT4_MOUNT2_STD_GROUP_SIZE	0x00000002 /* We have standard group
+						      size of blocksize * 8
+						      blocks */
 
 #define clear_opt(sb, opt)		EXT4_SB(sb)->s_mount_opt &= \
 						~EXT4_MOUNT_##opt