[PATCH] framebuffer: bit_putcs() optimization for 8x* fonts

This trivial patch gives a performance boost to the framebuffer console

Constructing the bitmaps that are given to the bitblit functions of the
framebuffer drivers is time consuming.  Here we avoide a call to the slow
fb_pad_aligned_buffer().  The patch replaces that call with a simple but
much more efficient bytewise copy.

The kernel spends a significant time at this place if you use 8x* fonts.
Every pixel displayed on your screen is prepared here.

Some benchmark results:

Displaying a file of 2000 lines with 160 characters each takes 889 ms
system time using cyblafb on my system (I´m using a 1280x1024 video mode,
resulting in a 160x64 character console)

Displaying the same file with the enclosed patch applied to 2.6.13 only
takes 760 ms system time, saving 129 ms or 14.5%.

Font widths other than 8 are not affected.

The advantage and correctness of this patch should be obvious.

Signed-off-by: Knut Petersen <Knut_Petersen@t-online.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/drivers/video/console/bitblit.c b/drivers/video/console/bitblit.c
index 12eaf0a..6550875 100644
--- a/drivers/video/console/bitblit.c
+++ b/drivers/video/console/bitblit.c
@@ -114,7 +114,7 @@
 	unsigned int scan_align = info->pixmap.scan_align - 1;
 	unsigned int buf_align = info->pixmap.buf_align - 1;
 	unsigned int shift_low = 0, mod = vc->vc_font.width % 8;
-	unsigned int shift_high = 8, pitch, cnt, size, k;
+	unsigned int shift_high = 8, pitch, cnt, size, i, k;
 	unsigned int idx = vc->vc_font.width >> 3;
 	unsigned int attribute = get_attribute(info, scr_readw(s));
 	struct fb_image image;
@@ -175,7 +175,11 @@
 					src = buf;
 				}
 
-				fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
+				if (idx == 1)
+					for(i=0; i < image.height; i++)
+						dst[pitch*i] = src[i];
+				else
+					fb_pad_aligned_buffer(dst, pitch, src, idx, image.height);
 				dst += width;
 			}
 		}