ARM JPEG Benchmarks part 2
I thought it would be useful to re-run the tests with the C version of my JPEG code. From the results it appears that memory bandwidth is the real limiting factor to the speed and the pixel colorspace conversion gets the most benefit from my optimized ARM assembly language. Also it appears that the OMAP gains more from optimized ASM than the XScale does. Here are the numbers:
C-Code:
PPC: thumbnail: 10.7 milliseconds, DC only: 968 milliseconds, full res: 3734 milliseconds.
SP: thumbnail: 25.1 milliseconds
Mixed C and ASM
PPC: thumbnail: 8.8 milliseconds, DC only: 830 milliseconds, full res: 2700 milliseconds.
SP: thumbnail: 15.1 milliseconds
The load times for the “DC only” and “full res” tests include the time taken to read 4.3MB of data from RAM through the WinCE file system.
These results make sense in that the real benefit of optimization comes from fixing the algorithms and reducing memory usage. The optimized ARM assembly code is certainly helpful in speeding things up, but won’t offer an order of magnitude improvement over what the compiler generates.
1 Comment »
Leave a comment
-
Recent
- Windows Vista killed my laptop battery :(
- New performance figures for the JPEG codec on ARM
- WM 6.1 – A Tale of Two Operating Systems
- A Graphics Library for Windows Mobile (could run on BREW, Symbian too)
- A good tool to save electricity
- Get your music without spending a bundle
- Supermarket club cards are a waste of everyone’s time
- Tiny JPEG
- Windows Mobile 6 Phone Roundup
- More JPEG Optimization
- My new T-Mobile Shadow
- SN76496 in ARM asm
-
Links
-
Archives
- June 2009 (1)
- January 2009 (1)
- December 2008 (1)
- November 2008 (1)
- October 2008 (4)
- April 2008 (1)
- March 2008 (3)
- February 2008 (6)
- January 2008 (2)
- December 2007 (1)
- November 2007 (1)
- October 2007 (2)
-
Categories
-
RSS
Entries RSS
Comments RSS
Great blog. Have you benchmarked your ARM stuff against the ARM6 assembly IDCT in the Android repository or the libjpeg implementation on the iPhone?
I’m looking at optimizing the DCT routine in ARM6 for some work we’re doing.