ARM Multiply Performance
Someone asked me to do some testing of the performance of the ARM multiply instruction. I hadn’t included it in my previous performance tests because it didn’t occur to me; I don’t use it in the inner loops of game emulators.
I decided to see if there was a difference in performance when working with different data values (e.g. multiplying by zero) and on the XScale vs. OMAP CPUs. The firt test showed that there is no difference in the performance when working with zero and non-zero data. The second test showed that the XScale has a much faster implementation of multiply than the OMAP. On my 400Mhz PXA255 handheld, my tests showed that the unsigned multiply instruction (MUL) takes just 1 clock cycle, but on the OMAP 850 (used in many SmartPhones) it takes 2 clocks. I haven’t tested the 32×32 multiply because it’s in the ARM5 instruction set and the VS2005 C compiler generates ARM4 compatible code.
2 Comments »
Leave a comment
-
Recent
- Windows Vista killed my laptop battery :(
- New performance figures for the JPEG codec on ARM
- WM 6.1 – A Tale of Two Operating Systems
- A Graphics Library for Windows Mobile (could run on BREW, Symbian too)
- A good tool to save electricity
- Get your music without spending a bundle
- Supermarket club cards are a waste of everyone’s time
- Tiny JPEG
- Windows Mobile 6 Phone Roundup
- More JPEG Optimization
- My new T-Mobile Shadow
- SN76496 in ARM asm
-
Links
-
Archives
- June 2009 (1)
- January 2009 (1)
- December 2008 (1)
- November 2008 (1)
- October 2008 (4)
- April 2008 (1)
- March 2008 (3)
- February 2008 (6)
- January 2008 (2)
- December 2007 (1)
- November 2007 (1)
- October 2007 (2)
-
Categories
-
RSS
Entries RSS
Comments RSS
VS 2005 can generate code for ARM5, the switch is under c/c++->advanced->compile for architecture. Just select ARM5 or ARM5T. It will be great if you can compare the performance based on ARM5 code.
Bill,
I used assembly language to test the multiply instruction, not C. The ARM5 uses the same multiply as ARM2-4, so it would not make any difference in performance.
L.B.