|
Message
From: Damjan Lampret<lampret@o...>
Date: Thu Feb 19 12:54:18 CET 2004
Subject: [openrisc] Re: or1200 execution units
> > "mac" takes effectively one clock cycle if there is no data hazard. > > Otherwise it takes 3. > mac takes less time than mul? I thought mac uses the multiplier from > mul. Isn't that why mac requires mul to be implemented?
Yes mac uses same hardware resources as mul. But because it doesn't have GPR destination register but only MACLO/MACHI as destination, it can complete effectively in 1 clock cycle (if there is no l.macrc followed, if there is one it wil ltake 3 clock cycles). Anyway there is no definition for l.mac in gcc. I tried to add l.mac but I was never successful to implement it in .md. I don't know why it didn't work. I tried to copy other mac definitions from other ports of GCC but it simply didn't emit l.mac insn.
> Optimization is done by gcc. So I believe that getting smaller code is a > matter of describing the machine more precisely. I noticed that some > instructions are not used by gcc yet, maybe there is some potential. I > will examine the target definitions as I find some time for it.
I can think of two optimizations, they would also be speed optimizations as well: - right now l.sfXX is always followed by cond branch instructions (or cond branch is always preceeded with l.sfXX insn). Splitting this pair into two separate insns would be good for speed and maybe also for size (?) - right now l.sfXXi are not implemented (I tried to implement and you can see some attempts in or32.c but it didn't work properly al lthe time - it emitted code but sometimes gcc crashed) - some optional insns like l.addc are not implemented
regards, Damjan
> > Heiko > _______________________________________________ > http://www.opencores.org/mailman/listinfo/openrisc
|
 |