|
Message
From: caodayong at uestc.edu.cn<caodayong@u...>
Date: Fri Sep 7 14:02:00 CEST 2007
Subject: [oc] Why open processors are so much slower than commercial ones?
----- Original Message ----- From: goran.bilski@x...<goran.bilski@x...> To: Date: Tue Aug 10 16:06:43 CEST 2004 Subject: [oc] Why open processors are so much slower than commercial ones?
> Hi, > > Interesting thread. > > As the designer of MicroBlaze, I can provide a little more details. > > In order to get the best performance, you have to optimize all part > of > the system. > 1. The Instruction Set > The instruction set has to be optimized for an FPGA design. > ex. > For the logical instructions, I use a LUT for each bit of result. > With a 4-input LUT, I need 2 inputs for the two operands and that > gives > me 2 inputs for the type of logical instruction. > With 2 inputs, I can do 4 different logical instructions. > MicroBlaze have just 4 logical instructions, no more, no less. > Just doing 3 instructions, won't save any area at all, > Doing 5 logical instructions would cost twice the amount of area. > The actual opcode values is choosen to minimize the control logic. > ex. I have a result mux which selects the source for the new value > to > the register file. Bit 0-1 is the actual selector for that mux, > which > means 0 LUTs for that control logic. Most of the layout of the > opcodes > has been done for this purpose. > Bit 4-5 determine the operation of the ALU block, etc... > 2. The actual datapath implementation has to match with the FPGA. > For an ASIC design, the area cost is very different for an ALU than > a > MUX but for an FPGA the area is actually the same. > A processor design has a lot of muxes and they needs to be minimzed > since they cost in area and performance. Extreme pipelines will run > very > slow on a FPGA design due to all muxes for resolving the pipeline > hazardous. > 3. The exact HDL coding of the processor has to be optimized. > I have done a lot of FPGA design tricks to get optimized > performance > and area. (The Xilinx carry-chain is extremly powerful and normally > under used). > The source code for MicroBlaze exists but not as open-source. > It can be purchased from Xilinx. > A FPGA specific version exists and also a pure RTL version. > The pure RTL version can be targetting to ASIC and when implemented > on a FPGA, the synthesis tools and PAR is quite good on optimized > the > design. The results is within 10% of area and performance compared > to > the FPGA specific version. > I have only handcoded and handplaced logic that is within the > critical > paths, very little has been needed for the area since I tend to > write RTL > code that is well suited for an FPGA. > The best design tools is still the paper and the pen. > Every design tricks starts with a datasheet of the CLB and blank > piece > of paper. > Göran Bilski > >
|
 |