|
Message
From: =?unknown-8bit?Q?Gy=F6rgy?= 'nog' Jeney<nog@s...>
Date: Sun May 8 16:52:25 CEST 2005
Subject: [openrisc] [or1ksim #85] [RFC] Overhoul memory code
Hi,This is quite a major rewrite of the memory code makeing the implementation more linear, ie. less conditionals which also leads to faster execution:
With the patch:
real 0m50.808s user 0m39.720s sys 0m11.090s
Without the patch:
real 1m0.258s user 0m48.570s sys 0m11.690s
I have changed the callback interface to provide a relative address. This is done to avoid haveing to pass a struct mem pointer to the simmem_{read,write}* functions.
The memory controller was a real hack. It thought that it owned every single peripheral and it was imposible to have more than one memory controller. With this patch, each peripheral that wants to be under the controll of a memory controller will have to register with the appropriate one. I have also decoupled it from the inerds of abstract.c: ie. it doesn't mess with the dev_memarea list. In 2 places it still diggs into the dev_memarea structures that are registered with it which still needs to be fixed, but it's still cleaner now.
With this patch it's simple to fix the debug unit to provide the weird behaviour of being able to load a program to flash (even when it's unwriteable). I've kept this out of this already very big patch. I would argue that this is wrong behaviour but people requested it...
There is some stuff I'm not really clear on so I would really like to hear comments/oppinions on them:
Granularity -----------
I have changed the register_memory_area function (renamed to reg_mem_area) so that you can specify 32, 16 and 8 bit read/write functions but only the memory peripheral uses them now (makes it much faster). With this, only one of the read/write delays get added to the cycle counter unlike previously where if an 8-bit write happened to a 32-bit memory area the expense of a write plus a read where added to the cycle counter. I don't know what the memory controller would do if, say, the data widths of the memory sitting behind it was not that of the access. The question: What does the hardware do in this case?
Also, how do peripherals not under the control of a memory controller (uart, ethernet, etc) react to writting less data to a register than the width of the register? As an example what happens when you do a 16-bit write to a 32-bit wide register in the ethernet?
What happens when you do, say a 16-bit read to a 32-bit wide register when the address of the read is not aligned on a 32-bit boundry?
The code in this patch mimics that of the previous implementation, so everything should still work the same as it did before. Which I'm not really sure is always right.
Endianess ---------
This is a stupid topic that drives me insane. or1ksim, in effect, emulates a little-endian core (and a little endian bus, etc.) on a litte endian machine and a big-endian core on a big-endian machine (I'm not even convinced that the sim will work on a big endian machine, though). This causes some real headaches when implementing a peripheral that registered with more than one granularity. I'm also stumped as to how should I implement proper support for IDE in which it is defined that all data transfers are little-endian. When the ata peripheral is compiled on a little endian machine all data returned from it is (naturally) little endian but when running linux compiled with the ide driver it decides to byteswap all data recieved from the ata, because openrisc is big-endian but since we have a little endian simulator the sim now tryies to do arithmetic with big endian numbers, which fails in some interesting ways.
Linux and it's userspace stuff just happen to work since when loading the linux elf binary everything is byteswapped into little-endian but what would happen if you would feed some code to linux running inside the sim from an external source? Say, an nfs mounted on a remote machine? I'm pretty sure that it would fail to execute since it will not be byteswapped. I would test it myself but since I don't have a LAN and I couldn't figure out how to get the ethernet peripheral to work if I don't have a proper ethernet running.
The only way that I can think of to solve this is to byteswap all data moveing between the registers and the memory, but then the peripherals will have to know that the data that they get(/must return) is(/must to be) in big-endian order. Meaning that it will have to potentially be swapped twice. Not a good thing for execution speed. This will also involve byte-swapping all the instruction reads. It may be possible to avoid doing this with clever tricks in or32.c, though I haven't looked at it. It may also be needed to keep the data in the registers in big-endian order, again I don't know. Any comments/ideas/help/ corrections in this regard will be greatly appreciated.
ChangeLog: * Seporate out the code used for handling the memory peripheral to peripheral/memory.c * Mostly decouple the memory controller from the internals of the memory handling. * Rewrite memory handling to be more linear and thus much faster.
nog.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch85.gz
Type: application/x-gunzip
Size: 22314 bytes
Desc: not available
Url : openrisc/attachments/20050508/86b236d2patch85-0001.bin
|
 |