|
Message
From: =?unknown-8bit?Q?Gy=F6rgy?= 'nog' Jeney<nog@s...>
Date: Thu May 12 17:17:35 CEST 2005
Subject: [openrisc] [or1ksim #85] [RFC] Overhoul memory code
> if you do a l.sb (store byte) from architectual point of view only write > delay access should be incured. (though in principle l.sb could have > different delay than l.sw). what actually happens depends on memory > controler you are using. > > the most common thing with 'normal' memories is that 8-bit, 16-bit and > 32-bit writes all take the same amount of time (the same goes for reads). > this is assuming you have 32-bit wide access to the memory. if not the the > dalays will be longer, depending on the mc width...
The preivious implementation added the read and write cycles if the access wasn't 32-bit access but I've changed it to add the read or write delay only once. Which behaviour do we want?
> > Also, how do peripherals not under the control of a memory controller (uart, > > ethernet, etc) react to writting less data to a register than the width of the > > register? As an example what happens when you do a 16-bit write to a 32-bit > > wide register in the ethernet? > > this is core dependand. some devices might even not allow less than 32-bit > accesses (or some other accesses). by not allowing i mean sending back bus > error, which raises 0x200 exception. [snip] > device dependant, the developer could have decided to implement 16-bit > access or not (so you could get the requested data back or bus error). if > the 16-bit access is supported (or 8-bit) the most natural assumption is > that it takes 1 'write/read delay' for any of 8, 16 or 32 bit accesses.
Then the granularity `emulation' functions can be removed? And if an access happens to an area that did not register the appropriate granularity raise a bus error? > Am not a big fan of bit/byte swapping either... The way i see it is that > internaly to simulator we can have the data in any bit/byte order we choose > (maybe depending an the host), the only place we *can* have > transformations is on the borders (ie input/output). for io we can have a > bunch of read/write functions that while reading/writing the data can > (depending on the host,...) do proper transformations. > > This is not enteirely true though. The problems may also (as you pointed out) > be at peripheral borders if peripherals want/have data in some special > order. in this case some more swapping might be neccessery.... the way i see > this could be achieved is: > > - openrisc functions in big-endian format (architectualy). the > representation of this in simulator running on real machine may or may not > be the same. all we care about is that the end result is identical (we get > the 'results' on outputs (which in some way depend on inputs, so it should > only be neccessey changing these two)). the also care about performance, so > we'd like to achieve the correct result using as few transformations as > possible
This could lead to some pretty ugly hacks... > > - the added complication of peripherals that architectualy define > order of data on their input and output to be different than that of an > openrisc should be possible to avoid by some tranformations that are depend > on the archtectural order and it's representation in the host machine. i'm > not saying that this is trivial though...
I see. I think I have a sort of idea on what needs to be done then. I'll try and express it below.
For the ensueing discussion I'm going to use the following term: Relative byte order: Byte ordering relative to either the openrisc architechture or the host running the sim. More simply `opposite byte order' will mean to be big-endian if the byte order is relative to little-endian and vica versa. And `identical byte order' will mean to be big-endian if the byte order is relative to big-endian byte order.
Now with the above definition it should be enough define that the sim works with byte ordering relative to the host system. In otherwords if the sim runns on a little-endian machine all arithmetic done by the (simulated) cpu will be in little-endian and data passed to and from the peripherals will be in little-endian order (and define clever macros to make it easy dealing with this). Eg. when data is returned from the ata peripheral it will be in big-endian order.
Then we define that all peripheral implements its own byte ordering. Meaning that it is peripheral dependant wether we return the big end or the little end of the word when doing and 8-bit read from address & 3 = 0.
With these two inplace it appears like we can shove most of this under the carpet and let the peripherals sort out byte ordering either on their sim interface or however they communicate with the outside world.
Does this sound like something intelligent? (or am I just thinking about this too hard?)
nog.
|
 |