Why is endianness hard?: mr

mr_z

Why is endianness hard?

Sep 08, 2009 22:57

The term "endian" comes to us from Jonathan Swift by way of Gulliver's Travels. It seems that the Lilliputians were a divisive sort, having (rather brutally) divided themselves into two groups: Those who ate their boiled eggs starting at the small, pointy end (the "little end"), and those who started at the much more round end (the "big end"). These two groups, the "little endians" and the "big endians" simply could not bring it upon themselves to see past this superficial difference to instead focus on their commonalities. Instead, as Wikipedia and Swift put it: "The differences between Big-Endians [...] and Little-Endians had given rise to 'six rebellions... wherein one Emperor lost his life, and another his crown.'"

But that's Gulliver's Travels. In a modern context, endian refers to something nearly as trivial as which end of an egg to start eating from. Specifically: Given a large quantity that needs to be broken into smaller pieces, do you start writing down the least-significant piece first, or the most significant piece first?

The difference is that in 2009, this is far from a contentious topic. Everyone is largely content to do things their own way, and to let others do things their way. It's not hard to convert between one format and the other. Especially flexible hardware (such as the DSP CPU I work with at work) can go either way. In the end, it's merely rather confusing. And apparently it is confusing, given the time I spend explaining it at work.

So what is endianness in a computer context, and why is it confusing?

Consider the number "12345". This is a 5 digit number. In that number, the digit '1' is the "most significant" digit, meaning that it's the digit in the position that has the most value (the ten-thousands place in this case). The digit '5' is the "least significant", because it's in the position with the least value, the "ones place."

Suppose we had to break this number down into individual digits, and then speak those digits aloud so that someone else could hear them and write them down. In big endian convention, you'd start with the highest-valued digit, the '1', and work toward the lowest valued. You'd say '1', '2', '3', '4', '5'. * In little endian convention, you'd start at the least significant, the '5' and work toward the highest valued: '5', '4', '3', '2', '1'.

Engineers and computer architects have their reasons for preferring one over the other. Some argue big endian is more intuitive for programmers, and others argue that little endian is more naturally suited to arithmetic. Those are just a couple of the arguments one might encounter. In the end, they don't generally matter all that much once you pick a convention and stick with it.

One fairly common convention, though, is to treat big endian as going "left to right", and little endian as going "right to left." You can see this in the example above: The big endian convention read 12345 starting at the leftmost digit, and the little endian convention read 12345 starting at the rightmost digit.

Where computers are involved, big vs. little endian most often involves what order bytes get placed into memory from larger quantities. ie. does the "big" or the "little" end go at the lowest address. So, when it comes to putting byte addresses on individual bytes in, say, a 32-bit word, the big endian guys write "0" on the leftmost byte and the little endian guys write "0" on the rightmost byte. The following diagrams illustrate:
Big Endian Byte Numbering
byte 0byte 1byte 2byte 332-bit word

Little Endian Byte Numbering
byte 3byte 2byte 1byte 032-bit word
Exciting, eh?

Well, here's where it gets fun: Like I mentioned briefly above, the DSP CPU I work with can handle either endian representation. There is a configuration input to the chip that switches it between little and big endian modes. The CPU and all the peripherals on the die switch to the selected endian.

Nifty, eh? It is, to a point. The folks that have to design these peripherals and connect them all together get confused rather quickly though.

The first thing that confuses people is that endianness isn't just about byte ordering within a word. It's far more general than that. Endianness governs the following two operations:

Given objects of size X and containers of size Y that is larger than X: Endianness says at what end of the container I should place the first object when packing multiple objects into the container
Given containers of size Y holding objects of a smaller size X: Endianness says at which end of the container I can expect to find the first object when I unpack them.

Note the abstract terms. Typically, in most textbooks, the "objects of size X" are bytes, and the "containers" are larger quantities such as "words" and "double-words" and "pointers" and "floating point numbers" and so on.

That works fine if your computer always communicates in terms of byte streams with everything. In our case, our Systems-on-a-Chip communicate over a range of buses internally that theoretically can vary in width from 8 bits to 256 bits. Endianness plays a large role in how these coordinate.

It turns out that the rule is pretty simple: For all buses wider than 1 byte, the byte numbering for the "byte lanes" on the bus goes left-to-right or right-to-left depending on whether the chip is configured for big or little endian. Therefore byte 0 is the leftmost byte when operating as big-endian, and the rightmost byte when operating as little endian. So far so good. This is just a straightforward generalization of the "bytes-in-a-word" diagrams I gave above.

Now for the fly in the ointment that sends folks' heads spinning.

The peripherals in our SoCs provide configuration registers in the memory map. These memory mapped registers allow the CPU to control the peripheral with load and store instructions. Typically, these registers are 32 bits wide on our 32-bit machines.

To make things easier on our customers and our internal developers, we generally specify that the 32-bit register images for these peripherals look the same to a "load word" or "store word" instruction, regardless of the device's endian setting. This is where things get confusing.

If the CPU could only read and write 32-bit words, and there was a 32-bit bus between the CPU and the 32-bit register, the concept of "endianness" wouldn't matter, because at no point are the 32-bit words communicated between the CPU and the peripheral's register ever getting subdivided, nor are they getting packed into wider quantities.

The CPU can, however, read and write 8-bit bytes, 16-bit half-words, 32-bit words and 64-bit double-words. We define that our peripheral's register image look the same in big and little endian for 32-bit load and store, but that means 8-bit, 16-bit and 64-bit accesses all look different between the two endians. That's the first part that usually breaks people's brains.

The problem is that the peripheral itself is conceptually switching the order of the bytes within its 32-bit register words so that they look the same to 32-bit accesses regardless of endianness. The physical reality is that any register that's on a 32-bit bus that follows the labeling convention I mentioned above doesn't need any physical byte swapping to work under this scheme. That's the part that leaves many engineers I work with scratching their heads. Why? Because the register's byte ordering changed at the same time as the machine's and the two canceled, that's why.

To get people really scratching their heads, though, you need to look at what this does for a series of registers accessed via a wider bus. Ordinarily, you'd have to swap all the bytes on the bus when switching between the two endians. That way, a byte access to address '0' would always go to the same place, even though in big endian the access goes at the left end of the bus, and in little endian the access goes at the right end of the bus.

When you have these 32-bit registers whose 32-bit images stay fixed, it turns out that all you need to do is swap word positions on the bus, but not the bytes within the words. It makes sense when you really think about it, but it's something I've seen confuse many engineers. The basic idea is that we're packing objects of size "32 bit" into a larger container (going back to the rules I stated before), not "bytes".

Anyway... I had hoped to describe this in a less ramble-prone manner, but seeing as it's getting close to 1AM, I seem to have veered squarely into ramble territory. So, until next time: Aren't you glad I put this under a cut tag?

*Remind me to change the combination on my luggage.

endianness, programming