Base-N Number System Primer

Sep 13, 2008 16:03

Written on sort-of-a-request from a person in my f-list. I hope it isn't too brain-frying.

For those of you who don't want to see a lot of math technobabble, skip right past this entry.



Base-N Number System Primer

1. Introduction

I actually learned the theory behind this stuff back in high school, but from what I saw in college, I was an exception. A lot of Computer Science freshmen are scared out of their wits at the first contact with base-n, and in one occasion my buddies even used it as prank lesson 1 material. I blame the education system for not preparing students well enough.

Anyway, enough rambling.

2. And Now For Something Completely Different... Not!

What is the number system you're used to? Hindu-Arabic numerals? Good. In that case, you might be surprised to find out that you've already been using base-n for a long time. Because in fact, all base-n systems use the same theoretical principle as the decimal system. Think about it: what does it mean when you append a zero to the right of a number? It means the number is now multiplied by ten. Which happens to be the base you're working with.

This works for every base-n number. Append a zero to the right of a binary number, and you're doubling it. Append a zero to the right of an octal number, and you're multiplying it by eight. And so on.

A better way to put this is to say that each position on a number represents a power of the base (starting from zero because we need to represent units somehow, and growing in magnitude from right to left just like in the decimal system), and the value at that position represents the factor by which you multiply the corresponding power of the base. This is why single digits in each base only represent from zero to the base minus one: if you have n×np, that's actually a different power (np+1), and thus a different position.

Although you probably won't see it for a while (in practical uses anyway), all this also applies to the negative powers of the base.

...d4d3d2d1d0.d-1d-2d-3d-4......n4n3n2n1n0.n-1n-2n-3n-4...
The common algorithms for the four basic operations will work regardless of base, as long as you work within the same base and remember where the "overflow" points are. You have to be aware of things like 1+1=10 in binary, or 6+7=15 in octal, or 2+8=a in hexadecimal. Multiplication tables need to be redone so that, for example, 4×7 maps to 34 in octal. And with multiplication tables at hand, you can actually use the division algorithm to obtain things like 64÷14=5 in hexadecimal without performing unnecessary base conversions.

3. Base-N Conversion

Having been involved with Computer Science for over eight years, I can safely claim that the main use of knowing how to do a base conversion is to cover for those situations where you don't have a computer around and need to have a clear idea of what amount is expressed by a base-n number. Or maybe you want to save writing space by converting that huge base-2 string of digits into a compact base-16 string of digits. Whatever the case is, doing a base conversion is just a matter of reencoding the number using the theory above: decode the source string by adding the amounts represented by each digit-position pair, and encode the number again by iteratively figuring out which order of magnitude you're working with, squeezing the highest digit you possibly can for that position and subtracting the partial amount from the number you're working with. You can also do it the other way around, walking from the lower orders to the higher ones, but that's best saved for a computer implementation. As an example, let's convert 34a8 (hexadecimal, henceforth prefixed with "0x") to decimal.

0x34a8=3×163+4×162+10×161+8×160=3×4096+4×256+10×16+8×1=13480

This is easy because we're used to thinking in decimal so we didn't have to go to the trouble of figuring out magnitudes, we just expressed the quantities in the target system. So let's do the opposite now.

13480 sits between 4096 and 65536 (164), so we'll have to make do with a multiple of 4096. Performing the integer division 13480÷4096 gets us the result 3, which will be the first digit we'll write down (in position 3 but that usually doesn't matter when you're using pen and paper). So now we have to get rid of that magnitude to continue the process, which is done by subtracting 3×4096 from 13480, or simply taking the remainder of the last division. The result is 1192. This sits between 256 and 4096 (if you got a number larger than 4096 here, you'd know there was a mistake in the previous step; on the other hand, a number smaller than 256 here means that the corresponding digit is a 0 and thus you don't have to subtract anything in order to proceed to the next step), so now we'll find out the digit for this position. The integer division 1192÷256 gets us a 4, so we'll use that. We're left with 168, which again sits between 16 and 256. The result of the division now is 10, which is written "a" in hexadecimal. Now we're left with 8, which goes as it is since we're down to the "0" position.

There, we got 0x34a8 back from 13480.

The octal-decimal and binary-decimal base pairs work similarly to the decimal-hexadecimal examples.

The octal-hexadecimal base pair could be where things get annoying, because you're most likely not fluent enough in either system and would have to use decimal as a middle point in conversions, but read on...

Something neat happens in the binary-octal and binary-hexadecimal base pairs. As you might have noticed, both log28 and log216 are integer. So for every point where you have a base-8 or base-16 magnitude change, you're guaranteed to have a corresponding base-2 magnitude change. In practice, this means you can convert from base-8 to base-2 (or base-16 to base-2) on a digit-by-digit basis, and vice-versa on a group-by-group basis if you group the base-2 digits. For example, any octal 5 will always translate to binary 101, any hexadecimal b will always translate to binary 1011 and so on. Any binary 110 that occupies the positions from 3×i to 3×i+2 will always translate to octal 6, any binary 1110 that occupies the positions from 4×i to 4×i+3 will always translate to hexadecimal e and so on. For example, let's convert 7315 (octal, henceforth prefixed with "0") to hexadecimal using binary as a middle point.

07315=111 011 001 101=111011001101=1110 1100 1101=0xecd

Of course, you might have to add some zeros for padding when you regroup the bits.

4. Exercises

Some practice if you're feeling up to it.

4.1. One of the harshest ways to structure objective questions in the college admission exams in Brazil is called "summation", and consists in presenting a short text and five or six statements about the text, which the candidate must then judge to be true or false, add the numeric values assigned to each true statement together, and give the answer in the form of the result of that sum. The values assigned to the statements are always 1, 2, 4, 8, 16 and so on.
4.1.a. Can you mess up the truth-values and still give a correct answer, assuming you don't make any arithmetic mistakes? Why?
4.1.b. Come up with one such question, point out the truth values of your statements and the numeric value that represents the correct answer.
4.1.c. Come up with a "summation" question where each statement can be true, false, or undecidable. Assign numeric values as necessary, point out the correct evaluation for each statement and the numeric value of the correct answer.

4.2. The MIPS 32-bit architecture defines that every single instruction must be 32 bits long, including the 6-bit opcode (thus instructions have at most 26 bits that can be used for arguments), and also defines a 32-bit address space for addressing individual blocks of 8 bits (byte). One of the instructions in the 32-bit MIPS instruction set is j, which is an unconditional jump (that is, an instruction that commands the program flow to continue from an instruction in a different place) using the single argument to obtain the target address. Seeing as 26 bits obviously cannot cover the entire address space, part of the target address comes from the program counter: that part is the 4 highest order bits. However, concatenating 4 bits with the argument of the j instruction still does not give you 32 bits. How does the MIPS get away with this?

4.3. What is the divisibility by 4 rule in a base-8 system?

4.4. Why does the divisibility by 3 rule still work in a base-16 system? Why doesn't it work in a base-8 one? For which bases does it work and why? (Consult this page if you must.)

4.5. Building on the previous question, which numbers have a divisibility rule based on single-digits sum in a base-13 system?

1For those not familiar with the concept of a prank lesson, in Brazil it's customary for the non-freshmen to play pranks on the freshmen as soon as they meet in college. These vary a lot in severity and some are even considered criminal these days. In my university, one thing Computer Science students did to their freshmen was getting an older student who could convincingly pass themself off as a professor to go to the classroom instead of the real professor for that time slot and scare the freshmen with difficult topics, a harsh grading system and a long list of heavy and expensive textbooks, none of which available in Portuguese.

Comments will be screened in case anyone wants to reply with the answers to the exercises.

computerscience

Previous post Next post
Up