Variable Type and Storage

Variable Type and Storage

Introduction

A variable is associated with memory that stores a value for later use. Computers are binary devices where ultimately everything is stored as some collection of ones and zeros—the computer can only remember ones and zeros. The computer needs to know the data type of the variable in order to interpret the memory correctly. Different types of data can have different numbers of bits associated with them and the way in which the bits should be interpreted depends on the type. For example, the interpretation of bits associated with a letter (or character) is different from the interpretation of bits used to represent a number. We won't get into all the details of the data storage and type here, but one thing you should realize is that each data type only has a finite number of bits associated with it and thus there is a limit on the size of numbers (and, for real numbers, a limit on the precision).

Bits and Bytes

Again, computers store and perform all operations in binary. What is binary? It is just a different counting system than the more familiar decimal counting system that we usually use. Instead of using digits 0 through 9 (to represent up to nine “things”) as we do in the decimal system, in binary we only use the digits 0 and 1 (to represent only as much as one “thing”). With decimal values, in addition to the digits, the location of a digit within a number has meaning. For example, the number 537 has 7 ones, three tens, and 5 hundreds—a 5 appearing in the third decimal place means something different than a 5 appearing anywhere else. Each place within a decimal number corresponds to a certain power of 10. The first place corresponds to the number of ones (the number of $1$'s = $10^0$'s); the second place corresponds to the number of tens (the number of $10$'s = $10^1$'s); the third place corresponds to the number of hundreds (the number of $100$'s = $10^2$'s); the fourth place corresponds to the number of thousands (the number of $1000$'s = $10^3$'s); and so on.

Similarly, in a binary number, the location of a binary digit has meaning. In this case, each place corresponds to a power of 2. So, for example, 1101 corresponds to, reading right to left, 1 one, 0 twos, 1 fours, and 1 eight. (Note that one, two, four, and eight correspond to $2^0$, $2^1$, $2^2$, and $2^3$, respectively.) Adding these values, the binary number 1011 correspond to the decimal number 13. A single binary digit is called a bit (obtained by combining the first and last two letters of “binary digit”). A byte is a collection of eight bits. Speaking in terms of bytes is common because the underlying hardware in a computer often is structures in a way that deals with eight bits at a time (or some multiple of eight bits).

Integers

Perhaps the most common type of variable is the integer. Integer variables are declared by writing int followed by the variable name. An integer is a number with no decimal point. For example, -1, 0, 1, and 42 are all integers. For example, the following code creates the integer variables i and count while initializing them to 0 and 57, respectively.

                    int i = 0;
                    int count = 57;
                

In C/C++, integers can be either signed or unsigned. If an integer is signed, it may be positive or negative. when it is signed, effectively one bit has to be used to represent the sign. However, if it is unsigned, the integer is regarded only as positive. For an equal number of bits, relative to the maximum value of a signed number, the size of an unsigned number can be larger. In fact, the maximum size of an unsigned number essentially doubles relative to a signed number owing to having one additional bit to store size information rather than having that bit dedicated to holding sign information. The following table shows the upper and lower limits on the value of signed and unsigned integers.

Integer Limits on the Uno32 and Max32
Lower Limit Upper Limit
Signed -2,147,483,648 (or -231) 2,147,483,647 (or 231-1)
Unsigned 0 (or 20-1) 4,294,967,295 (or 232-1)

If a variable is declared using int, it is a signed variable. In order to obtain an unsigned integer, one declares the variable to be of type unsigned int. For example, in the following, count1 is a signed integer while pageHits is an unsigned integer:

                    int count1 = -234;
                    unsigned int pageHits = 3000000000;
                

One final note: In C/C++, you do not include commas when writing numbers. So, it would be an error in the second statement above to write 3,000,000,000.

Floating Point Numbers

Another common numeric data type is the float, or floating point number. This data type is called float because it represents numbers with a decimal point, e.g., 1.5, 3.141, and -0.33333, where the decimal point can “float” to where it is needed to represent the number. This is in contrast to fixed point numbers, which specify the number of digits before and after the decimal point (for example, in everyday usage, monetary values are typically specified as having two digits to the right of decimal point). Unlike integers, floating point numbers can have a fractional part (if you are familiar with mathematical terms, a float is form of real number). This fractional part may be zero, such is in the number 7.0, but even when the fractional part is zero, a float is stored in a fundamentally different way than an int.

You can specify a floating point number using scientific notation, where a number is scaled by a specified power of 10. Examples of numbers written in scientific notation are $-1.234\times 10^5$ and $1.234 \times 10^{-3}$ which correspond to $-12340.0$ and $0.001234$, respectively. However, when writing a sketch, we can't write numbers quite like this because we don't have the ability to write superscripts. Instead, we use the letter e to represent “$\times 10$”. Think of e as indicating the exponential part where the number immediately following e is the power of 10. So, in C/C++, one would write the numbers mentioned at the start of this paragraph as -1.234e5 and 1.234e-3. (This representation of scientific notation is used on many calculators as well.)

Another data type that is used to store numbers that can have a fractional part (i.e., floating point or real numbers) is the double. There is an important difference between the Uno32 and the Max32 in how floats and doubles are handled. On an Uno32, both float and double are the same size. They both use the same number of bits: 4 bytes or 32 bits. Thus they have the same precision and the same limits on size. However, in contrast to this, for the Max32, the double data type has twice the number of bits as float. On a Max32 a double is stored using 8 bytes. If you are using an Uno32 and need the precision or size offered by an 8-byte representation of a floating point number, you can obtain that by declaring the type to be long double. The following table lists the number of bytes used to store floats and doubles on the two boards.

Bytes Used To Store Data
Boardfloat double
Uno3244
Max3248

Although not critical to most programming tasks, it is a good idea to have a basic understanding of how a floating point number uses its bits to represent numbers. There is one bit that is used to indicate the sign of the number, whether it is positive or negative. Then there are a number of bits (the amount depends on the precision) that are used for the exponent. These bits effectively specify where the decimal point is within the number. Finally, the rest of the bits are used for the mantissa (i.e., the non-exponential part of the number), which is called the significand since these are the bits that are actually “significant.”

The method used to translate the bits of a floating point number into decimal and how the computer does arithmetic with these numbers is beyond the scope of this page, but you can easily find this information with an Internet search if you are interested in learning more. The following table lists the number of bits dedicated to each part of a float and an 8-byte double. Also, Fig. 1 gives a graphical representation of the allocation of these bits.

Bits Used for Each Component of a Floating Point Number
Sign Exponent Significand (mantissa)
float (4 bytes) 1 8 23
double (8 bytes) 1 11 52
Figure 1. Allocation of bits in a 4-byte float and 8-byte double representation of a floating point number.

It may seem strange that there is only one sign bit and yet both the exponent and mantissa can be negative. The exponent actually is represented in a system called two's complement that does not require an explicit sign bit. We will not cover the details of two's complement here as it doesn't really concern us (and, again, you can do an Internet search to learn more if you're interested in further details). You might want to keep in mind that the 23 bits of the significand of a float corresponds to roughly 7 digits of precision in decimal ($2^{23} = 8.388608 \times 10^6$) whereas the 52 bits of the significand of a double corresponds to roughly 16 digits of precision in decimal ($2^{52} = 4.503599627370496 \times 10^{15}$). Thus a double actually has more than twice as much precision than a float. This is because, rather than simply doubling the number of bits devoted to the exponent, some of the “additional” bits in a double are devoted to making the significand have more than twice as many bits as the significand of a float.

The following table lists the limits on the size of floats and doubles, i.e., the range of these types of numbers.

Floating Point Number Range
Minimum (positive number) Maximum (positive number)
float (4 bytes) $1.1754943 \times 10^{-38}$ $3.4028234 \times 10^{+38}$
double (8 bytes) $2.2250738585072014 \times 10^{-308}$ $1.7976931348623157 \times 10^{+308}$

Characters and Bytes

Another data type is the char, or character data type. This data type is only one byte long, 8 bits. It is typically used to represent a character (e.g., a letter, digit, or punctuation mark). The way bytes correspond to individual characters is specified by the American Standard Code for Information Interchange (ASCII). For example, in this code the character “A” is stored in the computer as the collection of bits 0100 0001. Thinking of these bits as a binary number, we can say that in the ASCII system the character “A” is represented by the decimal number 65.

If you want to initialize a char variable with a character when the variable is declared, you can either write the character enclosed in single quotes or give the numeric ASCII value. For example, the following statements initialize both the variables ch1 and ch2 to the letter “A”.

                    char ch1 = 'A';
					char ch2 = 65;
                

The byte data type is often used as an (unsigned) integer with 8 bits. It may seem rather odd, but a char is treated as a signed integer. In many applications a byte and a char can be used interchangeably, but one should be mindful that by default a char is considered a signed value while a byte is assumed to contain no sign information.


  • Other product and company names mentioned herein are trademarks or trade names of their respective companies. © 2014 Digilent Inc. All rights reserved.