Submitted by tushar pramanick on Sun, 03/10/2013 - 21:51

Using Unions

Now let's focus on the applications of unions. Basically, there are two kinds of union applications, which are introduced in the following two sections.
Referencing the Same Memory Location Differently

The first application of unions is to reference the same memory location with different union members.

To get a better idea about referencing the same memory with different union members, let's have a look at the program in Listing 20.4, which uses the two members of a union to reference the same memory location. (We assume that the char data type is 1 byte long, and the int data type is 2 bytes long, which are true on many machines.)

TYPE
Listing 20.4. Referencing the same memory location with different union members.


1:  /* 20L04.c: Referencing the same memory in different ways */
2:  #include <stdio.h>
3:
4:  union u{
5:     char ch[2];
6:     int num;
7:  };
8:
9:  int UnionInitialize(union u val);
10:
11: main(void)
12: {
13:    union u val;
14:    int x;
15:
16:    x = UnionInitialize(val);
17:
18:    printf("The two character constants held by the union:\n");
19:    printf("%c\n", x & 0x00FF);
20:    printf("%c\n", x >> 8);
21:
22:    return 0;
23: }
24: /* function definition */
25: int UnionInitialize(union u val)
26: {
27:    val.ch[0] = `H';
28:    val.ch[1] = `i';
29:
30:    return val.num;
31: }


The following output is printed on the screen after the executable 20L04.exe is created and executed:

OUTPUT

C:\app>20L04
The two character constants held by the union:
H
i
C:\app>

ANALYSIS

As you see from the program in Listing 20.4, a union called val is defined in line 13. It contains two members; one is a char array ch and the other is an int variable num. If a char data type is 1 byte long and an int data type is 2 bytes long, the ch array and the integer variable num have the same length of memory storage on those machines.

A function named UnionInitialize()is called and passed with the union name val in line 16. The definition of the UnionInitialize() function is shown in lines 25_31.

From the function definition, you can see that the two elements of the char array ch are initialized with two character constants, `H' and `i' (in lines 27 and 28). Because the char array ch and the int variable num share the same memory location, we can return the value of num that contains the same content as the ch array. (See line 30.) Here we've used the two members, ch and num, in the val union to reference the same memory location and the same contents of the union.

The value returned by the UnionInitialize() function is assigned to an int variable x in line 16 inside the main() function. The statements in lines 19 and 20 print out the 2 bytes of the int variable num. Each byte of num corresponds to a character that is used to initialize the ch array because num and ch are all in the same union and have the same content of the union. Line 19 displays the low byte of num returned by the x & 0x00FF expression. In line 20, the high byte of num is obtained by shifting the x variable to the right by 8 bits. That is, by using the shift-right operator in the x >> 8 expression. (The bitwise operator (&) and the shift operator (>>) are introduced in Hour 8, "More Operators.")

From the output, you can see that the content of the val union is shown on the screen correctly.

Figure 20.2 shows the locations of the two character constants in memory.


Figure 20.2. The memory locations of the two character constants.

NOTE

    There are two formats to store a multiple-byte quantity, such as the int variable num in Listing 20.4. One of the formats is called the little-endian format; the other is the big-endian format.

    For the little-endian format, the high bytes of a multiple-byte quantity are stored at higher memory addresses and the low bytes are saved at lower addresses. The little-endian format is used by Intel's 80x86 microprocessors. My computer's CPU is a Pentium microprocessor, which is one of the members in the 80x86 family. Therefore, in Listing 20.4, the character constant `H', which is a low byte, is stored at the lower address. `i' is stored at the higher address because it's a high byte.

    The big-endian format is just opposite. That is, the high bytes are stored at lower addresses; the low bytes are stored at higher addresses. Motorola's 68000 microprocessor family uses the big-endian format.

Comments

Related Items

The if statement

The if statement

If life were a straight line, it would be very boring. The same thing is true for programming. It would be too dull if the statements in your program could only be executed in the order in which they appear.

Mathematical Functions in C

Mathematical Functions in C

Basically, the math functions provided by the C language can be classified into three groups:

    Trigonometric and hyperbolic functions, such as acos(), cos(), and cosh().

Changing Data Sizes

Changing Data Sizes

Enabling or Disabling the Sign Bit

Enabling or Disabling the Sign Bit

As you know, it's very easy to express a negative number in decimal. All you need to do is put a minus sign in front of the absolute value of the number. But how does the computer represent a negative number in the binary format?

Question and Answer

Question and Answer

    Q Which bit can be used as the sign bit in an integer?