HPlogo HP C/HP-UX Reference Manual: Version A.05.55.02 > Chapter 2 Program Organization

Lexical Elements

» 

Technical documentation

Complete book in PDF

 » Table of Contents

 » Index

C language programs are composed of lexical elements. The lexical elements of the C language are characters and white spaces that are grouped together into tokens. This section describes the following syntactic objects:

White Space, Newlines, and Continuation Lines

In C source files, blanks, newlines, vertical tabs, horizontal tabs, and form feeds are all considered to be white space characters.

The main purpose of white space characters is to format source files so that they are more readable. The compiler ignores white space characters, except when they are used to separate tokens or when they appear within string literals.

The newline character is not treated as white space in preprocessor directives. A newline character is used to terminate preprocessor directives. See “Overview of the Preprocessor” for more information.

The line continuation character in C is the backslash (\). Use the continuation character at the end of the line when splitting a quoted string or preprocessor directive across one or more lines of source code.

Spreading Source Code Across Multiple Lines

You can split a string or preprocessor directive across one or more lines. To split a string or preprocessor directive, however, you must use the continuation character (\) at the end of the line to be split; for example:

#define foo_macro(x,y,z) ((x) + (y))\
* ((z) - (x))

printf("This is an very, very, very lengthy and \
very, very uninteresting string.");

Comments

A comment is any series of characters beginning with /* and ending with */. The compiler replaces each comment with a single space character.

HP C allows comments to appear anywhere in the source file except within identifiers or string literals. The C language does not support nested comments.

In the following example, a comment follows an assignment statement:

average = total / number_of_components; /* Find mean value. */

Comments may also span multiple lines, as in:

/*
This is a
multi-line comment.
*/

Identifiers

Identifiers, also called names, can consist of the following:

  • Letters (ISO Latin-1 decimal values 65-90 and 97-122)

  • Digits

  • Dollar sign ($) (HP C extension)

  • Underscore (_)

The first character must be a letter, underscore or a $ sign. Identifiers that begin with an underscore are generally reserved for system use. The ANSI/ISO standard reserves all names that begin with two underscores or an underscore followed by an uppercase letter for system use.

NOTE: HP C allows the dollar sign ($) in identifiers, including using the $ in the first character, as an extension to the language.

Identifiers cannot conflict with reserved “Keywords ”.

Legal Identifiers

meters
green_eggs_and_ham
system_name
UPPER_AND_lower_case
$name Legal in HP C, but non-standard

Illegal Identifiers

20_meters Starts with a digit
int The int type is a reserved keyword
no$#@good Contains illegal characters

Length of Identifiers

HP C identifiers are unique up to 256 characters.

The ANSI/ISO standard requires compilers to support names of up to 32 characters for local variables and 6 characters for global variables.

To improve portability, it is a good idea to make your local variable names unique within the first 32 characters, and global variable names unique within the first 6 characters.

Case Sensitivity in Identifiers

In C, identifier names are always case-sensitive. An identifier written in uppercase letters is considered different from the same identifier written in lowercase. For example, the following three identifiers are all unique:

kilograms
KILOGRAMS
Kilograms

Some HP-UX programming languages (such as Pascal and FORTRAN) are case-insensitive. When writing an HP C program that calls routines from these other languages, you must be aware of this difference in sensitivity.

Strings are also case-sensitive. The system recognizes the following two strings as distinct:

"THE RAIN IN SPAIN"
"the rain in spain"
#include <stdio.h>
void varfunc(void)
{
printf(“%s\n”, __func__);
/* ... */
}

Predefined identifier _func_

The HP ANSI C compiler defines the predefined identifier__func__, as specified in the C9X Standard. This provides an additional means of debugging your code.

The use of the predefined "__func__" identifier allows you to use more informative debugging statements to indicate a specific function. This is useful for fatal errors and conditions that produce warnings. __func__ can also be used within debugging macros in order to keep track of tasks such as the function calling stack, etc.

The __func__ identifier is implicitly declared by the compiler in the following manner:

static const char __func__[ ] = "<function-name>";

The declaration is implicitly added immediately after the opening brace of a function which uses the variable __func__. The value <function-name> is the name of the lexically-enclosing function.

The following code example fragment illustrates the use of the predefined identifier__func__.

Each time the varfunc function is called, it prints to the standard output stream:

varfunc

Keywords

HP C supports the list of keywords shown below. You cannot use keywords as identifiers; if you do, the compiler reports an error. You cannot abbreviate a keyword, and you must enter keywords in lowercase letters.

NOTE: The const, signed, and volatile keywords are part of the ANSI/ISO standard, but not part of the K&R language definition.

auto

Causes the variable to be dynamically allocated and initialized only when the block containing the variable is being executed. This is the default for local variables.

break

See “break ”.

case

An optional element in a switch statement. The case label is followed by an integral constant expression and a (:) colon. No two case constant expressions in the same switch statement can have the same value. For example:

switch (getchar())
{
case 'r':
case 'R':
moveright();
break;
...
}

char

The char keyword defines an integer type that is 1 byte long.

A char type has a minimum value of -128 and a maximum value of 127.

The numeric range for unsigned char is 1 byte, with a minimum value of 0 and a maximum value of 255.

const

Specifies that an object of any type must have a constant value throughout the scope of its name. For example:

/* declare factor as a constant float */
const float factor = 2.54;

The value of factor cannot change after the initialization.

continue

See “continue ”.

default

A keyword used within the switch statement to identify the catch-all-else statement. For example:

switch (grade){
case 'A':
printf("Excellent\n");
break;
default:
printf("Invalid grade\n");
break;
}

double

A 64-bit data type for representing floating-point numbers.

The lower normalized bound is 2.225E-308. The lower de normalized bound is 4.941E-324. The upper bound is 1.798E+308.

Other floating-point types are float and long double.

else

See “if ”.

extern

Used for declarations both within and outside of a function (except for function arguments). Signifies that the object is declared somewhere else.

float

A 32-bit data type for representing floating-point numbers.

The range for float is:

  • Min: Least normalized: 1.1755E-38 Least de normalized: 1.4013E-45

  • Max: 3.4028E+38

Other floating-point types are double and long double.

for

See “for ”.

goto

See “goto ”.

if

See “if ”.

int

A 32-bit data type for representing whole numbers.

The range for int is -2,147,483,648 through 2,147,483,647.

The range for unsigned int is 0 through 4,294,967,295.

long

A 32-bit integer data type in the HP-UX 32-bit data model. The range for long is -2,147,483,648 through 2,147,483,647. For the HP-UX 64-bit data model, the long data type is 64-bits and the range is the same as the long long data type.

The long long 64-bit data type is supported as an extension to the language when you use the -Ae compile-line option.

The range for long long is -9,223,372,036,854,775,808 through +9,223,372,036,854,775,807.

register

Indicates to the compiler that the variable is heavily used and may be stored in a register for better performance.

return

See “return ”.

short

A 16-bit integer data type.

The range for short is -32,768 through 32,767.

The range for unsigned short is 0 through 65,535.

signed

All integer data types are signed by default. The high-order bit is used to indicate whether a value is greater than or less than zero. Use this modifier for better source code readability. The signed keyword can be used with these data types:

  • char

  • int

  • enum

  • long

  • long long

  • short

Whether or not char is signed or unsigned by default is implementation-defined. The signed keyword lets you explicitly declare (in a portable) way a signed char.

static

A variable that has memory allocated for it at program startup time. The variable is associated with a single memory location until the end of the program.

switch

See “switch ”.

__thread

Beginning with the HP-UX 10.30 operating system release, this HP-specific keyword defines a thread specific data variable, distinguishing it from other data items that are shared by all threads. With a thread-specific data variable, each thread has its own copy of the data item. These variables eliminate the need to allocate thread-specific data dynamically, thus improving performance.

This keyword is implemented as an HP-specific type qualifier, with the same syntax as const and volatile, but not the same semantics. Syntax examples:

__thread int j=2;
int main()
{
j = 20;
}

Semantics for the __thread keyword: Only variables of static duration can be thread specific. Thread specific data objects can not be initialized. Pointers of static duration that are not thread specific may not be initialized with the address of a thread specific object — assignment is okay. All global variables, thread specific or not, are initialized to zero by the linker implicitly.

Only one declaration, for example,

__thread int x;

is allowed in one compilation unit that contributes to the program (including libraries linked into the executable). All other declarations must be strictly references:

extern __thread int x;

Even though __thread has the same syntax as a type qualifier, it does not qualify the type, but is a storage class specification for the data object. As such, it is type compatible with non-thread-specific data objects of the same type. That is, a thread specific data int is type compatible with an ordinary int, (unlike const and volatile qualified int).

Note that use of the __thread keyword in a shared library will prevent that shared library from being dynamically loaded (that is, loaded via an explicit call to shl_load()).

unsigned

A data type modifier that indicates that no sign bit will be used. The data is assumed to contain values greater than or equal to zero. All integer data types are signed by default. The unsigned keyword can be used to modify these data types:

  • char

  • int

  • enum

  • long

  • long long

  • short

void

The void data type has three important purposes:

  • to indicate that a function does not return a value

  • to declare a function that takes no arguments

  • to allow you to create generic pointers.

To indicate that a function does not return a value, you can write a function definition such as:

void func(int a, int b)
{
. . .
}

This indicates that the function func() does not return a value. Likewise, on the calling side, you declare func() as:

extern void func(int, int);

volatile

Specifies that the value of a variable might change in ways that the compiler cannot predict. If volatile is used, the compiler will not perform certain optimizations on that variable.

while

See “while ”.

© Hewlett-Packard Development Company, L.P.