scanf [ HP C/iX Library Reference Manual ] MPE/iX 5.0 Documentation
HP C/iX Library Reference Manual
scanf
Reads externally formatted data from the standard input stream stdin.
Syntax
#include <stdio.h>
int scanf (const char *format [,item [,item]...] );
Parameters
format A pointer to a character string defining the format of the
data to be read (or the character string itself enclosed in
double quotes).
item Each item is the address of a variable into which the data
will be placed.
Return Values
>=0 The number of successfully matched and assigned input
items.
EOF An error occurred on input (no input characters, or a
matching error occurred before any conversion).
Description
The scanf function reads externally formatted data from the standard
input stream stdin, converts the data to internal format, and stores the
results in a string of arguments.
In the scanf function, format is a character pointer to a character
string (or the character string itself enclosed in double quotes), and
item is the address of a variable. The scanf function returns the number
of successfully matched and assigned input items or returns EOF if there
are no input characters available or if a matching error occurred before
any conversion was made.
The purpose of the format is to specify how the data to be read is
presented on stdin and what types of data are found there. The format
consists of white-space characters, conversion specifications, and
literal characters.
White-Space Characters
White-space characters (blanks, tabs, newlines, or form feeds) cause
input to be read up to the next non-white-space character.
Conversion Specifications
A conversion specification is a character sequence that tells scanf() how
to interpret the data received at that point in the input.
In the format, a conversion specification is introduced by a percent sign
(%), optionally followed by an asterisk (*) (called the assignment
suppression character), optionally followed by an integer value (called
the field width). The conversion specification is terminated by a
character specifying the type of data to expect; the terminating
characters are called conversion characters. The integer and
floating-point conversion characters may be optionally preceded by a
character indicating the size of the receiving variable.
When a conversion specification is encountered in a format, it is matched
up with the corresponding item in the item list. The data formatted by
that specification is then stored in the location pointed to by that
item. For example, if there are four conversion specifications in a
format, the first specification is matched up with the first item, the
second specification with the second item, and so on.
The number of conversion specifications in the format is directly related
to the number of items specified in the item list. With one exception,
there must be at least as many items as there are conversion
specifications in the format. If there are too few items in the item
list, an error occurs; if there are too many items, the excess items are
ignored. The one exception occurs when the assignment suppression
character (*) is used. If an asterisk occurs immediately after the
percent sign (before the field width, if any), the data formatted by that
conversion specification is discarded. No corresponding item is expected
in the item list; this is useful for skipping over unwanted data in the
input.
Conversion Characters
There are 14 conversion characters: five format integer data, three
format character data, three format floating-point data, and three
special characters.
The integer conversion characters are:
d A decimal integer is expected.
i A signed integer is expected.
o An octal integer is expected.
u An unsigned decimal integer is expected.
x A hexadecimal integer is expected.
The character conversion characters are:
c A single character is expected, normal skip over
leading white space is suppressed.
s A character string is expected.
[ A character string is expected, normal skip over
leading white space is suppressed.
The floating-point conversion characters are:
e, f, g A floating-point number is expected (the
capitalized forms of these characters are also
accepted).
The special characters are:
p Matches an implementation-defined set of sequences.
n No input is consumed. The corresponding argument
is a pointer to an integer into which is written
the number of characters read from the input stream
so far by this call to fscanf().
% Matches a single %. No conversion or assignment
occurs. The complete conversion specification is
&%&%
Integer Conversion Characters
The d, o, and x conversion characters read characters from stdin until an
inappropriate character is encountered, or until the number of characters
specified by the field width, if given, is exhausted (whichever comes
first).
For d, an inappropriate character is any character except +, -, and 0
through 9. For o, an inappropriate character is any character except +,
-, and 0 through 7. For x, an inappropriate character is any character
except +, -, 0 through 9, and the characters a through f and A through F.
Note that negative octal and hexadecimal values are stored in their twos
complement form with sign extension. Thus, they might look unfamiliar if
you print them out later using printf().
These integer conversion characters can be preceded by a l to indicate
that a long int should be expected rather than an int. They can also be
preceded by h to indicate a short int. The corresponding items in the
item list for these conversion characters must be pointers to integer
variables of the appropriate length.
Character Conversion Characters
The c conversion character reads the next character from stdin, no matter
what that character is. The corresponding item in the item list must be
a pointer to a character variable. If a field width is specified, the
number of characters indicated by the field width are read. In this
case, the corresponding item must refer to a character array large enough
to hold the characters read.
Note that strings read using the c conversion character are not
automatically terminated with a null character in the array. Because all
C library functions that use strings assume the existence of a null
terminator, be sure to add the '\0' character yourself. If you do not,
library functions are not able to tell where the string ends and you will
get unexpected results.
The s conversion character reads a character string from stdin which is
delimited by one or more space characters (blanks, tabs, or newlines).
If no field width is given, the input string consists of all characters
from the first nonspace character up to (but not including) the first
space character. Any initial space characters are skipped over. If a
field width is given, characters are read, beginning with the first
nonspace character, up to the first space character, or until the number
of characters specified by the field width is reached (whichever comes
first). The corresponding item in the item list must refer to a
character array large enough to hold the characters read, plus a
terminating null character which is added automatically.
The s conversion character cannot be made to read a space character as
part of a string. Space characters are always skipped over at the
beginning of a string, and they terminate reading whenever they occur in
the string. For example, suppose you want to read the first character
from the following input line:
" Hello, there!"
(Ten spaces followed by "Hello, there!"; the double quotes are added for
clarity). If you use %c, you get a space character. However, if you use
%1s, you get "H" (the first nonspace character in the input).
The [ conversion character also reads a character string from stdin.
However, you should use this character when a string is not to be
delimited by space characters. The left bracket is followed by a list of
characters, and is terminated by a right bracket. If the first character
after the left bracket is a circumflex (^), characters are read from
stdin until a character is read which matches one of the characters
between the brackets. If the first character is not a circumflex,
characters are read from stdin until a character not occurring between
the brackets is found. The corresponding item in the item list must
refer to a character array large enough to hold the characters read, plus
a terminating null character which is added automatically. In some
implementations, a minus sign (-) may specify a range of characters.
The three string conversion characters provide you with a complete set of
string-reading capabilities. The c conversion character can be used to
read any single character, or to read a character string when the exact
number of characters in the string is known beforehand. The s conversion
character enables you to read any character string which is delimited by
space characters, and is of unknown length. Finally, the [ conversion
character enables you to read character strings that are delimited by
characters other than space characters, and which are of unknown length.
Floating-Point Conversion Characters
The e, f, and g (or E, F, and G, respectively) conversion characters read
characters from stdin until an inappropriate character is encountered, or
until the number of characters specified by the field width, if given, is
exhausted (whichever comes first).
The e, f, and g characters expect data in the following form: an
optionally signed string of digits (possibly containing a decimal point),
followed by an optional exponent field consisting of an E or e followed
by an optionally signed integer. Thus, an inappropriate character is any
character except +, -, ., 0 through 9, E, or e.
These floating-point conversion characters may be preceded by a lowercase
L (l), to indicate that a double value is expected rather than a float,
or by an uppercase L (in ANSI C) to indicate that a long double value is
expected rather than a float. The corresponding items in the item list
for these conversion characters must be pointers to floating-point
variables of the appropriate length.
Literal Characters
Any characters included in the format which are not part of a conversion
specification are literal characters. A literal character is expected to
occur in the input at exactly that point. Note that since the percent
sign is used to introduce a conversion specification, you must type two
percent signs (%%) to get a literal percent sign.
Suppose that you want to read the following line of data:
NAME: Joe Kool; AGE: 27; PROF: Elec Engr; SAL: 39550
To get the vital data, you must read two strings (containing spaces) and
two integers. You also have data that should be ignored, such as the
semicolons and the identifying strings ("NAME:"). To read the strings,
first note that the identifying strings are always delimited by space
characters. This suggests use of the s conversion character to read
them. Second, you can never know the exact sizes of the NAME and PROF
fields, but note that they are both terminated by a semicolon. Thus, you
can use [ to read them. Finally, the d conversion character can be used
to read both integers. (Note: On 16-bit processors, you probably need
to use a long int to read the salaries. Thus, ld should be used instead
of d.)
The following code fragment successfully reads this data:
char name[40], prof[40];
int age;
long int salary;
.
.
scanf("%*s%*[ ]%[^;]%*c%*s%d%*c%*s%*[ ]%[^;]%*c%*s%ld",name,&age,
prof,&salary);
For easier understanding, break the format into pieces:
%*s This reads the string "NAME:". Since an asterisk
is given the string is simply read and discarded.
%*[ ] This removes all blanks occurring between "NAME:"
and the employee's name. Note that this removes
one or more blanks, giving the format some
flexibility.
%[^;] This reads all characters from the current
character up to a semicolon, and assigns the
characters to the array name.
%*c This removes the semicolon left over after reading
the name.
%*s This reads the next identifying string, "AGE:", and
discards it.
%d This reads the integer age given, and assigns it to
age. The semicolon after the age terminates %d,
because that character is not appropriate for an
integer value. Note that the address of age is
given in the item list (&age) instead of the
variable name itself. If this is not done, a
memory fault occurs at runtime due to the attempt
of scanf() to use the parameter as a pointer.
%*c This removes the semicolon following the age.
%*s This reads the next identifying string, "PROF:",
and discards it.
%*[ ] This removes all blanks between "PROF:" and the
next string.
%[^;] This reads all characters up to the next semicolon,
and assigns them to the character array prof.
%*c This removes the semicolon following the profession
string.
%*s This reads the final identifying string, "SAL:",
and discards it.
%ld This reads the final integer and assigns it to the
long integer variable salary. Again, note that the
address of salary is given, not the variable name
itself.
Although somewhat confusing to read, this format is quite flexible,
because it allows for multiple spaces between items and varying
identifying strings (that is, "PROFESSION:" could be specified instead of
"PROF:"). The following scanf() call reads the same data, but is much
less flexible:
scanf("NAME: %[^;]; AGE:%d; PROF: %[^;]; SAL: %d",name,&age,prof,&salary);
In this example, literal characters are used to exactly match the
characters in the input line. This only works if you can be sure that
the data always appears in this form. However, if a typing variation is
made, such as typing "SALARY:" instead of "SAL:", the scanf() fails.
Scanf() waits for more data as long as there are unsatisfied conversion
specifications in the format. Thus, the scanf() call
scanf("%f%f%f", &float1, &float2, &float3);
where float1, float2, and float3 are all variables of type float, allows
you to enter data in several ways. For example,
14.77 29.8 13.0
is read correctly by scanf(), as is
14.77
29.8
13.0
Using decimal points in floating-point data is recommended whenever
floating-point variables are being read. However, scanf() converts
integer data to floating-point if the conversion specification so
demands. Thus, "13.0" in the previous example could have been entered as
"13" with no side effects.
As a final example, consider the input string:
abcdef137 d14.77ghijklmnop
Suppose the following code fragment is used to read this string:
char arr1[10], arr2[10], arr3[10], arr4[10];
float float1;
scanf("%4c%[^3]%6c%f%[ghijkl]",arr1,arr2,arr3,&float1,arr4);
%4c Reads four characters and assigns them to arr1.
Thus, the string abcd is assigned to arr1. Note
that a null character is not appended to the end of
the string.
%[ ^ 3] Reads all characters from the current character up
to the character 3. This assigns ef1, along with
an added null character, to the array arr2.
%6c Reads the next six characters and stores them in
the array arr3. Thus, 37 d14 is assigned to arr3.
A null character is not appended to the end of the
string.
%f Reads a floating-point value which, due to the lack
of a field width, is terminated by the first
inappropriate character. Thus, the value .77 is
assigned to float1.
%[ghijkl] Reads all characters up to the first character not
occurring between the brackets. This stores the
string ghijkl, along with an appended null
character, in the array arr4.
Note that there are some characters left in stdin that were not read.
Any characters left unread in the input remain there, which might cause
unexpected errors.
Suppose that, later in the above program fragment, you want to read a
string from stdin using %s. No matter what string you type in as input,
it will never be read, because the %s conversion specification is
satisfied by reading "mnop" (the characters left over from the previous
read operation). To solve this, be sure you have read the entire current
line of input before attempting to read the next. To fix this in the
previous scanf() example, add a %*s%*c conversion specification at the
end of the format (%*s reads characters up to the next newline character,
and %*c reads the newline). This reads and discards the excess
characters.
You can use a minus character (-) between characters in the match list
inside a [ conversion specifier to indicate a range of characters. For
example, the conversion specifier [A-Z] matches all the characters from A
through Z.
See Also
fscanf(), sscanf(), getc(), setlocale(), printf(), strtod(), strtol(),
ANSI C 4.9.6.4, POSIX.1 8.1
MPE/iX 5.0 Documentation