DESCRIPTION
Dfd
reads D-records from the
input-file
and creates a set of field description records,
which reports the name and attributes of a field
in the input file.
Typically, output of
Dfd
is piped to
Dpr
to read the result.
By default
Dfd
writes a set of field description record
at each end of
input-file.
Every field in the
input-file
is described by a field description record.
When
-g
option is specified,
Dfd
creates field description records
for each group of D-records which has same key value.
These group by key fields are not described
by field description records.
Only the fields in the present group
are included in the field description records.
Record order of
Dfd
follows the field order of input D-records.
If the all input records have same field order
(for example "a", "b", "c"),
output record order is in that sequence
(first record for field "a",
second record for field "b", and then "c").
When field order of input records is various,
Dfd
tries to find out the "natural" sequence.
OUTPUT RECORD
Each output record has following fields in that order.
- filename:
- input file name in the form of a command argument
after globbed by the shell.
This field does not exist when the input file is the standard input.
- group by key fields
- when
-g
option is given;
fields listed in
key-field-list;
values may be altered depending on key flags.
- fieldname:
- name of the concerned field.
- min:
- minimum occurrence of the concerned field.
If this value is zero,
one or more records lack the concerned field.
- max:
- maximum occurrence of the concerned field.
If this value is greater than one,
the concerned field is repeating field.
- exists:
- value of this field consists of two numbers
separated by "/".
The first part is the number of D-records which has
the concerned field,
and the second part is total number of D-records
in the input file.
These two parts have same number if
min
value is not zero.
- minlen:
- minimum length (number of characters) of the concerned field.
- maxlen:
- maximum length (number of characters) of the concerned field.
If the
minlen
and
maxlen
have same value,
the concerned field is fixed length.
- avglen:
- average length (number of characters) of the concerned field.
- attribute:
- shows numeric/string class attributes of
the concerned field.
This field exists only when the concerned field
has data (i.e.
maxlen
is not zero).
See the following subsection.
- NAsc:
- extended attribute; shows character class of the field.
This field exists only when
-e option is given,
and there is a non ASCII character in the concerned field.
Character class depends on the operating system.
See the following subsection.
- position:
- denotes relative position of the concerned field.
Value "split" means the concerned field appears more than once
in a record and in split position.
This implies the concerned field may be used in a repeating group.
This field exists only when the concerned field is "split".
Attribute
Dfd
inspects all the values of the concerned field and
reports a numeric attribute when they are regarded as numeric.
Otherwise it reports string attribute.
Numeric values may have leading or following spaces.
See
Dintro
for the detail.
Numeric attribute is one of following:
- Int
- unsigned decimal integer.
- Int-
- signed decimal integer.
- Num
- numeric value including decimal point
or floating point notation (e.g., .3141593e01).
- Hex
- hexadecimal numeric value, i.e. form of
0x.....
String attribute has a value consists of one or more
words from next list, possibly with a sign "+".
- Asc
- means the value has ASCII characters.
- NAsc
- means the value Non ASCII characters.
- Nprt
- means the value has non printing character.
- ?
- means the value has character not defined in
the output character set.
This is not set when UTF I/O feature
is applied to the output.
Extended attribute: NAsc field
Extended attribute which shows the character class
depends on the operating system environment.
When the internal code is UCS (Windows, Linux
and UTF-8 locales under Solaris),
character block names are used as character class.
There may be one or more NAsc fields,
each of which has a character block name as its value.
Following is an example of NAsc field.
NAsc:Hiragana
NAsc:CJK Unified Ideographs
In Solaris, when using non UTF-8 locale,
no NAsc field is given.
Because, the internal code composition
is no more open to the applications.