Dfill - Filling empty fields from the preceding records

[ English | Japanese ]

[visit D-home]

SYNOPSIS

Dfill [ options ] field-list [ input-file.. ]

DESCRIPTION

Dfill fills empty fields listed in the field-list with values from the preceding non-empty records. Dfill is used, not necessarily, but usually with DfromHtml or DfromCsv to process tables. In the following examples, D-records are shown in their corresponding table forms.

In some tables, empty field means ditto. For example:

 
yearmonthrate
200810100.73
 1197.11
 1291.44
20090190.31
 0292.51
 0397.71
  
Result of Dfill year
yearmonthrate
200810100.73
20081197.11
20081291.44
20090190.31
20090292.51
20090397.71

The left table, naturally, shows rate from October 2008 to March 2009. For the corresponding D-record, the second and third records should be filled with year:2008, and the fifth and sixth records should be filled with year:2009.

Dfill year

does this operation.

Empty field

Empty field here means that there is no such field (null value) in the record, or the field value is a null string (all values are null string when the field is repeated). But, with -b option, only null values are regarded as empty.

The field-list may have more than one field. In this case, by default, the field order in the list gives hierarchy. Change in the superior field value cancels the subordinate field value. For example:

 
yearmonthcurrencyrate
200811EUR0.785
  JPY97.11
 12EUR0.743
  JPY91.44
2009 EUR0.713
  JPY93.17
  
Result of Dfill year,month
yearmonthcurrencyrate
200811EUR0.785
200811JPY97.11
200812EUR0.743
200812JPY91.44
2009 EUR0.713
2009 JPY93.17

Here, given the field-list year,month, the fifth record's month field is left empty (whtaever it means), while the fourth record is filled with year:2008 and month:12, the sixth record is filled with year:2009. (Of course, the second record is filled with year:2008 and month:11, the third record is filled with year:2008).

If you don't want hierarchical filling, and want to fill fields independently, use tandem Dfill, with individual field as the field-lists.

Dfill field-a input-file | Dfill field-b

Aggregate fields

Another case of filling field-list is aggregate value fields invoked by -a option. With this option, any change of listed fields is regarded as a new value. For example:

ISBNISSNtagvalue
 0028-0836titleNature
  descriptionv. 1- Nov. 4, 1869-
  publisher[London, etc., Macmillan Journals Ltd., etc.]
0131101633 titleThe C programming language
  authorBrian W. Kernighan, Dennis M. Ritchie
  publisherEnglewood Cliffs, N.J. : Prentice-Hall, c1978

This table will be filled with Dfill -a ISBN,ISSN as follows.

Result of Dfill -a ISBN,ISSN
ISBNISSNtagvalue
 0028-0836titleNature
 0028-0836descriptionv. 1- Nov. 4, 1869-
 0028-0836publisher[London, etc., Macmillan Journals Ltd., etc.]
0131101633 titleThe C programming language
0131101633 authorBrian W. Kernighan, Dennis M. Ritchie
0131101633 publisherEnglewood Cliffs, N.J. : Prentice-Hall, c1978

Output field order

The field order of the output record follows the shortest common super-sequence of the records -- the record which provide the filling field value, and the record to be filled. However, when the number of fields in the record is too big (either record has more than 512 fields, or the product of the number of fields is greater than 65536), filling fields come first and the other fields in the filled record follow.

OPTIONS

-a
aggregate; any change of listed fields is regarded as a new value. See the Aggregate fields in the description.
-b
blank is data; fields with null string is not regarded as empty. See the Empty field in the description.
-D [i/o]datautf=8|16|32
UTF I/O feature (see manual page of UTF I/O feature.)

ENVIRONMENT

Ddatautf, Didatautf, Dodatautf
for UTF I/O feature.

DIAGNOSTICS

See the manual of D_msg.

SEE ALSO

Dintro, DfromHtml, DfromCsv, D_msg.

AUTHOR

MIYAZAWA Akira


miyazawa@nii.ac.jp
2011