DtoCsv [ -p field-list ] [ options ] [ input-file.. ]
DtoCsv converts D-records in the input-files into a csv (Comma Separated Values) file.
Each input D-file is converted into a label line and following data lines. When more than one input files are given, lines converted from the second input file directly follows the last line from the first input file. (There is no general way to identify the second or following label lines only from the output).
The label line consists of field names separated by the separator character (COMMA by default). A data line is made from a D-record, of which values are the values from corresponding D-fields with quotings if necessary. Quoting mechanism is with QUOTATION MARKs (") by default. When the D-record does not have the listed field, null string value is assumed. In the case the D-record has two or more fields of the same name (repeating field), the values are joined with new line charatcer to form a single value, and quoted.
As D-file does not have field name directory, and a D-file may contain any type of D-reords, it is not streightforward job to decide what fields and in which order the conversion shuould be made. For this purpose DtoCsv has inspection phase before actual csv lines output. In this phase, DtoCsv stores certain amount of input data, inspecting the field names and their order, without creating output. After the inspection phase, DtoCsv writes the label line and data lines from the stored data, then converts subsequent D-records from the input file. Thus, the fields not found in the inspection phase are just discarded.
Usually, DtoCsv reads D-records up to 8192 characters (including new line characters) or up to 1024 D-fields during the inspection phase. If the first D-record is unusually big and cannot be accommodated in this space, the space is expanded to accommodate at least one D-record.
You can give irecs parameter by -D option to specify the minimum number of inspection records. DtoCsv reads at least irecs records in the inspection phase. The default value of irecs is 1.
When a -p field-list option is given, only listed fields in the given order comprise the output csv file. There is no inspection phase in this case.
CSV format specification here is what this program interprets. Though it can work with the well-known software, there is no guarantee that it works with other applications. For the other variations of csv files, DtoLine provides more flexible (thus less simple) way of conversion.
In the both label and data lines, values are separated by the separator character [default is COMMA (,)]. The label line and following data lines have same number of values.
When a value has the separator character [default is COMMA (,]), the quoting character [default is QUOTATION MARK (")], new line character at any position or SPACE at the top or at the last of the value, the value is enclosed by quoting characters. The quoting character and the separator can be changed with -z and -t option.
The quoting character is QUOTATION MARK (") by default, but it is changed to APOSTROPHE (') by -z Q option.
The quoting character in the data is, by default, escaped by doubling the character. But, with -z x option preceding REVERSE SOLIDUS (\) is used to escape the quoting character.
Simple conversion from D-file data.d to data.csv; fields in the first records up to 8192 characters are converted.
DtoCsv data.d > data.csv
When you know all the fields appear in the first 100 records, you can make sure all fields are included by giving -D irecs= parameter.
DtoCsv -D irecs=100 data.d > data.csv
Make csv file only from the fields a, b and c.
DtoCsv -p a,b,c data.d > data.csv
N.B. if you misspell the field names, you may have no output without any warning.
See the manual of D_msg.
MIYAZAWA Akira