Skip to main content
Version: 2.8

4 Utils


4.1 csv2evd

(since EVL 0.1)

Read <file.csv> or standard input, and guess:

  • data types,
  • field separator (unless option ‘--separator=<char>’ is used),
  • if strings are quoted (unless option ‘--quote=<char>’ or ‘--optional-quote=<char>’ is used),
  • end-of-line character(s) (unless option ‘--dos-eol’ or ‘--lin-eol’ or ‘--mac-eol’ is used)

and write EVD to standard output or to <file.evd>.

It uses header line for field names, spaces are replaced by underscores.

Separator is trying to be guessed in this order: ‘,’ (comma), ‘;’ (semi-colon), ‘|’ (pipe), ‘\t’ (tab), ‘:’ (colon), (space).

Quotation character is guessed in this order: double quotes, single quotes.

EVD is EVL data definition file, for details see man 5 evd.

Synopsis

csv2evd
[<file.csv>] [-o|--output=<file.evd>]
[--inline]
[-d|--date=<format>]
[-h|--header=<field_name>,...]
[-n|--no-header]
[-l|--null=<string>]
[-q|--quote=<char> | --optional-quote=<char>]
[-s|--separator=<char>]
[-t|--datetime=<format>]
[--timestamp=<format>]
[--dos-eol | --lin-eol | --mac-eol]
[-v|--verbose]

csv2evd
( --help | --usage | --version )

Options

-d, --date=<format>
by default it tries only ‘%Y-%m-%d’, then ‘%d.%m.%Y

-h, --header=<field_name>,...
use comma separated list of field names instead of header line, for example when there is no header in csv file (option ‘-n’ must be used) or when other field names would be used

--inline
output EVD in the inline format (for example to use EVD by other component with ‘-d’ option)

--dos-eol
do not guess end-of-line character(s), but suppose the input is text with CRLF as end of line,

--lin-eol
do not guess end-of-line character(s), but suppose the input is text with LF as end of line

--mac-eol
do not guess end-of-line character(s), but suppose the input is text with CR as end of line

-n, --no-header
with this option it suppose there is no header. Fields will be named ‘field_001’, ‘field_002’, etc.

-l, --null=<string>
to specify what string is used for NULL values in CSV, empty string is allowed

-o, --output=<file.evd>
write output into file <file.evd> instead of standard output

--optional-quote=<char>
suppose optional quote character <char>, must be used together with ‘--separator

-q, --quote=<char>
do not guess if fields are quoted, but suppose <char> as quotation character

-s, --separator=<char>
do not guess the separator, but use <char> instead

-t, --datetime=<format>
by default it tries only ‘%Y-%m-%d %H:%M:%S

--timestamp=<format>
by default it tries only ‘%Y-%m-%d %H:%M:%S.%E*f

-v, --verbose
print to STDERR info/debug messages

--help
print this help and exit

--usage
print short usage information and exit

--version
print version and exit

Examples

  1. Having table.csv:

    id;started;value
    1;2019-06-06;some string

    This command:

    csv2evd table.csv

    will try to guess data types, field separator and if strings are quoted or not, and use header line for field names, to produce EVD to standard output:

    id       int              null="" sep=";"
    started date("%Y-%m-%d") null="" sep=";"
    value string null="" sep="\n"
  2. Just an alternative invocation forwording output EVD to a file:

    csv2evd < table.csv > table.evd
  3. To skip header and use different field names:

    csv2evd --header="first_field,other_field,last_one" \
    table.csv > table.evd
  4. Case when there is no header in CSV file, but use specified field names:

    csv2evd --no-header --header="first_field,other_field,last_one" \
    table.csv > table.evd
  5. No header in CSV and use generated field names ‘field_001’, ‘field_002’, etc.:

    csv2evd --no-header table.csv > table.evd
  6. Consider specific date format, here day of year (‘001..366’), and ‘|’ as a field separator:

    csv2evd --date="%j" -s '|' table.csv > table.evd

4.2 csv2qvd

(since EVL 0.1)

Read <file.csv> with sturcture defined either in <evd> file or by <inline_evd> or guess

  • data types,
  • field separator (unless option ‘--separator=<char>’ is used),
  • if strings are quoted (unless option ‘--quote=<char>’ or ‘--optional-quote=<char>’ is used),
  • end-of-line character(s) (unless option ‘--dos-eol’ or ‘--lin-eol’ or ‘--mac-eol’ is used)

and write QVD file to <file.qvd> or standard output. For guessing data types (EVD) it uses utility ‘csv2evd’.

EVD is EVL data definition file, for details see man 5 evd.

Synopsis

csv2qvd
<file.csv>
[-o|--output=<file.qvd>]
[-d|--date=<format>]
[-h|--header=<field_name>,...]
[-n|--no-header]
[-l|--null=<string>]
[-q|--quote=<char> | --optional-quote=<char>]
[-s|--separator=<char>]
[-t|--datetime=<format>]
[--timestamp=<format>]
[--dos-eol | --lin-eol |--mac-eol]
[-v|--verbose]

csv2qvd
<file.csv> (<evd>|-d <inline_evd>)
[-o|--output=<file.qvd>]
[--dos-eol | --lin-eol |--mac-eol]
[-v|--verbose]

csv2qvd
( --help | --usage | --version )

Options

Standard options:

-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented to use already defined EVD

--dos-eol
do not guess end-of-line character(s), but suppose the input is text with CRLF as end of line,

--lin-eol
do not guess end-of-line character(s), but suppose the input is text with LF as end of line,

--mac-eol
do not guess end-of-line character(s), but suppose the input is text with CR as end of line,

-o, --output=<file.qvd>
write output into <file.qvd> instead of standard output

-v, --verbose
print to STDERR info/debug messages

--help
print this help and exit

--usage
print short usage information and exit

--version
print version and exit

EVD options:

--date=<format>
by default it tries only ‘%Y-%m-%d’, then ‘%d.%m.%Y

-h, --header=<field_name>,...
use comma separated list of field names instead of header line, for example when there is no header in csv file (option ‘-n’ must be used) or when other field names should be used

-n, --no-header
with this option it suppose there is no header. Fields will be named ‘field_001’, ‘field_002’, etc.

-l, --null=<string>
to specify what string is used for NULL values in CSV, empty string is allowed

--optional-quote=<char>
suppose optional quote character <char>, must be used together with ‘--separator

-q, --quote=<char>
do not guess if fields are quoted, but suppose <char> as quotation character

-s, --separator=<char>
do not guess the separator, but use <char> instead

-t, --datetime=<format>
by default it tries only ‘%Y-%m-%d %H:%M:%S

--timestamp=<format>
by default it tries only ‘%Y-%m-%d %H:%M:%S.%E*f

Examples

  1. Having ‘some.csv’:

    id;started;value
    1;2019-06-06;some string

    The command:

    csv2qvd --null="NULL" some.csv > some.qvd

    will produce some.qvd file with these field:

    id       int               null="NULL"  sep=";"
    started date("%Y-%m-%d") null="NULL" sep=";"
    value string null="NULL" sep="\n"
  2. Following invocation will guess data types, field separator and if strings are quoted or not, and use header line for field names:

    csv2qvd table.csv > table.qvd

    With the ‘--verbose’ option it will write to standard error the whole EVD file which was used:

    csv2qvd --verbose table.csv > table.qvd
  3. To skip header and use different field names:

    csv2qvd --header="first_field,other_field,last_one"
    table.csv > table.qvd
  4. Case when there is no header in CSV file, but use specified field names:

    csv2qvd --no-header --header="first_field,other_field,last_one" \
    table.csv > table.qvd
  5. No header in CSV and use generated field names ‘field_001’, ‘field_002’, etc.:

    csv2qvd --no-header table.csv > table.qvd
  6. Consider specific date format, here day of year (‘001..366’), and ‘|’ as a field separator:

    csv2qvd --date="%j" -s '|' table.csv > table.qvd
  7. To use own (specific or already generated) EVD file (i.e. data types definition):

    csv2qvd table.csv table.evd > table.qvd

Another example

Let’s have following sample_1.csv file with header and with semicolon as a field separator:

ID;Name;Code;Price;Created at
1;item_11;11;15,85;2016-05-04
2;item_12;12;21,25;2019-11-11
3;item_13;13;12,99;2019-05-05

Then by running:

csv2qvd -v < sample_1.csv > sample_1.qvd

will guess data types and use field names from the header and produce sample_1.qvd file.

And it will also write to standard error EVL data types which were used:

Used EVD:
---------
ID int null="" sep=";"
Name string null="" sep=";"
Code int null="" sep=";"
Price decimal(8,2) null="" sep=";"
Created_at date null="" sep="\n"
---------

More complex example

If guessed data types are not correct for some reason, for example the date format would be different or field which looks like an integer might be next time a string, then it is better to provide own EVD file with EVL data definition description.

Suppose following sample_2.csv file with header and with semicolon as a field separator:

ID;Name;Code;Price;Created at
1;item_11;11;15,85;05/04/2016
2;item_12;12;21,25;11/11/2019
3;item_13;13;12,99;05/05/2019

Let’s start by guessing EVD by csv2evd utility:

csv2evd sample_2.csv > sample_2.evd

It will result in sample_2.evd:

ID         int          null="" sep=";"
Name string null="" sep=";"
Code int null="" sep=";"
Price decimal(8,2) null="" sep=";"
Created_at string null="" sep="\n"

We can either use option --date="%m/%d/%Y" or simply edit sample_2.evd manually. Both will result in:

ID         int              null="" sep=";"
Name string null="" sep=";"
Code int null="" sep=";"
Price decimal(8,2) null="" sep=";"
Created_at date("%m/%d/%Y") null="" sep="\n"

4.3 qvd2csv

(since EVL 2.4)

Read <file.qvd> and write CSV file to <file.csv> or standard output. It uses data types from QVD header or from existing <evd> file or from <inline_evd>.

EVD is EVL data definition file, for details see man 5 evd.

Synopsis

qvd2csv
<file.qvd>
[-o|--output=<file.csv>]
[--all-as-string | --real-as-decimal[=<precision>,<scale>]]
[-d|--date=<format>]
[-h|--header=<field_name>,...]
[-n|--no-header]
[-l|--null=<string>]
[-q|--quote=<char>]
[-s|--separator=<char>]
[-t|--datetime=<format>]
[-a|--dos-eol | -b|--mac-eol]
[--filter=<condition>]
[--first-record=<n>]
[--guess-uniform-symbol-size]
[--low-memory]
[-v|--verbose]

qvd2csv
<file.qvd> (<evd>|-d <inline_evd>)
[-m|--match-fields]
[-o|--output=<file.csv>]
[-h|--header=<field_name>,...]
[-n|--no-header]
[-a|--dos-eol | -b|--mac-eol]
[--filter=<condition>]
[--first-record=<n>]
[--guess-uniform-symbol-size]
[--low-memory]
[-v|--verbose]

qvd2csv
( --help | --usage | --version )

Options

--all-as-string
interpret all fields as strings. (Since EVL 2.5.)

-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented to use already defined (custom) EVD

-a, --dos-eol
output DOS end-of-line, i.e. CR+LF (‘\r\n’)

-b, --mac-eol
output Mac end-of-line, i.e. CR (‘\r’)

--date=<format>
to specify a <format> for date data type

--filter=<condition>
read only records with given <condition>. (Since EVL 2.6.)

--first-record=<n>
start to read from the record number <n>. (Since EVL 2.6.)

--guess-uniform-symbol-size
might speed up indexing of dictionary, but it could not work in all cases. Use only in special cases when need really good performance. (Since EVL 2.6.)

-h, --header=<field_name>,...
use comma separated list of field names instead of header line, for example when you don’t want to use field names from QVD header.

--low-memory
do not read dictionary into memory. This could save memory consumption, but slows down reading the source file. (Since EVL 2.6.)

-l, --null=<string>
to specify what string is used for NULL values in CSV, empty string is allowed

-m, --match-fields
to read only a subset of fields from QVD file or to read them in different order

-n, --no-header
with this option it produces no header line

-o, --output=<file.csv>
write output into <file.csv> instead of standard output

-q, --quote=<char>
to use quoted fields for the CSV output. When data contains such <char>, all of them are escaped by duplicating them. For example using ‘--quote="\""’ will serve data like ‘some "text"’ as ‘"some ""text"""’.

--real-as-decimal[=<precision>,<scale>]
interpret ‘real’ data types as ‘decimal(<precision>,<scale>)’. When no <precision> or <scale> is specified, use values from environment variables ‘EVL_DEFAULT_DECIMAL_PRECISION’ and ‘EVL_DEFAULT_DECIMAL_SCALE’, which are by default set to 18 and 2. (Since EVL 2.5.)

-s, --separator=<char>
to use <char> as field separator for the CSV output

-t, --datetime=<format>
to specify a <format> for datetime data type

-v, --verbose
print to standard error output info/debug messages

--help
print this help and exit

--usage
print short usage information and exit

--version
print version and exit

Examples

  1. Having ‘some.qvd’, the command to produce CSV file with empty strings representing NULL values, dates in format ‘DD.MM.YYYY’ and with Windows end-of-line (i.e. CRLF):

    qvd2csv --null="" --date="%d.%m.%Y" --dos-eol some.qvd > some.csv
  2. To filter only particular records from ‘large.qvd’, for example we would like to read only latest records represented by field ‘invoice_id’:

    qvd2csv --filter="invoice_id>7654000" large.qvd > latest.csv
  3. To cut only particular columns from ‘large.qvd’, for example only column ‘invoice_id’:

    qvd2csv --match-fields -d 'invoice_id int null=""' large.qvd > latest.csv
  4. To read only after by some number of rows:

    qvd2csv --first-record=1234000 huge.qvd > latest.csv

    This could be quite useful when reading a huge QVD file.


4.4 qvd2evd

(since EVL 2.4)

Read header of <file.qvd> or standard input, guess data types, and write EVD to standard output or to <file.evd>.

EVD is EVL data definition file, for details see man 5 evd.

Synopsis

qvd2evd
[<file.qvd>] [-o|--output=<file.evd>]
[--all-as-string | --real-as-decimal[=<precision>,<scale>]]
[-d|--date=<format>]
[--inline]
[-l|--null[=<string>]]
[-q|--quote=<char>]
[-r|--record-separator=<char>]
[-s|--field-separator=<char>]
[-t|--datetime=<format>]
[-v|--verbose]

qvd2evd
( --help | --usage | --version )

Options

--all-as-string
produce EVD with all fields as strings. (Since EVL 2.5.)

-d, --date=<date_format>
use format argument for date data type

--inline
output EVD in the inline format (for example to use EVD by other component with ‘-d’ option)

-l, --null=<string>
to specify what string is used for NULL values in QVD, empty string is allowed

-o, --output=<file.evd>
write output into file <file.evd> instead of standard output

-q, --quote=<char>
to use a quote argument in EVD

--real-as-decimal[=<precision>,<scale>]
produce EVD with ‘decimal(<precision>,<scale>)’ instead of ‘double’. When no <precision> or <scale> is specified, it uses values from environment variables ‘EVL_DEFAULT_DECIMAL_PRECISION’ and ‘EVL_DEFAULT_DECIMAL_SCALE’, which are by default set to 18 and 2. (Since EVL 2.5.)

-r, --record-separator=<char>
use ‘sep="<char>"’ for last field

-s, --field-separator=<char>
add ‘sep="<char>"’ to each field, except the last one

-t, --datetime=<format>
use format for datetime data type by default it produces no format for datetime, so the EVL_DEFAULT_DATETIME_PATTERN is then used (which is by default set to ‘"%Y-%m-%d %H:%M:%S"’)

-v, --verbose
print to STDERR info/debug messages

--help
print this help and exit

--usage
print short usage information and exit

--version
print version and exit

Examples

  1. Having ‘some.qvd’, this command:

    qvd2evd --null -r '\n' -s ';' -d '%d.%m.%Y' some.qvd

    will produce:

    id       int              null="" sep=";"
    started date("%d.%m.%Y") null="" sep=";"
    value string null="" sep="\n"

4.5 qvd-header

(since EVL 0.2)

Take the header of <file.qvd> or standard input and produce to standard output particular information, for example EVL data definition file or number of records.

Synopsis

qvd-header
[<file.qvd>] --output=evd
[--all-as-string | --real-as-decimal[=<precision>,<scale>]]
[-d|--date=<format>]
[--inline]
[-l|--null=<string>]
[-q|--quote=<char>]
[-r|--record-separator=<char>]
[-s|--field-separator=<char>]
[-t|--datetime=<format>]
[-v|--verbose]

qvd-header
[<file.qvd>] --output=(json|xml)
[--fields]
[-v|--verbose]

qvd-header
[<file.qvd>]
[ --table-name | --no-of-records | --fields | --tag=<xml_tag_name> ]
[-v|--verbose]

qvd-header
( --help | --usage | --version )

Options

--no-of-records
return the value of ‘NoOfRecords’ tag

--fields
provide only fields’ information

--table-name
return the value of ‘TableName

--tag=<xml_tag_name>
return the value of <xml_tag_name>

Output Options:

--output=evd
return EVD data types definition

--output=json
return information as JSON

--output=xml
return information as XML

EVD options:

--all-as-string
produce EVD with all fields as strings. (Since EVL 2.5.)

-d, --date=<date_format>
use format argument for date data type

--inline
output EVD, XML or JSON in the inline format (for example to use EVD by other component with ‘-d’ option)

-l, --null=<string>
to specify what string is used for NULL values in EVD

-q, --quote=<char>
to use a quote argument in EVD

--real-as-decimal[=<precision>,<scale>]
produce EVD with ‘decimal(<precision>,<scale>)’ instead of ‘double’. When no <precision> or <scale> is specified, it uses values from environment variables ‘EVL_DEFAULT_DECIMAL_PRECISION’ and ‘EVL_DEFAULT_DECIMAL_SCALE’, which are by default set to 18 and 2. (Since EVL 2.5.)

-r, --record-separator=<char>
use ‘sep="<char>"’ for last field

-s, --field-separator=<char>
use ‘sep="<char>"’ for each field

-t, --datetime=<date_format>
use format for datetime data type

Standard options:

--help
print this help and exit

--usage
print short usage information and exit

-v, --verbose
print to stderr info/debug messages of the component

--version
print version and exit

Examples

  1. EVD example:

    qvd-header some.qvd --output=evd --record-separator="\n" \
    --null="" --date="%Y-%m-%d"

    will produce for example:

    id          int              null=""
    some_stamp datetime null=""
    some_date date("%Y-%m-%d") null=""
    value string null="" sep="\n"
  2. JSON example:

    qvd-header some.qvd --output=json --fields

    will produce for example:

    {
    "fields":
    [
    {
    name: "REQUEST_HOUR",
    type: "timestamp",
    format: "%Y-%m-%d"
    },
    ...
    ]
    }

    And:

    qvd-header some.qvd --output=json

    will produce for example:

    {
    "name": "Table1",
    "records": 3615,
    "fields":
    [
    {
    name: "REQUEST_HOUR",
    type: "timestamp",
    format: "%Y-%m-%d",
    "tags": [
    "$numeric",
    "$timestamp"
    ]
    },
    ...
    ]
    }