Read-Components
10 Read Components ¶
There is a generic ‘Read’ component (see Read), which read source file(s) and parse them based on file suffix from location based on URI Scheme. It can also read a table based on URI.
But for example in the case you need to read and parse particular file format from an input flow, there are also specialized components which parse such format:
Reading various file formats ¶
- Readevd – parse EVD,
- Readjson – parse JSON,
- Readparquet – read Parquet, a columnar file format,
- Readqvd – read QVD, Qlik’s file format,
- Readxls – parse old-style MS Excel,
- Readxlsx – read MS Excel,
- Readxml – parse XML.
And in the case you need to use some DBMS specific options to read a table, there are also DB specific read components:
Reading tables and streams ¶
- Readkafka – read Kafka topic,
- Readmysql – read MySQL/MariaDB table,
- Readora – read Oracle table,
- Readpg – read PostgreSQL table,
- Readsqlite – read SQLIte table,
- Readtd – read Teradata table.
10.1 Read ¶
(since EVL 1.0)
Read <source>(s) (file mask can be specified) and sends it to output <f_out>.
Multiple <source>s are concatenated.
It automatically parses various file formats: ‘Avro’, ‘json’, ‘Parquet’, ‘QVD’,
‘xls’, ‘xlsx’ and ‘xml’, just based on file suffix.
Also when compression suffix is recognized, like ‘gz’, ‘tar’, ‘bz2’, ‘zip’,
‘Z’, data are decompressed automatically.
In general the <source> is of the form
[scheme:][//[user@@]host[:port]]/path/basename[.format][.compression] [scheme:][//[user@@]host[:port]/]database?(table=[schema.]<table>|query=<query>)
When <source> starts with ‘file:’, ‘sftp:’, ‘hdfs:’, ‘s3:’, ‘gs:’
or ‘smb:’ it uses appropriate utility to get data from such location. If no URI Scheme is
presented, it reads from local file system.
When <source> starts with ‘mysql:’, ‘mssql’, ‘postgres:’, ‘oracle:’,
‘sqlite:’ or ‘teradata:’ it uses appropriate utility to get data from such database.
Besides below mentioned options, which changes file suffix behaviour, one can use generic
‘--cmd=<cmd>’ option, which calls ‘echo <source>... | xargs <cmd>’ to obtain the input
for this component. <cmd> can be also a pipeline (that is the reason for xargs). See examples below
for inspiration.
Read
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl read
is intended for standalone usage, i.e. to be invoked from command line and and write to standard
output.
EVD is EVL data definition file and EVS defines EVL job structure, for details see evl-evd(5) and evl-evs(5).
URI Scheme for file: ¶
Based on the URI Scheme in the <source>, it calls appropriate utility to get files or
tables.
no scheme, ‘``file:``’,
suppose local filesystem
‘``gdrive:``’
calls ‘gdrive’ utility
‘``gs:``’
calls Google’s ‘gsutil’ utility
‘``hdfs:``’
calls ‘hdfs dfs’ utility
‘``s3:``’
calls AWS’s ‘aws s3’ utility
‘``sftp:``’
calls ‘ssh’ utility
‘``smb:``’
calls ‘smbclient’ utility
URI Scheme for table: ¶
‘``mysql:``’
calls Readmysql component to read MySQL/MariaDB table
‘``mssql:``’
calls Readmssql component to read MySQL/MariaDB table
‘``postgres:``’
calls Readpg component to read PostgreSQL table
‘``oracle:``’
calls Readora component to read Oracle table
‘``sqlite:``’
calls Readsqlite component to read SQLite table
‘``teradata:``’
calls Readtd component to read Teradata table
Compression: ¶
Compressed file suffix behaviour (applied by following the order):
‘``*.tgz``’, ‘``*.tar.gz``’
calls ‘tar -zxO’
‘``*.tar.Z``’
calls ‘tar -ZxO’
‘``*.tar.bz2``’
calls ‘tar -jxO’
‘``*.tar``’
calls ‘tar -xO’
‘``*.gz``’, ‘``*.GZ``’, ‘``*.Z``’, ‘``*.zip``’, ‘``*.bz2``’
calls ‘gunzip -c’
‘``*.zip``’, ‘``*.ZIP``’
calls ‘unzip -p’
File Type: ¶
Read component behaves according to the <source> suffix.
Specific file formats suffix behaviour:
‘``*.avro``’, ‘``*.AVRO``’
calls ‘evl readavro’
‘``*.csv``’, ‘``*.CSV``’, ‘``*.txt``’, ‘``*.TXT``’
read file(s) with ‘--text-input’ option, other than standard Unix end-of-line character
(‘\n’) can be specified by option ‘--dos-eol’ or ‘--mac-eol’
‘``*.json``’, ‘``*.JSON``’
calls ‘evl readjson’
‘``*.parquet``’, ‘``*.parq``’, ‘``*.PARQUET``’, ‘``*.PARQ``’
calls ‘evl readparquet’
‘``*.qvd``’, ‘``*.QVD``’
calls ‘evl readqvd’
‘``*.xls``’, ‘``*.XLS``’
calls ‘evl readxls’
‘``*.xlsx``’, ‘``*.XLSX``’
calls ‘evl readxlsx’
‘``*.xml``’, ‘``*.XML``’
calls ‘evl readxml’
Synopsis ¶
Read
<source>... <f_out> (<evd>|-d <inline_evd>)
[--footer=<n>] [--header=<n>] [--cmd=<cmd>]
[<file_type_options>]
[--ignore-suffix] [--allow-missing-file]
[-y|--text-output [--dos-eol | --mac-eol] ]
[-w|--where=<condition>] [--filter=<filter>]
[--validate]
evl read
<source>... (<evd>|-d <inline_evd>)
[--footer=<n>] [--header=<n>] [--cmd=<cmd>]
[<file_type_options>]
[--ignore-suffix] [--allow-missing-file]
[-y|--text-output [--dos-eol | --mac-eol] ]
[-w|--where=<condition>] [--filter=<filter>]
[--validate]
[-v|--verbose]
evl read
( --help | --usage | --version )
Options ¶
--allow-missing-file
don’t fail if <source> doesn’t exist, and produce empty output
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example:
‘-d \"user_name string, user_sum int\"’
--filter=<filter>
when ‘--where’ option is used and replacing of SQL syntax is not valid, use <filter>
when reading file(s)
-f, --footer=<n>
skip last <n> records. When multiple files, skip last <n> records in each of them. Command
‘evl head -n-<n> --skip-parse’ is used for this job.
-h, --header=<n>
skip first <n> records. When multiple files, skip first <n> records in each of them. Command
‘evl tail -<n>+(N+1) --skip-parse’ is used for this job.
--cmd=<cmd>
bash command <cmd> is used to read the <source>s. In such case recognizing file’s suffix is
switched off. See examples below for inspiration.
--ignore-suffix
ignore <source>’s suffix, act only based on options.
--validate
without this option, no fields are checked against data types. With this option, all output fields
are checked
-w, --where=<condition>
use this where condition instead of reading whole file/table. In case of reading a table it sends
the query to the database with this where condition. In case of a file it reads the whole file and
apply the evl-filter component right after. For the filter it replaces these SQL logical operators
to C++ ones:
- ‘
AND’ -> ‘&&’ - ‘
OR’ -> ‘||’ - ‘
=’ -> ‘==’ <>-> ‘!=’
so one can use also SQL notation to specify a condition. It also removes quotes around field names and replaces single quotes by double quotes for proper string notion:
- ‘
\"field_name\"’ -> ‘field_name’ - ‘
'’ -> ‘\"’
Can be useful to have the same syntax for files and for tables.
-x, --text-input
suppose the input as text, not binary
--dos-eol
suppose the input is text with CRLF as end of line
--mac-eol
suppose the input is text with CR as end of line
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
File type options: ¶
--avro
whatever <source>’s suffix, act as reading ‘avro’ file format
--gz
whatever <source>’s suffix, act as reading ‘gz’, ‘Z’, ‘zip’, ‘bz2’ compressed
file format
--json
whatever <source>’s suffix, act as reading ‘json’ file format
--parquet
whatever <source>’s suffix, act as reading ‘parquet’ file format
--qvd
whatever <source>’s suffix, act as reading Qlik’s ‘QVD’ file format
--xls, --xlsx
whatever <source>’s suffix, act as reading MS Excel ‘xls’ or ‘xlsx’ file format
--tar
whatever <source>’s suffix, act as reading tar file
--xml
whatever <source>’s suffix, act as reading ‘xml’ file format
QVD, XLS, XLSX, XML and JSON specific option: ¶
--match-fields
for other than QVD, XLS(X), XML and JSON file is this option ignored.
XML and JSON specific option: ¶
--all-fields-exist
for other then XML and JSON file is this option ignored.
XML specific options: ¶
--document-tag=<tag>
for other then XML file is this option ignored. Check ‘man evl readxml’ for details.
--record-tag=<tag>
for other then XML file is this option ignored. Check ‘man evl readxml’ for details.
--vector-element-tag=<tag>
for other then XML file is this option ignored. Check ‘man evl readxml’ for details.
XLS and XLSX specific options: ¶
--sheet-index=<n>
read <n>-th sheet, starting from number 0. ‘--sheet-index=0’ is default
--sheet-name=<name>
read sheet with name <name>
Examples ¶
Standard examples of standalone usage: ¶
-
Read tar.gz, skip header line and validate data types Write into ‘
example.csv’ the content of the tarred and gzipped source without the header line and with validated data types:evl read -d 'id int sep=";", value string sep="\n"' \
-h1 -vxy <example.csv.tar.gz >example.csv -
Gzipped json file:
evl read sample.json.gz sample.evd -y >sample.csvAs the file has standard file suffixes ‘
gz’ and ‘json’, they are automatically recognized a gunzipped and parsed as JSON.
Standard examples of usage in EVL Job: ¶
-
Gzipped json file. The same as example 2., but to be used in evs file:
Read sample.json.gz SRC sample.evd
Write SRC sample.csv sample.evd
10.2 Readevd ¶
(since EVL 2.5)
Read EVD file from stdin and output using this evd structure:
parents vector null=""
string
name string
data_type string
format string null=""
comment string null=""
null vector null=""
string
separator string null=""
quote struct null=""
char string(1)
optional uchar
options vector
struct
tag string
value string null=""
decimal struct null=""
precision uchar
scale uchar
decimal_separator string(1) null=""
thousands_separator string null=""
string struct null=""
length ulong null=""
locale string null=""
encoding string null=""
max_bytes ulong null=""
max_chars ulong null=""
ustring struct null=""
length ulong null=""
locale string null=""
encoding string null=""
max_bytes ulong null=""
max_chars ulong null=""
Readevd
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readevd
is intended for standalone usage, i.e. to be invoked from command line and and write to standard
output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readevd
<f_in> <f_out> [-y|--text-output]
evl readevd
[-y|--text-output] [-v|--verbose]
evl readevd
( --help | --usage | --version )
Options ¶
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
10.3 Readjson ¶
(since EVL 1.2)
Parse <f_in> into <evd>.
In general not all input fields need to exist in the input JSON, but if they are, then the option
‘--all-fields-exist’ will speed up the processing.
When the input JSON has not the same order of fields as defined in <evd>, then option
‘--match-fields’ has to be used.
Usually when reading JSON file written by ‘Writejson’, it is good to call ‘Readjson’ with
option ‘-a’, as there are always all fields from <evd>.
Readjson
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readjson
is intended for standalone usage, i.e. to be invoked from command line and and write to standard
output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readjson
<f_in> <f_out> (<evd>|-d <inline_evd>)
[-a|--all-fields-exist] [-m|--match-fields] [-y|--text-output]
evl readjson
(<evd>|-d <inline_evd>)
[--all-fields-exist] [--match-fields] [-y|--text-output]
[-v|--verbose]
evl readjson
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example: -d ’user_sum long’
-a, --all-fields-exist
when the input contain all fields (e.g. output of evl-writejson), then using this option increase
the performance
-m, --match-fields
when field are not in the same order as used in evd, this option must be used
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
In general not all input fields need to exist in the input JSON, but if they are, then the option "–all-fields-exist" will speed up the processing.
When the input JSON has not the same order of fields as defined in "EVD" file, then option "–match-fields" has to be used.
Usually when reading file written by "EVL" component ‘Writejson’, it is good to call "Readjson" with option "-a", as there are always all fields from "EVD".
10.4 Readkafka ¶
(since EVL 1.1)
Component calls kafka consumer command, specified by ‘EVL_KAFKA_CONSUMER_COMMAND’, which is by
default ‘kafka-console-consumer.sh’. and run it with options:
--bootstrap-server "<server>:<port>" --topic "<topic>" ``<kafka_consumer_opts>``
and send the output to <f_out>.
Readkafka
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readkafka
is intended for standalone usage, i.e. to be invoked from command line and and write to standard
output.
EVS is EVL job structure definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readkafka
<topic> <f_out>
-s|--bootstrap-server <server:port>
[<kafka_consumer_opts>]
evl readkafka
<topic>
-s|--bootstrap-server <server:port>
[<kafka_consumer_opts>]
[-v|--verbose]
evl readkafka
( --help | --usage | --version )
Options ¶
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
10.5 Readmysql ¶
(since EVL 2.4)
Write to stdout or <f_out> MariaDB/MySQL <table>.
Password is taken from file ‘$EVL_PASSFILE’, which is by default ‘$HOME/.evlpass’. When
such file has not permissions 600 (or 400), it is ignored! For details see ‘evl-password’.
Readmysql
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readmysql
is intended for standalone usage, i.e. to be invoked from command line and writing records to
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readmysql
[<schema>.]<table> <f_out> (<evd>|-d <inline_evd>)
[-b|--dbname=<database>] [-h|--host=<hostname>] [-p|--port=<port>]
[-q|--query=<query>] [-u|--username=<mysqluser>]
[--mysql=<mysql-options>] [-y|--text-output]
evl readmysql
[<schema>.]<table> (<evd>|-d <inline_evd>)
[-b|--dbname=<database>] [-h|--host=<hostname>] [-p|--port=<port>]
[-q|--query=<query>] [-u|--username=<mysqluser>]
[--mysql=<mysql-options>] [-y|--text-output]
[-v|--verbose]
evl readmysql
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example: -d ’id int, user_id string
enc=iso-8859-1’
-q, --query=<query>
Use SQL <query> instead of reading whole table. With this option <table> might be an empty string.
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
’mysql’ options: ¶
-b, --dbname=<database>
this option is provided to ‘mysql’ command as ‘--database=<database>’
-h, --host=<hostname>
this option is provided to ‘mysql’ command
-p, --port=<port>
using other than standard port 3306. This option is provided to ‘mysql’ command.
-u, --username=<mysqluser>
if not mentioned, then current system username is used as mysql user. This option is provided to
‘mysql’ command as ‘--user=<mysqluser>’.
--mysql=<mysql-options>
other mysql options can be specified here
10.6 Readora ¶
(since EVL 2.0)
Write to standard output or <f_out> Oracle <table>.
When <schema> is not present, environment variable ‘ORADATABASE’ is used.
Password is taken from file ‘$EVL_PASSFILE’, which is by default ‘$HOME/.evlpass’. When
such file has not permissions 600 (or 400), it is ignored! For details see ‘evl-password’.
Readora
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readora
is intended for standalone usage, i.e. to be invoked from command line and writing records to
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
SQL*Plus Field Separator
Reading the table by SQL*Plus uses as field seprator the value of
‘$EVL_ORACLE_FIELD_SEPARATOR’, which is by default set to ‘\x1f’ (i.e. an Unit
Separator), and last field in each record is separated by ‘\n’.
SQL*Plus script hook
Custom options might be added to SQL*Plus script by environment variable
‘$EVL_ORACLE_SQLPLUS_HOOK’.
Synopsis ¶
Readora
[<schema>.]<table> <f_out> <evd>
[--query=<query>] [-w|--where=<condition>]
[ --connect=<connect_identifier> | -b|--dbname=<database> -h|--host=<hostname> [-p|--port=<port>] ]
[-u|--username=<oracle_user>] [-y|--text-output]
evl readora
[<schema>.]<table> <evd>
[--query=<query>] [-w|--where=<condition>]
[ --connect=<connect_identifier> | -b|--dbname=<database> -h|--host=<hostname> [-p|--port=<port>] ]
[-u|--username=<oracle_user>] [-y|--text-output]
[-v|--verbose]
evl readora
( --help | --usage | --version )
Options ¶
--query=<query>
use SQL <query> instead of reading whole table. With this option <table> might be an
empty string.
-w, --where=<condition>
use this where condition instead of reading whole table.
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
’sqlplus’ options: ¶
--connect=<connect_identifier>
sqlplus will be called in the form:
<username>/<password>@<connect_identifier>
where <connect_identifier> can be in the form
[<net_service_name> | [//]Host[:Port]/<service_name>]
without this option environment variable ‘ORACONN’ (if defined) is used as connection
identifier for sqlplus
-b, --dbname=<database>
either this or environment variable ‘ORADATABASE’ should be provided, If also
‘ORADATABASE’ environment variable is set, this option has preference.
-h, --host=<hostname>
either this or environment variable ‘ORAHOST’ should be provided when connecting to other host
than localhost. If also ‘ORAHOST’ variable is set, this option has preference.
-p, --port=<port>
either this or environment variable ‘ORAPORT’ should be provided when using other than
standard port ‘1521’.
-u, --username=<oracle_user>
without this option environment variable ‘ORAUSER’ is used as user for sqlplus
10.7 Readparquet ¶
(since EVL 2.0)
Write to stdout or <f_out> Parquet files from <parquet> directory.
Readparquet
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readparquet
is intended for standalone usage, i.e. to be invoked from command line and writing records into
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readparquet
<parquet> <f_out> (<evd>|-d <inline_evd>) [-y|--text-output]
evl readparquet
<parquet> (<evd>|-d <inline_evd>) [-y|--text-output]
[-v|--verbose]
evl readparquet
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example: -d ’id int, name string, started
timestamp’
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
10.8 Readpg ¶
(since EVL 2.0)
Write to standard output or <f_out> PostgreSQL <table>.
Password is taken:
- from file ‘
$EVL_PASSFILE’, which is by default ‘$HOME/.evlpass’,- from file ‘
$PGPASSFILE’, which is by default ‘$HOME/.pgpass’.
When such file has not permissions 600, it is ignored! For details see ‘evl-password’.
Readpg
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readpg
is intended for standalone usage, i.e. to be invoked from command line and writing records to
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readpg
[<schema>.]<table> <f_out> (<evd>|-d <inline_evd>)
[-q|--query=<query> | -w|--where=<condition>]
[-b|--dbname=<database>] [-h|--host=<hostname>] [-p|--port=<port>]
[-u|--username=<pguser>] [--psql=<psql_options>] [-y|--text-output]
evl readpg
[<schema>.]<table> (<evd>|-d <inline_evd>)
[-q|--query=<query> | -w|--where=<condition>]
[-b|--dbname=<database>] [-h|--host=<hostname>] [-p|--port=<port>]
[-u|--username=<pguser>] [--psql=<psql_options>] [-y|--text-output]
[-v|--verbose]
evl readpg
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or <evd> file must be presented. Example:
‘-d 'id int, user_id string enc=iso-8859-1'’
-q, --query=<query>
Use SQL <query> instead of reading whole table. With this option <table> might be an
empty string.
-w, --where=<condition>
use this where condition instead of reading whole table
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
’psql’ options: ¶
-b, --dbname=<database>
either this or environment variable ‘PGDATABASE’ should be provided, if not, then current
system username is used as psql database. If also ‘PGDATABASE’ environment variable is set,
this option has preference. (This option is provided to ‘psql’ command.)
-h, --host=<hostname>
either this or environment variable ‘PGHOST’ should be provided when connecting to other host
than localhost. If also ‘PGHOST’ variable is set, this option has preference. (This option is
provided to ‘psql’ command.)
-p, --port=<port>
either this or environment variable ‘PGPORT’ should be provided when using other than standard
port ‘5432’. (This option is provided to ‘psql’ command.)
--psql=<psql_options>
all other options to be provides to psql command. See ‘man psql’ for details.
-u, --username=<pguser>
either this or environment variable ‘PGUSER’ should be provided, if not, then current system
username is used as psql user. If variable ‘PGUSER’ is set, this option has preference. (This
option is provided to ‘psql’ command.)
Examples ¶
-
To read a table from default schema (mostly ‘
public’) in EVL job (i.e. in EVS file) from localhost:5432:export PGUSER=some_pg_user
export PGDATABASE=my_db
Readpg my_table MYTABLE evd/mytable.evd
Map MYTABLE ...Password is taken from ~/.pgpass, which has 600 permissions and look like this:
localhost:5432:my_db:some_pg_user:H+SCs9;_@D
10.9 Readqvd ¶
(since EVL 2.3)
Write to standard output or <f_out> the content of the <file.qvd>. It parses fields
as they are specified in EVD file, unless ‘--match-fields’ is specified.
If there are less fields in the EVD file than in QVD, only such fields are returned.
Readqvd
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readqvd
is intended for standalone usage, i.e. to be invoked from command line and reading records from
standard input.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readqvd
<file.qvd> <f_out> (<evd>|-d <inline_evd>)
[-y|--text-output | -a|--text-output-dos-eol | -b|--text-output-mac-eol]
[-m|--match-fields]
[-n|--null-as-string[=<string>]]
[--filter=<condition>]
[--first-record=<n>]
[--guess-uniform-symbol-size]
[--low-memory]
evl readqvd
<file.qvd> (<evd>|-d <inline_evd>)
[-y|--text-output | -a|--text-output-dos-eol | -b|--text-output-mac-eol]
[-m|--match-fields]
[-n|--null-as-string[=<string>]]
[--filter=<condition>]
[--first-record=<n>]
[--guess-uniform-symbol-size]
[--low-memory]
[-v|--verbose]
evl readqvd
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example:
‘-d 'id int, name string, started timestamp'’
-m, --match-fields
match fields between EVD and QVD, otherwise they are taken one by one from input QVD file. If there
are less fields in the EVD file than in QVD, only such fields are returned.
-n, --null-as-string[=<string>]
read <string> as a NULL value, without <string> specified it reads an empty string as
NULL
--filter=<condition>
read only records with given <condition>.
--first-record=<n>
start to read from the record number <n>.
--guess-uniform-symbol-size
might speed up indexing of dictionary, but it could not work in all cases. Use only in special
cases when need really good performance.
--low-memory
do not read dictionary into memory. This could save memory consumption, but slows down reading the
source file.
-y, --text-output
write the output as text, not binary
--text-output-dos-eol
produce the output as text with CRLF as end of line
--text-output-mac-eol
produce the output as text with CR as end of line
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
10.10 Readsqlite ¶
(since EVL 2.7)
Write to stdout or <f_out> SQLite <table>.
It takes the whole table with columns in order defined by EVD, unless <query> and/or
<condition> is specified.
Path to the database file is taken from environment variable ‘$EVL_SQLITE_DATABASE’, unless
<db_file> is specified.
Readsqlite
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readsqlite
is intended for standalone usage, i.e. to be invoked from command line and writing records to
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readsqlite
<table> <f_out> (<evd>|-d <inline_evd>)
[--dbname=<db_file>] [--query=<query>] [-w|--where=<condition>]
[-y|--text-output]
evl readsqlite
<table> (<evd>|-d <inline_evd>)
[--dbname=<db_file>] [--query=<query>] [-w|--where=<condition>]
[-y|--text-output]
[-v|--verbose]
evl readsqlite
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example: -d ’id int, user_id string
enc=iso-8859-1’
--dbname=<db_file>
path to the SQLite database file; if this option is not used, database file is taken from
environment variable ‘$EVL_SQLITE_DATABASE’.
--query=<query>
Use SQL <query> instead of reading whole table. With this option <table> might be an empty string.
-w, --where=<condition>
use this where condition instead of reading whole table.
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
Examples ¶
-
To read a table ‘
my_table’ in EVL job (i.e. in EVS file) from ‘/home/myself/my_db.sqlite’:export EVL_SQLITE_DATABASE="/home/myself/my_db.sqlite"
Readsqlite my_table MYTABLE evd/mytable.evd
Map MYTABLE ... -
Command line usage of sending table ‘
my_table’ from ‘/home/myself/my_db.sqlite’ to standard output:export EVL_SQLITE_DATABASE="/home/myself/my_db.sqlite"
evl readsqlite my_table evd/mytable.evd --text-outputor just
evl readsqlite –dbname="/home/myself/my_db.sqlite" my_table evd/mytable.evd –text-output
10.11 Readtd ¶
(since EVL 1.1)
Write to stdout or <f_out> Teradata <table>.
Readtd
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readtd
is intended for standalone usage, i.e. to be invoked from command line and writing records to
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readtd
<database>.<table> <f_out> (<evd>|-d <inline_evd>) [-y|--text-output]
evl readtd
<database>.<table> (<evd>|-d <inline_evd>) [-y|--text-output]
[-v|--verbose]
evl readtd
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example: -d ’id int, user_id string
enc=iso-8859-1’
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
10.12 Readxls ¶
(since EVL 2.2)
Read XLS sheet and write to <f_out>.
Unless ‘--sheet-index’ or ‘--sheet-name’ is specified, it reads only the first sheet from
the file.
It skips the header line, unless option ‘--no-header’ or ‘--match-fields’ is used.
Readxls
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readxls
is intended for standalone usage, i.e. to be invoked from command line and writing records to
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readxls
<file> <f_out> (<evd>|-d <inline_evd>)
[-m|--match-fields | --no-header]
[--sheet-index=<n> | --sheet-name=<name>]
[-y|--text-output]
evl readxls
<file> (<evd>|-d <inline_evd>)
[-m|--match-fields | --no-header]
[--sheet-index=<n> | --sheet-name=<name>]
[-y|--text-output]
[-v|--verbose]
evl readxls
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example:
‘-d 'id int, name string, started timestamp'’
-m, --match-fields
read only fields specified by EVD, based on header. All characters other than ‘[a-zA-Z0-9_-]’
are replaced by underscore when matching with EVD field names.
--no-header
suppose there is no header
--sheet-index=<n>
read <n>-th sheet, starting from number 0 (i.e. ‘--sheet-index=0’ is the default
behaviour)
--sheet-name=<name>
read sheet with name <name>
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
10.13 Readxlsx ¶
(since EVL 2.2)
Read XLSX sheet and write to <f_out>.
Unless ‘--sheet-index’ or ‘--sheet-name’ is specified, it reads only the first sheet from
the file.
It skips the header line, unless option ‘--no-header’ or ‘--match-fields’ is used.
Readxlsx
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readxlsx
is intended for standalone usage, i.e. to be invoked from command line and writing records to
standard output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readxlsx
<file> <f_out> (<evd>|-d <inline_evd>)
[-m|--match-fields | --no-header]
[--sheet-index=<n> | --sheet-name=<name>]
[-y|--text-output]
evl readxlsx
<file> (<evd>|-d <inline_evd>)
[-m|--match-fields | --no-header]
[--sheet-index=<n> | --sheet-name=<name>]
[-y|--text-output]
[-v|--verbose]
evl readxlsx
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example:
‘-d 'id int, name string, started timestamp'’
-m, --match-fields
read only fields specified by EVD, based on header. All characters other than ‘[a-zA-Z0-9_-]’
are replaced by underscore when matching with EVD field names.
--no-header
suppose there is no header
--sheet-index=<n>
read <n>-th sheet, starting from number 0 (i.e. ‘--sheet-index=0’ is the default
behaviour)
--sheet-name=<name>
read sheet with name <name>
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit
10.14 Readxml ¶
(since EVL 1.3)
Parse XML <f_in> into <evd>.
In general not all input fields need to exist in the input XML, but if they are, then the option
‘--all-fields-exist’ will speed up the processing.
When the input XML has not the same order of fields as defined in <evd>, then option
‘--match-fields’ has to be used.
Usually when reading XML file written by ‘Writexml’ it is good to call ‘Readxml’ with
option ‘-a’, as there are always all fields from <evd>.
Readxml
is to be used in EVS job structure definition file. <f_out> is either output file or flow
name.
evl readxml
is intended for standalone usage, i.e. to be invoked from command line and and write to standard
output.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis ¶
Readxml
<f_in> <f_out> (<evd>|-d <inline_evd>)
[-a|--all-fields-exist]
[-m|--match-fields]
[--document-tag=<tag>]
[--record-tag=<tag>]
[--vector-element-tag=<tag>]
[-y|--text-output]
evl readxml
(<evd>|-d <inline_evd>)
[-a|--all-fields-exist]
[-m|--match-fields]
[--document-tag=<tag>]
[--record-tag=<tag>]
[--vector-element-tag=<tag>]
[-y|--text-output]
[-v|--verbose]
evl readxml
( --help | --usage | --version )
Options ¶
-d, --data-definition=<inline_evd>
either this option or the file <evd> must be presented. Example: -d ’user_sum long’
-a, --all-fields-exist
when the input contain all fields (e.g. output of evl-writexml), then using this option increase
the performance
-m, --match-fields
when field are not in the same order as used in evd, this option must be used
--document-tag=<tag>
specify a tag name of the main tag, by default it tries to guess it. XML file should look like
this:
<?xml version="1.0" encoding="UTF-8"?>
<document>
...
</document>
where the tag ‘document’ can be of any name.
--record-tag=<tag>
specify a tag name of a record, by default it tries to guess it. XML file should look like this:
<?xml version="1.0" encoding="UTF-8"?>
<document>
<record>
...
</record>
<record>
...
</record>
<record>
...
</record>
...
</document>
where the tag ‘record’ can be of any name, but the same accross the file.
--vector-element-tag=<tag>
the name of the tag for vector elements, e.g. XML file with vector ‘someVector’:
...
<someVector>
<elem>1</elem>
<elem>2</elem>
<elem>3</elem>
</someVector>
...
shoul be read with option ‘--vector-element-tag=elem’.
-y, --text-output
write the output as text, not binary
Standard options: ¶
--help
print this help and exit
--usage
print short usage information and exit
-v, --verbose
print to stderr info/debug messages of the component
--version
print version and exit