Bigger Data,
Integrated Faster
Solution Services Microservices EVL Anonymization EVL Validation EVL Generation

EVL Generation Microservice

EVL Data Generation Microservice provides fast, automated and cost-effective method for data generation. Having a proper test environment is a must in many areas like application development, implementing ETL processes, and stress testing. Often the tests can’t be done on existing production data so data simulation is the only way how to achieve the goal. The simulated data must comply with the real-life data patterns, and data volumes should be close to the expected peaks.

EVL Generation advantages:

  • Configuration via excel or csv files pre-filled by “reading” the existing data structures
  • Automatic random data generation based on data types
  • Customization of data pattern based on filled-in parameters like min-max intervals, null values probability, string ranges …
  • Ability to include custom made data generation functions
  • Extremely fast generation of vast amounts of data by using low-level IO techniques
  • Parallel running of jobs and workflow monitoring
  • Low implementation and operating costs

EVL Microservices are built on top of the core EVL software and retain its flexibility, robustness, high productivity, and ability to read data from various sources; including csv files, databases–Oracle, Teradata, SQL Server, etc–and Hadoop streaming data like Kafka and Flume.
EVL Generation white paper. Printable function guide and examples.

Download

EVL Generation Functions

Data Null Min Max Pattern Description
String Pattern like [a-zA-Z0-9]
Date Pattern Values between 1970-01-01 and 2199-12-31
Number 30% Min Max Intervals for integer and decimals and probability of null values

EVL Generation Project

A data generation project consists of following steps:
  1. unzipping EVL distribution and defining a few variables and paths
  2. filling-in an excel or csv file defining source type (e.g. csv, Oracle ODBC …),entity and attribute names and generation parameters to be applied
  3. automatic generation of EVL jobs for each entity
  4. running EVL jobs in a batch or individually
  5. monitoring and tuning

Example

Set variables:

# project directory CONFIG_FILE_DIRECTORY=$HOME/Project/Generation/ # configuration file name CONFIG_FILE=generation-config.csv # jobs directory CONFIG_EVD=evd/config.generation.evd # global default parameters CONFIG_GEN_DEAFULTS=val/config.evl.gen

Data generation definition file TEST

Src Entity Attribute Data type Null Min Max Pattern Description
csv TEST1 ID int 1 5000 Setting number interval, no null allowed
csv TEST1 ACC int 20% 20% can be null
csv TEST1 NOTE string 70% [a-zA-Z0-9,-] Allowed characters
csv TEST1 Sex string 1 1 ['M', 'F'] List of values
ORCL TEST2 ID int 1 300000 100+ List of values: 100, 200, 300
ORCL TEST2 Postcode Number 5% 5 5 Postcode must be 5 digits, 5% can be null
ORCL TEST2 Text string 3 80 Mandatory minimal text

Run:

# generating evl jobs from the config file evl run/generate_jobs.evl # running the test generation job evl run/generation.test.evl