Difference between revisions of "Datasource Adapters"

From 3forge Documentation
Jump to navigation Jump to search
(Created blank page)
 
Tag: visualeditor
Line 1: Line 1:
 +
= Flat File Reader =
  
 +
= Overview =
 +
The AMI Flat File Reader Datasource Adapter is a highly configurable adapter designed to process extremely large flat files into tables at rates exceeding 100mb per second*. There are a number of directives which can be used to control how the flat file reader processes a file. Each line (delineated by a Line feed) is considered independently for parsing. Note the EXECUTE <sql> clause supports the full AMI sql language.
 +
 +
''*Using Pattern Capture technique (_pattern) to extract 3 fields across a 4.080 gb text file containing 11,999,504 records. This generated a table of 11,999,504 records x 4 columns in 37,364 milliseconds (additional column is the default linenum). Tested on raid-2 7200rpm 2TB drive''
 +
 +
Generally speaking, the parser can handle three (4) different methods of parsing files:
 +
 +
=== Delimited list or ordered fields ===
 +
Example data and query:  
 +
 +
''11232''|''1000''|''123.20''
 +
 +
''12412''|''8900''|''430.90''
 +
 +
CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|"
 +
 +
_fields="String account, Integer qty, Double px"
 +
 +
EXECUTE SELECT * FROM file
 +
 +
=== Key value pairs ===
 +
Example data and query:
 +
 +
account=''11232''|quantity=''1000''|price=''123.20''
 +
 +
account=''12412''|quantity=''8900''|price=''430.90''
 +
 +
CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|" _equals="="
 +
 +
_fields="String account, Integer qty, Double px"
 +
 +
EXECUTE SELECT * FROM file
 +
 +
=== Pattern Capture ===
 +
Example data and query:
 +
 +
Account ''11232'' has ''1000'' shares at $''123.20'' px
 +
 +
Account ''12412'' has ''8900'' shares at $''430.90'' px
 +
 +
CREATE TABLE mytable AS USE _file="myfile.txt"
 +
 +
_fields="String account, Integer qty, Double px"
 +
 +
_pattern="account,qty,px=Account (.*) has (.*) shares at \\$(.*) px"
 +
 +
EXECUTE SELECT * FROM file
 +
 +
=== Raw Line ===
 +
If you do not specify a _fields, _mapping nor _pattern directive then the parser defaults to a simple row-per-line parser. A "line" column is generated containing the entire contents of each line from the file
 +
 +
CREATE TABLE mytable AS USE _file="f.txt" EXECUTE SELECT * FROM FILE

Revision as of 22:51, 4 March 2021

Flat File Reader

Overview

The AMI Flat File Reader Datasource Adapter is a highly configurable adapter designed to process extremely large flat files into tables at rates exceeding 100mb per second*. There are a number of directives which can be used to control how the flat file reader processes a file. Each line (delineated by a Line feed) is considered independently for parsing. Note the EXECUTE <sql> clause supports the full AMI sql language.

*Using Pattern Capture technique (_pattern) to extract 3 fields across a 4.080 gb text file containing 11,999,504 records. This generated a table of 11,999,504 records x 4 columns in 37,364 milliseconds (additional column is the default linenum). Tested on raid-2 7200rpm 2TB drive

Generally speaking, the parser can handle three (4) different methods of parsing files:

Delimited list or ordered fields

Example data and query:  

11232|1000|123.20

12412|8900|430.90

CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|"

_fields="String account, Integer qty, Double px"

EXECUTE SELECT * FROM file

Key value pairs

Example data and query:

account=11232|quantity=1000|price=123.20

account=12412|quantity=8900|price=430.90

CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|" _equals="="

_fields="String account, Integer qty, Double px"

EXECUTE SELECT * FROM file

Pattern Capture

Example data and query:

Account 11232 has 1000 shares at $123.20 px

Account 12412 has 8900 shares at $430.90 px

CREATE TABLE mytable AS USE _file="myfile.txt"

_fields="String account, Integer qty, Double px"

_pattern="account,qty,px=Account (.*) has (.*) shares at \\$(.*) px"

EXECUTE SELECT * FROM file

Raw Line

If you do not specify a _fields, _mapping nor _pattern directive then the parser defaults to a simple row-per-line parser. A "line" column is generated containing the entire contents of each line from the file

CREATE TABLE mytable AS USE _file="f.txt" EXECUTE SELECT * FROM FILE