Difference between revisions of "Datasource Adapters"
(Created blank page) |
Tag: visualeditor |
||
Line 1: | Line 1: | ||
+ | = Flat File Reader = | ||
+ | = Overview = | ||
+ | The AMI Flat File Reader Datasource Adapter is a highly configurable adapter designed to process extremely large flat files into tables at rates exceeding 100mb per second*. There are a number of directives which can be used to control how the flat file reader processes a file. Each line (delineated by a Line feed) is considered independently for parsing. Note the EXECUTE <sql> clause supports the full AMI sql language. | ||
+ | |||
+ | ''*Using Pattern Capture technique (_pattern) to extract 3 fields across a 4.080 gb text file containing 11,999,504 records. This generated a table of 11,999,504 records x 4 columns in 37,364 milliseconds (additional column is the default linenum). Tested on raid-2 7200rpm 2TB drive'' | ||
+ | |||
+ | Generally speaking, the parser can handle three (4) different methods of parsing files: | ||
+ | |||
+ | === Delimited list or ordered fields === | ||
+ | Example data and query: | ||
+ | |||
+ | ''11232''|''1000''|''123.20'' | ||
+ | |||
+ | ''12412''|''8900''|''430.90'' | ||
+ | |||
+ | CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|" | ||
+ | |||
+ | _fields="String account, Integer qty, Double px" | ||
+ | |||
+ | EXECUTE SELECT * FROM file | ||
+ | |||
+ | === Key value pairs === | ||
+ | Example data and query: | ||
+ | |||
+ | account=''11232''|quantity=''1000''|price=''123.20'' | ||
+ | |||
+ | account=''12412''|quantity=''8900''|price=''430.90'' | ||
+ | |||
+ | CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|" _equals="=" | ||
+ | |||
+ | _fields="String account, Integer qty, Double px" | ||
+ | |||
+ | EXECUTE SELECT * FROM file | ||
+ | |||
+ | === Pattern Capture === | ||
+ | Example data and query: | ||
+ | |||
+ | Account ''11232'' has ''1000'' shares at $''123.20'' px | ||
+ | |||
+ | Account ''12412'' has ''8900'' shares at $''430.90'' px | ||
+ | |||
+ | CREATE TABLE mytable AS USE _file="myfile.txt" | ||
+ | |||
+ | _fields="String account, Integer qty, Double px" | ||
+ | |||
+ | _pattern="account,qty,px=Account (.*) has (.*) shares at \\$(.*) px" | ||
+ | |||
+ | EXECUTE SELECT * FROM file | ||
+ | |||
+ | === Raw Line === | ||
+ | If you do not specify a _fields, _mapping nor _pattern directive then the parser defaults to a simple row-per-line parser. A "line" column is generated containing the entire contents of each line from the file | ||
+ | |||
+ | CREATE TABLE mytable AS USE _file="f.txt" EXECUTE SELECT * FROM FILE |
Revision as of 22:51, 4 March 2021
Flat File Reader
Overview
The AMI Flat File Reader Datasource Adapter is a highly configurable adapter designed to process extremely large flat files into tables at rates exceeding 100mb per second*. There are a number of directives which can be used to control how the flat file reader processes a file. Each line (delineated by a Line feed) is considered independently for parsing. Note the EXECUTE <sql> clause supports the full AMI sql language.
*Using Pattern Capture technique (_pattern) to extract 3 fields across a 4.080 gb text file containing 11,999,504 records. This generated a table of 11,999,504 records x 4 columns in 37,364 milliseconds (additional column is the default linenum). Tested on raid-2 7200rpm 2TB drive
Generally speaking, the parser can handle three (4) different methods of parsing files:
Delimited list or ordered fields
Example data and query:
11232|1000|123.20
12412|8900|430.90
CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|"
_fields="String account, Integer qty, Double px"
EXECUTE SELECT * FROM file
Key value pairs
Example data and query:
account=11232|quantity=1000|price=123.20
account=12412|quantity=8900|price=430.90
CREATE TABLE mytable AS USE _file="myfile.txt" _delim="|" _equals="="
_fields="String account, Integer qty, Double px"
EXECUTE SELECT * FROM file
Pattern Capture
Example data and query:
Account 11232 has 1000 shares at $123.20 px
Account 12412 has 8900 shares at $430.90 px
CREATE TABLE mytable AS USE _file="myfile.txt"
_fields="String account, Integer qty, Double px"
_pattern="account,qty,px=Account (.*) has (.*) shares at \\$(.*) px"
EXECUTE SELECT * FROM file
Raw Line
If you do not specify a _fields, _mapping nor _pattern directive then the parser defaults to a simple row-per-line parser. A "line" column is generated containing the entire contents of each line from the file
CREATE TABLE mytable AS USE _file="f.txt" EXECUTE SELECT * FROM FILE