
This is a thing to way speed up importing comma separated files into
TSV databases.

The core of this will ultimately become a C extension to NeoTcl.

Open up the CSV input file and open the TSV output file.

Open up the TSV output file for create-truncate-append.

Open up all of the dbopen index files for create-truncate-read-write.

While not EOF on the CSV file

	Use "tell" on the TSV file to find out the byte number in the
	file that we're at.

	Get a line from the file

	Split the file into a list

	Write the list out to the TSV file, prepending "+ " to the line.

	For each index file,
	    Obtain the value of the element of the list corresponding to
	    the element being indexed and write it into the dbopen file,
	    with it as the key and the "tell" value we fetched earlier
	    as the data (it will be the absolute index into the file
	    where the record starts.)

Have a callout that is performed every certain number of records in order
to update your thermometer, etc.

Hmm....

Why not make a route that writes a proc to do it.

Since we would create the proc from code, we would expand the constant
values (constant for this pass) one time, rather than having to lindex
them constantly, etc.

Hmm...

So, I'd say something like

    import_csv_database inFileName tableName "indexFieldName indexFieldName indexFieldName"

We're assuming field names are the first line of the file, or something else
is done behind our back to make it look that way.

This beastie reads in the field names and creates the output file and the 
dbopen index files.

It then configures and activates *something* that does the actual importing.
Like it feeds a C extension alternating field numbers and dbopen handles.

Yah, here we go...


The C extension looks like this:

csv_import_engine inputFileHandle outputFileHandle nRecordsPerCallback callbackProc [ fieldNumber dbopenHandle fieldNumber2 dbopenHandle2 ... ]


gets inputFileHandle

comma_split line

tell us output file position

write the line to the output file

for each dbopen file write the element of the comma-split list corresponding
to the specified index.




