A few months ago, I started writing a tool to help deal with chopping up data in CSV format. I came back to it yesterday out of a need for quickly uploading the data I needed for a basic shopping cart. The awk | sed
incantations needed are well beyond the scope of my knowledge and patience. I’m also not a big user of Perl, so it would have taken me just as long to figure out something there, if ever. Short on time? The tool is called briggs and can be downloaded via Github at this link.
The idea of briggs is to be what most good Unix tools are, a filter that transforms some input data and gives you something nice to work with. In my case, I needed to map CSV columns to SQL pretty quickly and didn’t have the time to muck about with shell tools to get it done. Of course, the tool can output columnar data into other formats as well.
Assuming we’ve got some data like the following:
inv_cost,inv_qty,inv_item,inv_image,inv_description 0.99,-1,2020-11-04-Sunrise.mp3,,Mr. Doctor – Sunrise (MP3 encoded) 0.99,-1,2020-11-04-Scapes-v1.mp3,,Mr. Doctor – Scapes (MP3 encoded) 0.99,-1,2020-11-04-Respect.mp3,,Mr. Doctor – Respect (MP3 encoded)
And we have a table with the following schema:
CREATE TABLE galleria_dominica ( inv_id INTEGER PRIMARY KEY AUTO_INCREMENT NOT NULL , inv_uuid varchar(64) DEFAULT( UUID() ) NOT NULL , inv_cost FLOAT NOT NULL , inv_qty INTEGER NOT NULL , inv_item VARCHAR(128) NOT NULL , inv_description TEXT , inv_image varchar(2048) );
briggs can convert the SQL needed to load the table via this:
briggs -f myfile.csv --convert --sql galleria_dominica
Giving us, this:
INSERT INTO galleria_dominica ( inv_cost,inv_qty,inv_item,inv_image,inv_description ) VALUES ( '0.99','-1','2020-11-04-Sunrise.mp3','','Mr. Doctor – Sunrise (MP3 encoded)' ); INSERT INTO galleria_dominica ( inv_cost,inv_qty,inv_item,inv_image,inv_description ) VALUES ( '0.99','-1','2020-11-04-Scapes-v1.mp3','','Mr. Doctor – Scapes (MP3 encoded)' ); INSERT INTO galleria_dominica ( inv_cost,inv_qty,inv_item,inv_image,inv_description ) VALUES ( '0.99','-1','2020-11-04-Respect.mp3','','Mr. Doctor – Respect (MP3 encoded)' ); INSERT INTO galleria_dominica ( inv_cost,inv_qty,inv_item,inv_image,inv_description ) VALUES ( '0.99','-1','2020-11-04-QueenCityBlues.mp3','','Mr. Doctor – Queen City Blues (MP3 encoded)' ); ...
We can also convert to JSON…
briggs -f myfile.csv --convert --json --newline
And get something like this
{ "inv_cost": "0.99" ,"inv_qty": "-1" ,"inv_item": "2020-11-04-Sunrise.mp3" ,"inv_image": "" ,"inv_description": "Mr. Doctor – Sunrise (MP3 encoded)" } ,{ "inv_cost": "0.99" ,"inv_qty": "-1" ,"inv_item": "2020-11-04-Scapes-v1.mp3" ,"inv_image": "" ,"inv_description": "Mr. Doctor – Scapes (MP3 encoded)" } ,{ "inv_cost": "0.99" ,"inv_qty": "-1" ,"inv_item": "2020-11-04-Respect.mp3" ,"inv_image": "" ,"inv_description": "Mr. Doctor – Respect (MP3 encoded)" }, ... }
The tool can also spit out XML…
briggs -f myfile.csv --convert --xml --newline
Giving us this:
<inv_cost>0.99</inv_cost> <inv_qty>-1</inv_qty> <inv_item>2020-11-04-Sunrise.mp3</inv_item> <inv_image></inv_image> <inv_description>Mr. Doctor – Sunrise (MP3 encoded)</inv_description> <inv_cost>0.99</inv_cost> <inv_qty>-1</inv_qty> <inv_item>2020-11-04-Scapes-v1.mp3</inv_item> <inv_image></inv_image> <inv_description>Mr. Doctor – Scapes (MP3 encoded)</inv_description> <inv_cost>0.99</inv_cost> <inv_qty>-1</inv_qty> <inv_item>2020-11-04-Respect.mp3</inv_item> <inv_image></inv_image> <inv_description>Mr. Doctor – Respect (MP3 encoded)</inv_description>
…though admitttedly, both of those need a little more work to just drop right in to a program and get going. I still like old Unix tools. I’m just not going to spend the same amount of time with them as I might have in the past.