csvkit
Here we manipulate csv files with csvkit.
It's developed with Python. So install it with sudo pip install csvkit
;
Convert xls, xlsx file into csv file: in2csv input.xlsx > data.csv
;
List column titles of a csv file: csvcut -n data.csv
;
Print statistic information for each columns in a csv file: csvstat data.csv
;
Pretty print the first 3 columns, first 5 rows of data:
csvcut -c 1-3 agenda.csv | head -6 | csvlook
.
Print the first, second, 11th, 16th column of first 9 records
(another one for headline): csvcut -c 1,2,11,16 data.csv|head
;
If there are newline characters in records, add "-l" option to make it clear:
csvcut -l -c 1,2,11,16 data.csv|head
;
Convert to json file with csvjson:, head fairData150104.csv|csvjson -i 4
,
where "-i" specify how many indent spaces, default is no indent.
Or select some collomns to show:
csvcut -c 1,2,11,16 fairData150104.csv|head|csvjson -i 4
Select some columns and save to a new file:
csvcut -c <col-index> input.csv > output.csv
.
For example, save the 2nd and 3rd columns of file 276107.csv
to file windspeed.csv
: csvcut -c 2,3 276107.csv > windspeed.csv
.
csvtool
Install: sudo aptitude install csvtool
;
-
Pretty print the first 3 columns, first 5 rows of data:
csvtool col 1-3 agenda.csv | head -6 | csvtool readable -
. -
csvtool sub
data.csv: Take a square subset of the CSV, top left at row
, column , which is deep and wide. and count from 1, or from 0 if -z option is given. Example:
csvtool sub 1 16 1 1 data.csv
-
csvtool take
data.csv: take first rows of the csv, like "head" in shell; -
csvtool drop
data.csv: drop the first rows and return the rest (if any).
See csvtool -h
for details.
To print the nth record easily, add the following function into ~/.bash_alias file:
function ctls(){
if test $# -eq 3; then
csvtool drop $2 $1 | csvtool take $3 -
else
echo ctls: csvtool list specified record
echo Synopsis: ctls '<filename> <startNo> <count>'
fi
}
Now print the 3rd record of data.csv with ctls data.csv 3 1
.
The "-" in "csvtool take" means "get input from stdin instead of a file".
Select some columns and save to a new file:
csvtool col <col-index> input.csv > output.csv
.
For example, save the 1,2,3 and 6th columns of input file to output file:
csvtool col 1-3,6 input.csv > output.csv
.
Join some files column-wise:
$ cat aa.csv
col1,col2
11,21
12,22
$ cat bb.csv
col3,col4
31,41
32,42
$ csvtool paste aa.csv bb.csv
col1,col2,col3,col4
11,21,31,41
12,22,32,42
Put them together
-
Count the number of records which the 7th field is empty:
csvgrep -c 7 -r "^$" data.csv | csvtool height -
; -
Count the number of records which the 7th field is NOT empty:
csvgrep -c 7 -r "^.+$" data.csv | csvtool height -
; -
Pretty print all rows whose 3rd column is Bob in file agenda.csv:
csvgrep -c 3 -m 'Bob' agenda.csv| csvcut -c 1-3 | csvtool readable -
-
Print data on screen without line wrap:
less -S <filename>
;