Tools
xsv installed with asdf and cargo.
feedgnuplot installed with apt.
display in imagemagick installed with apt.
header, Rio in foler tools of
data-science-at-the-command-line.
Delete --vanilla option in Rio script to
use customized R environment setup.
Examples
git clone https://github.com/jeroenjanssens/data-science-at-the-command-line.git
cd data-science-at-the-command-line/data/ch07
< data/tips.csv Rio -ge 'g + geom_histogram(aes(bill))' | display
# use `q` to quit image window
< data/immigration.csv xsv select Period,Denmark,Netherlands,Norway,Sweden |
Rio -d',' -re 'reshape2::melt(df, id="Period", variable.name="Country", value.name="Count")' |
tee immigration-long.csv | head | xsv table
# note how to use `tee` to save calculation results in file
# here `-d` option is unnecessary
< data/tips.csv | xsv select size | header -d |
feedgnuplot --terminal 'dumb 80,25' --histogram 0 --with boxes --unset grid --exit
seq 5 | awk '{print 2*$1, $1*$1}' |
feedgnuplot --lines --points --legend 0 "data 0" --title "Test plot"\
--y2 1 --unset grid --terminal 'dumb 80,40' --exit
# a sin plots
seq -15 15 | awk '{print $1, sin($1)}' | feedgnuplot --domain --lines --points \
--unset grid --terminal 'dumb 120 30' --exit --legend 0 'sin(x)'
Note the difference between --domain and --dataid:
--domain means using the first column as the X column,
instead of the row number.
While --dataid means the 1st, 3rd, 5th ... columns are the ID of
the 2nd, 4th, 6th columns, respectively.
So you can put multiple curves in one column with different IDs.
For example, with --dataid, the dataset below:
1 1.0
1 2.0
2 1.5
2 2.5
1 3.0
will be ploted as 2 curves:
1 1.0
1 2.0
1 3.0
and
2 1.5
2 2.5