Tools
xsv installed with asdf
and cargo
.
feedgnuplot installed with apt
.
display
in imagemagick installed with apt
.
header
, Rio
in foler tools of
data-science-at-the-command-line.
Delete --vanilla
option in Rio
script to
use customized R environment setup.
Examples
git clone https://github.com/jeroenjanssens/data-science-at-the-command-line.git
cd data-science-at-the-command-line/data/ch07
< data/tips.csv Rio -ge 'g + geom_histogram(aes(bill))' | display
# use `q` to quit image window
< data/immigration.csv xsv select Period,Denmark,Netherlands,Norway,Sweden |
Rio -d',' -re 'reshape2::melt(df, id="Period", variable.name="Country", value.name="Count")' |
tee immigration-long.csv | head | xsv table
# note how to use `tee` to save calculation results in file
# here `-d` option is unnecessary
< data/tips.csv | xsv select size | header -d |
feedgnuplot --terminal 'dumb 80,25' --histogram 0 --with boxes --unset grid --exit
seq 5 | awk '{print 2*$1, $1*$1}' |
feedgnuplot --lines --points --legend 0 "data 0" --title "Test plot"\
--y2 1 --unset grid --terminal 'dumb 80,40' --exit
# a sin plots
seq -15 15 | awk '{print $1, sin($1)}' | feedgnuplot --domain --lines --points \
--unset grid --terminal 'dumb 120 30' --exit --legend 0 'sin(x)'
Note the difference between --domain
and --dataid
:
--domain
means using the first column as the X column,
instead of the row number.
While --dataid
means the 1st, 3rd, 5th ... columns are the ID of
the 2nd, 4th, 6th columns, respectively.
So you can put multiple curves in one column with different IDs.
For example, with --dataid
, the dataset below:
1 1.0
1 2.0
2 1.5
2 2.5
1 3.0
will be ploted as 2 curves:
1 1.0
1 2.0
1 3.0
and
2 1.5
2 2.5