DarkMatter in Cyberspace
  • Home
  • Categories
  • Tags
  • Archives

R Notes


Install R and run

On Linux:

$ sudo apt update
$ sudo apt install r-base
$ R

On Windows:

choco install r.project
choco install r.studio

Install RStudio Server on Ubuntu

$ sudo apt update
$ sudo apt install r-base
$ sudo apt-get install gdebi-core
$ wget https://download2.rstudio.org/rstudio-server-1.1.383-amd64.deb
$ sudo gdebi rstudio-server-1.1.383-amd64.deb
$ sudo echo "r-libs-user=~/.RPackages" >> /etc/rstudio/rsession.conf
$ sudo rstudio-server restart

Login RStudio in browser at ":8787" with Ubuntu username and password.

Note 1: If not modify the default personal package directory in file /etc/rstudio/rsession.conf, the new installed package will be saved in folder ~/R. Changes in /etc/rstudio/rsession.conf only affect RStudio, not the R console (started with R in shell). To modify the installation path for both R console and RStudio, run echo 'R_LIBS_USER=~/.RPackages' >> $HOME/.Renviron. Verified with Sys.getenv("R_LIBS_USER") and .libPaths() in R console.

Note 2: List all rstudio-server subcommmands with sudo rstudio-server help.

Note 3: Specify another server port instead of 8787 in file /etc/rstudio/rserver.conf. See all configuration items in RStudio Server: Configuring the Server.

Compile Rmd file to docx

Install R base and relevant packages, build a new Rmd file and convert:

$ R
> chooseCRANmirror(graphics=FALSE)
> install.packages("rmarkdown")
> packageVersion("rmarkdown")
[1] ‘1.10’
> install.packages("knitr")
> packageVersion("knitr")
[1] ‘1.18’
> install.packages("devtools")
> devtools::install_github('rstudio/reticulate')

$ cat example.Rmd
---
title: "My Example Doc"
author: "Leo"
date: "2019.1.25"
output: html_document
---

This is a test file.

$ Rscript -e 'library(rmarkdown);render("example.Rmd", "word_document")'

Frequently used command

version
.libPaths()
installed.packages()[,c(1,3:4)]
getwd()
setwd("temp")  # set working directory

Get the package name which a specific object belongs to

For example, I want to get the name of the package where the function acf() defined in. In R console, run ?acf, get the package name (stats) at the first line of the function's documentation in the Help window: acf {stats}, and the last line of the doc: [Package stats version 3.2.3 Index].

Or run ??acf to get all similar function list in the Help window, here you can see stats::acf in the function list.

These methods also works for data object. For example, run ?women and ??women you can see it's defined in package datasets.

Explore data frame

str(women)           # list all column names, types and some data
head(women, n)       # first n rows
attributes(women)    # list all attributes (metadata) of an object,
                     #   such as names, dimension, class, row.names, etc.
names(women)         # get the *name* attribute of an object
dim(women)           # get the *dimension* attribute of an object (row and column number)
class(women)         # get the *class* attribute of an object
nrow(women)
ncol(women)
colnames(women)      # column names
summary(women)
women[sample(nrow(women), 3), ]  # sample 3 rows randomly
library("dplyr")
sample_n(women, 3)               # another way for sampling via package dplyr

names transition between data frame and vector

names is a basic attribute of all objects, including single scalar (but in R scalar is a vector with length equals to 1):

scalar <- 3
names(scalar) <- 'sname'
df <- data.frame(scalar)
names(df)        # "scalar"
attributes(df)   # $names: "scalar"; $row.names: "sname"; $class: "data.frame"

Data frame is a kind of list. Each element of this list is a vector. If the vector has set its names attribute (here they are 'fir' and 'sec'), they will be convert to rownames attribute in the created data frame:

> sv <- c(33,44)
> names(sv) <- c('fir', 'sec')
> sv
fir sec
 33  44
> sdf <- data.frame(sv)
> sdf
    sv
fir 33
sec 44
> class(sdf)
[1] "data.frame"
> mode(sdf)
[1] "list"

typeof(), mode() & class()

'mode' is a mutually exclusive classification of objects according to their basic structure. The 'atomic' modes are numeric, complex, charcter and logical. Recursive objects have modes such as 'list' or 'function' or a few others. An object has one and only one mode.

typeof() and mode() are basically the same with some differences, see ?mode for details. The return values of these functions are in a small set, such as "integer", "double", "logical", "character", "list", "closure"(function), etc.

'class' is a property assigned to an object that determines how generic functions (such as summary()) operate with it. It is not a mutually exclusive classification. If an object has no specific class assigned to it, such as a simple numeric vector, it's class is usually the same as its mode, by convention.

Changing the mode of an object is often called 'coercion'. The mode of an object can change without necessarily changing the class. e.g:

> x <- 1:8
> mode(x)
[1] "numeric"
> class(x)
[1] "integer"
> dim(x) <- c(2,4)
> mode(x)
[1] "numeric"
> class(x)
[1] "matrix"
> is.numeric(x)
[1] TRUE
> mode(x) <- "character"
> mode(x)
[1] "character"
> class(x)
[1] "matrix"

Ref: What is the difference between mode and class in R?

Data Types

1-dimensional: vector(c(...)), list(list(...));

2-dimensional: matrix(maxtrix(...)), data.frame(data.frame(...));

n-dimensional: array(array(...)).



Published

Dec 21, 2017

Last Updated

Apr 23, 2019

Category

Tech

Tags

  • R 5
  • rlang 17

Contact

  • Powered by Pelican. Theme: Elegant by Talha Mansoor