Reading and Writing Data to and from R


Reading files into R

Usually nosotros will exist using data already in a file that we need to read into R in gild to work on it. R can read data from a variety of file formats—for example, files created as text, or in Excel, SPSS or Stata. Nosotros will mainly exist reading files in text format .txt or .csv (comma-separated, usually created in Excel).

To read an entire information frame directly, the external file will commonly have a special grade

  • The commencement line of the file should accept a proper name for each variable in the data frame.
  • Each additional line of the file has every bit its outset item a row characterization and the values for each variable.

Here we apply the example dataset called airquality.csv and airquality.txt

Input file form with names and row labels:

Ozone Solar.R * Air current Temp Calendar month Twenty-four hours

1 41 ***** 190 ** 7.4 ** 67 **** v ** ane

ii 36 ***** 118 ** 8.0 ** 72 **** 5 ** 2

three 12 ***** 149 * 12.6 ** 74 **** 5 ** iii

4 18 ***** 313 * 11.5 ** 62 **** 5 ** 4

5 NA ***** NA ** xiv.iii ** 56 **** v ** 5

   ...

By default numeric items (except row labels) are read as numeric variables. This can be inverse if necessary.

The office read.tabular array() tin then be used to read the data frame direct

     > airqual <- read.tabular array("C:/Desktop/airquality.txt")

Similarly, to read .csv files the read.csv() function tin can be used to read in the data frame directly

[Note: I have noticed that occasionally you'll need to practice a double slash in your path //. This seems to depend on the machine.]

> airqual <- read.csv("C:/Desktop/airquality.csv")

 In addition, yous can read in files using the file.choose() role in R. After typing in this command in R, you can manually select the directory and file where your dataset is located.

  1. Read the airquality.csv file into R using the read.csv command.
  2. Read the airquality.txt file into R using the file.choose() command

Occasionally, you will need to read in data that does not already have column proper name information.  For example, the dataset BOD.txt looks similar this:

i    8.3

two   10.three

3   19.0

four   16.0

five   xv.half dozen

7   nineteen.8

Initially, there are no column names associated with the dataset.  We tin use the colnames() command to assign column names to the dataset.  Suppose that we want to assign columns, "Fourth dimension" and "demand" to the BOD.txt dataset.  To practise then we practice the following

> bod <- read.table("BOD.txt", header=F)

> colnames(bod) <- c("Time","demand")

> colnames(bod)

[ane] "Time"   "demand"

The outset control reads in the dataset, the control "header=F" specifies that there are no column names associated with the dataset.

Read in the cars.txt dataset and call information technology car1.  Brand sure you employ the "header=F" option to specify that there are no column names associated with the dataset.  Next, assign "speed" and "dist" to be the starting time and second column names to the car1 dataset.

The ii videos beneath provide a overnice explanations of unlike methods to read data from a spreadsheet into an R dataset.

Import Data, Re-create Information from Excel to R, Both .csv and .txt Formats (R Tutorial 1.iii) MarinStatsLectures [Contents]

alternative accessible content

Importing Information and Working With Data in R (R Tutorial ane.4) MarinStatsLectures [Contents]

alternative accessible content

Writing Data to a File


After working with a dataset, nosotros might like to save it for future use. Before nosotros practice this, permit's outset ready a working directory then we know where we can find all our data sets and files afterward.

Setting upward a Directory

In the R window, click on "File" and and then on "Modify dir". You should so run across a box pop up titled "Cull directory". For this class, choose the directory "Desktop" by clicking on "Scan", then select "Desktop" and click "OK". In the futurity, you may want to create a directory on your computer where y'all keep your data sets and codes for this grade.

Alternatively, you lot can use the setwd() function to assign every bit working directory.

> setwd("C:/Desktop")

To observe out what your current working directory is, blazon

> getwd()

Setting Up Working Directories in R (R Tutorial ane.8) MarinStatsLectures [Contents]

alternative accessible content

In R, nosotros can write data frames easily to a file, using the write.tabular array() control.

> write.tabular array(cars1, file=" cars1.txt ", quote=F)

The outset argument refers to the data frame to exist written to the output file, the 2d is the name of the output file. By default R will surround each entry in the output file by quotes, then we use quote=F.

Now, allow's cheque whether R created the file on the Desktop, past going to the Desktop and clicking to open up the file. You should run into a file with 3 columns, the first giving the index (or row number) and the other two the speed and altitude. R by default creates a column of row indices. If nosotros wanted to create a file without the row indices, we would use the command:

> write.table(cars1, file=" cars1.txt ", quote=F, row.names=F)

Datasets in R


Sentry the video beneath for a concise intoduction to working with the variables in an R dataset

Working with Variables and Data in R (R Tutorial 1.five) MarinStatsLecures [Contents]

alternative accessible content

Around 100 datasets are supplied with R (in the package datasets), and others are available.

To see the list of datasets currently available apply the command:

data()

We volition first await at a data set on CO2 (carbon dioxide) uptake in grass plants available in R.

> CO2

[ Note: capitalization matters here; also: it's the letter O, not nil. Typing this command should brandish the entire dataset called CO2, which has 84 observations (in rows) and 5 variables (columns).]

To get more information on the variables in the dataset, blazon in

> help(CO2)

Evaluate and report the mean and standard divergence of the variables "Concentration" and "Uptake".

Subsetting Information in R With Square Brackets and Logic Statements (R Tutorial ane.vi) MarinStatsLecures [Contents]

alternative accessible content