The data - chronological sequences (time series) of the price per bushel of wheat and the weekly wage of a skilled laborer in England between 1565 and 1821 - were compiled by William Playfair and published in 1821 in his book Letter on our Agricultural Distresses, Their Causes and Remedies.. This dataset is available “directly” to R
users after installing the HistData
package; it is also available in csv
format from the Rdatasets site on GitHub. We’re going to use the latter. The full URL of the data is: https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv. As it starts with https
and not http
, we can’t use the read.csv
function directly, we need to download the data to our disk first. We can do this with the download.file
function, specifying "wget"
as the value of the formal method
parameter:
download.file(url="https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv",
destfile="Wheat.csv",
method="wget")
We can now use read.csv
to load the data into our workspace:
wheat = read.csv("Wheat.csv")
We then quickly check that the download followed by the reading of the data has been carried out correctly by looking at the variable names in the data.frame
we’ve just created:
names(wheat)
## [1] "rownames" "Year" "Wheat" "Wages"
We can also examine the top of the table with :
head(wheat)
## rownames Year Wheat Wages
## 1 1 1565 41.0 5.00
## 2 2 1570 45.0 5.05
## 3 3 1575 42.0 5.08
## 4 4 1580 49.0 5.12
## 5 5 1585 41.5 5.15
## 6 6 1590 47.0 5.25
We see that the first column (variable X
) is useless and we delete it with :
wheat = wheat[,names(wheat)[-1]]
names(wheat) = c("Year", "Wheat", "Salary")
Finally, we look at the bottom of the table:
tail(wheat)
## Year Wheat Salary
## 48 1800 79 28.5
## 49 1805 81 29.5
## 50 1810 99 30.0
## 51 1815 78 NA
## 52 1820 54 NA
## 53 1821 54 NA
We can “finish” this example by reproducing the graph showing the co-evolution of wheat prices and wages, adapting (and correcting!) the example in the documentation for this dataset in the HistDat
package:
with(wheat,
{
known_salary = !is.na(Salary)
plot(Year, Wheat, type="s", ylim=c(0,105),
ylab="Price of a quarter of a wheat bushel (shillings)")
polygon(c(Year[known_salary],rev(Year[known_salary])),
c(Salary[known_salary],rep(0,sum(known_salary))),
col="lightskyblue", border=NA)
lines(Year[known_salary],
Salary[known_salary], lwd=3, col="red")
text(1625,10, "Weekly salary of a skilled worker",
cex=0.8, srt=3, col="red")
})
We can compare our figure to the original one: