|
DATA MINING
Desktop Survival Guide by Graham Williams |
|
|||
A dataset is usually more copmlex than a simple vector. Indeed, often
we have several vectors making up the dataset, and refer to this as a
matrix. A matrix is a data structure containing items all of the same
data type. We construct a matrix with the matrix and
c functions. Rows and columns of a matrix can have
names, and the functions colnames and
rownames will list the current names. However, you can
also assign a new list of names to these functions!
> ds <- matrix(c(52, 37, 59, 42, 36, 46, 38, 21, 18, 32, 10, 67),
nrow=3, byrow=T)
> colnames(ds) <- c("Low", "Medium", "High","VHigh")
> rownames(ds) <- c("Married","Prev.Married","Single")
> ds
Low Medium High VHigh
Married 52 37 59 42
Prev.Married 36 46 38 21
Single 18 32 10 67
|
Of course, manually creating datasets in this way is only useful for
small data collections. A slightly easier approach is to manually
modify and add to the dataset using a simple spreadsheet-like
interface through the edit function or through the
fix function which will also assign the results of the
edit back to the variable being edited.
> ds <- edit(ds) > fix(ds) |
The cbind function combines each of its arguments,
column-wise (the c in the name is for column), into a
single data structure:
> age <- c(35, 23, 56, 18)
> gender <- c("m", "m", "f", "f")
> people <- cbind(age, gender)
> people
age gender
[1,] "35" "m"
[2,] "23" "m"
[3,] "56" "f"
[4,] "18" "f"
|
The rbind function similarly combines its argument, but in a row-wise manner. The result will be the same as if we transpose the matrix with the t function:
> t(people)
[,1] [,2] [,3] [,4]
age "35" "23" "56" "18"
gender "m" "m" "f" "f"
> people <- rbind(age, gender)
> people
[,1] [,2] [,3] [,4]
age "35" "23" "56" "18"
gender "m" "m" "f" "f"
|
Copyright © 2004-2005
Brought to you by Togaware.