This jupyter notebook demonstrates how to cluster the iris2D dataset using density-based methods. It uses the language R and can be run live using an R kernel.
Setup¶
The following load and create the iris2D data set:
data("iris") # load the iris data set
x <- as.matrix(iris[,1:2]) # load the input attributes: sepal width and length
plot(x)DBSCAN and OPTICS are implemented in the following package:
library(dbscan) # for DBSCAN and OPTICS
help(package="dbscan") # More information about the packageDBSCAN¶
DBSCAN is implement by the function dbscan:
?dbscanTo apply DBSCAN to the iris data set with and :
db <- dbscan(x, eps = .3, minPts = 4)
dbTo visualize the clustering solution, we can plot the points in different clusters with different colors:
pairs(x, col = db$cluster + 1L)YOUR ANSWER HERE
For each data point, we can calculate the local outlier factor (LOF), which quantifies how much a point is locally an outlier using the reachability distance:
lof <- lof(x, minPts=5)
pairs(x, cex = lof) # ploting the points scaled relative to the LOF score.OPTICS¶
OPTICS is implemented by the function optics:
?opticsTo apply OPTICS with and :
opt <- optics(x, eps=1, minPts = 4)
plot(opt)
optWe can identify the clusters with a threshold, say 0.3, on the reachability distance:
opt <- extractDBSCAN(opt, eps_cl = .3)
plot(opt)# YOUR CODE HERE
fail()
plot(opt)
hullplot(x,opt)
opt