Title: | Dimension Reduction for Outlier Detection |
---|---|
Description: | A dimension reduction technique for outlier detection. DOBIN: a Distance based Outlier BasIs using Neighbours, constructs a set of basis vectors for outlier detection. This is not an outlier detection method; rather it is a pre-processing method for outlier detection. It brings outliers to the fore-front using fewer basis vectors (Kandanaarachchi, Hyndman 2020) <doi:10.1080/10618600.2020.1807353>. |
Authors: | Sevvandi Kandanaarachchi [aut, cre] |
Maintainer: | Sevvandi Kandanaarachchi <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.4 |
Built: | 2024-11-20 03:36:06 UTC |
Source: | https://github.com/sevvandi/dobin |
Scatterplot of the first two columns in the dobin space.
## S3 method for class 'dobin' autoplot(object, ...)
## S3 method for class 'dobin' autoplot(object, ...)
object |
The output of the function 'dobin'. |
... |
Other arguments currently ignored. |
A ggplot object.
X <- rbind( data.frame(x = rnorm(500), y = rnorm(500), z = rnorm(500)), data.frame(x = rnorm(5, mean = 10, sd = 0.2), y = rnorm(5, mean = 10, sd = 0.2), z = rnorm(5, mean = 10, sd = 0.2)) ) dob <- dobin(X) autoplot(dob)
X <- rbind( data.frame(x = rnorm(500), y = rnorm(500), z = rnorm(500)), data.frame(x = rnorm(5, mean = 10, sd = 0.2), y = rnorm(5, mean = 10, sd = 0.2), z = rnorm(5, mean = 10, sd = 0.2)) ) dob <- dobin(X) autoplot(dob)
This function computes a set of basis vectors suitable for outlier detection.
dobin(xx, frac = 0.95, norm = 1, k = NULL)
dobin(xx, frac = 0.95, norm = 1, k = NULL)
xx |
The input data in a dataframe, matrix or tibble format. |
frac |
The cut-off quantile for |
norm |
The normalization technique. Default is Min-Max, which normalizes each column to values between 0 and 1. |
k |
Parameter |
A list with the following components:
rotation |
The basis vectors suitable for outlier detection. |
coords |
The dobin coordinates of the data |
Yspace |
The The associated |
Ypairs |
The pairs in |
zerosdcols |
Columns in |
# A bimodal distribution in six dimensions, with 5 outliers in the middle. set.seed(1) x2 <- rnorm(405) x3 <- rnorm(405) x4 <- rnorm(405) x5 <- rnorm(405) x6 <- rnorm(405) x1_1 <- rnorm(mean = 5, 400) mu2 <- 0 x1_2 <- rnorm(5, mean=mu2, sd=0.2) x1 <- c(x1_1, x1_2) X1 <- cbind(x1,x2,x3,x4,x5,x6) X2 <- cbind(-1*x1_1,x2[1:400],x3[1:400],x4[1:400],x5[1:400],x6[1:400]) X <- rbind(X1, X2) labs <- c(rep(0,400), rep(1,5), rep(0,400)) dob <- dobin(X) autoplot(dob)
# A bimodal distribution in six dimensions, with 5 outliers in the middle. set.seed(1) x2 <- rnorm(405) x3 <- rnorm(405) x4 <- rnorm(405) x5 <- rnorm(405) x6 <- rnorm(405) x1_1 <- rnorm(mean = 5, 400) mu2 <- 0 x1_2 <- rnorm(5, mean=mu2, sd=0.2) x1 <- c(x1_1, x1_2) X1 <- cbind(x1,x2,x3,x4,x5,x6) X2 <- cbind(-1*x1_1,x2[1:400],x3[1:400],x4[1:400],x5[1:400],x6[1:400]) X <- rbind(X1, X2) labs <- c(rep(0,400), rep(1,5), rep(0,400)) dob <- dobin(X) autoplot(dob)