## Here is an example of why:

Simulate some data from a mixture of binomial distributions:

``````set.seed(1)
n <- 100  # sample size
ub <- 1000  # upper bound
p1 <- 0.8  # probability
p2 <- 0.99  # probability
X <- rbinom(n, ub, sample(c(p1, p2), size = 100, replace = TRUE))
``````

These data come from this density function:

``````xlim <- ub * c(0.7, 1.05)
xseq <- seq(xlim[1], xlim[2], 1)
den_tr <- 0.5 * dbinom(xseq, ub, p1) + 0.5 * dbinom(xseq, ub, p2)
``````
``````par(cex.lab = 2, cex.main = 2.2, cex.axis = 1.8, mar = c(5, 6, 2, 1))
plot(xseq, den_tr, type = "l", main = "True density", ylab = "", xlab = "X")
abline(v = ub)
``````

The vertical line gives the maximum possible value for the random variable.

Now try estimating the density using three methods

``````den_hs <- hist(X, plot = FALSE)
den_st <- density(X)
library(logspline)
den_ls <- logspline(X, lbound = 0, ubound = ub)
``````
``````## Warning: too much data close together
``````
``````## possible infinite density at upper end
## running program with fewer knots
``````
``````## Warning: re-ran with oldlogspline
``````
``````par(cex.lab = 2, cex.main = 2, cex.axis = 1.8, mar = c(5, 6, 2, 1))
plot(den_ls, xlim = xlim, main = "logspline estimate", ylab = "", xlab = "X")
abline(v = ub)
``````

In this example, the `logspline` function does much better than the default settings for `density` or `hist` functions.

I'm not saying that `logspline` will always be best, but it does have some pretty sweet features. In particular, you can set upper and lower bounds on the random variable (zero and 1000 in this case). Also, `logspline` provides nice info about the fit of the model (because that’s what density estimates are…they’re models),

``````den_ls
``````
``````##  knots A(1)/D(2) loglik    AIC minimum penalty maximum penalty
##      3         2 -568.1 1145.5          270.20             Inf
##      4         2 -433.0  879.9            4.44          270.20
##      5         2 -430.8  880.1            1.94            4.44
##      6         2 -429.9  882.7            1.61            1.94
##      7         2 -429.1  885.7            0.01            1.61
##      8         1 -429.0  890.3            0.00            0.01
## the present optimal number of knots is  4
## penalty(AIC) was the default: BIC=log(samplesize): log( 100 )= 4.61
``````

Is there a reason you don’t likewise constrain the `density()` function via `from = 0, to = ub`?