-
Notifications
You must be signed in to change notification settings - Fork 1
/
discretesample.m
109 lines (92 loc) · 3.14 KB
/
discretesample.m
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
function x = discretesample(p, n)
% Samples from a discrete distribution
%
% x = discretesample(p, n)
% independently draws n samples (with replacement) from the
% distribution specified by p, where p is a probability array
% whose elements sum to 1.
%
% Suppose the sample space comprises K distinct objects, then
% p should be an array with K elements. In the output, x(i) = k
% means that the k-th object is drawn at the i-th trial.
%
% Remarks
% -------
% - This function is mainly for efficient sampling in non-uniform
% distribution, which can be either parametric or non-parametric.
%
% - The function is implemented based on histc, which has been
% highly optimized by mathworks. The basic idea is to divide
% the range [0, 1] into K bins, with the length of each bin
% proportional to the probability mass. And then, n values are
% drawn from a uniform distribution in [0, 1], and the bins that
% these values fall into are picked as results.
%
% - This function can also be employed for continuous distribution
% in 1D/2D dimensional space, where the distribution can be
% effectively discretized.
%
% - This function can also be useful for sampling from distributions
% which can be considered as weighted sum of "modes".
% In this type of applications, you can first randomly choose
% a mode, and then sample from that mode. The process of choosing
% a mode according to the weights can be accomplished with this
% function.
%
% Examples
% --------
% % sample from a uniform distribution for K objects.
% p = ones(1, K) / K;
% x = discretesample(p, n);
%
% % sample from a non-uniform distribution given by user
% x = discretesample([0.6 0.3 0.1], n);
%
% % sample from a parametric discrete distribution with
% % probability mass function given by f.
% p = f(1:K);
% x = discretesample(p, n);
%
% Created by Dahua Lin, On Oct 27, 2008
%
%% parse and verify input arguments
assert(isfloat(p), 'discretesample:invalidarg', ...
'p should be an array with floating-point value type.');
assert(isnumeric(n) && isscalar(n) && n >= 0 && n == fix(n), ...
'discretesample:invalidarg', ...
'n should be a nonnegative integer scalar.');
%% main
% process p if necessary
K = numel(p);
if ~isequal(size(p), [1, K])
p = reshape(p, [1, K]);
end
% construct the bins
edges = [0, cumsum(p)];
s = edges(end);
if abs(s - 1) > eps
edges = edges * (1 / s);
end
% draw bins
rv = rand(1, n);
% old implementation with histc
% c = histc(rv, edges);
% ce = c(end);
% c = c(1:end-1);
% c(end) = c(end) + ce;
% new implementation with histcounts
c = histcounts(rv, edges);
% extract samples
xv = find(c);
if numel(xv) == n % each value is sampled at most once
x = xv;
else % some values are sampled more than once
xc = c(xv);
d = zeros(1, n);
dv = [xv(1), diff(xv)];
dp = [1, 1 + cumsum(xc(1:end-1))];
d(dp) = dv;
x = cumsum(d);
end
% randomly permute the sample's order
x = x(randperm(n));