SmoothCFTest¶
- class hyppo.ksample.SmoothCFTest(num_randfreq=5)¶
Smooth Characteristic Function test statistic and p-value
The Smooth Characteristic Function test is a two-sample test that uses differences in the smoothed (analytic) characteristic function of two data distributions in order to determine how different the two data are 1.
- Parameters
num_randfreq (
int
) -- Used to construct random array with size(p, q)
where p is the number of dimensions of the data and q is the random frequency at which the test is performed. These are the random test points at which test occurs (see notes).
Notes
The test statistic takes on the following form:
As seen in the above formulation, this test-statistic takes the same form as the Hotelling
statistic. However, the components are defined differently in this case. Given data sets X and Y, define the following as , the vector of differences:The above is the vector of differences between kernels at test points,
. This same formulation is used in the Mean Embedding Test. Moving forward, can be defined:This leaves
, the covariance matrix as:In the specific case of the Smooth Characteristic function test, the vector of differences can be defined as follows:
Once
is calculated, a threshold corresponding to the quantile of a Chi-squared distribution w/ J degrees of freedom is chosen. Null is rejected if is larger than this threshold.References
- 1
Kacper P Chwialkowski, Aaditya Ramdas, Dino Sejdinovic, and Arthur Gretton. Fast two-sample testing with analytic representations of probability measures. Advances in Neural Information Processing Systems, 2015.
Methods Summary
|
Calculates the smooth CF test statistic. |
|
Calculates the smooth CF test statistic and p-value. |
- SmoothCFTest.statistic(x, y, random_state)¶
Calculates the smooth CF test statistic.
- Parameters
- Returns
stat (
float
) -- The computed Smooth CF statistic.
- SmoothCFTest.test(x, y, random_state=None)¶
Calculates the smooth CF test statistic and p-value.
- Parameters
- Returns
Examples
>>> import numpy as np >>> from hyppo.ksample import SmoothCFTest >>> np.random.seed(1234) >>> x = np.random.randn(500, 10) >>> y = np.random.randn(500, 10) >>> stat, pvalue = SmoothCFTest().test(x, y, random_state=1234) >>> '%.2f, %.3f' % (stat, pvalue) '4.70, 0.910'