Addons/stats/base/univariate

From J Wiki
Jump to navigation Jump to search
User Guide | Installation | Development | Categories | Git | Build Log

stats/base/univariate  - Univariate statistics

mean v arithmetic mean (dyadic: weighted)
geomean v geometric mean
harmean v harmonic mean
commonmean v common mean
dev v deviation from mean (dyadic: weighted)
ssdev v sum of squared deviations (dyadic: weighted)
var v sample variance (dyadic: weighted)
stddev v standard deviation (dyadic: weighted)
varp v population variance
stddevp v population standard deviation
min v minimum
max v maximum
midpt v index of midpoint
median v median
quantiles v returns the quantile of y at the specified probabilities x
nquantiles v returns the values which partition y into x quantiles
iqr v calculates the interquartile range (IQR) of y
ntiles v assign values of y to x quantiles
cile v assign values of y to x subsets of nearly equal size
rankOrdinal a ordinal ranking ("0 1 2 3") of array y
rankCompete a standard competition ranking ("0 0 2 3") of array y
rankDense a dense ranking ("0 0 1 2") of array y
rankFractional a fractional ranking ("0 1.5 1.5 3") of array y
dstat v descriptive statistics
freqcount v frequency count
Idotr v equivalent to I. but intervals closed on left, open on right
histogram v tally of the items in each bin
binnedData a Applies verb u to the values of y after binning them in the intervals specified by x
kurtosis v 4th moment coefficient
skewness v 3rd moment coefficient

mean

mean (v) arithmetic mean (dyadic: weighted)

geomean

geomean (v) geometric mean

harmean

harmean (v) harmonic mean

commonmean

commonmean (v) common mean

dev

dev (v) deviation from mean (dyadic: weighted)

ssdev

ssdev (v) sum of squared deviations (dyadic: weighted)

var

var (v) sample variance (dyadic: weighted)

stddev

stddev (v) standard deviation (dyadic: weighted)

"p" suffix = population definitions

varp

varp (v) population variance

stddevp

stddevp (v) population standard deviation

min

min (v) minimum

max

max (v) maximum

midpt

midpt (v) index of midpoint

median

median (v) median

quantiles

quantiles (v) returns the quantile of y at the specified probabilities x

y is: numeric values to calculate quantiles for
x is: 0{:: probabilities at which to calculate quantiles (default 0.25 0.5 0.75)
      1{:: method for calculating quantiles (default 7)
EG: 0 0.25 0.5 0.75 1 quantiles 2 4 5 6 7 8 9
NB. There are a number of different methods for calculating quantiles
NB. https://en.wikipedia.org/wiki/Quantile , also Hyndman and Fan (1996)
qh4=. 4 : 'x * # y'               NB. alpha=0, beta=1
qh5=. 4 : '0.5 + x * # y'         NB. alpha=0.5, beta=0.5
qh6=. 4 : 'x * >:@# y'            NB. alpha=0, beta=0
qh7=. 4 : '1 + x * <:@# y'        NB. alpha=1, beta=1      ; default for R, NumPy & Julia
qh8=. 4 : '1r3 + x * 1r3 + # y'   NB. alpha=1/3, beta=1/3  ; recommended by Hyndman and Fan (1996)
qh9=. 4 : '3r8 + x * 0.25 + # y'  NB. alpha=3/8, beta=3/8  ; tends to be used for Normal QQ plots
Qh=: (qh4 f.)`(qh5 f.)`(qh6 f.)`(qh7 f.)`(qh8 f.)`(qh9 f.)
QuantileMethod=: 7               NB. default quantile method to use

nquantiles

nquantiles (v) returns the values which partition y into x quantiles

returns 1 less value than the number of quantiles specified
y is: numeric values to calculate quantiles for
x is: number of quantiles (default 4)
EG: 4 nquantiles 2 4 5 6 7 8 9

iqr

iqr (v) calculates the interquartile range (IQR) of y

ntiles

ntiles (v) assign values of y to x quantiles

x is: number of quantiles (default 4)
EG: 2 ntiles 2 4 5 6 7 8 9

cile

cile (v) assign values of y to x subsets of nearly equal size

eg: 3 cile i.12

rankOrdinal

rankOrdinal (a) ordinal ranking ("0 1 2 3") of array y

tied items are ranked on the order they appear in y
eg: /: rankOrdinal 5 2 5 0 6 2 4  NB. rank ascending
eg: \: rankOrdinal 5 2 5 0 6 2 4  NB. rank descending

rankCompete

rankCompete (a) standard competition ranking ("0 0 2 3") of array y

eg: /: rankCompete 5 2 5 0 6 2 4  NB. rank ascending
eg: \: rankCompete 5 2 5 0 6 2 4  NB. rank descending

rankDense

rankDense (a) dense ranking ("0 0 1 2") of array y

eg: /: rankDense 5 2 5 0 6 2 4  NB. rank ascending
eg: \: rankDense 5 2 5 0 6 2 4  NB. rank descending

rankFractional

rankFractional (a) fractional ranking ("0 1.5 1.5 3") of array y

Items with the same ranking have the mean of their ordinal ranks.
eg: /: rankFractional 5 2 5 0 6 2 4  NB. rank ascending
eg: \: rankFractional 5 2 5 0 6 2 4  NB. rank descending

dstat

dstat (v) descriptive statistics

table of formatted descriptive statistics

freqcount

freqcount (v) frequency count

(value, frequency) sorted by decreasing frequency

Idotr

Idotr (v) Equivalent to I. but intervals are closed on the left and open on the right

Idotr : (0{x) <= y < (1{x)
   I. : (0{x) < y <= (1{x)

histogram

histogram (v) histogram

x is a list of interval start/end points. The number of intervals is 1+#x
y is an array of data
The result is a list of counts of the number of data points in each interval.

binnedData

binnedData (a) Applies verb u to the values of y after binning them in the intervals specified by x

x is a list of interval start/end points. The number of intervals is 1+#x
y is an array of data.
eg: < binnedData  NB. verb to box the binned data
eg: (+/ % #) binnedData  NB. verb to average the binned data

kurtosis

kurtosis (v) 4th moment coefficient

skewness

skewness (v) 3rd moment coefficient