[pystatsmodels] GLM QMLE and scale estimates

j***@gmail.com

2018-10-01 13:30:00 UTC

Post by j***@gmail.com
While looking at the scale of GLM-NegativeBinomial I found an old notebook
"GLM, variance function, var_weights and scale or families faking each other"
https://gist.github.com/josef-pkt/ae344b04b264ea3db02da9c363b6aa24
What are the minimal assumptions behind GLM params and cov_params?
The notebook illustrates that the family itself doesn't matter, all we
need is link/mean function, variance function and a scale estimator.
As a special case `fit(scale='x2')` for Poisson or similar implements
the Quasi-Poisson case in R, i.e. we assume log link, Poisson variance
function and excess dispersion, where scale is not assumed to be 1.
Aside: I just realized that we have a backwards incompatible change in
the default scale in GLM-NegativeBinomial in the 0.9 release.
Nobody complained, but it changes the cov_params/bse and wald tests
for GLM-NegativeBinomial compared to 0.8.
Before 0.9, the scale was using pearson-chi2, i.e. excess dispersion,
as default. In 0.9 the scale=1 and cov_params is computed under the
assumption that the NegativeBinomial var function is correctly
specified.
This will not be reverted because this definition is consistent with
the rest of GLM.
Josef

I found another gist notebook of mine for dispersion versus HC0 in GLM
Poisson for the simple two sample case
https://gist.github.com/josef-pkt/ff08f8c446576faa3654d17694da01fc

Josef