j***@gmail.com
2018-09-30 18:28:27 UTC
While looking at the scale of GLM-NegativeBinomial I found an old notebook
"GLM, variance function, var_weights and scale or families faking each other"
https://gist.github.com/josef-pkt/ae344b04b264ea3db02da9c363b6aa24
What are the minimal assumptions behind GLM params and cov_params?
The notebook illustrates that the family itself doesn't matter, all we
need is link/mean function, variance function and a scale estimator.
As a special case `fit(scale='x2')` for Poisson or similar implements
the Quasi-Poisson case in R, i.e. we assume log link, Poisson variance
function and excess dispersion, where scale is not assumed to be 1.
Aside: I just realized that we have a backwards incompatible change in
the default scale in GLM-NegativeBinomial in the 0.9 release.
Nobody complained, but it changes the cov_params/bse and wald tests
for GLM-NegativeBinomial compared to 0.8.
Before 0.9, the scale was using pearson-chi2, i.e. excess dispersion,
as default. In 0.9 the scale=1 and cov_params is computed under the
assumption that the NegativeBinomial var function is correctly
specified.
This will not be reverted because this definition is consistent with
the rest of GLM.
Josef
"GLM, variance function, var_weights and scale or families faking each other"
https://gist.github.com/josef-pkt/ae344b04b264ea3db02da9c363b6aa24
What are the minimal assumptions behind GLM params and cov_params?
The notebook illustrates that the family itself doesn't matter, all we
need is link/mean function, variance function and a scale estimator.
As a special case `fit(scale='x2')` for Poisson or similar implements
the Quasi-Poisson case in R, i.e. we assume log link, Poisson variance
function and excess dispersion, where scale is not assumed to be 1.
Aside: I just realized that we have a backwards incompatible change in
the default scale in GLM-NegativeBinomial in the 0.9 release.
Nobody complained, but it changes the cov_params/bse and wald tests
for GLM-NegativeBinomial compared to 0.8.
Before 0.9, the scale was using pearson-chi2, i.e. excess dispersion,
as default. In 0.9 the scale=1 and cov_params is computed under the
assumption that the NegativeBinomial var function is correctly
specified.
This will not be reverted because this definition is consistent with
the rest of GLM.
Josef