Discussion:
[pystatsmodels] Unclear License
Brock Mendel
2018-09-07 22:19:58 UTC
Permalink
There are several LICENSE and license-like files, not all of which match.
I don't know if this has any real-world consequences, but license files
seem like something Worth Doing Right.

LICENSE.txt is has copyright last dated to 2012
statsmodels/LICENSE.txt is has copyright last dated to 2013
(https://github.com/statsmodels/statsmodels/pull/4973)

statsmodels/datasets/COPYING is copy/pasted from somewhere, not really
clear what "code and descriptive text" it refers to
possibly related are instructions about obtaining permission for datasets
in sandbox/dataset_notes.rst

---
There is also COPYRIGHTS.txt that looks like it is mostly copy/pasted
versions of other packages' licenses. It isn't clear which of these are
actually needed.

multivariate/factor_rotation has COPYRIGHTS.txt and LICENSE.txt which seem
OK. Should this be listed in repo-level COPYRIGHTS?

stats/libqsturng/LICENSE.txt _is_ copy/pasted in COPYRIGHTS

There is a BSD license copy/pasted into sandbox/tsa/diffusion2.py
----
Many modules have docstrings that list their Licenses as either BSD,
Simplified-BSD, or BSD-3. Does the top-level LICENSE.txt file not apply to
files listing non-BSD3 licenses?
j***@gmail.com
2018-09-07 22:52:12 UTC
Permalink
Post by Brock Mendel
There are several LICENSE and license-like files, not all of which match.
I don't know if this has any real-world consequences, but license files
seem like something Worth Doing Right.
LICENSE.txt is has copyright last dated to 2012
statsmodels/LICENSE.txt is has copyright last dated to 2013
(https://github.com/statsmodels/statsmodels/pull/4973)
statsmodels/datasets/COPYING is copy/pasted from somewhere, not really
clear what "code and descriptive text" it refers to
possibly related are instructions about obtaining permission for datasets
in sandbox/dataset_notes.rst
the original structure of our dataset code was initially based on a dataset
package by David C.
Post by Brock Mendel
---
There is also COPYRIGHTS.txt that looks like it is mostly copy/pasted
versions of other packages' licenses. It isn't clear which of these are
actually needed.
This is required by Debian, and we added it to get statsmodels initially
into Debian.
We added the licenses of major code included from other sources,
factor_rotation should be added
Post by Brock Mendel
multivariate/factor_rotation has COPYRIGHTS.txt and LICENSE.txt which seem
OK. Should this be listed in repo-level COPYRIGHTS?
stats/libqsturng/LICENSE.txt _is_ copy/pasted in COPYRIGHTS
included package so we keep the license also with the source
Post by Brock Mendel
There is a BSD license copy/pasted into sandbox/tsa/diffusion2.py
As it says, it's translated from matlab and the BSD license is for the
matlab fileexchange version
Post by Brock Mendel
----
Many modules have docstrings that list their Licenses as either BSD,
Simplified-BSD, or BSD-3. Does the top-level LICENSE.txt file not apply to
files listing non-BSD3 licenses?
BSD-3 applies to all modules with a few explicit exceptions, e.g.. we are
not copyright holders on data that is in public domain,
We don't have a consistent policy what should be in the header of modules.
Simplified-BSD is most likely unchanged from the first two years of
statsmodels before we coordinated on BSD-3 because Simplified BSD versus
Modified BSD is confusing.

Josef
Ralf Gommers
2018-09-12 05:57:04 UTC
Permalink
Post by j***@gmail.com
Post by Brock Mendel
There are several LICENSE and license-like files, not all of which
match. I don't know if this has any real-world consequences, but license
files seem like something Worth Doing Right.
LICENSE.txt is has copyright last dated to 2012
statsmodels/LICENSE.txt is has copyright last dated to 2013
(https://github.com/statsmodels/statsmodels/pull/4973)
statsmodels/datasets/COPYING is copy/pasted from somewhere, not really
clear what "code and descriptive text" it refers to
possibly related are instructions about obtaining permission for datasets
in sandbox/dataset_notes.rst
the original structure of our dataset code was initially based on a
dataset package by David C.
Post by Brock Mendel
---
There is also COPYRIGHTS.txt that looks like it is mostly copy/pasted
versions of other packages' licenses. It isn't clear which of these are
actually needed.
This is required by Debian, and we added it to get statsmodels initially
into Debian.
We added the licenses of major code included from other sources,
factor_rotation should be added
Pretty sure this is no longer needed (numpy/scipy never had that). Having
everything in a single LICENSE(.txt) file is probably clearer. Not too long
ago we did clean up the licensing of numpy and scipy, as well as added all
the licenses for code included in the wheels per platform to the wheel
builds. Could be useful to use the same structure here.
Post by j***@gmail.com
Post by Brock Mendel
multivariate/factor_rotation has COPYRIGHTS.txt and LICENSE.txt which
seem OK. Should this be listed in repo-level COPYRIGHTS?
stats/libqsturng/LICENSE.txt _is_ copy/pasted in COPYRIGHTS
included package so we keep the license also with the source
This is useful, keeping them in the vendored package is best practice I
think. Probably no need for a duplicate copy though. Here's what numpy does:

<BSD license text>
...
The NumPy repository and source distributions bundle several libraries that
are
compatibly licensed. We list these here.

Name: Numpydoc
Files: doc/sphinxext/numpydoc/*
License: 2-clause BSD
For details, see doc/sphinxext/LICENSE.txt

Name: scipy-sphinx-theme
Files: doc/scipy-sphinx-theme/*
License: 3-clause BSD, PSF and Apache 2.0
For details, see doc/sphinxext/LICENSE.txt

Name: lapack-lite
Files: numpy/linalg/lapack_lite/*
License: 3-clause BSD
For details, see numpy/linalg/lapack_lite/LICENSE.txt
...


Cheers,
Ralf
Post by j***@gmail.com
Post by Brock Mendel
There is a BSD license copy/pasted into sandbox/tsa/diffusion2.py
As it says, it's translated from matlab and the BSD license is for the
matlab fileexchange version
Post by Brock Mendel
----
Many modules have docstrings that list their Licenses as either BSD,
Simplified-BSD, or BSD-3. Does the top-level LICENSE.txt file not apply to
files listing non-BSD3 licenses?
BSD-3 applies to all modules with a few explicit exceptions, e.g.. we are
not copyright holders on data that is in public domain,
We don't have a consistent policy what should be in the header of modules.
Simplified-BSD is most likely unchanged from the first two years of
statsmodels before we coordinated on BSD-3 because Simplified BSD versus
Modified BSD is confusing.
Josef
Loading...