Skip to content

Conversation

@tornaria
Copy link
Contributor

@tornaria tornaria commented Oct 5, 2023

This PR adds support for running with python 3.12 from system. This has been tested on the python 3.12 branch of void linux (x86_64, x86_64-musl and i686).

The first two commits correspond to #36403. The rest is split is small pieces and I tried to add reasonable explanations in the commit messages. Reviewing by commit may be easier (ignoring the first two, already reviewed).

See also: #36181

⌛ Dependencies

#36403

@kiwifb
Copy link
Member

kiwifb commented Oct 6, 2023

I was thinking about what was needed to enable 3.12 so it is timely. It is mostly doctest fix apart from that stuff with pylong. Can you expand on that stuff? It mentions some headers from python 3.12, naively I would have thought that was stuff removed from 3.12 that was still needed. If it is supposed to be internal, why do we need to use it?

@tornaria
Copy link
Contributor Author

tornaria commented Oct 6, 2023

I was thinking about what was needed to enable 3.12 so it is timely. It is mostly doctest fix apart from that stuff with pylong. Can you expand on that stuff? It mentions some headers from python 3.12, naively I would have thought that was stuff removed from 3.12 that was still needed. If it is supposed to be internal, why do we need to use it?

As I mentioned in the commit message, the layout for python integers changed in python 3.12. This affects cypari2 and fpylll as well as sagemath, since all three projects implement functions to convert from/to python integers.

Since the change is forward and backward incompatible, I created a module pycore_long which implements some api to python integers. This is all declared in pycore_long.pxd and implemented in pycore_long.h. Note that these two files are shared verbatim with cypari2 (sagemath/cypari2#138) and fpylll (fplll/fpylll#254). Arguably this should be moved to cython itself (there is a cython module cpython.longintrepr) but for now this is the simpler solution.

I suggest looking at the raw files: in pycore_long.pxd there are a few functions declared, and this is all a user of this module needs to know. These declarations are meant to complement the ones in cpython.longintrepr.

In pycore_long.h is the implementation: one for python 3.12, one for < 3.12. The function names and prototypes, and the implementation for 3.12, is taken verbatim from cpython 3.12 source code. This way, if a later version of cpython decides to change again the representation of integers, we can just copy the corresponding code. The fall-back implementation for < 3.12 is mine, but you'll see everything is trivial (the function names are already very clear on intent, this is just "impedance matching" code).

The only function not taken from python is ob_digit() which is just an abstraction around getting the ob_digit struct member which they moved around.

As for the low level details: the main difference (besides moving the ob_digit member around) is that in the previous implementation they abused the size of the integer to place the sign. In the current implementation they added an additional member lv_tag to hold the sign. I expect they want to use lv_tag for other stuff in the future, presumably to have a "short" representation for integers that fit in one word.

The layout for python integers changed in python 3.12.
We add a module `sage.cpython.pycore_long` which copies the new
(internal) api of PyLong from python 3.12. We also implement fallback
version of these functions suitable for python pre-3.12.

Note the files implementing the `pycore_long` module (`pycore_long.pxd`
and `pycore_long.h`) are shared with fpylll and cypari2.
In python 3.12, the `struct atexit_callback` was renamed to
`struct atexit_py_callback`. The easiest workaround is to add
`#define atexit_callback atexit_py_callback` in the right place when
building using python 3.12 or newer.
Tracing has changed in python 3.12 in such a way that cython doesn't
support it properly anymore. This one file sets `profile=true` for
cython which won't work anymore (and it fails to build, at least with
cython 0.29). We just disable that line.

See also: cython/cython#5450
To use some (internal) declarations related to dict type, we have to
include `<internal/pycore_dict.h>` which needs `#define Py_BUILD_CORE`
to be loaded. This causes trouble when `Python.h` was included before
defining `Py_BUILD_CORE`, due to a macro `_PyGC_FINALIZED`. We fix it
by undefining said macro.
Some changes in ast, the old `node.n` and `node.s` are deprecated in
favour of a common `node.value`. Making this change seems better than
just ignoring the deprecation warning.
This adds some filterwarnings that trigger with python 3.12.

 - deprecation of `datetime.datetime.utcfromtimestamp()` this is
   triggered by python modules `dateutil` and `sphinx`.
 - `os.fork()` and `os.ptyfork()` are deprecated when running
   multi-threaded; I don't see an easy way out of this, so ignore it.
 - itertools won't support pickling in python 3.14; let's ignore this
   for now with the hope there's an alternative before 3.14.
Is deprecated, and it can be replaced just fine with
`importlib.util.find_spec()`.
Since `importer.find_module(...)` was removed in 3.12.

We just follow the suggestion from

https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly

Note that the last three added lines here could be replaced instead by

    spec.loader.load_module(module_name)

which works; however this is deprecated so it's better to use the
recommended way using `importlib.util.module_from_spec(...)` and
`spec.loader.execute_module(...)`.
In python < 3.12 we have

    sage: a = delta_qexp(1000)
    sage: sum(a[n]/float(n)^14 for n in range(1,1000))
    0.9985830631627459

This changed in python 3.12 to

    sage: sum(a[n]/float(n)^14 for n in range(1,1000))
    0.9985830631627461

The latter is the correct one as can be seen using rationals:

    sage: float(sum(a[n]/n^14 for n in range(1,1000)))
    0.9985830631627461

As a workaround, we do the sum in reverse (from small to large terms),
which gives the correct result in any case:

    sage: sum(a[n]/float(n)^14 for n in reversed(range(1,1000)))
    0.9985830631627461
In python 3.12 the printing of OrderedDict has been changed.

As of Python 3.7, regular dicts are guaranteed to be ordered, so it's
safe to replace OrderedDict by dict.

Maybe convenient to replace other uses of OrderedDict, although this is
out of scope of python 3.12 support.
Running

    sage: g = Polyhedron().face_generator()
    sage: g.__next__()
    A -1-dimensional face of a Polyhedron in ZZ^0
    sage: g.__next__()

is supposed to raise `StopIteration`. However in python 3.12 the second
call to `__next__()` leads to a crash.

This is caused by a `return -1` in `next_face_loop()` which is supposed
to mean `raise StopIteration`. Using raise explicitly fixes the crash.
@tornaria
Copy link
Contributor Author

tornaria commented Oct 9, 2023

rebased to beta6

@github-actions
Copy link

Documentation preview for this PR (built with commit 3a7cd3f; changes) is ready! 🎉

@tornaria
Copy link
Contributor Author

FTR: before I was only testing this with cython 0.29 (this PR applied to sagemath 10.1).

Now I've tested this with cython 3:

In summary, when actually updating python to 3.12 something has to be done with cython.

Everything seems to work ok when using cython 3.0.2 + cython/cython#5725.

A further note: this PR is about supporting using python 3.12. Please review from that point of view: the important thing is that this makes sense and it doesn't cause any problem with older python.

I'd appreciate if this can be merged for 10.2 even if it might make more sense to wait a bit more to actually update the python in sage-the-distro. We are already shipping python 3.12 in void linux so this is well tested at least in the architectures we support.

Copy link
Contributor

@mkoeppe mkoeppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vbraun vbraun merged commit 9144356 into sagemath:develop Oct 21, 2023
@dimpase
Copy link
Member

dimpase commented Apr 23, 2024

_Py_Isimmortal, which we see in the backtrace, is new in Python 3.12.
So this might be something to fix in cysignals, or
in Sage guts somewhere

https://peps.python.org/pep-0683/

@antonio-rojas
Copy link
Contributor

_Py_Isimmortal, which we see in the backtrace, is new in Python 3.12. So this might be something to fix in cysignals, or in Sage guts somewhere

But op is a nullptr long before reaching _Py_Isimmortal, so this may be a previously existing issue that only now is being detected by _Py_Isimmortal trying to dereference it.

@antonio-rojas
Copy link
Contributor

Calling any libgap function in three arguments triggers the crash.

sage: libgap.AbelianGroup(0,0,0)
-> crash

@dimpase
Copy link
Member

dimpase commented Apr 23, 2024

Can you attach your sage/libs/gap/element.c here?

@antonio-rojas
Copy link
Contributor

Can you attach your sage/libs/gap/element.c here?

element.tar.gz

@tornaria
Copy link
Contributor Author

libgap.AbelianGroup(0,0,0)

Can't reproduce this one either:

$ sage
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 10.3, Release Date: 2024-03-19                    │
│ Using Python 3.12.3. Type "help()" for help.                       │
└────────────────────────────────────────────────────────────────────┘
sage: libgap.AbelianGroup(0,0,0)
---------------------------------------------------------------------------
GAPError                                  Traceback (most recent call last)
Cell In[1], line 1
----> 1 libgap.AbelianGroup(Integer(0),Integer(0),Integer(0))

File /usr/lib/python3.12/site-packages/sage/libs/gap/element.pyx:2514, in sage.libs.gap.element.GapElement_Function.__call__ (build/cythonized/sage/libs/gap/element.c:26294)()
   2512 try:
   2513     sig_GAP_Enter()
-> 2514     sig_on()
   2515     if n == 0:
   2516         result = GAP_CallFunc0Args(self.value)

GAPError: Error, usage: AbelianGroup( [<filter>, ]<ints> )
sage: 

FWIW, we are on gap 4.12.2, cython 3.0.10, cysignals 1.11.4, ... (what else would be relevant?)

We don't have mpdecimal.

@dimpase
Copy link
Member

dimpase commented Apr 23, 2024

Calling any libgap function in three arguments triggers the crash.
Do you mean, any libgap function which does not take 3 arguments?

sage: libgap.AbelianGroup(0,0,0)
-> crash

same crash place?

Probably, to reproduce one needs to know compiler version, cython version, maybe even glibc and pthreads version, cysignals version...

@tornaria
Copy link
Contributor Author

For me gcc is 13.2.0, glibc is 2.39 (and gap 4.12.2, cython 3.0.10, cysignals 1.11.4, ... )

Unrelated, but how do you deal with #37616 if you are on gap 4.13?

@antonio-rojas
Copy link
Contributor

same crash place?

Probably, to reproduce one needs to know compiler version, cython version, maybe even glibc and pthreads version, cysignals version...

Yes, same backtrace.
GCC 13.2.1, cython 3.0.10, glibc 2.39, cysignals 1.11.4.
Tried rebuilding python and sage without any Arch specific build flags, same result.

What baffles me is that the definition of Py_XDECREF is

static inline void Py_XDECREF(PyObject *op)
{
    if (op != _Py_NULL) {
        Py_DECREF(op);
    }
}

so this part of the backtrace doesn't make any sense to me:

#8  Py_DECREF (op=0x0) at /usr/include/python3.12/object.h:701
#9  Py_XDECREF (op=0x0) at /usr/include/python3.12/object.h:799

@antonio-rojas
Copy link
Contributor

antonio-rojas commented Apr 24, 2024

For me gcc is 13.2.0, glibc is 2.39 (and gap 4.12.2, cython 3.0.10, cysignals 1.11.4, ... )

Unrelated, but how do you deal with #37616 if you are on gap 4.13?

I ignore the test failures. None of them is a serious issue. In any case, this is also reproducible with 4.12.

@antonio-rojas
Copy link
Contributor

antonio-rojas commented Apr 24, 2024

Narrowed this down to GCC optimization. Compiling with -O2 or higher triggers this, compiling with -O1 works fine. This would explain why this is only reproducible in some distros.

Compiling with clang works OK with any optimization level.

@dimpase
Copy link
Member

dimpase commented Apr 24, 2024

Is this compiling libgap? or element.pyx ?

If this is compiling libgap with different -O values then it's GAP bug.

@antonio-rojas
Copy link
Contributor

Is this compiling libgap? or element.pyx ?

If this is compiling libgap with different -O values then it's GAP bug.

It's compiling the previously attached element.c

@tornaria
Copy link
Contributor Author

Narrowed this down to GCC optimization. Compiling with -O2 or higher triggers this, compiling with -O1 works fine. This would explain why this is only reproducible in some distros.

Compiling with clang works OK with any optimization level.

This is our cc line for that file:
cc -fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall -fstack-clash-protection -D_FORTIFY_SOURCE=2 -mtune=generic -O2 -pipe -g -fstack-clash-protection -D_FORTIFY_SOURCE=2 -mtune=generic -O2 -pipe -g -fstack-clash-protection -D_FORTIFY_SOURCE=2 -mtune=generic -O2 -pipe -ffile-prefix-map=/builddir/sagemath-10.4.beta3/pkgs/sagemath-standard=. -fPIC -Isage/cpython -I/usr/lib/python3.12/site-packages/cysignals -I/builddir/sagemath-10.4.beta3/pkgs/sagemath-standard -I/usr/lib/python3.12/site-packages/numpy/core/include -I/usr/include/python3.12 -Ibuild/cythonized -I/usr/include/python3.12 -c build/cythonized/sage/libs/gap/element.c -o build/temp.linux-x86_64-cpython-312/build/cythonized/sage/libs/gap/element.o -fno-strict-aliasing -DCYTHON_CLINE_IN_TRACEBACK=1

@dimpase
Copy link
Member

dimpase commented Apr 24, 2024

What baffles me is that the definition of Py_XDECREF is
[...]

// Static inline functions should use _Py_NULL rather than using directly NULL
// to prevent C++ compiler warnings. On C23 and newer and on C++11 and newer,
// _Py_NULL is defined as nullptr.
#if (defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L) \
        || (defined(__cplusplus) && __cplusplus >= 201103)
#  define _Py_NULL nullptr
#else
#  define _Py_NULL NULL
#endif

Py_NULL and NULL are the same thing, IMHO. So probably some kind of optimisation causes a corruption of call frames.

Something to do with _Py_IsImmortal - Py_XDECREF and Py_DECREF are kicking the can down the road to _Py_IsImmortal for some reason.

Can you insert prints, or otherwise see the actual values of op at the calling moments, not post mortem?

Also, can you add print in the body of the sig_GAP_Enter macro (defined close to the top of element.c)

#define sig_GAP_Enter()  {int t = GAP_Enter(); if (!t) sig_error();}

to see what value of t we're getting from this libgap.Sum(1,2,3), or whatever the crash's trigger is.

Maybe t must be declared volatile there? (to prevent it being optimized away)

Of course, sometimes print statements cause sudden changes in how the code operates, but let's try anyway...

@dimpase
Copy link
Member

dimpase commented Apr 24, 2024

something like

--- a/src/sage/libs/gap/gap_includes.pxd
+++ b/src/sage/libs/gap/gap_includes.pxd
@@ -29,7 +29,7 @@ cdef extern from "gap/calls.h" nogil:
 
 cdef extern from "gap/libgap-api.h" nogil:
     """
-    #define sig_GAP_Enter()  {int t = GAP_Enter(); if (!t) sig_error();}
+    #define sig_GAP_Enter()  {volatile int t = GAP_Enter(); if (!t) sig_error();}
     """
     ctypedef void (*GAP_CallbackFunc)()
     void GAP_Initialize(int argc, char ** argv,

@antonio-rojas
Copy link
Contributor

This is a regression in the GCC stable branch. I downgraded to the previous Arch snapshot (commit 860b0f0ef787f756c0e293671b4c4622dff63a79) and it works fine. I will try to create a minimal test case and report upstream, so far my naive attempts are not working.

@antonio-rojas
Copy link
Contributor

Upstream is understandably having trouble diagnosing the issue, with such a complex setup needed to reproduce it, and I am not able to help much further. If someone more knowledgeable could chime in and help make this more digestable for upstream, that would be helpful.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

@antonio-rojas
Copy link
Contributor

Let's move the discussion to #37026 since this is not directly caused by Python 3.12 (it only makes the issue manifest itself as a crash, but likely still exists on 3.11)

@dimpase
Copy link
Member

dimpase commented May 3, 2024

Calling any libgap function in three arguments triggers the crash.

sage: libgap.AbelianGroup(0,0,0)
-> crash

I am able to reproduce this on Gentoo (with Python 3.12.3, and gcc version 13.2.1 20240210 (Gentoo 13.2.1_p20240210 p14)).

In view of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872#c12 it's of course a tough task to produce a "small" example (it could be that gcc code generation only breaks down on huge files like element.c)

@dimpase
Copy link
Member

dimpase commented May 14, 2024

This is a regression in the GCC stable branch. I downgraded to the previous Arch snapshot (commit 860b0f0ef787f756c0e293671b4c4622dff63a79) and it works fine. I will try to create a minimal test case and report upstream, so far my naive attempts are not working.

Upstream has concluded it's not a gcc bug. It's actually their optimiser getting more efficient, holding more local variables in registers. So on can either go the route of adding volatile all over the place (#37951), or, better, do not blindly create ref-counted GAP objects, which need postprocessing at sign_on/off recovery, and use as much as possible temporary GAP objects (currently not supported by out interface, but it's certainly doable).

vbraun pushed a commit to vbraun/sage that referenced this pull request May 15, 2024
sagemathgh-37951: declare the last arg to GAP_CallFunc3Args volatile
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this pull request May 18, 2024
sagemathgh-37951: declare the last arg to GAP_CallFunc3Args volatile
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this pull request May 18, 2024
sagemathgh-37951: declare the last arg to GAP_CallFunc3Args volatile
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this pull request May 24, 2024
sagemathgh-37951: declare the last arg to GAP_CallFunc3Args volatile
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this pull request Feb 18, 2025
sagemathgh-39423: Use import_module instead of find_spec
    
Otherwise the test would fail with meson editable install. See https://g
ithub.com/sagemath/sage/actions/runs/13003203795/job/36265539648 .

Looks like the function was last changed in sagemath#36407. There was no
discussion why the simple implementation is not used.

This is part of the fix for this test. The other part needed is
sagemath#39498


### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.  (can't really test,
but see sagemath#39369)
- [ ] I have updated the documentation and checked the documentation
preview.  (no documentation change)

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39423
Reported by: user202729
Reviewer(s): Tobias Diez
vbraun pushed a commit to vbraun/sage that referenced this pull request Feb 21, 2025
sagemathgh-39423: Use import_module instead of find_spec
    
Otherwise the test would fail with meson editable install. See https://g
ithub.com/sagemath/sage/actions/runs/13003203795/job/36265539648 .

Looks like the function was last changed in sagemath#36407. There was no
discussion why the simple implementation is not used.

This is part of the fix for this test. The other part needed is
sagemath#39498


### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.  (can't really test,
but see sagemath#39369)
- [ ] I have updated the documentation and checked the documentation
preview.  (no documentation change)

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#39423
Reported by: user202729
Reviewer(s): Tobias Diez
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ p: critical / 2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants