Revised PR: Simplify `PyArray_DescrFromScalar` per maintainer feedback

Maintainer's suggestion (decoded)

The maintainer (Sebastian Berg) suggests replacing most of PyArray_DescrFromScalar with a two-step pattern:

Get the DType class via PyArray_DescrFromTypeObject (robust, handles subclasses)
Call NPY_DT_CALL_discover_descr_from_pyobject(DType, sc) unconditionally

And "delete everything else" — meaning the special cases for Void, unsized string/unicode, and the new-style dtype block we added.

He also explains why PyArray_DescrFromTypeObject is preferable to PyArray_DiscoverDTypeFromScalarType: the latter fails for subclasses of scalar types (e.g., someone subclassing np.float64), while DescrFromTypeObject handles these via MRO walking.

Analysis: what can we actually delete?

Void special case (lines 408-412) — CAN DELETE

void_discover_descr_from_pyobject (dtypemeta.c:542) handles void scalars correctly:

if (PyArray_IsScalar(obj, Void)) {
    Py_INCREF(void_obj->descr);
    return (PyArray_Descr *)void_obj->descr;
}

Same behavior as the current special case. Safe to remove.

Unsized string/unicode/other handling (lines 461-487) — CAN DELETE

string_discover_descr_from_pyobject (dtypemeta.c:510) checks PyBytes_Check and PyUnicode_Check and computes itemsize correctly. Since np.bytes_ subclasses bytes and np.str_ subclasses str, these isinstance checks pass. Same effective behavior. Safe to remove.

Our new-style dtype block (lines 434-455) — CAN DELETE

Subsumed by the unconditional discover_descr_from_pyobject call. This was our fix; the maintainer's approach generalizes it to all dtypes. Safe to remove.

Datetime/Timedelta special case (lines 414-432) — MUST KEEP

Circular recursion: discover_datetime_and_timedelta_from_pyobject (dtypemeta.c:572) calls back into PyArray_DescrFromScalar(obj):

// dtypemeta.c:577
PyArray_Descr *descr = PyArray_DescrFromScalar(obj);

If we removed the datetime/timedelta special case from PyArray_DescrFromScalar, the new general path would call discover_descr_from_pyobject, which calls PyArray_DescrFromScalar, which calls discover_descr_from_pyobject → infinite recursion, stack overflow.

The special case acts as a base case that breaks this recursion. It must stay.

Note: The maintainer may have been speaking aspirationally when they said "delete everything else." They acknowledged "I have to think about how that is best." The datetime recursion is a concrete blocker that prevents full deletion.

Impact on behavior

Scalar type	Before	After	Changed?
Builtin non-parametric (float64, int32, ...)	`DescrFromTypeObject` → singleton	`DescrFromTypeObject` → DType → `discover` → singleton	No (same result)
Void	Special case → `void_obj->descr`	`DescrFromTypeObject` → DType → `void_discover` → `void_obj->descr`	No (same result)
Datetime/Timedelta	Special case (kept)	Special case (kept)	No
String/Unicode (unsized)	`DescrFromTypeObject` + manual `elsize`	`DescrFromTypeObject` → DType → `string_discover` → sized descr	No (same result)
New-style non-parametric user dtype	Our block → `discover` → `default_descr`	`DescrFromTypeObject` → DType → `discover` → `default_descr`	No
New-style parametric user dtype	Our block → `discover` → correct descr	`DescrFromTypeObject` → DType → `discover` → correct descr	No (same fix)
Scalar subclass (e.g., subclass of float64)	`DescrFromTypeObject` → `_descr_from_subtype`	`DescrFromTypeObject` → DType → `discover` → singleton	No (same base dtype)

Performance: For builtin dtypes, the new path does one extra function call (discover_descr_from_pyobject → nonparametric_discover_descr_from_pyobject which just increfs and returns the singleton). Negligible. The maintainer explicitly said "I wouldn't worry about it."

Proposed new code

/*NUMPY_API
 * Return descr object from array scalar.
 *
 * New reference
 */
NPY_NO_EXPORT PyArray_Descr *
PyArray_DescrFromScalar(PyObject *sc)
{
    /*
     * Datetime and timedelta must remain here because
     * discover_datetime_and_timedelta_from_pyobject calls back into
     * PyArray_DescrFromScalar — removing this would create infinite recursion.
     */
    if (PyArray_IsScalar(sc, Datetime) || PyArray_IsScalar(sc, Timedelta)) {
        PyArray_DatetimeMetaData *dt_data;
        PyArray_Descr *descr;

        if (PyArray_IsScalar(sc, Datetime)) {
            descr = PyArray_DescrNewFromType(NPY_DATETIME);
        }
        else {
            /* Timedelta */
            descr = PyArray_DescrNewFromType(NPY_TIMEDELTA);
        }
        if (descr == NULL) {
            return NULL;
        }
        dt_data = &(((PyArray_DatetimeDTypeMetaData *)((_PyArray_LegacyDescr *)descr)->c_metadata)->meta);
        memcpy(dt_data, &((PyDatetimeScalarObject *)sc)->obmeta,
               sizeof(PyArray_DatetimeMetaData));

        return descr;
    }

    /*
     * Use PyArray_DescrFromTypeObject to get the DType class — this is more
     * robust than PyArray_DiscoverDTypeFromScalarType because it handles
     * subclasses via MRO walking.  Then call discover_descr_from_pyobject
     * unconditionally, which extracts the correct descriptor from the scalar
     * instance (important for parametric dtypes like Void, String, and
     * new-style user dtypes).
     */
    PyArray_Descr *descr = PyArray_DescrFromTypeObject((PyObject *)Py_TYPE(sc));
    if (descr == NULL) {
        return NULL;
    }
    PyArray_DTypeMeta *DType = NPY_DTYPE(descr);
    Py_INCREF(DType);
    Py_DECREF(descr);

    PyArray_Descr *result = NPY_DT_CALL_discover_descr_from_pyobject(DType, sc);
    Py_DECREF(DType);
    return result;
}

What this removes

Void special case (6 lines) — void_discover_descr_from_pyobject handles it
Our new-style dtype block (14 lines) — subsumed by the unconditional pattern
Unsized string/unicode/other handling (27 lines) — string_discover_descr_from_pyobject handles it

Total: ~47 lines removed, ~12 lines added. Net simplification of ~35 lines.

What this keeps

Datetime/Timedelta special case — required to prevent infinite recursion

Risks

Low risk: Some edge case in void subtype handling where void_discover_descr_from_pyobject behaves slightly differently from the old hand-rolled code. Mitigated by: existing tests cover void scalars, and void_discover_descr_from_pyobject is already the canonical path used by array coercion.
Low risk: Minor performance regression for builtin scalars (one extra function call through discover_descr_from_pyobject). Maintainer explicitly said not to worry.

Tests

The existing tests in test_custom_dtypes.py (our TestDescrFromScalarNewStyleDType class) should still pass. We should also verify:

np.void scalar .astype() still works
np.bytes_ and np.str_ scalar descriptor extraction still gives correct sizes
np.datetime64 and np.timedelta64 scalar descriptor extraction is unchanged
All existing scalar tests pass (test_scalar_methods, test_multiarray, etc.)

MaartenBaert/numpy_fix_generic_astype_feedback.md

Select an option

No results found

Select an option

No results found

Revised PR: Simplify `PyArray_DescrFromScalar` per maintainer feedback

Maintainer's suggestion (decoded)

Analysis: what can we actually delete?

Void special case (lines 408-412) — CAN DELETE

Unsized string/unicode/other handling (lines 461-487) — CAN DELETE

Our new-style dtype block (lines 434-455) — CAN DELETE

Datetime/Timedelta special case (lines 414-432) — MUST KEEP

Impact on behavior

Proposed new code

What this removes

What this keeps

Risks

Tests

MaartenBaert/numpy_fix_generic_astype_feedback.md

Revised PR: Simplify PyArray_DescrFromScalar per maintainer feedback

Maintainer's suggestion (decoded)

Analysis: what can we actually delete?

Void special case (lines 408-412) — CAN DELETE

Unsized string/unicode/other handling (lines 461-487) — CAN DELETE

Our new-style dtype block (lines 434-455) — CAN DELETE

Datetime/Timedelta special case (lines 414-432) — MUST KEEP

Impact on behavior

Proposed new code

What this removes

What this keeps

Risks

Tests

Revised PR: Simplify `PyArray_DescrFromScalar` per maintainer feedback