Skip to content

Instantly share code, notes, and snippets.

@ManifoldFR
Last active August 27, 2025 15:41
Show Gist options
  • Select an option

  • Save ManifoldFR/f16c840e193c0c95b25f872baf36467d to your computer and use it in GitHub Desktop.

Select an option

Save ManifoldFR/f16c840e193c0c95b25f872baf36467d to your computer and use it in GitHub Desktop.
Milestone 5.0 MR data
{
"supported": {
"other_improved": [
"[!1938](https://gitlab.com/libeigen/eigen/-/merge_requests/1938): Corrected documentation typos in Eigen's Core module by removing duplicated 'for' keywords in MathFunctionsImpl.h and QuickReference.dox files.",
"[!1937](https://gitlab.com/libeigen/eigen/-/merge_requests/1937): Improved compiler warnings and vectorized cast handling in Eigen's core packet math implementation by modifying CoreEvaluators.h and GenericPacketMath.h to suppress array bounds warnings and fix an edge case in segment loading.",
"[!1936](https://gitlab.com/libeigen/eigen/-/merge_requests/1936): Improved GenericPacketMath.h by renaming variables to address -Wshadow compiler warnings, reducing potential naming conflicts in the Eigen core library.",
"[!1934](https://gitlab.com/libeigen/eigen/-/merge_requests/1934): Improved SuperLU support in Eigen by introducing `GlobalLU_t` pointer and addressing API incompatibility issues for ILU in SuperLUv7.0.1, enhancing interface consistency.",
"[!1932](https://gitlab.com/libeigen/eigen/-/merge_requests/1932): Updates CMakeLists.txt by setting CMake policy CMP0177 to NEW, improving build configuration management for the Eigen library.",
"[!1930](https://gitlab.com/libeigen/eigen/-/merge_requests/1930): Improved sparse matrix-vector dot product computation in SparseDot.h by utilizing numext::fma, which reduces computational error and provides a small performance boost for specific workloads.",
"[!1928](https://gitlab.com/libeigen/eigen/-/merge_requests/1928): Improved CI/CD infrastructure by migrating default Linux builds and tests to GitLab runners, reducing dependency on a single machine and enhancing testing efficiency.",
"[!1927](https://gitlab.com/libeigen/eigen/-/merge_requests/1927): Improved CI configuration for PPC architecture by replacing g++-10 with g++14 in Linux build and test pipeline files, resolving compiler compatibility issues.",
"[!1924](https://gitlab.com/libeigen/eigen/-/merge_requests/1924): Improved ARM SIMD vector handling in Eigen by modifying alignment requirements and load/store operations for ARM architecture, specifically targeting NEON-related header files.",
"[!1923](https://gitlab.com/libeigen/eigen/-/merge_requests/1923): Improved GPU-related code organization by moving HIP/CUDA defines from the unsupported Tensor module to the core Eigen utility directory, enhancing maintainability and code structure.",
"[!1917](https://gitlab.com/libeigen/eigen/-/merge_requests/1917): Improved CI testing infrastructure by transitioning ARM and PPC tests to use QEMU, enhancing test environment reliability and portability across different architectures.",
"[!1910](https://gitlab.com/libeigen/eigen/-/merge_requests/1910): Improves half-precision floating point comparisons in Eigen's core math libraries by implementing faster scalar and intrinsic comparison methods with better handling of edge cases like sign-magnitude conversions.",
"[!1909](https://gitlab.com/libeigen/eigen/-/merge_requests/1909): Improves OpenBLAS matrix multiplication support by adding `sbgemm` function for bfloat16 matrix operations, with a new macro to enable flexible OpenBLAS optimization in Eigen's core product generation code.",
"[!1902](https://gitlab.com/libeigen/eigen/-/merge_requests/1902): Improves the performance of `maxCoeff` and related coefficient finding functions in Eigen's Core library by implementing a vectorized approach that delays index determination and optimizes linear access evaluators.",
"[!1900](https://gitlab.com/libeigen/eigen/-/merge_requests/1900): Improves the `operator[]` return type for `Map<const Vector>` in Eigen's `DenseCoeffsBase.h` to ensure type correctness by returning a `const Scalar&` instead of a `Scalar`.",
"[!1899](https://gitlab.com/libeigen/eigen/-/merge_requests/1899): Improved packet reduction operations in Eigen's AVX, AVX512, and SSE architectures by enabling default behavior for `PropagateFast` and removing redundant specializations for integral types.",
"[!1898](https://gitlab.com/libeigen/eigen/-/merge_requests/1898): Improved Intel intrinsics reduction operations by reorganizing AVX and SSE packet reduction code into separate files and adding missing predux operations with NaN propagation support.",
"[!1894](https://gitlab.com/libeigen/eigen/-/merge_requests/1894): Improves scalar construction in Eigen's SelfAdjointEigenSolver by adding explicit support for non-implicitly convertible types like ceres::Jet, enhancing type flexibility in eigenvalue computations.",
"[!1888](https://gitlab.com/libeigen/eigen/-/merge_requests/1888): Improved solver base classes by implementing consistent `info()` methods across multiple linear algebra solver types to prevent potential infinite recursion and enhance code reliability.",
"[!1886](https://gitlab.com/libeigen/eigen/-/merge_requests/1886): Improves BDCSVD and JacobiSVD computational efficiency by adding an overload to `compute()` that avoids unnecessary matrix copies, reducing memory allocation during SVD computations.",
"[!1885](https://gitlab.com/libeigen/eigen/-/merge_requests/1885): Improves CMake configuration by conditionally creating the uninstall target only when Eigen is the top-level project, preventing potential conflicts during library integration.",
"[!1881](https://gitlab.com/libeigen/eigen/-/merge_requests/1881): Improved the `slerp()` quaternion interpolation function in Eigen's Geometry module by replacing the implementation with a more performant approach that avoids expensive trigonometric function calls.",
"[!1880](https://gitlab.com/libeigen/eigen/-/merge_requests/1880): Improved the `cbrt` function in GenericPacketMathFunctions.h by implementing a more conservative method for detecting non-finite inputs, reducing the risk of unintended compiler substitutions.",
"[!1879](https://gitlab.com/libeigen/eigen/-/merge_requests/1879): Improved Eigen's core math functions by adding vectorized implementations of `cbrt` for float and double across multiple architectures (AVX/SSE/NEON/Altivec/AVX512), achieving better performance and accuracy.",
"[!1878](https://gitlab.com/libeigen/eigen/-/merge_requests/1878): Improved Eigen's partial redux operations by adding packet segment support, optimizing rowwise sum performance across various matrix sizes and packet configurations.",
"[!1877](https://gitlab.com/libeigen/eigen/-/merge_requests/1877): Improved packet segment implementation in XprHelper.h by adding a check for DiagonalWrapper to enhance compatibility and robustness of the core Eigen library.",
"[!1875](https://gitlab.com/libeigen/eigen/-/merge_requests/1875): Improves memory optimization for `std::complex` types in Eigen's Core module by extending memset optimization support to `std::complex<float>` and `std::complex<double>` for enhanced performance.",
"[!1868](https://gitlab.com/libeigen/eigen/-/merge_requests/1868): Improved CTest configuration in Eigen's CMake testing infrastructure by modifying default thread settings to use `j0` for better performance and stability.",
"[!1867](https://gitlab.com/libeigen/eigen/-/merge_requests/1867): Improved AVX512 packet math by adding the missing `pmadd` function for `Packet16bf` in the PacketMath header, which resolved flaky packetmath test stability.",
"[!1866](https://gitlab.com/libeigen/eigen/-/merge_requests/1866): Improves packet output functionality in Eigen's GenericPacketMath by adding a more reliable `postream` function and removing the `packet_ostream.h` test file.",
"[!1864](https://gitlab.com/libeigen/eigen/-/merge_requests/1864): Improves CMake testing configuration by modifying `EigenConfigureTesting.cmake` to run ctests in parallel across all available cores by default, enhancing testing efficiency.",
"[!1863](https://gitlab.com/libeigen/eigen/-/merge_requests/1863): Improved floating-point operations for Half and BFloat16 types by implementing fallback FMA (fused multiply-add) using float when native FMA is not available, preventing potential overflow issues.",
"[!1861](https://gitlab.com/libeigen/eigen/-/merge_requests/1861): Improved GitLab CI configuration to enable full test suite triggering via the `all-tests` label for merge requests, enhancing test execution control across Linux and Windows build environments.",
"[!1858](https://gitlab.com/libeigen/eigen/-/merge_requests/1858): Improved Eigen::half packet math functions by adding FMA support and adjusting test parameters to reduce numerical instability in AVX512 scenarios.",
"[!1857](https://gitlab.com/libeigen/eigen/-/merge_requests/1857): Improved packet math operations by adding support for `numext::fma` and implementing missing `pmadd` functions for float16 and bfloat16 types across multiple architecture-specific headers.",
"[!1855](https://gitlab.com/libeigen/eigen/-/merge_requests/1855): Enhances the Eigen ForkJoin scheduler by generalizing the ThreadPool interface in ForkJoin.h, enabling more flexible threading behavior and improved portability across different platforms and configurations.",
"[!1853](https://gitlab.com/libeigen/eigen/-/merge_requests/1853): Optimized matrix operations in multiple Eigen library components by adding more `.noalias()` directives to reduce unnecessary memory accesses and improve computational efficiency.",
"[!1850](https://gitlab.com/libeigen/eigen/-/merge_requests/1850): Improved x86 complex vectorization performance by fixing FMA operations in AVX and SSE complex math implementations, targeting core vectorization functionality in Eigen's architecture-specific headers.",
"[!1846](https://gitlab.com/libeigen/eigen/-/merge_requests/1846): Refactored AssignmentFunctors.h to unify assignment functors with existing scalar operations, reducing code redundancy and improving consistency between compound and simple assignment operations.",
"[!1843](https://gitlab.com/libeigen/eigen/-/merge_requests/1843): Improved STL feature detection in Eigen's core utility headers to enhance compatibility with C++20 and resolve compilation issues in libraries like JAX/TensorFlow.",
"[!1841](https://gitlab.com/libeigen/eigen/-/merge_requests/1841): Fixed CI configuration files for documentation job in nightlies, correcting an accidental overwrite of nightly rules to ensure proper documentation workflow.",
"[!1839](https://gitlab.com/libeigen/eigen/-/merge_requests/1839): Improved the `ConstexprTest` struct in `test/constexpr.cpp` by adding a deduction guide to suppress a compiler warning related to class template argument deduction.",
"[!1838](https://gitlab.com/libeigen/eigen/-/merge_requests/1838): Improved Eigen's ThreadPool functionality by enforcing binary functor requirements in ParallelFor and ParallelForAsync methods, and ensuring test coverage through CMake configuration updates.",
"[!1837](https://gitlab.com/libeigen/eigen/-/merge_requests/1837): Improves documentation deployment infrastructure by modifying CI configuration files to build docs on push and prevent automatic expiration, enhancing documentation preservation and reliability.",
"[!1832](https://gitlab.com/libeigen/eigen/-/merge_requests/1832): Improved CMakeLists.txt by disabling the `fno-check-new` flag for Clang, reducing build system warning noise.",
"[!1829](https://gitlab.com/libeigen/eigen/-/merge_requests/1829): Refactored the AssignEvaluator header in Eigen's Core module to improve code clarity by removing legacy syntax and outdated patterns without changing functionality.",
"[!1826](https://gitlab.com/libeigen/eigen/-/merge_requests/1826): Improved documentation configuration by modifying Doxyfile.in to add proper MathJax and LaTeX package settings for better documentation rendering.",
"[!1824](https://gitlab.com/libeigen/eigen/-/merge_requests/1824): Improves the FullPivLU decomposition routine in Eigen by adding logic to return a condition number of zero when a matrix is not invertible, enhancing the robustness of the LU decomposition handling.",
"[!1823](https://gitlab.com/libeigen/eigen/-/merge_requests/1823): Improved documentation build configuration by adding graphviz to resolve graph visualization issues in the Eigen library's documentation.",
"[!1821](https://gitlab.com/libeigen/eigen/-/merge_requests/1821): Improved the BiCGSTAB iterative linear solver by modifying restart conditions and initialization to enhance numerical stability and address convergence issues in edge cases.",
"[!1820](https://gitlab.com/libeigen/eigen/-/merge_requests/1820): Improved Eigen's vectorization and traversal logic in Core modules by modifying Meta.h, AssignEvaluator.h, and vectorization_logic.cpp to better handle compile-time size considerations and reduce unnecessary compiler warnings.",
"[!1818](https://gitlab.com/libeigen/eigen/-/merge_requests/1818): Improved Eigen's documentation infrastructure by enabling nightly documentation generation, configuring Doxygen to fail on warnings, and removing external page dependencies.",
"[!1817](https://gitlab.com/libeigen/eigen/-/merge_requests/1817): Improved CI testing configuration by adding `EIGEN_CI_CTEST_ARGS` to enable custom test timeouts across multiple CI script and configuration files.",
"[!1815](https://gitlab.com/libeigen/eigen/-/merge_requests/1815): Improves configuration detection in ConfigureVectorization.h by adding a check for std::hardware_destructive_interference_size, enhancing compatibility with newer GCC versions.",
"[!1814](https://gitlab.com/libeigen/eigen/-/merge_requests/1814): Improved PPC architecture support in Eigen's Complex class by adding missing return statements, enhancing code reliability and consistency for the AltiVec architecture.",
"[!1813](https://gitlab.com/libeigen/eigen/-/merge_requests/1813): Improves memory alignment support in Eigen's core utilities by increasing maximum alignment to 256 bytes, enhancing compatibility with modern ARM architectures and cache line requirements.",
"[!1812](https://gitlab.com/libeigen/eigen/-/merge_requests/1812): Improved CI infrastructure by adding a script to build and deploy Doxygen documentation, integrating documentation generation into the GitLab CI pipeline for automatic nightly documentation updates.",
"[!1811](https://gitlab.com/libeigen/eigen/-/merge_requests/1811): Improved CI/CMake configuration to enable emulated testing for LoongArch64 architecture by explicitly configuring QEMU and adjusting test execution settings in Eigen's build infrastructure.",
"[!1807](https://gitlab.com/libeigen/eigen/-/merge_requests/1807): Improved Eigen's documentation infrastructure by fixing Doxygen warnings, updating configuration files, and removing outdated documentation elements across multiple library modules.",
"[!1804](https://gitlab.com/libeigen/eigen/-/merge_requests/1804): Improved the `NonBlockingThreadPool` class by making the `spin_count_` member variable `const` and initializing it in the constructor to eliminate potential data races in multi-threaded scenarios.",
"[!1802](https://gitlab.com/libeigen/eigen/-/merge_requests/1802): Improved the NonBlockingThreadPool header by fixing initialization order and removing unused variables to enhance code clarity and reduce potential initialization-related bugs.",
"[!1801](https://gitlab.com/libeigen/eigen/-/merge_requests/1801): Improved the Simplicial Cholesky factorization routine in Eigen's sparse matrix module by implementing advanced pattern analysis algorithms, reducing runtime for large benchmark problems from 7.5 minutes to less than 0.5 seconds.",
"[!1800](https://gitlab.com/libeigen/eigen/-/merge_requests/1800): Improved documentation in the ForkJoin.h file by fixing typos and enhancing comment clarity, with no changes to the underlying code implementation.",
"[!1797](https://gitlab.com/libeigen/eigen/-/merge_requests/1797): Attempts to fix CI configuration for LoongArch architecture by modifying the Linux GitLab CI test configuration file.",
"[!1796](https://gitlab.com/libeigen/eigen/-/merge_requests/1796): Improved Eigen block documentation to clarify that block objects can have non-square dimensions, enhancing clarity in the Tutorial_BlockOperations example.",
"[!1794](https://gitlab.com/libeigen/eigen/-/merge_requests/1794): Improved documentation for the cross product operation in Eigen's geometry module, clarifying the behavior for complex numbers.",
"[!1788](https://gitlab.com/libeigen/eigen/-/merge_requests/1788): Improved CI configuration by removing the Ubuntu ToolChain repository from the Linux CI script, reducing unnecessary dependencies in the build environment.",
"[!1787](https://gitlab.com/libeigen/eigen/-/merge_requests/1787): Improved CUDA device support in Eigen's DiagonalMatrix and PlainObjectBase by adding device function qualifiers and replacing std::copy with manual iteration to resolve device function call issues.",
"[!1786](https://gitlab.com/libeigen/eigen/-/merge_requests/1786): Improves Eigen's parallelization behavior in Parallelizer.h by using `omp_get_max_threads` when `setNbThreads` is not explicitly set, ensuring consistent multi-threading performance.",
"[!1779](https://gitlab.com/libeigen/eigen/-/merge_requests/1779): Improves Eigen's core assignment and construction logic by optimizing fill_n and memset operations for matrix expressions, enhancing performance for constant and zero-based assignments.",
"[!1778](https://gitlab.com/libeigen/eigen/-/merge_requests/1778): Improves documentation installation by adding a new `install-doc` target in CMake configuration, enabling more straightforward placement of documentation files in the standard CMake documentation directory.",
"[!1776](https://gitlab.com/libeigen/eigen/-/merge_requests/1776): Improved deployment configuration by switching to Alpine image in the CI pipeline, reducing dependencies and enhancing deployment speed.",
"[!1775](https://gitlab.com/libeigen/eigen/-/merge_requests/1775): Simplified the nightly tag job configuration in the GitLab CI/CD pipeline by removing branch name references from the deployment process, improving clarity and maintainability.",
"[!1774](https://gitlab.com/libeigen/eigen/-/merge_requests/1774): Improves matrix equality comparison in Eigen's core module by adding support for comparing matrices of different sizes, addressing issue #1061 and enhancing matrix comparison flexibility.",
"[!1773](https://gitlab.com/libeigen/eigen/-/merge_requests/1773): Improves GitLab CI deployment configuration by modifying `deploy.gitlab-ci.yml` to use specific commit tags instead of branch references, enhancing build consistency and deployment precision.",
"[!1772](https://gitlab.com/libeigen/eigen/-/merge_requests/1772): Improved CI/CD deployment configuration by modifying the GitLab CI deployment file to ensure correct branch management and cloning strategy.",
"[!1771](https://gitlab.com/libeigen/eigen/-/merge_requests/1771): Updates the deploy job configuration in the GitLab CI pipeline, modifying the deployment process for the Eigen library.",
"[!1770](https://gitlab.com/libeigen/eigen/-/merge_requests/1770): Experiments with Alpine Linux in the CI configuration for formatting checks, modifying the checkformat GitLab CI configuration to explore alternative formatting options.",
"[!1768](https://gitlab.com/libeigen/eigen/-/merge_requests/1768): Updates the Linux GitLab CI configuration for ROCm build pipeline, likely adjusting Docker-related settings for the continuous integration environment.",
"[!1767](https://gitlab.com/libeigen/eigen/-/merge_requests/1767): Improved CI configuration by switching the Ubuntu Docker image from 20.04 to 22.04 in the Linux build pipeline to resolve image corruption issues and stabilize build processes.",
"[!1766](https://gitlab.com/libeigen/eigen/-/merge_requests/1766): Updates the ROCm Docker image configuration in the GitLab CI pipeline to improve the Linux build environment for Eigen library development.",
"[!1765](https://gitlab.com/libeigen/eigen/-/merge_requests/1765): Improved the GitLab CI configuration by adding a deploy stage that tags the latest nightly pipeline when it passes successfully, enhancing the continuous integration workflow.",
"[!1763](https://gitlab.com/libeigen/eigen/-/merge_requests/1763): Improved documentation for move constructors and move assignment operators in Eigen's core Array, Matrix, and PlainObjectBase classes by updating their documentation strings.",
"[!1761](https://gitlab.com/libeigen/eigen/-/merge_requests/1761): Improved map fill logic in Eigen's Core module by modifying stride handling in Fill.h, enhancing memory access for map operations with non-linear strides.",
"[!1759](https://gitlab.com/libeigen/eigen/-/merge_requests/1759): Refactored the pow() function's special case handling for float and int types in Eigen's default packet math functions, improving robustness and simplifying code structure by reverting to repeated squaring.",
"[!1756](https://gitlab.com/libeigen/eigen/-/merge_requests/1756): Improved Eigen's pow() function performance by optimizing log2() operator and integer exponent handling, achieving a 25% speedup for float calculations while maintaining high accuracy.",
"[!1755](https://gitlab.com/libeigen/eigen/-/merge_requests/1755): Optimized Eigen's core assignment operations by implementing `fill_n` and `memset` techniques for `setConstant` and `setZero`, resulting in performance gains up to 57% for certain matrix sizes.",
"[!1754](https://gitlab.com/libeigen/eigen/-/merge_requests/1754): Improved Eigen's pow() performance by simplifying and optimizing power operations in the default packet math functions, reducing computational overhead and enhancing speed by 5-6% in AVX2+FMA mode.",
"[!1753](https://gitlab.com/libeigen/eigen/-/merge_requests/1753): Restored vectorized error function (erf) support for SSE and AVX architectures in Eigen's PacketMath headers, fixing a performance regression accidentally introduced in a previous merge request.",
"[!1752](https://gitlab.com/libeigen/eigen/-/merge_requests/1752): Improved the exp() function in Eigen's GenericPacketMathFunctions.h to prevent premature overflow and achieve a 3-4% performance speedup across double and float data types.",
"[!1750](https://gitlab.com/libeigen/eigen/-/merge_requests/1750): Improves exponential function performance in Eigen's packet math implementations for SSE and AVX architectures, achieving a 30-35% speedup by optimizing `pexp` calculations and reducing unnecessary subnormal result processing.",
"[!1749](https://gitlab.com/libeigen/eigen/-/merge_requests/1749): Improved MSVC performance optimization in AssignEvaluator.h by disabling fill_n optimization to address potential issues with std::_Is_all_bits_zero function.",
"[!1748](https://gitlab.com/libeigen/eigen/-/merge_requests/1748): Simplified the NullaryFunctors.h component by removing an unnecessary `HasBlend` trait check, potentially reducing code complexity and improving performance.",
"[!1746](https://gitlab.com/libeigen/eigen/-/merge_requests/1746): Simplified exception handling macros across multiple Eigen library core components by replacing custom `EIGEN_NOEXCEPT` and related macros with standard C++ `noexcept` keywords, improving code consistency and maintainability.",
"[!1745](https://gitlab.com/libeigen/eigen/-/merge_requests/1745): Improved the EigenBase header to resolve C++20 constexpr test compilation failures, enhancing the library's compatibility with modern C++ standards.",
"[!1744](https://gitlab.com/libeigen/eigen/-/merge_requests/1744): Improved Eigen library's macro usage by systematically replacing `EIGEN_CONSTEXPR` with standard `constexpr` across multiple core headers, enhancing code consistency and compiler compatibility.",
"[!1743](https://gitlab.com/libeigen/eigen/-/merge_requests/1743): Improved vectorized error function (erf) computation across multiple hardware architectures by optimizing PacketMath.h implementations for SSE, AVX, AVX2, AVX512, and AltiVec, resulting in significant performance speedups for double-precision calculations.",
"[!1742](https://gitlab.com/libeigen/eigen/-/merge_requests/1742): Improved the Assign_MKL.h header by casting enum types to int, resolving potential compilation issues with enum comparisons in C++26 and later standards.",
"[!1741](https://gitlab.com/libeigen/eigen/-/merge_requests/1741): Improved Eigen's MatrixBase destructor to resolve symbol resolution issues with lldb, ensuring better compatibility when evaluating expressions involving Eigen matrices.",
"[!1739](https://gitlab.com/libeigen/eigen/-/merge_requests/1739): Improved the Memory.h utility in Eigen's core module by replacing C99 size_t macros with more portable numeric limits, enhancing code compatibility and maintainability.",
"[!1737](https://gitlab.com/libeigen/eigen/-/merge_requests/1737): Improves fixed-size matrix handling in Eigen's core memory management by modifying Memory.h and DenseStorage.h to conform to std::is_standard_layout, enhancing compatibility and scalar management for fixed-size matrix types.",
"[!1736](https://gitlab.com/libeigen/eigen/-/merge_requests/1736): Improved device function decorations in Eigen's Core module by adding missing `EIGEN_DEVICE_FUNCTION` annotations to ensure proper device-side execution and compatibility.",
"[!1735](https://gitlab.com/libeigen/eigen/-/merge_requests/1735): Improved Eigen's core element accessors by making `operator()` and `operator[]` constexpr-compatible, enhancing performance and usability in template code across multiple core header files.",
"[!1734](https://gitlab.com/libeigen/eigen/-/merge_requests/1734): Improved AVX and AVX512 packet math operations in Eigen's core library by enhancing vectorized instruction support and performance for linear algebra computations.",
"[!1733](https://gitlab.com/libeigen/eigen/-/merge_requests/1733): Improved AVX vector operations in Eigen's Core library by adding missing `predux_any` function implementations, enhancing performance for vectorized computations.",
"[!1731](https://gitlab.com/libeigen/eigen/-/merge_requests/1731): Improves StlIterators header by replacing `__cplusplus` with `EIGEN_CPLUSPLUS` macro, simplifying compiler version compatibility checks in the Eigen core module.",
"[!1727](https://gitlab.com/libeigen/eigen/-/merge_requests/1727): Enhances fixed-size Eigen objects by making them trivially move assignable, modifying core matrix and array header files to improve move semantics performance and memory efficiency.",
"[!1722](https://gitlab.com/libeigen/eigen/-/merge_requests/1722): Improved matrix passing in the reshape test file to address GCC arm test failures, focusing on data alignment issues in matrix operations.",
"[!1720](https://gitlab.com/libeigen/eigen/-/merge_requests/1720): Improved CUDA compatibility by modifying core utility headers in Eigen, addressing build warnings and assignment operator issues for CUDA 10+ environments.",
"[!1719](https://gitlab.com/libeigen/eigen/-/merge_requests/1719): Improved test coverage for `sizeof()` function by adding test cases that specifically examine dynamic dimension scenarios in the Eigen library's test suite.",
"[!1712](https://gitlab.com/libeigen/eigen/-/merge_requests/1712): Improved the `reverseInPlace` method in Eigen's Core module by adding compile-time information to suppress ARM array out of bounds compiler warnings for fixed-size matrices.",
"[!1709](https://gitlab.com/libeigen/eigen/-/merge_requests/1709): Improved polynomial evaluation in Eigen's core and special functions modules by converting manual polynomial calculations to the more efficient `ppolevl` helper function, enhancing performance and code clarity.",
"[!1701](https://gitlab.com/libeigen/eigen/-/merge_requests/1701): Improved CUDA compatibility in Eigen's Core module by adding missing `EIGEN_DEVICE_FUNC` annotations to header files, resolving build issues for CUDA platforms.",
"[!1700](https://gitlab.com/libeigen/eigen/-/merge_requests/1700): Improved testing infrastructure by adding extra debugging information to float_pow_test_impl and cleaning up array_cwise test code, enhancing diagnostic capabilities for Eigen's test suite.",
"[!1697](https://gitlab.com/libeigen/eigen/-/merge_requests/1697): Improved SSE implementation in PacketMath.h by removing an unnecessary call to _mm_setzero_si128, potentially reducing computational overhead in SSE-based operations.",
"[!1696](https://gitlab.com/libeigen/eigen/-/merge_requests/1696): Improves fixed-size matrices and arrays in Eigen by enabling `trivially_default_constructible` and removing unnecessary constructor variants, simplifying matrix and array implementations in release mode.",
"[!1694](https://gitlab.com/libeigen/eigen/-/merge_requests/1694): Improves fixed-size matrices and arrays in Eigen's core by making their copy and move constructors trivially constructible, enabling better compiler optimizations and compatibility with C++ standards.",
"[!1692](https://gitlab.com/libeigen/eigen/-/merge_requests/1692): Optimized the dot product implementation in InnerProduct.h to improve performance for small matrix sizes by refining bounds calculations and simplifying scalar loop handling.",
"[!1691](https://gitlab.com/libeigen/eigen/-/merge_requests/1691): Improved NonBlockingThreadPool.h by replacing plain asserts with eigen_plain_assert to enhance compatibility with projects using older compilers and maintain consistent assert macro usage.",
"[!1684](https://gitlab.com/libeigen/eigen/-/merge_requests/1684): Improved Eigen's atanh vectorized implementation across multiple architectures (SSE, AVX2, AVX512) to optimize performance and ensure standard compliance for inputs with |x| >= 1.",
"[!1683](https://gitlab.com/libeigen/eigen/-/merge_requests/1683): Improves complex number performance in Eigen's SSE and AVX architectures by implementing optimized fused-multiply-add (FMA) operations that reduce instruction count and enhance computational efficiency.",
"[!1681](https://gitlab.com/libeigen/eigen/-/merge_requests/1681): Improved NumTraits for complex numbers by implementing HasSign and fixing signedness inheritance in Eigen's core numeric traits, along with updating related packet math tests.",
"[!1679](https://gitlab.com/libeigen/eigen/-/merge_requests/1679): Improved BDCSVD and JacobiSVD implementations by suppressing potential memory-related warnings related to uninitialized memory in the SVD modules.",
"[!1677](https://gitlab.com/libeigen/eigen/-/merge_requests/1677): Improved the `patan()` function by consolidating float and double implementations, reducing code duplication and enhancing accuracy across different CPU architectures.",
"[!1676](https://gitlab.com/libeigen/eigen/-/merge_requests/1676): Improved documentation for the GeneralizedEigenSolver::eigenvectors() method by adding missing double quotes to ensure correct visibility in the Eigen documentation.",
"[!1675](https://gitlab.com/libeigen/eigen/-/merge_requests/1675): Improved Eigen's tanh<double> implementation by adding vectorized performance optimizations across multiple architectures (SSE, AVX, AVX512, NEON, AltiVec), demonstrating significant speedups up to 22x for AVX512.",
"[!1673](https://gitlab.com/libeigen/eigen/-/merge_requests/1673): Improves SVE (Scalable Vector Extension) intrinsics performance in Eigen's PacketMath and TypeCasting headers by using \"_x\" suffix to reduce compiler-generated overhead and optimize instruction efficiency.",
"[!1672](https://gitlab.com/libeigen/eigen/-/merge_requests/1672): Improved Eigen's complex number support by vectorizing the squaredNorm() function in UnaryFunctors.h and Dot.h, reducing computational overhead for complex type operations.",
"[!1671](https://gitlab.com/libeigen/eigen/-/merge_requests/1671): Improves dot product performance in Eigen's core library by adding a new inner product evaluator and implementing explicit unrolling for small vectors, with enhanced support for AVX2+FMA instructions.",
"[!1670](https://gitlab.com/libeigen/eigen/-/merge_requests/1670): Improved the tanh implementation in Eigen's default packet math functions, introducing a new rational approximation for float that reduces maximum error and boosts performance by 20-50% on SSE and AVX2+FMA architectures.",
"[!1668](https://gitlab.com/libeigen/eigen/-/merge_requests/1668): Improved Eigen/Core by adding the <thread> header to enable proper compilation and usage of std::this_thread::yield() in C++11 contexts.",
"[!1667](https://gitlab.com/libeigen/eigen/-/merge_requests/1667): Improved StableNorm performance in Eigen's Core module by optimizing computation for non-trivial matrix sizes and enhancing consistency between aligned and unaligned input handling.",
"[!1665](https://gitlab.com/libeigen/eigen/-/merge_requests/1665): Improved threaded product code in Parallelizer.h and product_threaded.cpp by cleaning up implementation and enhancing code clarity and maintainability.",
"[!1663](https://gitlab.com/libeigen/eigen/-/merge_requests/1663): Improved SSE/AVX complex multiplication kernels by utilizing `vfmaddsub213ps` instructions in Complex.h, reducing latency and optimizing performance for complex multiplication operations.",
"[!1662](https://gitlab.com/libeigen/eigen/-/merge_requests/1662): Improved complex matrix multiplication performance in GeneralBlockPanelKernel.h by adjusting block panel size, resulting in 8-33% speedup for complex * complex matrix operations across different backends.",
"[!1661](https://gitlab.com/libeigen/eigen/-/merge_requests/1661): Improves the `hlog` symbol lookup in the Half.h header by removing namespace restrictions, enhancing flexibility for symbol resolution in non-global namespaces.",
"[!1660](https://gitlab.com/libeigen/eigen/-/merge_requests/1660): Updated the documentation navigation tree JavaScript file (eigen_navtree_hacks.js) with minor modifications to improve documentation infrastructure.",
"[!1659](https://gitlab.com/libeigen/eigen/-/merge_requests/1659): Updated .clang-format configuration file to potentially adjust code formatting standards for the Eigen library.",
"[!1656](https://gitlab.com/libeigen/eigen/-/merge_requests/1656): Improved documentation and code quality across multiple Eigen library components by fixing typos in header files, build scripts, and documentation files.",
"[!1650](https://gitlab.com/libeigen/eigen/-/merge_requests/1650): Improved MSVC compatibility in Eigen's BFloat16 and Half headers by removing unnecessary C++23 deprecation suppression checks, resolving warning issues for floating-point type implementations.",
"[!1649](https://gitlab.com/libeigen/eigen/-/merge_requests/1649): Fixed compiler warnings in Eigen's SVD implementation by using placement new to construct small SVD objects in BDCSVD.h and JacobiSVD.h, reducing uninitialized variable warnings without changing core functionality.",
"[!1641](https://gitlab.com/libeigen/eigen/-/merge_requests/1641): Improved AVX512F type casting support in Eigen by adding efficient double to int64_t conversion instructions, enhancing performance of integer conversion operations.",
"[!1640](https://gitlab.com/libeigen/eigen/-/merge_requests/1640): Improved the CI README.md markdown formatting to enhance readability in the GitLab web interface.",
"[!1636](https://gitlab.com/libeigen/eigen/-/merge_requests/1636): Enhances the `pointer_based_stl_iterator` in Eigen's Core module to conform to the C++20 `contiguous_iterator` concept, enabling better compatibility with range operations on `std::span`.",
"[!1632](https://gitlab.com/libeigen/eigen/-/merge_requests/1632): Improved the `allFinite()` function in Eigen's core module by adding AVX vectorization, which enhances performance for large arrays by up to 2.7x.",
"[!1626](https://gitlab.com/libeigen/eigen/-/merge_requests/1626): Improved Eigen core data() functions by refactoring multiple header files to use constexpr, reducing runtime overhead and potentially enhancing compile-time performance across core modules.",
"[!1625](https://gitlab.com/libeigen/eigen/-/merge_requests/1625): Improves memory allocation in Eigen's core utility header by utilizing the built-in `__builtin_alloca_with_align` function when available, potentially enhancing allocation performance.",
"[!1624](https://gitlab.com/libeigen/eigen/-/merge_requests/1624): Improved memory utility in Eigen's Core module by addressing Clang tidy warnings about pointer casting in the `aligned_alloca` function within the Memory.h header.",
"[!1623](https://gitlab.com/libeigen/eigen/-/merge_requests/1623): Improved Eigen's static assert macro formatting across multiple core and tensor header files, introducing a new formatting script to enhance code consistency and readability.",
"[!1621](https://gitlab.com/libeigen/eigen/-/merge_requests/1621): Improves SparseMatrix::insert method by adding index validation checks to prevent out-of-bounds access, enhancing robustness and preventing potential runtime errors.",
"[!1619](https://gitlab.com/libeigen/eigen/-/merge_requests/1619): Suppresses C++23 deprecation warnings in BFloat16 and Half floating-point type headers by modifying compiler-specific type trait handling to reduce unnecessary warnings.",
"[!1618](https://gitlab.com/libeigen/eigen/-/merge_requests/1618): Improved documentation for the Matrix class by correcting a grammatical error in the class documentation, enhancing clarity without changing functionality.",
"[!1615](https://gitlab.com/libeigen/eigen/-/merge_requests/1615): Improves PowerPC-specific predux behavior in Packet4i by modifying the AltiVec PacketMath implementation to prevent element sum saturation, aligning with other architecture implementations.",
"[!1610](https://gitlab.com/libeigen/eigen/-/merge_requests/1610): Improved GPU nearest integer operations by modifying core Eigen packet math functions, enhancing support and performance for GPU-based integer rounding methods.",
"[!1609](https://gitlab.com/libeigen/eigen/-/merge_requests/1609): Improved unitary-ness test robustness in eigensolver_selfadjoint.cpp by adjusting error tolerance to better handle scaling effects, reducing test flakiness.",
"[!1605](https://gitlab.com/libeigen/eigen/-/merge_requests/1605): Improved Eigen's Core utility files by removing unnecessary semicolons in SymbolicIndex.h and RandomImpl.h to reduce potential build errors in downstream projects.",
"[!1600](https://gitlab.com/libeigen/eigen/-/merge_requests/1600): Improved Eigen's transpose product operations by optimizing memory allocations and reducing computational overhead in matrix transposition expressions.",
"[!1599](https://gitlab.com/libeigen/eigen/-/merge_requests/1599): Improved CI configuration by adding a \"cross-compiler\" tag to prevent the PPC runner from attempting cross-compilation, reducing build errors for non-PPC targets.",
"[!1595](https://gitlab.com/libeigen/eigen/-/merge_requests/1595): Improved CI scripts for Windows, adding AVX tests and new MSVC/CUDA build scripts while addressing cache and folder issues in the continuous integration environment.",
"[!1594](https://gitlab.com/libeigen/eigen/-/merge_requests/1594): Improved the `tridiagonalization_inplace_selector::run()` method by adding CUDA device function compatibility, ensuring correct execution in CUDA contexts.",
"[!1593](https://gitlab.com/libeigen/eigen/-/merge_requests/1593): Improved Eigen's ternary evaluator by specializing scalar boolean select operations to reduce dependency on output scalar type and enhance vectorized comparison support.",
"[!1592](https://gitlab.com/libeigen/eigen/-/merge_requests/1592): Improved vectorization support for PPC and ARM architectures by adding psincos implementation for double and fixing integer_packet scalar support on 32-bit ARM.",
"[!1590](https://gitlab.com/libeigen/eigen/-/merge_requests/1590): Improved AVX and SSE packet math performance by optimizing pblend operations with enhanced bitmask generation and better loop unrolling for GCC and Clang compilers.",
"[!1584](https://gitlab.com/libeigen/eigen/-/merge_requests/1584): Optimized bit masking operations in Eigen's SIMD packet math implementations by replacing floating-point comparisons with more efficient integer arithmetic and mask conversion techniques across AVX, AVX512, and SSE architectures.",
"[!1583](https://gitlab.com/libeigen/eigen/-/merge_requests/1583): Optimized the `pldexp_generic` function in Eigen's generic packet math functions, improving performance by up to 6% across SSE4.2, AVX2, and AVX512 instruction sets.",
"[!1582](https://gitlab.com/libeigen/eigen/-/merge_requests/1582): Refactored IndexedView template definitions to resolve MSVC 14.16 compiler warnings by reorganizing code in IndexedViewHelper.h and IndexedViewMethods.inc.",
"[!1581](https://gitlab.com/libeigen/eigen/-/merge_requests/1581): Improved accessors in DenseBase, Quaternions, and Translations by adding constexpr to enable compile-time computations, enhancing performance potential for these core Eigen components.",
"[!1580](https://gitlab.com/libeigen/eigen/-/merge_requests/1580): Enhances AVX512 packet operations by adding support for Packet8l in the Eigen Core architecture, improving performance and compatibility for linear algebra computations.",
"[!1578](https://gitlab.com/libeigen/eigen/-/merge_requests/1578): Updated Geometry_SIMD.h file with minor modifications, suggesting a small refinement to the SIMD geometry implementation without significant functional changes.",
"[!1574](https://gitlab.com/libeigen/eigen/-/merge_requests/1574): Improved AVX packet math implementation by adding safeguards to the `Packet4l` definition in the AVX PacketMath header, ensuring more robust handling of packet operations.",
"[!1572](https://gitlab.com/libeigen/eigen/-/merge_requests/1572): Improves AVX2 vectorization performance by fully vectorizing double to int64_t casting operations in the Eigen Core AVX architecture implementation, reducing code complexity and enhancing throughput by approximately 70%.",
"[!1569](https://gitlab.com/libeigen/eigen/-/merge_requests/1569): Improves performance of SparseMatrix and SparseVector move operations by optimizing constructors and enabling more efficient memory swapping and rvalue handling.",
"[!1564](https://gitlab.com/libeigen/eigen/-/merge_requests/1564): Improved cross product vectorization in Eigen's geometry module by modifying AVX type casting and orthogonal methods, resolving MSVC compilation issues and enhancing performance.",
"[!1562](https://gitlab.com/libeigen/eigen/-/merge_requests/1562): Improved the TriangularMatrixVector component in Eigen's Core module by adding protection against alloca usage on 32-bit ARM systems to prevent potential compatibility issues.",
"[!1557](https://gitlab.com/libeigen/eigen/-/merge_requests/1557): Improved documentation for the Jacobi module by modifying the tag placement for the `applyOnTheRight` method to ensure correct documentation rendering.",
"[!1556](https://gitlab.com/libeigen/eigen/-/merge_requests/1556): Improved CMake configuration for Eigen by reorganizing build settings, reducing configuration time, and enabling better target installation for non-top-level builds.",
"[!1555](https://gitlab.com/libeigen/eigen/-/merge_requests/1555): Improved Matrix functions by extending constexpr support in core Matrix and PlainObjectBase classes, enabling more compile-time optimizations for Matrix operations.",
"[!1549](https://gitlab.com/libeigen/eigen/-/merge_requests/1549): Improved CwiseUnaryView in Eigen's Core module by modifying const access functions to prevent unintended matrix mutations and reduce build failures.",
"[!1547](https://gitlab.com/libeigen/eigen/-/merge_requests/1547): Improved const handling in Eigen's unary views by preserving const-ness of input scalars and updating type resolution mechanisms to enhance C++20 compatibility.",
"[!1546](https://gitlab.com/libeigen/eigen/-/merge_requests/1546): Improved SSE and AVX2 vectorization support by adding optimized casting operations between double and int64_t data types, enhancing performance for tensor cast expressions across various data sizes.",
"[!1544](https://gitlab.com/libeigen/eigen/-/merge_requests/1544): Enhances SSE vectorization support for int64_t operations by modifying PacketMath.h and GenericPacketMath.h to improve performance of 64-bit integer math functions.",
"[!1543](https://gitlab.com/libeigen/eigen/-/merge_requests/1543): Improved the incomplete Cholesky decomposition by adding a method to handle diagonal element insertion in sparse matrices and enhancing parameter verification functionality.",
"[!1539](https://gitlab.com/libeigen/eigen/-/merge_requests/1539): Improved TRMV (Triangular Matrix Multiply) operation by adding support for aligned assignment and ensuring static vector allocations are properly aligned, enhancing stability for fixed-sized vectors.",
"[!1535](https://gitlab.com/libeigen/eigen/-/merge_requests/1535): Improved Eigen library's enum-enum conversions by modifying core utility and matrix header files to address and eliminate deprecated compiler warnings.",
"[!1531](https://gitlab.com/libeigen/eigen/-/merge_requests/1531): Improves BLAS product routines by adding degenerate case checks in matrix and vector operation files, preventing potential crashes with zero-sized inputs.",
"[!1530](https://gitlab.com/libeigen/eigen/-/merge_requests/1530): Improved CMake configuration by eliminating a FindCUDA warning in the CMakeLists.txt file, reducing unnecessary build warnings without changing functionality.",
"[!1527](https://gitlab.com/libeigen/eigen/-/merge_requests/1527): Improved Eigen core files by resolving shadowed typedefs in multiple header files, specifically in IndexedViewHelper.h, ArithmeticSequence.h, ColPivHouseholderQR.h, and FullPivHouseholderQR.h to enhance code stability and type resolution.",
"[!1525](https://gitlab.com/libeigen/eigen/-/merge_requests/1525): Improved sparse x dense dot product performance in Eigen's SparseCore module by applying a small optimization and adding inline keywords to methods in SparseDot.h, reducing computation time for SparseQR operations.",
"[!1523](https://gitlab.com/libeigen/eigen/-/merge_requests/1523): Improved the SparseQR implementation in Eigen's linear algebra module by optimizing the algorithm, reducing execution time from 256s to 200s.",
"[!1520](https://gitlab.com/libeigen/eigen/-/merge_requests/1520): Improves Eigen's BLAS headers by removing `using namespace Eigen` from `common.h`, preventing potential symbol collisions and reducing namespace pollution across multiple BLAS and LAPACK implementation files.",
"[!1519](https://gitlab.com/libeigen/eigen/-/merge_requests/1519): Improves Eigen's array size calculation by converting `array_size` from enum to `constexpr` in utility header files, enhancing type safety and resolving comparison issues.",
"[!1516](https://gitlab.com/libeigen/eigen/-/merge_requests/1516): Improved GPU support for the `ptanh_float` function by modifying declarations in `GenericPacketMathFunctions.h` and `MathFunctions.h` to enable correct compilation on GPU architectures.",
"[!1511](https://gitlab.com/libeigen/eigen/-/merge_requests/1511): Improves IndexedView functionality by adding direct access methods and strides, enhancing performance and flexibility for indexed views in Eigen's core library.",
"[!1510](https://gitlab.com/libeigen/eigen/-/merge_requests/1510): Improved the real Schur decomposition algorithm by adjusting shift application frequency and adding validation to the polynomial solver, enhancing numerical stability and robustness in edge cases.",
"[!1509](https://gitlab.com/libeigen/eigen/-/merge_requests/1509): Improved Eigen's math functions by renaming `generic_fast_tanh_float` to `ptanh_float` and relocating it to appropriate header files, enhancing code organization and consistency across different architectures.",
"[!1506](https://gitlab.com/libeigen/eigen/-/merge_requests/1506): Improves Eigen's trait-based system by replacing `Matrix::Options` with `traits<Matrix>::Options` across multiple eigenvalue, linear algebra, and sparse matrix header files to enhance code consistency and compatibility.",
"[!1505](https://gitlab.com/libeigen/eigen/-/merge_requests/1505): Improves AVX512 float16 packet casting in TypeCasting.h by conditionally disabling unnecessary casting when native AVX512 f16 support is available, potentially enhancing performance and preventing undefined behavior.",
"[!1503](https://gitlab.com/libeigen/eigen/-/merge_requests/1503): Improved the `digits()` function in Eigen's MathFunctions to ensure `constexpr` compatibility for custom scalar types, resolving limitations in precomputation of mantissa bits.",
"[!1499](https://gitlab.com/libeigen/eigen/-/merge_requests/1499): Improved the test/packetmath.cpp file by eliminating a compiler warning related to byte writing, using a `void*` cast to address type casting concerns.",
"[!1491](https://gitlab.com/libeigen/eigen/-/merge_requests/1491): Improved code formatting for BLAS and LAPACK C files in the Eigen library by applying consistent clang-format styling across multiple source files in the blas/f2c and lapack directories.",
"[!1483](https://gitlab.com/libeigen/eigen/-/merge_requests/1483): Improved the ComplexEigenSolver by incorporating stableNorm() to enhance numerical stability during eigenvalue computations.",
"[!1481](https://gitlab.com/libeigen/eigen/-/merge_requests/1481): Improved CI configuration for Eigen's Linux build and test environments by modifying GitLab CI files to ensure consistent GLIBC versions and compatibility with clang-6 in cross-compiled builds.",
"[!1473](https://gitlab.com/libeigen/eigen/-/merge_requests/1473): Improved documentation for LAPACK routines `second` and `dsecnd` in their respective source files, enhancing code clarity and understanding of these timing functions.",
"[!1461](https://gitlab.com/libeigen/eigen/-/merge_requests/1461): Improved the Eigen failtest suite by removing unused warnings in several const-qualified method return value test files, enhancing code clarity and reducing potential warning noise.",
"[!1459](https://gitlab.com/libeigen/eigen/-/merge_requests/1459): Improved the PlainObjectBase class in Eigen's Core module by adding the `constexpr` qualifier, enhancing the library's compile-time expression capabilities and consistency.",
"[!1456](https://gitlab.com/libeigen/eigen/-/merge_requests/1456): Improves memory safety in Eigen's Core memory utility by adding pointer validation checks before freeing memory, preventing potential invalid memory access.",
"[!1454](https://gitlab.com/libeigen/eigen/-/merge_requests/1454): Improved HVX architecture support in Eigen by modifying PacketMath.h to add half and quarter vector types, enabling better vectorization for smaller matrix sizes on Snapdragon XR2 Gen 2.",
"[!1452](https://gitlab.com/libeigen/eigen/-/merge_requests/1452): Improved documentation for basic slicing examples in Eigen's documentation, specifically updating the TutorialSlicingIndexing.dox file to enhance clarity and correctness of slicing-related explanations.",
"[!1450](https://gitlab.com/libeigen/eigen/-/merge_requests/1450): Improved the `stableNorm` implementation in Eigen's Core module to suppress a GCC warning related to potentially uninitialized memory, without changing the function's core behavior.",
"[!1443](https://gitlab.com/libeigen/eigen/-/merge_requests/1443): Updated CI configuration by adding new Linux and Windows testing scripts, replacing old CI files with a more comprehensive testing framework across different platforms.",
"[!1441](https://gitlab.com/libeigen/eigen/-/merge_requests/1441): Improved CI configuration in `checkformat.gitlab-ci.yml` to enable non-interactive `clang-format` mode, reducing manual intervention in the formatting pipeline.",
"[!1438](https://gitlab.com/libeigen/eigen/-/merge_requests/1438): Improved documentation for SparseLU, clarifying the interaction between `compute`, `analyzePattern`, and `factorize` methods to enhance user understanding of the sparse linear solver implementation.",
"[!1437](https://gitlab.com/libeigen/eigen/-/merge_requests/1437): Improved random number generation for 64-bit scalars by modifying the random generation mechanism to ensure sufficient entropy across different platforms, addressing entropy limitations with `std::rand()`.",
"[!1435](https://gitlab.com/libeigen/eigen/-/merge_requests/1435): Improved GPU testing infrastructure by modifying `test/gpu_common.h` to protect kernel launch syntax from clang-format errors across versions 13-18.",
"[!1433](https://gitlab.com/libeigen/eigen/-/merge_requests/1433): Updated .git-blame-ignore-revs file to modify git blame configuration, likely for improved code attribution or repository management.",
"[!1432](https://gitlab.com/libeigen/eigen/-/merge_requests/1432): Optimized matrix multiplication performance by implementing Eigen's internal optimizations and adding a new strongly typed algebraic matrix multiplication function across multiple benchmark and performance-related files.",
"[!1428](https://gitlab.com/libeigen/eigen/-/merge_requests/1428): Improved CI infrastructure by adding a clang-format check to ensure consistent code formatting across Eigen library commits.",
"[!1424](https://gitlab.com/libeigen/eigen/-/merge_requests/1424): Optimized the GeneralMatrixVector.h header to improve matrix-vector multiplication performance for packet sizes that are powers of two, while maintaining optimal behavior across different configurations.",
"[!1421](https://gitlab.com/libeigen/eigen/-/merge_requests/1421): Optimized the GeneralMatrixVector implementation in Eigen's core module by explicitly defining loop bounds and improving bitwise rounding operations to reduce compiler warnings and minimize performance overhead.",
"[!1404](https://gitlab.com/libeigen/eigen/-/merge_requests/1404): Improves CMake documentation build configuration to avoid unnecessary documentation generation during cross-compilation, enhancing build efficiency for cross-compilation environments.",
"[!1400](https://gitlab.com/libeigen/eigen/-/merge_requests/1400): Improves the `div_ceil` function in Eigen's Core module by passing arguments by value to prevent potential ODR-usage errors and ensure safer implicit conversion handling.",
"[!1399](https://gitlab.com/libeigen/eigen/-/merge_requests/1399): Improved warning handling in Eigen's utility headers by disabling denorm deprecation warnings for MSVC C++23, reducing build noise in compiler configurations.",
"[!1393](https://gitlab.com/libeigen/eigen/-/merge_requests/1393): Updated ROCm build configuration by replacing HIP_PATH with ROCM_PATH in CMakeLists.txt files to improve compatibility with ROCm 6.0 directory structure.",
"[!1392](https://gitlab.com/libeigen/eigen/-/merge_requests/1392): Improved CUDA device function compatibility in Transform.h by adding EIGEN_DEVICE_FUNC attribute to static run methods, resolving issues with operator * calls on device functions.",
"[!1389](https://gitlab.com/libeigen/eigen/-/merge_requests/1389): Improves GEMM MMA performance for AltiVec architecture by adding new panel modes for real and complex matrix operations, delivering significant speedups across different matrix sizes and numeric types.",
"[!1387](https://gitlab.com/libeigen/eigen/-/merge_requests/1387): Improved block expression handling in Eigen's core module by adding an explicit method to convert block of block expressions to simple blocks and removing implicit conversion operators to prevent unwinding issues.",
"[!1385](https://gitlab.com/libeigen/eigen/-/merge_requests/1385): Improved Eigen plugin headers by renaming several `.h` files to `.inc` to prevent unintended tool interactions and clarify header usage in build processes.",
"[!1381](https://gitlab.com/libeigen/eigen/-/merge_requests/1381): Updated the Boost multiprecision test file to reference new SVD tests, ensuring compatibility and maintaining test suite alignment.",
"[!1373](https://gitlab.com/libeigen/eigen/-/merge_requests/1373): Improved NumTraits by adding max_digits10 function to enhance precision handling for double types in serialization contexts, ensuring consistent behavior with standard library implementations.",
"[!1365](https://gitlab.com/libeigen/eigen/-/merge_requests/1365): Improved x86 type casting support in Eigen by adding missing pcasts for float and int conversions, simplifying the pcast enabling mechanism, and cleaning up array_cwise implementation to reduce warnings.",
"[!1364](https://gitlab.com/libeigen/eigen/-/merge_requests/1364): Optimized the `check_rows_cols_for_overflow` function in Eigen's core matrix utilities by adding partial template specialization for compile-time dimension checks, improving performance for specific matrix types with known dimensions.",
"[!1361](https://gitlab.com/libeigen/eigen/-/merge_requests/1361): Improved Altivec support by fixing compilation compatibility with C++20 and C++23 standards in the MatrixVectorProduct header, removing an unnecessary constructor name.",
"[!1357](https://gitlab.com/libeigen/eigen/-/merge_requests/1357): Improved Altivec architecture support in Eigen's Matrix-Matrix Multiplication operations by modifying the MatrixProduct.h file to comply with the EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH compilation flag.",
"[!1356](https://gitlab.com/libeigen/eigen/-/merge_requests/1356): Improved Eigen's Macros.h by ensuring EIGEN_HAS_ARM64_FP16_VECTOR_ARITHMETIC is always defined on ARM architectures, eliminating compilation warnings for clang.",
"[!1355](https://gitlab.com/libeigen/eigen/-/merge_requests/1355): Restricts FP16 arithmetic support in Eigen's NEON architecture implementation by disabling intrinsics for arm32, ensuring compatibility with Arm's developer guidelines.",
"[!1354](https://gitlab.com/libeigen/eigen/-/merge_requests/1354): Enhances AltiVec complex and packet math implementations by adding optional offset parameters to `ploadu_partial` and `pstoreu_partial` functions, improving memory access flexibility.",
"[!1351](https://gitlab.com/libeigen/eigen/-/merge_requests/1351): Improved SVD (Singular Value Decomposition) test suite by removing deprecated behavior tests and reducing resource consumption in test files.",
"[!1347](https://gitlab.com/libeigen/eigen/-/merge_requests/1347): Enhances the `Ref<const...>` construction in Eigen's Core module by adding compile-time assertions to prevent potential runtime errors and improve error detection during object construction.",
"[!1346](https://gitlab.com/libeigen/eigen/-/merge_requests/1346): Improved the `Ref<const...>` class in Eigen's Core module by adding a move constructor to reduce memory copy operations for dynamically allocated data.",
"[!1344](https://gitlab.com/libeigen/eigen/-/merge_requests/1344): Improved the `prsqrt` function in MathFunctionsImpl.h to prevent underflow errors by enhancing numerical stability for small input values.",
"[!1342](https://gitlab.com/libeigen/eigen/-/merge_requests/1342): Improved the `rsqrt` function in MathFunctionsImpl.h by reducing the maximum relative error from 3 to 2 ulps, enhancing numerical precision for floating-point square root calculations.",
"[!1341](https://gitlab.com/libeigen/eigen/-/merge_requests/1341): Improved GPU tensor benchmarks by replacing CudaStreamDevice with GpuStreamDevice in benchmark files, resolving a stream device usage issue.",
"[!1338](https://gitlab.com/libeigen/eigen/-/merge_requests/1338): Improved error handling in scalar_unary_pow_op for integer base and exponent operations, reducing code complexity and optimizing performance in Eigen's core mathematical functions.",
"[!1336](https://gitlab.com/libeigen/eigen/-/merge_requests/1336): Improved Redux library linear access evaluators by implementing new traversal methods for scalar and vectorized unrolled traversals, enhancing performance and simplifying traversal logic.",
"[!1334](https://gitlab.com/libeigen/eigen/-/merge_requests/1334): Improved the unrolled assignment evaluator in Eigen's Core module by fixing data access interfaces and changing template parameter names to reduce unpredictable access patterns.",
"[!1325](https://gitlab.com/libeigen/eigen/-/merge_requests/1325): Improved the array_cwise test file by suppressing compiler warnings and renaming the test to avoid potential naming conflicts with tensor array() function.",
"[!1321](https://gitlab.com/libeigen/eigen/-/merge_requests/1321): Improved the array_cwise test file by addressing MSVC compiler warnings and removing redundant shift tests, enhancing code clarity and robustness for Windows builds.",
"[!1317](https://gitlab.com/libeigen/eigen/-/merge_requests/1317): Improves F32 to BF16 conversion performance in AltiVec architecture by unrolling conversion loops, achieving 1.8X faster conversions for LLVM and implementing vector pair optimizations for GCC.",
"[!1316](https://gitlab.com/libeigen/eigen/-/merge_requests/1316): Improved SSE packet math support by adding `pcmp`, `pmin`, and `pmax` functions to `Packet4ui` in the SSE PacketMath header, enabling better compilation for SSE4.1 vector types.",
"[!1313](https://gitlab.com/libeigen/eigen/-/merge_requests/1313): Improved AVX2 packet math support by adding `pmul` and `abs2` operations for 4 unsigned 64-bit integers in the `Packet4ul` implementation, resolving compilation issues and enhancing vectorized operation performance.",
"[!1311](https://gitlab.com/libeigen/eigen/-/merge_requests/1311): Improved sparse matrix iterator compatibility by making `StorageRef` move-able and addressing deprecated `std::random_shuffle` warnings in sparse matrix test files.",
"[!1307](https://gitlab.com/libeigen/eigen/-/merge_requests/1307): Improved VSX BF16 GEMV performance for Power architectures, achieving up to 6.7X faster matrix-vector multiplication operations through optimized implementations in AltiVec matrix processing files.",
"[!1304](https://gitlab.com/libeigen/eigen/-/merge_requests/1304): Improved type casting performance in Eigen's Core module by optimizing AVX instruction handling, specializing scalar cast evaluators, and reducing overhead for complex cast expressions.",
"[!1301](https://gitlab.com/libeigen/eigen/-/merge_requests/1301): Improves Eigen's Euler angles functionality by implementing canonical range enforcement for Tait-Bryan and proper Euler angle transformations, with a new optional parameter to maintain backward compatibility.",
"[!1295](https://gitlab.com/libeigen/eigen/-/merge_requests/1295): Refactored IndexedView implementation to reduce SFINAE complexity and improve maintainability, simplifying the public API and re-enabling raw, fixed-size array access.",
"[!1293](https://gitlab.com/libeigen/eigen/-/merge_requests/1293): Improves the AVX512 GEMM kernel by enabling the new kernel as a default option in the Eigen Core module, enhancing performance for matrix operations on modern CPU architectures.",
"[!1288](https://gitlab.com/libeigen/eigen/-/merge_requests/1288): Updated documentation files in the 3.4.x branch to reflect recent code changes, focusing on examples and tutorial documentation for matrix operations and indexing.",
"[!1286](https://gitlab.com/libeigen/eigen/-/merge_requests/1286): Enhances symbolic indexed view safety in Eigen by adding an explicit l-value qualifier to prevent incorrect view usage and fix potential type mismatches in Map expressions.",
"[!1284](https://gitlab.com/libeigen/eigen/-/merge_requests/1284): Cleaned up packet math implementations across multiple architecture-specific files by removing unused traits and adding missing specializations for `pselect` and `pblend` operations.",
"[!1279](https://gitlab.com/libeigen/eigen/-/merge_requests/1279): Refactored IndexedViewMethods to reduce code duplication and enhance flexibility, enabling non-const reference access for indexed views with symbolic indices in the Eigen core module.",
"[!1276](https://gitlab.com/libeigen/eigen/-/merge_requests/1276): Improved the `generic_rsqrt_newton_step` function in MathFunctionsImpl.h by optimizing operation order and reducing floating point comparisons, resulting in better accuracy and slightly faster AVX path performance.",
"[!1275](https://gitlab.com/libeigen/eigen/-/merge_requests/1275): Improved x86 vectorized type casting by adding missing int vectorization support and removing redundant unit tests in Eigen's architecture-specific type casting headers.",
"[!1274](https://gitlab.com/libeigen/eigen/-/merge_requests/1274): Improved AVX2 float-to-bool type casting performance in Eigen's TypeCasting.h by optimizing the conversion routine, resulting in significant speedups for large data sizes.",
"[!1273](https://gitlab.com/libeigen/eigen/-/merge_requests/1273): Improved Eigen's core utility files by replacing internal pointer type definitions with standard C++ pointer types, enhancing compatibility with CHERI/Morello architecture and removing unnecessary Intel compiler workarounds.",
"[!1272](https://gitlab.com/libeigen/eigen/-/merge_requests/1272): Optimized type casting operations for x86_64 architectures in Eigen's AVX, SSE, and AVX512 TypeCasting headers, improving performance for tensor operations and bool casting on x86_64 platforms.",
"[!1267](https://gitlab.com/libeigen/eigen/-/merge_requests/1267): Improved documentation quality by fixing typos across multiple Eigen documentation files, including Constants.h, example code, and documentation pages.",
"[!1264](https://gitlab.com/libeigen/eigen/-/merge_requests/1264): Improved MathFunctions.h in Eigen's Core module by using the EIGEN_NOT_A_MACRO macro to resolve build configuration conflicts with TensorFlow.",
"[!1262](https://gitlab.com/libeigen/eigen/-/merge_requests/1262): Improved GitLab CI configuration for PowerPC builds by limiting the number of build and link jobs to reduce out-of-memory (OOM) issues during continuous integration.",
"[!1260](https://gitlab.com/libeigen/eigen/-/merge_requests/1260): Improved Eigen's math functions by adding C++11 standard features for detecting Inf and NaN, enhancing compiler compatibility in core numeric detection methods.",
"[!1259](https://gitlab.com/libeigen/eigen/-/merge_requests/1259): Improved the MatrixProductMMAbfloat16.h file by adding deadcode checks to prevent unused code from being optimized away, maintaining code integrity in the Eigen library's core architecture.",
"[!1257](https://gitlab.com/libeigen/eigen/-/merge_requests/1257): Improved the minmax visitor in Eigen's Visitor.h to handle PropagateFast consistently with PropagateNaN, ensuring correct propagation logic for matrices with all NaN values.",
"[!1255](https://gitlab.com/libeigen/eigen/-/merge_requests/1255): Improves GEMV performance for BF16 data types on Power architecture by implementing MMA (Matrix Multiply Assist) instructions in AltiVec-specific matrix vector product files.",
"[!1254](https://gitlab.com/libeigen/eigen/-/merge_requests/1254): Improved the Select implementation in Eigen's core module by swapping template argument order to maintain backwards compatibility with legacy code.",
"[!1253](https://gitlab.com/libeigen/eigen/-/merge_requests/1253): Improved packetmath specializations across multiple backend architectures by introducing a macro to reduce code duplication and enhance maintainability in Eigen's core math functions.",
"[!1251](https://gitlab.com/libeigen/eigen/-/merge_requests/1251): Improved the CommonCwiseBinaryOps.h header file by adding a newline character at the end, ensuring proper file formatting and consistency.",
"[!1244](https://gitlab.com/libeigen/eigen/-/merge_requests/1244): Enhances LU decomposition classes by adding support for specifying permutation index types, improving compatibility with Lapacke ILP64 interfaces and providing more flexibility in handling matrix decompositions.",
"[!1242](https://gitlab.com/libeigen/eigen/-/merge_requests/1242): Optimizes eigenvalue computation in SelfAdjointEigenSolver by pre-allocating workspace columns and improving memory allocation for in-place tridiagonalization.",
"[!1241](https://gitlab.com/libeigen/eigen/-/merge_requests/1241): Improves CMake configuration in Eigen's build system by conditionally setting cache variables only when Eigen is the top-level project, preventing unintended side effects in external project builds.",
"[!1236](https://gitlab.com/libeigen/eigen/-/merge_requests/1236): Improved bfloat16 GEMM MMA performance on Power architecture by adding partial linear access for LHS and Output, reducing memory loads and achieving 30% faster execution.",
"[!1234](https://gitlab.com/libeigen/eigen/-/merge_requests/1234): Improved BLAS/LAPACK header organization by removing unused declarations and restructuring header files into more logical directories for better maintainability.",
"[!1233](https://gitlab.com/libeigen/eigen/-/merge_requests/1233): Improved Eigen's visitor performance by vectorizing `any()` and `all()` operations, enabling short-circuit evaluation and linear access for more efficient matrix traversals.",
"[!1232](https://gitlab.com/libeigen/eigen/-/merge_requests/1232): Improved GPU device support by guarding long double usage in core Eigen utility files, preventing warnings and ensuring consistent behavior across CUDA/HIP implementations.",
"[!1226](https://gitlab.com/libeigen/eigen/-/merge_requests/1226): Optimized the `pow()` function in Eigen's generic packet math functions by using `pmsub` instruction, achieving a ~1% performance improvement on Skylake architecture.",
"[!1224](https://gitlab.com/libeigen/eigen/-/merge_requests/1224): Improved Power10 architecture support in Eigen by adding packet integer division operations to the AltiVec PacketMath header, enhancing performance for integer packet computations.",
"[!1223](https://gitlab.com/libeigen/eigen/-/merge_requests/1223): Improves mathematical functions in Eigen's core library by adding vectorized implementations of atanh, completing atan support for half-precision floats, and enhancing unit tests for unary mathematical functors across multiple architectures.",
"[!1221](https://gitlab.com/libeigen/eigen/-/merge_requests/1221): Improved complex sqrt functionality in AVX512 Complex header by adding compiler compatibility guards for older MSVC versions, preventing potential compilation failures on legacy systems.",
"[!1219](https://gitlab.com/libeigen/eigen/-/merge_requests/1219): Optimized the `pasin_float` function in the core Eigen library, reducing runtime by ~11% with AVX support, and fixed `psqrt_complex` to handle special cases more accurately.",
"[!1215](https://gitlab.com/libeigen/eigen/-/merge_requests/1215): Reduced compiler warnings in Eigen test files across multiple test modules, addressing potential compilation issues and improving code quality.",
"[!1214](https://gitlab.com/libeigen/eigen/-/merge_requests/1214): Optimized BF16 to F32 array conversions in Power architecture's MatrixProductMMAbfloat16.h, reducing vector instruction overhead and improving conversion performance.",
"[!1213](https://gitlab.com/libeigen/eigen/-/merge_requests/1213): Improved compiler warning handling across multiple Eigen core source files by addressing potential warnings in BinaryFunctors.h, TriangularMatrixVector.h, PlainObjectBase.h, Jacobi.h, and unalignedcount.cpp.",
"[!1210](https://gitlab.com/libeigen/eigen/-/merge_requests/1210): Improves bfloat16 Matrix Multiply-Accumulate (MMA) performance in Eigen's AltiVec architecture by optimizing accumulator usage, resulting in up to 10% speed improvement through more efficient column calculations and increased accumulator count.",
"[!1209](https://gitlab.com/libeigen/eigen/-/merge_requests/1209): Improves diagonal matrix printing functionality in Eigen's Core module by adding direct support for printing diagonal matrix expressions without requiring dense object assignment.",
"[!1208](https://gitlab.com/libeigen/eigen/-/merge_requests/1208): Improved AltiVec matrix product implementation by reverting ODR changes and inlining `gemm_extra_cols` and `gemm_complex_extra_cols` functions to reduce external function calls and enhance performance.",
"[!1207](https://gitlab.com/libeigen/eigen/-/merge_requests/1207): Optimized the `psign` function in Eigen's core architecture by reducing logical operations and improving AVX2 instruction performance.",
"[!1206](https://gitlab.com/libeigen/eigen/-/merge_requests/1206): Improved the ColPivHouseholderQR_LAPACKE.h header by enhancing LAPACKE complex type compatibility through specialized type translation mechanisms for std::complex types.",
"[!1199](https://gitlab.com/libeigen/eigen/-/merge_requests/1199): Improved Eigen library headers by adding Include What You Use (IWYU) export pragmas across multiple modules to enhance tooling support and header inclusion management.",
"[!1198](https://gitlab.com/libeigen/eigen/-/merge_requests/1198): Improved Power library performance by replacing `eigen_assert` with `eigen_internal_assert` in AltiVec complex and packet math files, reducing unnecessary error checking and assertions.",
"[!1191](https://gitlab.com/libeigen/eigen/-/merge_requests/1191): Improved LAPACKE configuration by modifying complex type definitions and adding support for 64-bit integer LAPACK bindings, enhancing compatibility with external LAPACK libraries.",
"[!1190](https://gitlab.com/libeigen/eigen/-/merge_requests/1190): Improved test code in `test/array_for_matrix.cpp` by replacing comparison methods with `VERIFY_IS_EQUAL` macro for better clarity and consistency in zero comparisons.",
"[!1189](https://gitlab.com/libeigen/eigen/-/merge_requests/1189): Improved the SkewSymmetricMatrix3.h header by adding CUDA device function qualifiers, enabling CUDA kernel compatibility for the SkewSymmetric<> template.",
"[!1176](https://gitlab.com/libeigen/eigen/-/merge_requests/1176): Improved Eigen's packet mathematical operations by fixing edge cases and optimizing functions like atan, pow, and acos, with specific attention to handling special values like -0.",
"[!1175](https://gitlab.com/libeigen/eigen/-/merge_requests/1175): Improved Eigen's atan2 implementation across multiple math-related header files, adding support for numext and patan2 functions while fixing a test case bug in binary operations.",
"[!1174](https://gitlab.com/libeigen/eigen/-/merge_requests/1174): Improved bfloat16 Matrix Multiply Accumulate (MMA) performance on AltiVec by optimizing packing and processing for non-aligned row and column sizes, specifically modifying matrix product handling in AltiVec architecture files.",
"[!1172](https://gitlab.com/libeigen/eigen/-/merge_requests/1172): Refactored SparseMatrix.h to improve code consistency by removing encapsulation wrappers and enabling more direct access to internal data members.",
"[!1170](https://gitlab.com/libeigen/eigen/-/merge_requests/1170): Improved sparse matrix insertion performance by optimizing memory allocation and insertion strategies in CompressedStorage and SparseMatrix, reducing overhead for large matrix operations.",
"[!1169](https://gitlab.com/libeigen/eigen/-/merge_requests/1169): Updated CMake configuration in EigenTesting.cmake by replacing the deprecated `$<CONFIGURATION>` generator expression with the newer `$<CONFIG>` to ensure compatibility with CMake 3.0 and later versions.",
"[!1168](https://gitlab.com/libeigen/eigen/-/merge_requests/1168): Improves thread safety for the `is_malloc_allowed()` function in Eigen's Memory.h by adding thread-local state support, with an optional compiler flag to disable thread-local behavior for single-threaded applications.",
"[!1167](https://gitlab.com/libeigen/eigen/-/merge_requests/1167): Improved the ColPivHouseholderQR implementation in Eigen by modifying QR decomposition files to avoid move assignment compiler issues, reducing potential compilation errors.",
"[!1166](https://gitlab.com/libeigen/eigen/-/merge_requests/1166): Improves Eigen's assert mechanism by adding a custom ODR-safe assert implementation in Core utility headers, supporting multiple compilers and resolving potential ODR violations in header files.",
"[!1165](https://gitlab.com/libeigen/eigen/-/merge_requests/1165): Improved Eigen's core utility functions by adding missing `EIGEN_DEVICE_FUNC` annotations and removing an outdated GCC 4.7 workaround to reduce potential undefined behavior in assert contexts.",
"[!1160](https://gitlab.com/libeigen/eigen/-/merge_requests/1160): Improves SparseMatrix insertion strategy by modifying the insertion process to optimize performance, reducing compression overhead and providing more flexible matrix manipulation.",
"[!1158](https://gitlab.com/libeigen/eigen/-/merge_requests/1158): Improved the help message in spbenchsolver to clarify matrix naming conventions for SPD matrices, enhancing user guidance in the benchmarking tool.",
"[!1154](https://gitlab.com/libeigen/eigen/-/merge_requests/1154): Improved Power10 MMA bfloat16 GEMM performance in Eigen's AltiVec architecture by optimizing matrix product handling, including data packing, indexing, and hardware conversions, resulting in significant speedups across different compilers.",
"[!1148](https://gitlab.com/libeigen/eigen/-/merge_requests/1148): Improves Eigen's memory allocation handling by adding runtime checks for malloc, realloc, and free() functions in the Core memory utility, preventing unexpected heap crashes and providing better memory management visibility.",
"[!1146](https://gitlab.com/libeigen/eigen/-/merge_requests/1146): Improved NEON architecture support by enabling complex number operations like `pcmp`, `plset`, and `psqrt`, enhancing performance for complex number handling in the Eigen library's NEON implementation.",
"[!1145](https://gitlab.com/libeigen/eigen/-/merge_requests/1145): Adjusted thresholds in bfloat16 product tests to improve test reliability by increasing comparison tolerances in the test/product.h file.",
"[!1141](https://gitlab.com/libeigen/eigen/-/merge_requests/1141): Improved NEON packet math support by enabling `pabs` operations for unsigned integer types (uint16_t, uint32_t, uint64_t), addressing a discrepancy in packet traits implementation.",
"[!1140](https://gitlab.com/libeigen/eigen/-/merge_requests/1140): Improved SparseLU implementation by removing deprecated code, fixing a subtle bug in SparseLUTransposeView, and enhancing compatibility with BLAS backends to reduce numerical instability in sparse LU computations.",
"[!1139](https://gitlab.com/libeigen/eigen/-/merge_requests/1139): Improved the CompressedStorageIterator by adding comparison and arithmetic operators, enhancing its usability and compatibility with RandomAccessIterator requirements.",
"[!1138](https://gitlab.com/libeigen/eigen/-/merge_requests/1138): Improved the `numext::signbit` test suite by adding a new test case in `test/numext.cpp` to enhance function behavior validation and test coverage.",
"[!1137](https://gitlab.com/libeigen/eigen/-/merge_requests/1137): Improved test suite compatibility by replacing std::signbit with numext::signbit in array_cwise.cpp to support bfloat16 type testing.",
"[!1136](https://gitlab.com/libeigen/eigen/-/merge_requests/1136): Improved compiler version compatibility across Eigen's vectorization architectures (AVX, SSE, NEON, etc.) by reviewing and cleaning up compiler version checks in core library and test files.",
"[!1135](https://gitlab.com/libeigen/eigen/-/merge_requests/1135): Improves Eigen's Core module by replacing `std::raise()` with a more compatible division by zero handling mechanism, ensuring better support for embedded systems and platforms without `<csignal>`.",
"[!1134](https://gitlab.com/libeigen/eigen/-/merge_requests/1134): Optimized the `equalspace` packet operation in the `NullaryFunctors.h` file to improve performance of the Eigen library's core functionality.",
"[!1131](https://gitlab.com/libeigen/eigen/-/merge_requests/1131): Improved Power10 cache size configuration in GeneralBlockPanelKernel.h to optimize matrix operation performance, specifically enhancing sub-matrix splitting and GEMM operation efficiency.",
"[!1128](https://gitlab.com/libeigen/eigen/-/merge_requests/1128): Improves NestByValue functionality by enabling direct access support, reducing overhead and enhancing performance for expressions with direct access capabilities.",
"[!1122](https://gitlab.com/libeigen/eigen/-/merge_requests/1122): Reduced compiler warnings in Eigen test files by modifying test/adjoint.cpp, test/array_cwise.cpp, and test/visitor.cpp to address narrowing conversions and deprecated function usage.",
"[!1119](https://gitlab.com/libeigen/eigen/-/merge_requests/1119): Improved AltiVec architecture source code by adding brackets around unsigned type names in the PacketMath.h file, enhancing code readability and formatting consistency.",
"[!1114](https://gitlab.com/libeigen/eigen/-/merge_requests/1114): Improves the BiCGSTAB solver in Eigen's iterative linear solvers by modifying parameter initialization to support custom types, enhancing flexibility for users implementing specialized linear solving methods.",
"[!1110](https://gitlab.com/libeigen/eigen/-/merge_requests/1110): Improved the DenseStorage.h file by removing an unused parameter name, enhancing code clarity and reducing potential confusion in the implementation.",
"[!1101](https://gitlab.com/libeigen/eigen/-/merge_requests/1101): Improves memory management in Eigen's Memory.h utility by modifying aligned allocation handling to store a 1-byte offset instead of absolute address, enhancing allocation compatibility and alignment precision.",
"[!1100](https://gitlab.com/libeigen/eigen/-/merge_requests/1100): Improved matrix resizing functionality in Eigen's core storage module by adding support for resizing empty dynamic matrices, fixing dimension reporting and flexibility for linear algebra operations.",
"[!1099](https://gitlab.com/libeigen/eigen/-/merge_requests/1099): Improves the SparseMap class by explicitly requiring sorted indices, enhancing the consistency and correctness of sparse matrix operations in the Eigen library.",
"[!1095](https://gitlab.com/libeigen/eigen/-/merge_requests/1095): Improved Eigen's mathematical function testing by refactoring special values tests for pow and atan2 functions in BinaryFunctors.h and array_cwise.cpp, enhancing test coverage and code maintainability.",
"[!1093](https://gitlab.com/libeigen/eigen/-/merge_requests/1093): Improved the atan2 function in Eigen's Core module to handle NaN inputs more robustly, ensuring consistent mathematical behavior when encountering undefined values.",
"[!1091](https://gitlab.com/libeigen/eigen/-/merge_requests/1091): Improved clang-format configuration by adding new attribute macros to enhance automated code formatting consistency in the Eigen library.",
"[!1090](https://gitlab.com/libeigen/eigen/-/merge_requests/1090): Improves Eigen's core matrix and array constructors by enabling std::initializer_list support in constexpr expressions, enhancing compile-time initialization capabilities across multiple core header files.",
"[!1089](https://gitlab.com/libeigen/eigen/-/merge_requests/1089): Improved Eigen's math support by unconditionally enabling CXX11 math features for C++14 and newer compilers, modifying core header files to ensure consistent and enhanced mathematical functionality.",
"[!1088](https://gitlab.com/libeigen/eigen/-/merge_requests/1088): Improved Eigen's assertion mechanism by systematically replacing standard `assert` with `eigen_assert` across multiple source files, enabling more consistent and controllable assertion handling.",
"[!1087](https://gitlab.com/libeigen/eigen/-/merge_requests/1087): Improved the `atan<float>()` implementation in Eigen's default packet math functions by simplifying the range reduction strategy, resulting in 20-40% performance gains on x86 architectures.",
"[!1086](https://gitlab.com/libeigen/eigen/-/merge_requests/1086): Improved Altivec vectorization in MathFunctions.h by conditionally enabling atan<double> vectorization only when VSX is available, reducing unnecessary vectorization.",
"[!1084](https://gitlab.com/libeigen/eigen/-/merge_requests/1084): Improved Eigen's vectorized math functions by optimizing the `atan()` implementation for double-precision arguments across multiple architecture-specific headers, enhancing computational performance and accuracy.",
"[!1083](https://gitlab.com/libeigen/eigen/-/merge_requests/1083): Improved the GEBP (General Block Panel) kernel in GeneralBlockPanelKernel.h to reduce memory usage for non-ARM targets, specifically addressing potential heap memory issues when building with MSVC.",
"[!1079](https://gitlab.com/libeigen/eigen/-/merge_requests/1079): Improved the GEBP (General Block Panel) kernel compilation process by applying `EIGEN_IF_CONSTEXPR` in AVX512 and kernel-related files to reduce compilation time and memory usage.",
"[!1078](https://gitlab.com/libeigen/eigen/-/merge_requests/1078): Improved the NEON GEBP kernel by adding a macro to set the `nr` trait, enhancing support for matrix operations on NEON architecture in the Eigen library.",
"[!1075](https://gitlab.com/libeigen/eigen/-/merge_requests/1075): Improves complex number sign function handling in Eigen's core math functions by optimizing generic packet math functions to avoid inefficient sign calculations when vectorization is not supported.",
"[!1066](https://gitlab.com/libeigen/eigen/-/merge_requests/1066): Improves Eigen's pow() functionality by modifying UnaryFunctors.h and NumTraits.h to support mixed type power operations with better type handling and flexibility.",
"[!1058](https://gitlab.com/libeigen/eigen/-/merge_requests/1058): Improved GPU packet comparison operators in PacketMath.h to enable vectorized implementation of psign, resolving a build issue for CUDA compilation.",
"[!1057](https://gitlab.com/libeigen/eigen/-/merge_requests/1057): Improved pow test bounds in array_cwise.cpp to prevent integer overflow, ensuring more stable CI test execution.",
"[!1056](https://gitlab.com/libeigen/eigen/-/merge_requests/1056): Reduced compiler warnings in Eigen's test infrastructure by modifying GenericPacketMathFunctions.h and array_cwise.cpp to improve build stability and code clarity.",
"[!1055](https://gitlab.com/libeigen/eigen/-/merge_requests/1055): Improves memory allocation handling in Eigen's Memory.h by adding a runtime malloc check in the aligned_realloc() function to prevent reallocation when EIGEN_RUNTIME_NO_MALLOC is defined.",
"[!1052](https://gitlab.com/libeigen/eigen/-/merge_requests/1052): Improved CMake configuration by removing default benchmark building and fixing test dependency issues in sparse library environments.",
"[!1050](https://gitlab.com/libeigen/eigen/-/merge_requests/1050): Improves IndexedView robustness by adding index-out-of-bounds assertion checks in the IndexedView class to prevent invalid memory accesses.",
"[!1049](https://gitlab.com/libeigen/eigen/-/merge_requests/1049): Improved documentation in the slicing tutorial by correcting two typos in the 3rd table, replacing \"vector `v`\" with \"matrix `A`\" to enhance clarity and accuracy.",
"[!1048](https://gitlab.com/libeigen/eigen/-/merge_requests/1048): Improved complex power operations in Eigen's UnaryFunctors by fixing return type handling and removing unnecessary const qualifiers for better compatibility with ScalarBinaryOpTraits.",
"[!1046](https://gitlab.com/libeigen/eigen/-/merge_requests/1046): Improved Eigen's complex number support by re-enabling the `pow()` function for complex types across multiple core header files, expanding mathematical operation capabilities for complex number handling.",
"[!1043](https://gitlab.com/libeigen/eigen/-/merge_requests/1043): Improves Eigen's vectorized pow operations for integer types by adding support for negative exponents and preventing undefined behavior during integer overflow across multiple architecture-specific packet math implementations.",
"[!1040](https://gitlab.com/libeigen/eigen/-/merge_requests/1040): Improves AVX2 performance by specializing the `psign<Packet8i>` function with a more efficient implementation and removing unnecessary vectorization for `psign<bool>`.",
"[!1038](https://gitlab.com/libeigen/eigen/-/merge_requests/1038): Improved Eigen's vectorized math functions by implementing optimized implementations of acos(), asin(), and atan() for float across multiple hardware architectures, delivering significant performance speedups of up to 29.5x.",
"[!1037](https://gitlab.com/libeigen/eigen/-/merge_requests/1037): Improved AVX PacketMath implementation by protecting the new pblend implementation with EIGEN_VECTORIZE_AVX2, ensuring compatibility with AVX2 architectures.",
"[!1036](https://gitlab.com/libeigen/eigen/-/merge_requests/1036): Improves Sparse Core memory management by replacing malloc/free with conditional_aligned storage in CompressedStorage, SparseAssign, and SparseMatrix, enabling better heap allocation tracking and potential vectorization performance gains.",
"[!1035](https://gitlab.com/libeigen/eigen/-/merge_requests/1035): Improved AVX512 packetmath intrinsics by removing unnecessary FP16C flag checks in PacketMath.h, enabling better performance for users with AVX512F support.",
"[!1034](https://gitlab.com/libeigen/eigen/-/merge_requests/1034): Improved the `pow<double>` implementation in GenericPacketMathFunctions.h by replacing the approximate reciprocal division algorithm with a more accurate method, resulting in an 11-15% performance speedup.",
"[!1032](https://gitlab.com/libeigen/eigen/-/merge_requests/1032): Improved BDCSVD warning handling by modifying Macros.h and BDCSVD.h to disable unnecessary deprecated warnings when no computation options are set in the SVD computation.",
"[!1026](https://gitlab.com/libeigen/eigen/-/merge_requests/1026): Improves performance of the sign operator in Eigen by vectorizing operations for real and complex types across SSE, AVX, and AVX512 architectures, targeting core mathematical functions and packet math implementations.",
"[!1024](https://gitlab.com/libeigen/eigen/-/merge_requests/1024): Improved PowerPC GEMM real-only operations by adding partial packet support and optimizing code, resulting in up to 40% reduction in binary size for AltiVec architecture.",
"[!1021](https://gitlab.com/libeigen/eigen/-/merge_requests/1021): Updated AccelerateSupport documentation to align with recent code changes, ensuring accuracy of the module's reference materials.",
"[!1020](https://gitlab.com/libeigen/eigen/-/merge_requests/1020): Improves the ConjugateGradient solver by adding support for `numext::sqrt`, enabling more flexible usage with custom floating-point types like `__float128`.",
"[!1019](https://gitlab.com/libeigen/eigen/-/merge_requests/1019): Improved Eigen's Core and SparseCore modules to avoid including <sstream> when EIGEN_NO_IO is defined, enabling better compatibility with embedded environments using libc++ without localization support.",
"[!1018](https://gitlab.com/libeigen/eigen/-/merge_requests/1018): Improved arm64-neon matrix multiplication performance by implementing larger gebp_kernel sizes (3px8, 2px8, 1px8) in GeneralBlockPanelKernel to optimize register usage and data reuse.",
"[!1016](https://gitlab.com/libeigen/eigen/-/merge_requests/1016): Improved Eigen's vectorization support for enscripten by adding the `immintrin.h` header to the ConfigureVectorization.h file, enabling better performance for vectorized operations.",
"[!1015](https://gitlab.com/libeigen/eigen/-/merge_requests/1015): Improved AVX512 GEMM kernels by disabling them by default to prevent potential segfaults in applications, modifying core architecture-specific header files.",
"[!1013](https://gitlab.com/libeigen/eigen/-/merge_requests/1013): Improved AVX512 GEBP kernels by adding compiler flag options to enable or disable AVX512 support in GemmKernel.h and TrsmKernel.h, reducing related warnings.",
"[!1012](https://gitlab.com/libeigen/eigen/-/merge_requests/1012): Improved Jacobi rotation vectorization in Eigen by modifying the vectorization check logic in Jacobi.h and jacobi.cpp, enabling better compiler optimization for fixed-size code paths.",
"[!1011](https://gitlab.com/libeigen/eigen/-/merge_requests/1011): Improved AVX implementation of pblend in PacketMath.h by removing vcvtdq2ps instruction and optimizing integer-based operations, resulting in a 24.84% performance boost for blend operations.",
"[!1009](https://gitlab.com/libeigen/eigen/-/merge_requests/1009): Improved doxygen group definitions in PlainObjectBase.h to ensure consistent and correct documentation generation for Eigen's core classes.",
"[!1005](https://gitlab.com/libeigen/eigen/-/merge_requests/1005): Improved GPU unit tests by enabling device side malloc functionality for ROCm 5.2, specifically modifying the test/gpu_example.cu file to restore compatibility with the latest ROCm version.",
"[!1003](https://gitlab.com/libeigen/eigen/-/merge_requests/1003): Reduces compiler warnings in the TriangularSolverMatrix header by adding conditional macro checks for AVX512 support, preventing undefined macro warnings during non-AVX512 builds.",
"[!1000](https://gitlab.com/libeigen/eigen/-/merge_requests/1000): Optimized GEMV performance for Power10 architecture by improving vector pair load and store operations in the AltiVec matrix-vector product implementation.",
"[!999](https://gitlab.com/libeigen/eigen/-/merge_requests/999): Improves the Householder module by replacing a custom square root function with Eigen's `numext::sqrt`, simplifying implementation and reducing potential header include order issues.",
"[!998](https://gitlab.com/libeigen/eigen/-/merge_requests/998): Improved VSX architecture performance by vectorizing tanh and erf mathematical functions in the AltiVec PacketMath header, optimizing computational efficiency for these specific operations.",
"[!997](https://gitlab.com/libeigen/eigen/-/merge_requests/997): Improved AVX512 TRSM (triangular solver matrix) kernels to conditionally use alloca for workspace allocation when EIGEN_NO_MALLOC is requested, optimizing memory management without sacrificing performance.",
"[!996](https://gitlab.com/libeigen/eigen/-/merge_requests/996): Improved Eigen's Constants.h to align with SYCL-2020 specification for kernel names, ensuring C++ type compliance and forward declarability in kernel definitions.",
"[!995](https://gitlab.com/libeigen/eigen/-/merge_requests/995): Improved documentation for the DiagonalBase class in the Eigen Core module by adding comprehensive documentation and cleaning up code formatting in the DiagonalMatrix.h header file.",
"[!994](https://gitlab.com/libeigen/eigen/-/merge_requests/994): Improved the `index_remap` function in `Reshaped.h` by marking it with `EIGEN_DEVICE_FUNC`, enabling its usage in GPU code and resolving compatibility issues.",
"[!993](https://gitlab.com/libeigen/eigen/-/merge_requests/993): Corrected a documentation typo in the Matrix class tutorial, fixing the description of row and column vector roles to improve clarity for users.",
"[!992](https://gitlab.com/libeigen/eigen/-/merge_requests/992): Improves AVX512 TRSM kernels by modifying memory allocation to respect EIGEN_NO_MALLOC, splitting kernel implementations for better control, and enhancing compatibility with Eigen's memory management constraints.",
"[!985](https://gitlab.com/libeigen/eigen/-/merge_requests/985): Improved SVE architecture's logical shift operations in PacketMath.h, fixing a typo and enhancing vector operation implementations.",
"[!984](https://gitlab.com/libeigen/eigen/-/merge_requests/984): Simplified header file configurations in Eigen's core and unsupported modules by removing unused executable flags, primarily targeting MKL support and utility headers.",
"[!972](https://gitlab.com/libeigen/eigen/-/merge_requests/972): Improved AVX512 performance optimizations for Eigen's GEMM and matrix operations by adding new kernel implementations and modifying related architecture-specific files for better computational efficiency.",
"[!969](https://gitlab.com/libeigen/eigen/-/merge_requests/969): Improves CMakeLists.txt by adding a safety check to prevent duplicate `uninstall` target definitions, ensuring better compatibility with FetchContent and other build systems.",
"[!968](https://gitlab.com/libeigen/eigen/-/merge_requests/968): Improves DiagonalMatrix methods by adding constexpr to cols() and rows() methods, enhancing compile-time usability and consistency with Eigen's existing constexpr support.",
"[!967](https://gitlab.com/libeigen/eigen/-/merge_requests/967): Improved AltiVec GEMM and GEMV implementations by optimizing vector pair loading and simplifying scalar operations, focusing on performance enhancements in matrix-matrix and matrix-vector product computations.",
"[!966](https://gitlab.com/libeigen/eigen/-/merge_requests/966): Simplified Accelerate support for symmetric matrix operations by removing the need to explicitly supply the Symmetric flag for LLT and LDLT solvers, reducing boilerplate code complexity.",
"[!962](https://gitlab.com/libeigen/eigen/-/merge_requests/962): Optimized Householder sequence implementation in BlockHouseholder.h and HouseholderSequence.h to reduce memory allocations and improve performance when applying Householder transformations to vectors with fixed column sizes.",
"[!960](https://gitlab.com/libeigen/eigen/-/merge_requests/960): Improved AVX512 support in Eigen's Core module by removing AVX512VL dependency in trsm implementation, switching to more generic intrinsics that maintain performance with `-march=native`.",
"[!959](https://gitlab.com/libeigen/eigen/-/merge_requests/959): Improved AVX512 implementation by restricting trsm to AVX512VL and renaming related files to align with Eigen's naming conventions, enhancing code organization and consistency in the Core module.",
"[!953](https://gitlab.com/libeigen/eigen/-/merge_requests/953): Improves DiagonalMatrix constructor by resolving ambiguity in initializer list construction, ensuring more consistent and predictable initialization behavior for the DiagonalMatrix class.",
"[!952](https://gitlab.com/libeigen/eigen/-/merge_requests/952): Improved test compatibility by modifying several test files to ensure all tests pass when explicit vectorization is disabled, addressing alignment-related issues in Eigen's test suite.",
"[!951](https://gitlab.com/libeigen/eigen/-/merge_requests/951): Improved the GEMV (General Matrix-Vector) operation in the AltiVec MatrixVectorProduct header by optimizing the predux order of operations, reducing instruction count from 20 to 7 and fixing GCC inline assembly compatibility.",
"[!944](https://gitlab.com/libeigen/eigen/-/merge_requests/944): Improves Eigen's array reshaping functionality by introducing a constexpr helper function in ReshapedHelper.h and ReshapedMethods.h, simplifying the reshaping logic and enabling more efficient compile-time array manipulation.",
"[!943](https://gitlab.com/libeigen/eigen/-/merge_requests/943): Improved Eigen's helper functions by converting template metaprogramming to constexpr functions across multiple core and utility header files, reducing code complexity and enhancing compile-time performance.",
"[!942](https://gitlab.com/libeigen/eigen/-/merge_requests/942): Improved Eigen documentation navbar by modifying JavaScript and CSS to resolve scrollbar and TOC positioning issues in the documentation navigation.",
"[!939](https://gitlab.com/libeigen/eigen/-/merge_requests/939): Improved Eigen's LAPACK module by renaming implementation files from `.cpp` to `.inc` to better separate implementation details from headers and enhance module organization.",
"[!936](https://gitlab.com/libeigen/eigen/-/merge_requests/936): Improved GEMM performance for Power architecture by optimizing vector loads and reducing computational passes in AltiVec matrix operations, resulting in significant speedups for matrix multiplication.",
"[!931](https://gitlab.com/libeigen/eigen/-/merge_requests/931): Improved CI pipeline configuration to enable Aarch64 architecture support by modifying build and test GitLab CI configuration files.",
"[!929](https://gitlab.com/libeigen/eigen/-/merge_requests/929): Improves the AltiVec matrix-vector product interface by splitting the general matrix-vector product macro into separate ColMajor and RowMajor implementations, resolving TensorFlow compilation issues.",
"[!921](https://gitlab.com/libeigen/eigen/-/merge_requests/921): Optimizes visitor traversal in Eigen's Core module by modifying the Visitor.h implementation to efficiently handle RowMajor matrix layouts, reducing unnecessary column-major traversals.",
"[!919](https://gitlab.com/libeigen/eigen/-/merge_requests/919): Fixed a syntax error in the Eigen tutorial documentation by adding a missing parenthesis in the TutorialSlicingIndexing.dox file, improving code readability.",
"[!916](https://gitlab.com/libeigen/eigen/-/merge_requests/916): Improves Altivec MMA flag handling in Eigen's matrix product implementations by enabling explicit control over configuration flags and updating related documentation.",
"[!913](https://gitlab.com/libeigen/eigen/-/merge_requests/913): Improved PowerPC MMA (Matrix-Multiply Accumulate) support in Eigen by adding a dynamic dispatch build option and modifying matrix product handling to address LTO and build compatibility issues.",
"[!907](https://gitlab.com/libeigen/eigen/-/merge_requests/907): Improved PowerPC MMA (Matrix-Multiply Assist) support in Eigen's AltiVec architecture by adding a dynamic dispatch build option and defaulting MMA usage for Power10 configurations.",
"[!904](https://gitlab.com/libeigen/eigen/-/merge_requests/904): Improved Eigen library's static class members by converting `static const` to `static constexpr` across multiple core and tensor-related header files, enhancing compile-time constant handling and code clarity.",
"[!903](https://gitlab.com/libeigen/eigen/-/merge_requests/903): Improves bit calculation in Eigen's default packet math functions by converting calculations to constexpr and eliminating unnecessary casts, enhancing code clarity and performance.",
"[!899](https://gitlab.com/libeigen/eigen/-/merge_requests/899): Improved Eigen's Map and core functionality by adding constexpr support for initialization and basic operations, enabling more compile-time evaluation capabilities in C++14.",
"[!895](https://gitlab.com/libeigen/eigen/-/merge_requests/895): Improved SparseSolverBase and IterativeSolverBase by adding move constructor support, enabling more flexible and efficient solver manipulation in Eigen's linear solver components.",
"[!892](https://gitlab.com/libeigen/eigen/-/merge_requests/892): Improves constant evaluation handling in Eigen's core utilities by adding a wrapper for `std::is_constant_evaluated` and adjusting alignment check assertions across core header files.",
"[!891](https://gitlab.com/libeigen/eigen/-/merge_requests/891): Improved SVD test suite by splitting and reducing matrix sizes in bdcsvd.cpp and jacobisvd.cpp to optimize memory usage and enhance test stability across different compilers.",
"[!890](https://gitlab.com/libeigen/eigen/-/merge_requests/890): Improved the BooleanRedux.h header by removing a duplicate IsRowMajor declaration, which eliminates potential compiler warnings in the Eigen core module.",
"[!889](https://gitlab.com/libeigen/eigen/-/merge_requests/889): Improved Eigen's memory management by adding standard library wrappers `construct_at` and `destroy_at` across multiple core and support modules, replacing manual placement new and destructor calls with safer, more standardized approaches.",
"[!888](https://gitlab.com/libeigen/eigen/-/merge_requests/888): Improved the Least Squares Conjugate Gradient solver in Eigen by adding `.noalias()` to optimize performance and reduce unnecessary memory copies.",
"[!887](https://gitlab.com/libeigen/eigen/-/merge_requests/887): Improved vectorization logic tests by modifying test/vectorization_logic.cpp to support better platform compatibility and more informative test cases across different SIMD architectures.",
"[!886](https://gitlab.com/libeigen/eigen/-/merge_requests/886): Improves the denormal test in packetmath.cpp by adding conditional logic to skip the test when the packet operation is not present, reducing unnecessary test execution.",
"[!885](https://gitlab.com/libeigen/eigen/-/merge_requests/885): Improved the BooleanRedux module in Eigen's Core library by addressing enum conversion warnings, enhancing code clarity and compiler compatibility.",
"[!879](https://gitlab.com/libeigen/eigen/-/merge_requests/879): Improved reduction operations for row-major layout in Eigen's core module, optimizing performance by fixing inefficiencies in the BooleanRedux.h implementation.",
"[!877](https://gitlab.com/libeigen/eigen/-/merge_requests/877): Reduced build log clutter by disabling deprecated warnings for SVD tests on MSVC in bdcsvd.cpp and jacobisvd.cpp test files.",
"[!873](https://gitlab.com/libeigen/eigen/-/merge_requests/873): Improved SVD test cases by disabling deprecated warnings in bdcsvd.cpp, jacobisvd.cpp, and svd_common.h to reduce build noise and improve test clarity.",
"[!872](https://gitlab.com/libeigen/eigen/-/merge_requests/872): Improved sqrt and rsqrt implementations in Eigen's AVX and AVX512 math functions to better handle denormal numbers, enhancing numerical stability and performance for AVX512 architectures.",
"[!868](https://gitlab.com/libeigen/eigen/-/merge_requests/868): Optimized SQRT/RSQRT implementations for x86 Skylake and Zen2 processors by removing specialized internal functions and improving test coverage for IEEE special values in packet math functions.",
"[!865](https://gitlab.com/libeigen/eigen/-/merge_requests/865): Improved SVD (Singular Value Decomposition) component by adding runtime assert checks in BDCSVD, JacobiSVD, and SVDBase headers to enhance error handling for thin U edge cases.",
"[!864](https://gitlab.com/libeigen/eigen/-/merge_requests/864): Improved Eigen architecture-specific math function headers by removing unnecessary `EIGEN_UNUSED` decorations from multiple header files, reducing potential build warnings and enhancing code clarity.",
"[!862](https://gitlab.com/libeigen/eigen/-/merge_requests/862): Improved SVD (Singular Value Decomposition) implementation in Eigen by restoring fixed-sized U/V matrix sizes for fixed-sized inputs, ensuring consistent matrix sizing behavior.",
"[!861](https://gitlab.com/libeigen/eigen/-/merge_requests/861): Improved the `IntegralConstant.h` header by making `FixedInt` constexpr-compatible and resolving One Definition Rule (ODR) violations related to `fix<N>`.",
"[!857](https://gitlab.com/libeigen/eigen/-/merge_requests/857): Restored the deprecated `svd::compute(Matrix, options)` method in SVD-related header files to maintain backward compatibility with external projects that rely on the older SVD computation approach.",
"[!854](https://gitlab.com/libeigen/eigen/-/merge_requests/854): Enhances the Scaling function in Eigen's Geometry module by adding an overload for rvalue reference vectors, enabling more flexible diagonal matrix creation from temporary vectors.",
"[!850](https://gitlab.com/libeigen/eigen/-/merge_requests/850): Improved documentation for Matrix typedefs in Eigen/src/Core/Matrix.h by adding explicit descriptions to ensure better doxygen documentation clarity.",
"[!849](https://gitlab.com/libeigen/eigen/-/merge_requests/849): Improved Eigen documentation by adding details for `MatrixXNt` and `MatrixNXt` matrix patterns and fixing namespace issues in the linear algebra tutorial example.",
"[!847](https://gitlab.com/libeigen/eigen/-/merge_requests/847): Cleaned up compiler warnings in PowerPC-specific GEMM and GEMV implementations, improving code clarity and maintainability in the AltiVec architecture files.",
"[!846](https://gitlab.com/libeigen/eigen/-/merge_requests/846): Improved the GeneralizedEigenSolver in Eigen by modifying the alphas() and betas() methods to return const references, reducing memory allocations and aligning with documentation.",
"[!845](https://gitlab.com/libeigen/eigen/-/merge_requests/845): Improved Eigen's numeric_limits implementation for half-precision floating point types by moving static data members into a class template to avoid One Definition Rule (ODR) violations.",
"[!844](https://gitlab.com/libeigen/eigen/-/merge_requests/844): Updated COPYING.MPL2 file to use https protocol, ensuring a secure connection for the project's licensing documentation.",
"[!842](https://gitlab.com/libeigen/eigen/-/merge_requests/842): Corrected documentation for the `matrixT()` method in the `CompleteOrthogonalDecomposition` class, fixing a typo to improve clarity and consistency.",
"[!841](https://gitlab.com/libeigen/eigen/-/merge_requests/841): Improved mathematical functions in Eigen's core architecture by consolidating and standardizing fast square root implementations across SSE, AVX, and AVX512, ensuring correct handling of edge cases like zero, infinity, and negative inputs.",
"[!838](https://gitlab.com/libeigen/eigen/-/merge_requests/838): Improved AVX512 support in Eigen's PacketMath module by defining the EIGEN_HAS_AVX512_MATH macro and fixing operation order to ensure compatibility and performance with AVX512 instructions.",
"[!836](https://gitlab.com/libeigen/eigen/-/merge_requests/836): Improved SSE PacketMath implementation by restricting a GCC < 6.3 workaround to only apply to GCC compilers, preventing unnecessary code generation for non-GCC compilers.",
"[!834](https://gitlab.com/libeigen/eigen/-/merge_requests/834): Improved AVX512 optimizations for triangular solve kernels by adding specialized implementations in Eigen's Core module, targeting performance enhancements for smaller problem sizes with AVX512 instructions.",
"[!832](https://gitlab.com/libeigen/eigen/-/merge_requests/832): Improved AVX512 math functions in Eigen's architecture-specific files to resolve consistency issues and enable support for Intel C++ Compiler (ICC).",
"[!828](https://gitlab.com/libeigen/eigen/-/merge_requests/828): Improved PowerPC GEMV performance in AltiVec architecture by modifying the matrix-vector product implementation to prevent cache overflow in the MatrixVectorProduct.h file.",
"[!827](https://gitlab.com/libeigen/eigen/-/merge_requests/827): Improves Newton-Raphson implementation in Eigen's core math functions by adding IEEE-compliant handling of edge cases like 1/0 and 1/inf, with performance optimizations leveraging SSE/FMA and AVX/FMA instructions.",
"[!825](https://gitlab.com/libeigen/eigen/-/merge_requests/825): Improved floating-point comparison handling across multiple Eigen library modules by introducing strict comparison utilities and reducing warnings related to float comparisons and implicit type conversions.",
"[!824](https://gitlab.com/libeigen/eigen/-/merge_requests/824): Improved AVX and FMA packet operations by removing inline assembly and adding new packet extensions (pmsub, pnmadd, pnmsub) to enhance low-level matrix multiplication performance.",
"[!821](https://gitlab.com/libeigen/eigen/-/merge_requests/821): Improves diagonal matrix performance by modifying DiagonalMatrix traits to prevent heap allocation during product operations, reducing memory overhead in linear algebra computations.",
"[!820](https://gitlab.com/libeigen/eigen/-/merge_requests/820): Improves Eigen's packet math operations by adding optimized reciprocal packet operations for SSE, AVX, and AVX512 architectures, enhancing performance and accuracy for float-based inverse calculations.",
"[!819](https://gitlab.com/libeigen/eigen/-/merge_requests/819): Improves warning suppression mechanism in Eigen's DisableStupidWarnings.h by adding logic to check warning support before suppressing, reducing unnecessary warning suppressions.",
"[!818](https://gitlab.com/libeigen/eigen/-/merge_requests/818): Improved MSVC compiler warnings in Eigen's Memory.h utility by silencing specific uninitialized variable and unreachable code warnings without changing functionality.",
"[!816](https://gitlab.com/libeigen/eigen/-/merge_requests/816): Improved Eigen's optimization barrier macro in Core/util/Macros.h to support soft float ARM architecture by removing \"w\" inline assembly constraint and enabling compatibility with ARMv6j+nofp flags.",
"[!814](https://gitlab.com/libeigen/eigen/-/merge_requests/814): Updated comment in Eigen/src/Geometry/Umeyama.h to reference a new constexpr function instead of a removed macro, improving documentation clarity.",
"[!813](https://gitlab.com/libeigen/eigen/-/merge_requests/813): Improved documentation for the Least Squares Conjugate Gradient (LSCG) solver by correcting mathematical descriptions and clarifying the solver's problem formulation in the Eigen library.",
"[!808](https://gitlab.com/libeigen/eigen/-/merge_requests/808): Improved type casting in the LU determinant calculation by adding explicit type conversion for the `pmadd` function to resolve compiler errors with custom scalar types.",
"[!799](https://gitlab.com/libeigen/eigen/-/merge_requests/799): Improves the log computation performance for float in Eigen's packet math functions by replacing a polynomial approximation with a rational approximation and fixing denormalized argument handling, resulting in a 20% speedup for AVX2.",
"[!797](https://gitlab.com/libeigen/eigen/-/merge_requests/797): Improved Eigen's serializer by adding bounds checking to prevent out-of-bounds access in the serialization and deserialization mechanisms, enhancing overall library safety.",
"[!796](https://gitlab.com/libeigen/eigen/-/merge_requests/796): Improves fixed-size Matrix and Array type traits by adding trivial copyability support for C++20 compilers, enabling more efficient memory operations and simplified special member function selection.",
"[!795](https://gitlab.com/libeigen/eigen/-/merge_requests/795): Improved Eigen library's naming conventions across multiple header files by reducing usage of reserved names and avoiding potential naming conflicts with implementation-specific identifiers.",
"[!792](https://gitlab.com/libeigen/eigen/-/merge_requests/792): Enhances CWiseUnaryView by adding support for specifying inner and outer strides, improving flexibility in stride management for Eigen's core functionality.",
"[!790](https://gitlab.com/libeigen/eigen/-/merge_requests/790): Improved vectorization logic test cases by adding missing internal namespace qualifiers in the test/vectorization_logic.cpp file, enhancing test coverage for namespace-related issues.",
"[!788](https://gitlab.com/libeigen/eigen/-/merge_requests/788): Improved documentation and code quality across multiple Eigen source files by fixing documentation formatting, removing unnecessary semicolon warnings, and updating literal type usage.",
"[!786](https://gitlab.com/libeigen/eigen/-/merge_requests/786): Improved the GDB pretty printer code by renaming variables, removing unused imports, and enhancing code formatting in the debug/gdb/printers.py file.",
"[!783](https://gitlab.com/libeigen/eigen/-/merge_requests/783): Simplified the `logical_xor()` function for `bool` types in Eigen's Core utility header by replacing the complex logical expression with a more concise `!=` comparison.",
"[!780](https://gitlab.com/libeigen/eigen/-/merge_requests/780): Improved the logistic sigmoid function in Eigen's UnaryFunctors.h by implementing a hybrid range reduction method, enhancing accuracy and handling of large negative x values for float32 implementation.",
"[!779](https://gitlab.com/libeigen/eigen/-/merge_requests/779): Improved the exp<float>() implementation in Eigen's default packet math functions, reducing polynomial approximant degree and achieving a 4% AVX2 speedup while maintaining accuracy for denormalized values.",
"[!776](https://gitlab.com/libeigen/eigen/-/merge_requests/776): Improved CMake configuration by converting `EIGEN_TEST_CUSTOM_CXX_FLAGS` to a proper CMake list using semicolon separation and `separate_arguments` with MODE option in build system files.",
"[!774](https://gitlab.com/libeigen/eigen/-/merge_requests/774): Improved CMake configuration in EigenTesting.cmake to enhance compatibility with the latest CMake version, specifically enabling HIP unit tests.",
"[!773](https://gitlab.com/libeigen/eigen/-/merge_requests/773): Improved sparse-dense matrix product performance in RowMajor matrices by modifying SparseDenseProduct.h to use two accumulation variables, enabling better instruction-level parallelism.",
"[!764](https://gitlab.com/libeigen/eigen/-/merge_requests/764): Improves PowerPC GEMV performance by adding VSX and MMA acceleration support in matrix-vector operations, achieving up to 4X speedup for MMA and 2.5X for VSX.",
"[!763](https://gitlab.com/libeigen/eigen/-/merge_requests/763): Improved CMake configuration by removing deprecated `COMPILE_FLAGS` macro and replacing it with modern `target_compile_options` and `target_compile_definitions` in Eigen's build system files.",
"[!762](https://gitlab.com/libeigen/eigen/-/merge_requests/762): Updated documentation snippets for Eigen slicing operations, improving code examples and clarifying usage across different data types and contexts.",
"[!760](https://gitlab.com/libeigen/eigen/-/merge_requests/760): Improved Eigen documentation examples by removing `using namespace Eigen` from multiple example files, promoting better C++ coding practices and reducing potential namespace pollution.",
"[!758](https://gitlab.com/libeigen/eigen/-/merge_requests/758): Improved CMake testing infrastructure by adding support for HIP GPU unit tests and enabling C++14 compliance in the Eigen library's test suite.",
"[!756](https://gitlab.com/libeigen/eigen/-/merge_requests/756): Improved Eigen's Core module by conditionally including <atomic> header, enabling compilation in toolchains without atomic support and reducing unnecessary header inclusion.",
"[!753](https://gitlab.com/libeigen/eigen/-/merge_requests/753): Improves Eigen's type safety by converting computational macros to constexpr functions across multiple core library files, introducing stricter type checking and reducing potential runtime errors.",
"[!748](https://gitlab.com/libeigen/eigen/-/merge_requests/748): Improved Lapacke bindings for HouseholderQR and PartialPivLU by replacing binding macros with C++ code and factoring common binding logic into a new helper file, reducing memory usage and enhancing code maintainability.",
"[!742](https://gitlab.com/libeigen/eigen/-/merge_requests/742): Improved CMake configuration by updating minimum version requirements and removing outdated testing options, enhancing compatibility with modern Linux distributions.",
"[!737](https://gitlab.com/libeigen/eigen/-/merge_requests/737): Improves the Lapacke LLT macro in Eigen's Cholesky module by splitting a large macro into smaller, more readable parts to enhance code maintainability.",
"[!736](https://gitlab.com/libeigen/eigen/-/merge_requests/736): Improved const-qualification for SelfAdjoint and Triangular views in Eigen by removing unnecessary non-const transpose overloads when views do not refer to lvalues, enhancing type safety and error messaging.",
"[!734](https://gitlab.com/libeigen/eigen/-/merge_requests/734): Improves AVX2 vectorization support in XprHelper.h by allowing vector operations even when data size is not a multiple of 8, with corresponding test updates to verify the new behavior.",
"[!727](https://gitlab.com/libeigen/eigen/-/merge_requests/727): Improved Eigen's numeric limits handling by making numeric_limits members constexpr in BFloat16 and Half type implementations, enhancing compliance with modern C++ standards.",
"[!722](https://gitlab.com/libeigen/eigen/-/merge_requests/722): Improved the Umeyama.h header by optimizing computational logic and clarifying the usage of `src_var` when scaling is disabled, reducing unnecessary calculations in the Eigen geometry module.",
"[!718](https://gitlab.com/libeigen/eigen/-/merge_requests/718): Improves SparseMatrix handling by ensuring consistent StorageIndex usage across SparseMatrix and its derived classes, specifically updating Map and TransposedSparseMatrix to maintain uniform index type.",
"[!717](https://gitlab.com/libeigen/eigen/-/merge_requests/717): Improved sparse vector implementation by moving pruning code from CompressedStorage.h to SparseVector.h, enhancing code modularity and preparing for future sparse matrix storage features.",
"[!716](https://gitlab.com/libeigen/eigen/-/merge_requests/716): Improved Eigen's warning pragmas by converting diag pragmas to nv_diag in utility header files, enhancing compatibility with NVIDIA GPU code and maintaining consistent diagnostic handling.",
"[!712](https://gitlab.com/libeigen/eigen/-/merge_requests/712): Improved Quaternion constructor documentation in Eigen/src/Geometry/Quaternion.h to clarify the order of matrix elements, addressing potential user confusion about input format.",
"[!702](https://gitlab.com/libeigen/eigen/-/merge_requests/702): Improves AVX vector path for float2half and half2float conversions in Eigen's linear algebra library, optimizing matrix multiplication performance by introducing vectorized conversion methods.",
"[!701](https://gitlab.com/libeigen/eigen/-/merge_requests/701): Improved ZVector alignment specifications by moving `alignas` qualifier to the first position in Complex.h and PacketMath.h, reducing compiler warnings related to vector type alignments.",
"[!700](https://gitlab.com/libeigen/eigen/-/merge_requests/700): Improved Neon architecture support by vectorizing fp16 tanh and logistic functions, adding optimized implementations in the Core module's NEON headers.",
"[!698](https://gitlab.com/libeigen/eigen/-/merge_requests/698): Improved the CommaInitializer in Eigen's Core module to ensure proper handling of fixed-dimension blocks during initialization, preventing potential sizing compatibility issues.",
"[!697](https://gitlab.com/libeigen/eigen/-/merge_requests/697): Improved CMake configuration for Eigen by optimizing build scripts to better support subprojects and reduce unnecessary test building.",
"[!693](https://gitlab.com/libeigen/eigen/-/merge_requests/693): Improved documentation for the Stride class by adding a note clarifying inner stride behavior for compile-time vectors, addressing potential user confusion about vector stride implementation.",
"[!692](https://gitlab.com/libeigen/eigen/-/merge_requests/692): Improved Qt support in Eigen's Transform.h by extending compatibility to Qt6 while maintaining Qt5 functionality, resolving compatibility issues for users building with different Qt versions.",
"[!687](https://gitlab.com/libeigen/eigen/-/merge_requests/687): Improved Eigen array and matrix plugins by adding nan-propagation options for elementwise min/max operations, enhancing numerical value handling across the library.",
"[!686](https://gitlab.com/libeigen/eigen/-/merge_requests/686): Improved bit_cast implementation in NumTraits.h for CUDA by reverting to memcpy, avoiding potential undefined behavior with reinterpret_cast and enhancing code safety for CUDA environments.",
"[!680](https://gitlab.com/libeigen/eigen/-/merge_requests/680): Improved PowerPC matrix packing performance by inverting rows and depth in the non-vectorized portion, resolving data retrieval issues and achieving up to 10% speed gains in specific test cases.",
"[!678](https://gitlab.com/libeigen/eigen/-/merge_requests/678): Reorganized Eigen's CUDA/GPU architecture by moving Complex.h to the GPU directory and removing the deprecated TensorReductionCuda.h file to improve code maintainability.",
"[!677](https://gitlab.com/libeigen/eigen/-/merge_requests/677): Improved GPU type punning in NumTraits.h by replacing memcpy with reinterpret_cast for more efficient CUDA-based bit_cast operations.",
"[!673](https://gitlab.com/libeigen/eigen/-/merge_requests/673): Improved Visitor.h by adding vectorized codepaths for matrix coefficient operations, delivering up to 5x performance gains on AVX2-enabled machines through optimized matrix decomposition functions.",
"[!668](https://gitlab.com/libeigen/eigen/-/merge_requests/668): Simplified Windows CMake configuration by removing deprecated OS version detection scripts and updating compiler version detection in EigenTesting.cmake for improved cross-platform compatibility.",
"[!664](https://gitlab.com/libeigen/eigen/-/merge_requests/664): Improved CUDA complex operations compatibility by disabling complex compound assignment operators for MSVC in the Eigen/src/Core/arch/CUDA/Complex.h header, preventing potential compilation issues.",
"[!663](https://gitlab.com/libeigen/eigen/-/merge_requests/663): Improved the DisableStupidWarnings.h header by adding more CUDA warning suppressions for versions 9.2 and 11.4, reducing warning noise in the Eigen core utilities.",
"[!662](https://gitlab.com/libeigen/eigen/-/merge_requests/662): Improved test infrastructure by reorganizing the main test file and extracting random matrix generators into a separate helper header, enhancing code modularity and maintainability.",
"[!661](https://gitlab.com/libeigen/eigen/-/merge_requests/661): Improved documentation and code comments across multiple Eigen library files by correcting spelling errors and typos, focusing on enhancing readability without changing functional behavior.",
"[!657](https://gitlab.com/libeigen/eigen/-/merge_requests/657): Improved tuple test suite by addressing implicit conversion warnings in tuple_test.cpp, reducing compiler warnings without changing functionality.",
"[!655](https://gitlab.com/libeigen/eigen/-/merge_requests/655): Improved CI infrastructure by enabling parallel test execution across all available CPU cores in GitLab CI configuration files, enhancing test performance and resource utilization.",
"[!654](https://gitlab.com/libeigen/eigen/-/merge_requests/654): Silenced a string overflow warning in the initializer list construction test for GCC, improving compiler compatibility in the Eigen test suite.",
"[!653](https://gitlab.com/libeigen/eigen/-/merge_requests/653): Improved GPU testing in Eigen's test/gpu_example.cu by disabling specific subtests that fail on HIP due to missing device-side malloc/free functionality.",
"[!651](https://gitlab.com/libeigen/eigen/-/merge_requests/651): Improved AVX512 build configuration by removing the unnecessary `-fabi-version=6` flag from CMakeLists.txt, reducing potential compilation issues.",
"[!647](https://gitlab.com/libeigen/eigen/-/merge_requests/647): Improved Eigen's static assertion mechanism by transitioning to standard C++11 static_assert, removing runtime checks and breaking large static assertions into more readable, individual checks across multiple core library files.",
"[!646](https://gitlab.com/libeigen/eigen/-/merge_requests/646): Improved GPU testing infrastructure in Eigen by adding new CMake targets `buildtests_gpu` and `check_gpu`, which simplify the process of building and running GPU-specific tests in the continuous integration workflow.",
"[!645](https://gitlab.com/libeigen/eigen/-/merge_requests/645): Improves the `eigen_packet_wrapper` in GenericPacketMath.h by adding a default constructor to enable easier memory copying operations.",
"[!638](https://gitlab.com/libeigen/eigen/-/merge_requests/638): Improved AVX packet types in PacketMath.h by adding missing integer packet types for the pset1 call, enhancing packet type support in the Eigen library.",
"[!635](https://gitlab.com/libeigen/eigen/-/merge_requests/635): Improved the tridiagonalization function in Eigen's Eigenvalues module by introducing a flexible template parameter `CoeffVectorType` to resolve build errors related to mismatched coefficient vector types.",
"[!634](https://gitlab.com/libeigen/eigen/-/merge_requests/634): Improves CMake configuration by defaulting package registry population for CMake 3.15+ in the project's CMakeLists.txt, reducing configuration complexity for users upgrading from older CMake versions.",
"[!633](https://gitlab.com/libeigen/eigen/-/merge_requests/633): Improves CMake versioning in Eigen by adding support for the `ARCH_INDEPENDENT` option, simplifying package configuration and potentially removing legacy versioning code.",
"[!618](https://gitlab.com/libeigen/eigen/-/merge_requests/618): Fixed CUDA 9 compatibility in Eigen's core headers by adding `EIGEN_DEVICE_FUNC` macro annotations to `Macros.h` and `Block.h`, resolving compilation issues for the `gpu_basic` test.",
"[!614](https://gitlab.com/libeigen/eigen/-/merge_requests/614): Improved LAPACK test compilation in CMakeLists.txt by addressing Fortran argument type mismatches, enabling compatibility with GNU Fortran 10 and reducing legacy code compilation errors.",
"[!610](https://gitlab.com/libeigen/eigen/-/merge_requests/610): Improved CMake configuration across Eigen documentation and test directories by updating C++ standard settings to centralize and simplify C++11/14 standard management.",
"[!609](https://gitlab.com/libeigen/eigen/-/merge_requests/609): Optimized predux reduction operations for aarch64 architecture in Eigen's NEON intrinsics implementation, improving performance by using more efficient vector instructions.",
"[!603](https://gitlab.com/libeigen/eigen/-/merge_requests/603): Improved documentation for the `squaredNorm()` function in Eigen's Core module, clarifying its behavior for calculating the squared Frobenius norm of matrices to prevent potential user confusion.",
"[!600](https://gitlab.com/libeigen/eigen/-/merge_requests/600): Improved PPC packet comparisons in Eigen's AltiVec architecture support by adding missing packet comparison operations to the PacketMath.h header file.",
"[!598](https://gitlab.com/libeigen/eigen/-/merge_requests/598): Fixed documentation in Map.h with a minor documentation correction, improving code clarity in the Eigen Core module.",
"[!597](https://gitlab.com/libeigen/eigen/-/merge_requests/597): Improved documentation for matrix decompositions and least squares solvers by updating key documentation files to enhance clarity and provide more comprehensive information for users.",
"[!596](https://gitlab.com/libeigen/eigen/-/merge_requests/596): Improved AltiVec/PacketMath.h for Power8 architecture by adding reverse compare logic in F32ToBf16 to support clang10 compilation on older hardware platforms.",
"[!595](https://gitlab.com/libeigen/eigen/-/merge_requests/595): Improved AltiVec matrix product header files by initializing pointers to NULL to eliminate uninitialized variable warnings in GCC 11+ compilers.",
"[!588](https://gitlab.com/libeigen/eigen/-/merge_requests/588): Improved test infrastructure by conditionally setting the `AnnoyingScalar::dont_throw` flag in `conservative_resize.cpp` and `sparse_block.cpp` to prevent potential undefined behavior when the `EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW` macro is not defined.",
"[!584](https://gitlab.com/libeigen/eigen/-/merge_requests/584): Optimized the tridiagonalization process in Eigen's SelfAdjointEigenSolver by reducing memory allocations in the in-place selector implementation, improving memory efficiency for eigenvalue computations.",
"[!582](https://gitlab.com/libeigen/eigen/-/merge_requests/582): Improved the 3x3 matrix inverse computation in Eigen's LU module, reducing execution time by 88.67% and resolving a GCC uninitialized memory warning.",
"[!581](https://gitlab.com/libeigen/eigen/-/merge_requests/581): Improved documentation for `middleCol` and `middleRow` block operations in the Eigen tutorial, enhancing user understanding of these matrix manipulation methods.",
"[!580](https://gitlab.com/libeigen/eigen/-/merge_requests/580): Improved BF16 performance in AVX and AVX512 architectures by removing denormal flushing in FP32ToBF16 conversion routines, reducing overhead in BF16 operations.",
"[!575](https://gitlab.com/libeigen/eigen/-/merge_requests/575): Improved template identifier naming conventions across multiple Eigen library header files to avoid leading underscore followed by capital letters, enhancing code consistency and readability.",
"[!573](https://gitlab.com/libeigen/eigen/-/merge_requests/573): Corrected a typo in the Constants.h documentation, improving the grammar and clarity of the Eigen library's documentation.",
"[!568](https://gitlab.com/libeigen/eigen/-/merge_requests/568): Improved vectorization support for comparison functors in Eigen's NEON architecture by modifying PacketMath.h and BinaryFunctors.h, enabling more efficient SIMD-based comparison operations.",
"[!567](https://gitlab.com/libeigen/eigen/-/merge_requests/567): Improved GPU support by enabling equality comparisons across different device types in Eigen's core comparison functors and matrix operations.",
"[!566](https://gitlab.com/libeigen/eigen/-/merge_requests/566): Improved Eigen documentation by applying monospace formatting to code snippets across multiple documentation files, enhancing readability of code examples.",
"[!563](https://gitlab.com/libeigen/eigen/-/merge_requests/563): Improved CMake configuration files by fixing package detection warnings, renaming files to correct case mismatches, and resolving package name inconsistencies in Eigen's build system.",
"[!557](https://gitlab.com/libeigen/eigen/-/merge_requests/557): Improved HIP GPU backend support by fixing build issues in BlasUtil.h, enabling submatrix extraction and related operations for HIP GPU compilation.",
"[!556](https://gitlab.com/libeigen/eigen/-/merge_requests/556): Improved dense matrix filling performance in Eigen's AssignEvaluator by deferring to std::fill_n for constant value assignments, reducing execution time for large matrix operations.",
"[!545](https://gitlab.com/libeigen/eigen/-/merge_requests/545): Improves the Eigen library's MatrixProduct implementation for PPC architecture by adding the ability to disable specialized `gemm_pack_rhs` operations, specifically to optimize TensorFlow performance.",
"[!542](https://gitlab.com/libeigen/eigen/-/merge_requests/542): Improved documentation for the main header file by adding Doxygen-style comments to enhance function descriptions in `main.h`.",
"[!541](https://gitlab.com/libeigen/eigen/-/merge_requests/541): Improves DenseStorage by adding trivially_copyable trait, enabling safer and more efficient memory copying for Eigen's dense storage types.",
"[!537](https://gitlab.com/libeigen/eigen/-/merge_requests/537): Improved Eigen's complex number handling by reducing code duplication across multiple architecture-specific complex conjugate helper implementations, simplifying the codebase and potentially enhancing performance.",
"[!535](https://gitlab.com/libeigen/eigen/-/merge_requests/535): Improves CMake build configuration by conditionally building shared libraries and static targets for BLAS and LAPACK, preventing build errors on platforms without shared library support.",
"[!533](https://gitlab.com/libeigen/eigen/-/merge_requests/533): Improved solver reference management in Eigen's Core module by using internal::ref_selector to avoid holding references to RHS expressions, enhancing code efficiency and solver interface consistency.",
"[!532](https://gitlab.com/libeigen/eigen/-/merge_requests/532): Corrects NEON declarations in PacketMath.h for aarch64-pc-windows-msvc configuration by preventing compatibility issues between clang and MSVC compiler declarations.",
"[!529](https://gitlab.com/libeigen/eigen/-/merge_requests/529): Improves data loading operations in Eigen's core architecture by replacing `pset` with `ploadu` for safer unaligned data handling across multiple header files, reducing potential segmentation fault risks.",
"[!527](https://gitlab.com/libeigen/eigen/-/merge_requests/527): Improved AltiVec matrix product performance by changing inline directives from EIGEN_STRONG_INLINE to EIGEN_ALWAYS_INLINE in critical matrix multiplication functions to optimize Tensorflow integration.",
"[!525](https://gitlab.com/libeigen/eigen/-/merge_requests/525): Improved PPC architecture support in Eigen's packetmath by adding a missing `pcmp_lt_or_nan` test and definition for Packet8bf in the AltiVec PacketMath header.",
"[!524](https://gitlab.com/libeigen/eigen/-/merge_requests/524): Improved Eigen library's core and sparse modules by removing deprecated enum arithmetic to ensure compatibility with C++20 standards across multiple source files.",
"[!522](https://gitlab.com/libeigen/eigen/-/merge_requests/522): Improved MinGW compiler version detection in Eigen's Meta.h and CMake testing infrastructure by adding support for extracting version numbers and handling type compatibility for MinGW builds.",
"[!519](https://gitlab.com/libeigen/eigen/-/merge_requests/519): Improved floating-point handling across multiple SIMD architectures by using `bit_cast` to consistently create `-0.0` values in Eigen's PacketMath and related implementation files.",
"[!518](https://gitlab.com/libeigen/eigen/-/merge_requests/518): Improved Eigen library's core components to address C++20 warnings related to enum arithmetic expressions, ensuring better compiler compatibility without breaking existing functionality.",
"[!511](https://gitlab.com/libeigen/eigen/-/merge_requests/511): Improved NEON packet transpose implementation by unifying code and using `vzip` for more efficient vector operations across different packet types.",
"[!510](https://gitlab.com/libeigen/eigen/-/merge_requests/510): Improved Eigen evaluators in CoreEvaluators.h to support non-class types like raw function pointers, maintaining existing logic and assembly performance.",
"[!509](https://gitlab.com/libeigen/eigen/-/merge_requests/509): Improved the CwiseBinaryOp's default copy constructor by removing the EIGEN_DEVICE_FUNC annotation, eliminating CUDA compiler warnings for explicitly defaulted functions.",
"[!501](https://gitlab.com/libeigen/eigen/-/merge_requests/501): Improved Eigen's complex number math functions by adding a device implementation of the log function for std::complex types, enhancing performance for device-based computations.",
"[!486](https://gitlab.com/libeigen/eigen/-/merge_requests/486): Improved CUDA complex division implementation in Eigen's Complex.h by implementing Smith's algorithm, enhancing numerical stability and resolving edge cases with subnormal numbers.",
"[!478](https://gitlab.com/libeigen/eigen/-/merge_requests/478): Improved DenseStorage copy and swap operations for dynamic matrices with fixed-sized storage, ensuring safer and more efficient handling of initialized elements by modifying core memory management functions."
],
"other_fixed": [
"[!1935](https://gitlab.com/libeigen/eigen/-/merge_requests/1935): Fixed self-adjoint matrix-vector product handling in Eigen's core product evaluators. Corrected compile-time vector processing in SelfadjointMatrixVector.h to resolve issues with selfadjoint_eigensolver tests.",
"[!1931](https://gitlab.com/libeigen/eigen/-/merge_requests/1931): Fixed a bug in the 1x1 selfadjoint matrix-vector product within the Eigen Core module. The modification addresses a specific issue related to matrix-vector multiplication for 1x1 matrices.",
"[!1921](https://gitlab.com/libeigen/eigen/-/merge_requests/1921): Fixed VSX packetmath type casting and vector operations in AltiVec architecture, resolving type mismatches and improving compatibility with clang and QEMU environments.",
"[!1920](https://gitlab.com/libeigen/eigen/-/merge_requests/1920): Fixed Bazel-related test failures across multiple Eigen components, addressing GPU support, exception handling, and test numbering issues to improve overall library compatibility and testing robustness.",
"[!1912](https://gitlab.com/libeigen/eigen/-/merge_requests/1912): Fixed a potential memory management vulnerability in the Eigen Memory utility macro by protecting the SIZE argument in `ei_declare_aligned_stack_constructed_variable` to prevent buffer overflow risks.",
"[!1911](https://gitlab.com/libeigen/eigen/-/merge_requests/1911): Fixed MSVC compiler warnings in FindCoeff.h by addressing type truncation issues, improving compatibility for MSVC users of the Eigen library.",
"[!1906](https://gitlab.com/libeigen/eigen/-/merge_requests/1906): Fixed a compilation bug in the NEON implementation of PacketMath.h, resolving an architecture-specific build issue in Eigen's core architecture support.",
"[!1904](https://gitlab.com/libeigen/eigen/-/merge_requests/1904): Fixed NEON packet math operations in Eigen by adding native implementations of `pnmadd` and correcting the `pnmsub` intrinsic function for float, double, and half vector types.",
"[!1903](https://gitlab.com/libeigen/eigen/-/merge_requests/1903): Fixed a compile warning in the test/packetmath.cpp file related to multiplication with boolean values, addressing a potential compiler warning without changing actual code functionality.",
"[!1891](https://gitlab.com/libeigen/eigen/-/merge_requests/1891): Fixes vectorwise operation scalar argument handling in Eigen's core module by adding support for right-hand-side scalar arguments in multiplication and division operators.",
"[!1890](https://gitlab.com/libeigen/eigen/-/merge_requests/1890): Fixed LAPACKe bindings for BDCSVD and JacobiSVD SVD components by modifying their header files to align with the updated API, resolving compatibility and potential compilation issues.",
"[!1889](https://gitlab.com/libeigen/eigen/-/merge_requests/1889): Fixed potential MSAN errors in Eigen's vectorized casting evaluator by zeroing out unused packets in CoreEvaluators.h, preventing undefined behavior during intermediate operations.",
"[!1883](https://gitlab.com/libeigen/eigen/-/merge_requests/1883): Fixed undefined behavior in the `ploaduSegment` function within GenericPacketMath.h by adding safeguards to prevent out-of-bounds memory access, improving the safety of the Eigen core library.",
"[!1882](https://gitlab.com/libeigen/eigen/-/merge_requests/1882): Fixed the `noexcept` specifier in the CommaInitializer header to restore test functionality, ensuring proper behavior of Eigen's comma initialization mechanism.",
"[!1876](https://gitlab.com/libeigen/eigen/-/merge_requests/1876): Fixed constexpr usage in CoreEvaluators.h to improve compiler compatibility and correctness of the implementation.",
"[!1874](https://gitlab.com/libeigen/eigen/-/merge_requests/1874): Fixed ArrayWrapper and MatrixWrapper partial redux expressions by addressing compatibility issues with `.array()` method in the Eigen Core module.",
"[!1872](https://gitlab.com/libeigen/eigen/-/merge_requests/1872): Fixed a potential deadlock in Eigen's thread pool by improving task notification and stealing mechanisms. Ensures more robust thread pool behavior to prevent synchronization issues.",
"[!1870](https://gitlab.com/libeigen/eigen/-/merge_requests/1870): Fixed type errors in the ForkJoin.h thread environment handling, improving type safety and compatibility for custom thread environments in Eigen's parallel processing infrastructure.",
"[!1869](https://gitlab.com/libeigen/eigen/-/merge_requests/1869): Fixed a compiler warning in the Eigen Parallelizer implementation when using GCC 11.4.0 with OpenMP, addressing a type conversion issue in the `Parallelizer.h` file.",
"[!1862](https://gitlab.com/libeigen/eigen/-/merge_requests/1862): Fixed packet math operations by replacing NaN with Scalar(1) in several core Eigen math function files to improve compatibility with fast-math enabled modes.",
"[!1856](https://gitlab.com/libeigen/eigen/-/merge_requests/1856): Fixed mathematical functions and packet math implementations in Eigen's core modules to resolve an identified issue, addressing potential problems in the library's computational routines.",
"[!1854](https://gitlab.com/libeigen/eigen/-/merge_requests/1854): Fixes the `allFinite` method in Eigen's `DenseBase` to correctly handle integer arrays and platform-specific `std::isfinite` behavior, ensuring accurate finite value checking across different integer types.",
"[!1847](https://gitlab.com/libeigen/eigen/-/merge_requests/1847): Fixed an extra semicolon in the DeviceWrapper.h header file to resolve a compiler warning when using specific compiler flags.",
"[!1842](https://gitlab.com/libeigen/eigen/-/merge_requests/1842): Fixes CMake configuration warning in test/CMakeLists.txt related to Boost library, ensuring proper build system compliance.",
"[!1835](https://gitlab.com/libeigen/eigen/-/merge_requests/1835): Fixes a bitwise operation error in the Eigen Geometry module when compiling with C++26, specifically modifying the OrthoMethods.h header to resolve compilation compatibility.",
"[!1834](https://gitlab.com/libeigen/eigen/-/merge_requests/1834): Fixed matrix initialization in the bicgstab test to ensure proper element initialization and improve test reliability.",
"[!1833](https://gitlab.com/libeigen/eigen/-/merge_requests/1833): Fixed an array bounds issue in the Eigen inner product implementation, improving safety and correctness of the core product operation.",
"[!1831](https://gitlab.com/libeigen/eigen/-/merge_requests/1831): Fixed build configuration in AltiVec/PacketMath.h to resolve compilation errors for systems without VSX and POWER8 support, improving cross-platform compatibility.",
"[!1825](https://gitlab.com/libeigen/eigen/-/merge_requests/1825): Fixes type-punning undefined behavior in Eigen::half implementation by modifying the Half.h header to use a proper bit-cast approach, ensuring safer memory operations.",
"[!1816](https://gitlab.com/libeigen/eigen/-/merge_requests/1816): Fixed Android compatibility issue in Eigen's core configuration headers by removing the `__cpp_lib_hardware_interference_size` macro for NDK versions r25 and lower, resolving a macro definition problem that prevented correct library functionality.",
"[!1810](https://gitlab.com/libeigen/eigen/-/merge_requests/1810): Fixed midpoint calculation in Eigen::ForkJoinScheduler to prevent index out-of-bounds errors when granularity is greater than one, ensuring proper range selection during parallel forking operations.",
"[!1808](https://gitlab.com/libeigen/eigen/-/merge_requests/1808): Fixed typos in the `ForkJoin.h` file within the Eigen ThreadPool module, addressing minor textual errors without impacting functionality.",
"[!1806](https://gitlab.com/libeigen/eigen/-/merge_requests/1806): Fixed UTF-8 encoding errors in the SimplicialCholesky implementation file, resolving invalid character issues that were causing build problems with MSVC and Apple Clang compilers.",
"[!1803](https://gitlab.com/libeigen/eigen/-/merge_requests/1803): Fixed threadpool compatibility by replacing C++17-specific initializers with C++14-compliant code in the NonBlockingThreadPool header, improving compiler support for older versions of g++ and MSVC.",
"[!1799](https://gitlab.com/libeigen/eigen/-/merge_requests/1799): Fixed a typo in the NonBlockingThreadPool's task stealing logic within the thread pool implementation, correcting the spin loop to properly retrieve tasks from other threads' work queues.",
"[!1792](https://gitlab.com/libeigen/eigen/-/merge_requests/1792): Fixed reference handling for std::fill_n in Eigen's Core and SparseCore modules, addressing compatibility issues with device code and standard namespace usage.",
"[!1790](https://gitlab.com/libeigen/eigen/-/merge_requests/1790): Fixed an uninitialized memory read issue in the SparseQR module by removing unnecessary access to the `m_threshold` variable during factorization, improving code safety and maintainability.",
"[!1785](https://gitlab.com/libeigen/eigen/-/merge_requests/1785): Fixed a build configuration issue in Eigen's Core utility header by adding a missing `#include <new>` directive, resolving compatibility problems with recent LLVM commits.",
"[!1764](https://gitlab.com/libeigen/eigen/-/merge_requests/1764): Fixed the checkformat CI stage in GitLab configuration by addressing a Docker Hub Ubuntu image version compatibility issue.",
"[!1762](https://gitlab.com/libeigen/eigen/-/merge_requests/1762): Fixes an alignment issue in the IOFormat computation within the Eigen/src/Core/IO.h file by modifying how row spacing is calculated for matrix outputs.",
"[!1760](https://gitlab.com/libeigen/eigen/-/merge_requests/1760): Fixed undefined behavior in the `setZero` function by adding a null pointer check in the `memset` specialization within the Core module, preventing potential runtime errors for zero-sized blocks.",
"[!1751](https://gitlab.com/libeigen/eigen/-/merge_requests/1751): Reverted a problematic commit in the Eigen/src/Core/EigenBase.h file to resolve debug mode build failures, restoring previous functionality.",
"[!1726](https://gitlab.com/libeigen/eigen/-/merge_requests/1726): Fixed GPU build compatibility in IndexedViewHelper.h by adding initializers for constexpr globals, ensuring proper support for CUDA contexts.",
"[!1725](https://gitlab.com/libeigen/eigen/-/merge_requests/1725): Fixes SIMD geometry code in Eigen to resolve clang6 compilation issues on ARM architectures by modifying the Geometry_SIMD.h header to handle last scalar component zeroing.",
"[!1723](https://gitlab.com/libeigen/eigen/-/merge_requests/1723): Fixed compiler optimization bugs in Clang 6 for SSE and Geometry SIMD implementations, addressing vector rearrangement and floating-point mask operation issues in specific functions.",
"[!1721](https://gitlab.com/libeigen/eigen/-/merge_requests/1721): Fixed Memory.h utility in Eigen's core module to resolve compilation compatibility with nvc++ by replacing `__builtin_alloca_with_align` with a fallback implementation.",
"[!1718](https://gitlab.com/libeigen/eigen/-/merge_requests/1718): Fixed out-of-bounds access vulnerability in Eigen's triangular matrix multiplication implementation within the Core module. Ensures safe memory access during matrix multiplication operations to prevent potential runtime errors.",
"[!1716](https://gitlab.com/libeigen/eigen/-/merge_requests/1716): Fixed stack allocation static assert in DenseStorage by moving the assert back into the constructor, specifically when `EIGEN_NO_DEBUG` is defined, to improve construction behavior for `VectorBlock`.",
"[!1711](https://gitlab.com/libeigen/eigen/-/merge_requests/1711): Fixed a bug in DenseBase::tail method for dynamic template arguments, improving compatibility and resolving compilation issues with runtime size arguments.",
"[!1708](https://gitlab.com/libeigen/eigen/-/merge_requests/1708): Fixed the `atan` test in `array_cwise.cpp` for 32-bit ARM architectures by adjusting polynomial expansion inputs to prevent flushed zero results.",
"[!1703](https://gitlab.com/libeigen/eigen/-/merge_requests/1703): Fixes the inverse evaluator in Eigen's Core module to support CUDA device execution by marking the function as a host+device function, resolving compatibility issues with GPU computations.",
"[!1693](https://gitlab.com/libeigen/eigen/-/merge_requests/1693): Fixed the SSE2 implementation of `pceil` to correctly handle rounding of negative numbers, ensuring consistency with standard C++ `std::ceil` behavior.",
"[!1690](https://gitlab.com/libeigen/eigen/-/merge_requests/1690): Fixed a bug in the `atanh` function implementation within the Eigen library's default packet math functions header, addressing potential incorrect behavior in certain cases.",
"[!1689](https://gitlab.com/libeigen/eigen/-/merge_requests/1689): Fixed SVE intrinsics in Eigen's PacketMath.h by correcting the use of \"svnot_b_x\" to \"svnot_b_z\" and adding support for float square root operations with `svsqrt_f32_x`.",
"[!1688](https://gitlab.com/libeigen/eigen/-/merge_requests/1688): Fixed a bug in the Eigen library's atanh function to correctly handle the edge case when the input is -1, improving numerical computation robustness.",
"[!1685](https://gitlab.com/libeigen/eigen/-/merge_requests/1685): Fixed a bug in the SSE complex permutation function `_mm_permute_pd` to correctly handle out-of-range arguments, preventing potential runtime errors in vectorized computations.",
"[!1653](https://gitlab.com/libeigen/eigen/-/merge_requests/1653): Fixed numerous typos across multiple Eigen library source files, including architecture-specific headers (AVX, NEON, SSE) and core utility files, without impacting core functionality.",
"[!1651](https://gitlab.com/libeigen/eigen/-/merge_requests/1651): Fixed AVX512 floating-point conversion handling by adding `as_float16` conversion function for `Eigen::half` and addressing compilation issues with `_Float16` in AVX512FP16 intrinsics.",
"[!1648](https://gitlab.com/libeigen/eigen/-/merge_requests/1648): Fixed an overflow warning in the AVX512 PacketMathFP16 implementation by adding an explicit cast to `short` for the `_mm512_mask_set1_epi16` intrinsic function, resolving potential compilation issues with specific compiler flags.",
"[!1639](https://gitlab.com/libeigen/eigen/-/merge_requests/1639): Fixes AVX512FP16 vectorization compatibility in Eigen's Core module by adding vectorized cast specializations for `packet16h` and `packet16f` types when AVX512FP16 is enabled.",
"[!1637](https://gitlab.com/libeigen/eigen/-/merge_requests/1637): Fixes scalar comparison behavior in GenericPacketMath.h for MSVC, addressing NaN value propagation to ensure consistent mathematical operations across different compiler settings.",
"[!1635](https://gitlab.com/libeigen/eigen/-/merge_requests/1635): Fixed a compiler warning in Eigen's ProductEvaluators header by addressing deprecated enumeration comparison operators, improving code compatibility.",
"[!1633](https://gitlab.com/libeigen/eigen/-/merge_requests/1633): Fixed warnings in the Eigen Core utility header Meta.h by addressing conflicts from previous warning resolution efforts, reducing overall warning noise in the library.",
"[!1631](https://gitlab.com/libeigen/eigen/-/merge_requests/1631): Fixed warnings in Eigen's Core and AutoDiff utilities by suppressing enum comparison warnings on GCC, reducing compilation warning noise.",
"[!1630](https://gitlab.com/libeigen/eigen/-/merge_requests/1630): Fixed macro definition warnings in Eigen's ThreadPool and test files by resolving repeated macro definitions, improving build consistency and reducing potential warnings.",
"[!1628](https://gitlab.com/libeigen/eigen/-/merge_requests/1628): Fixed threading tests in Eigen's CoreThreadPoolDevice by adjusting header inclusion order and addressing C++20 extension warnings, improving the stability of threading-related code.",
"[!1622](https://gitlab.com/libeigen/eigen/-/merge_requests/1622): Fixed a potential undefined behavior issue in the `array_for_matrix` test suite by modifying the test/array_for_matrix.cpp file to address a UBSAN failure related to integer types.",
"[!1620](https://gitlab.com/libeigen/eigen/-/merge_requests/1620): Fixed compilation issues with constexpr matrices in DenseBase for GCC 14 by adding a trivial default constructor and modifying related header and test files to ensure proper handling of constexpr matrix initialization.",
"[!1616](https://gitlab.com/libeigen/eigen/-/merge_requests/1616): Fixed a GCC6 compilation issue in the `test/array_cwise.cpp` test file by resolving namespace prefixing problems in struct specializations, improving build compatibility.",
"[!1611](https://gitlab.com/libeigen/eigen/-/merge_requests/1611): Fixes CMake package configuration by modifying `CMakeLists.txt` to correctly set the include path for the Eigen library, ensuring proper include directory inheritance for dependent packages.",
"[!1606](https://gitlab.com/libeigen/eigen/-/merge_requests/1606): Fixed a signed integer overflow issue in the predux_mul test within packetmath.cpp, preventing undefined behavior during input generation.",
"[!1604](https://gitlab.com/libeigen/eigen/-/merge_requests/1604): Fixed AVX512 `preduce_mul` implementation on MSVC to correctly handle negative values in the PacketMath header, ensuring proper computation for AVX512 packet operations.",
"[!1601](https://gitlab.com/libeigen/eigen/-/merge_requests/1601): Fixed sine and cosine function implementation for PowerPC architecture in Eigen's AltiVec packet math header to resolve incorrect function selection due to missing comparison functions.",
"[!1598](https://gitlab.com/libeigen/eigen/-/merge_requests/1598): Fixes matrix product performance in Eigen's core library by modifying transpose and product handling to eliminate unnecessary memory allocations during matrix operations.",
"[!1591](https://gitlab.com/libeigen/eigen/-/merge_requests/1591): Fixed compilation issues with PacketI on PowerPC by modifying AltiVec and generic packet math header files, improving architecture-specific compatibility and stability.",
"[!1588](https://gitlab.com/libeigen/eigen/-/merge_requests/1588): Fixed AVX architecture support in Eigen's MathFunctions and PacketMath headers by adjusting implementation for environments without AVX2, improving build compatibility for AVX-based functions.",
"[!1585](https://gitlab.com/libeigen/eigen/-/merge_requests/1585): Fixed AVX512 intrinsic handling in Eigen's PacketMath.h, resolving a GCC-related issue with the `pfirst<Packet16i>` functionality in the AVX512 implementation.",
"[!1577](https://gitlab.com/libeigen/eigen/-/merge_requests/1577): Fixed preverse implementation in Eigen's AltiVec/PacketMath.h for PowerPC architecture, addressing compatibility and correctness issues specific to this platform.",
"[!1576](https://gitlab.com/libeigen/eigen/-/merge_requests/1576): Fixed preprocessor condition in Eigen's UnaryFunctors to correctly enable fast float logistic implementation, resolving macro mismatches that previously prevented the optimized implementation from being used.",
"[!1575](https://gitlab.com/libeigen/eigen/-/merge_requests/1575): Fixed long double random number generation in Eigen's core random implementation by correcting the mantissa bit calculation and removing redundant static asserts.",
"[!1573](https://gitlab.com/libeigen/eigen/-/merge_requests/1573): Fixed compiler warnings in Eigen's core arithmetic operations, addressing unary minus and type casting issues primarily on MSVC by modifying packet math and core functional implementations.",
"[!1570](https://gitlab.com/libeigen/eigen/-/merge_requests/1570): Fixed type casting in Eigen's SSE implementation by modifying the `TypeCasting.h` file to use truncation instead of rounding when converting `Packet2d` to `Packet2l`, improving numerical operation correctness.",
"[!1567](https://gitlab.com/libeigen/eigen/-/merge_requests/1567): Fixed SSE architecture support for 32-bit systems by improving double-to-int64 conversions and adding Windows build smoketests for 32-bit and 64-bit configurations.",
"[!1566](https://gitlab.com/libeigen/eigen/-/merge_requests/1566): Fixes an issue with the `Packet2l` implementation in the SSE PacketMath header on Windows, addressing a potential compatibility problem in the SSE architecture support.",
"[!1559](https://gitlab.com/libeigen/eigen/-/merge_requests/1559): Fixed AVX and SSE packet math implementations to support 32-bit builds on Linux and Windows, adding workarounds for specific 64-bit extraction and conversion instructions.",
"[!1552](https://gitlab.com/libeigen/eigen/-/merge_requests/1552): Fixed compatibility issue in CwiseUnaryView for MSVC by modifying the default parameter handling in the CwiseUnaryViewImpl implementation within the Core module.",
"[!1550](https://gitlab.com/libeigen/eigen/-/merge_requests/1550): Fixes GPU compatibility in EmulateArray.h by removing unnecessary guarding of rbegin/rend methods, resolving compile errors for device code.",
"[!1545](https://gitlab.com/libeigen/eigen/-/merge_requests/1545): Fixed CwiseUnaryView functionality in Eigen's core module by addressing direct-access issues for const objects and improving view access and modification capabilities.",
"[!1541](https://gitlab.com/libeigen/eigen/-/merge_requests/1541): Fixed packetmath test compatibility on Windows by replacing `std::log` with `numext::log` in the `test/packetmath.cpp` file to ensure correct behavior with MSVC.",
"[!1540](https://gitlab.com/libeigen/eigen/-/merge_requests/1540): Fixed a pexp test in packetmath.cpp to handle 32-bit ARM subnormal value flushes, resolving a test failure specific to ARM architecture.",
"[!1538](https://gitlab.com/libeigen/eigen/-/merge_requests/1538): Fixes volume calculation for empty AlignedBox in Eigen's Geometry module by ensuring 0 volume is returned when the box is empty, correcting previous incorrect behavior.",
"[!1536](https://gitlab.com/libeigen/eigen/-/merge_requests/1536): Fixed an unaligned access issue in the Eigen Core library's triangular matrix-vector multiplication (trmv) function, resolving a test failure related to memory access on certain hardware architectures.",
"[!1533](https://gitlab.com/libeigen/eigen/-/merge_requests/1533): Fixed complex number edge cases in the `pexp` function within Eigen's packet math implementation, addressing test failures related to complex number exponential calculations.",
"[!1532](https://gitlab.com/libeigen/eigen/-/merge_requests/1532): Fixed a warning in the Eigen library's Macros.h header related to C++14 requirement, reducing compatibility-related warnings in the codebase.",
"[!1529](https://gitlab.com/libeigen/eigen/-/merge_requests/1529): Fixed a warning in triangular matrix-vector multiplication by removing `const_cast` and avoiding potential uninitialized memory issues in the Eigen Core module.",
"[!1528](https://gitlab.com/libeigen/eigen/-/merge_requests/1528): Fixed a compilation warning in the QR column pivoting test by using `numext::abs` instead of `abs` for floating-point types, resolving a test failure.",
"[!1526](https://gitlab.com/libeigen/eigen/-/merge_requests/1526): Fixed GPU build compatibility in Eigen's Core and SVD modules by modifying MathFunctions.h and JacobiSVD.h to resolve MSVC and NVCC compilation issues.",
"[!1524](https://gitlab.com/libeigen/eigen/-/merge_requests/1524): Fixed signed integer overflow issues in Eigen's random number generation module by modifying MathFunctions.h and rand.cpp to prevent undefined behavior during random number generation.",
"[!1521](https://gitlab.com/libeigen/eigen/-/merge_requests/1521): Fixes a crash in the IncompleteCholesky algorithm by modifying the handling of zero diagonal entries in the sparse matrix implementation, ensuring stability when processing matrices with zero diagonals.",
"[!1518](https://gitlab.com/libeigen/eigen/-/merge_requests/1518): Fixed header guards in GeneralMatrixMatrix.h to resolve build inconsistencies and ensure proper header protection across Eigen's core components.",
"[!1514](https://gitlab.com/libeigen/eigen/-/merge_requests/1514): Fixes a test in packetmath.cpp by replacing an index with an integer, addressing a potential type-related issue in the complex exponential test.",
"[!1513](https://gitlab.com/libeigen/eigen/-/merge_requests/1513): Fixes a test case in the packetmath.cpp test file, likely addressing an issue with complex exponential function testing.",
"[!1507](https://gitlab.com/libeigen/eigen/-/merge_requests/1507): Fixed BDCSVD implementation by correcting deflation issues and improving numeric stability through better index alignment and use of hypot function for diagonal element comparisons.",
"[!1504](https://gitlab.com/libeigen/eigen/-/merge_requests/1504): Fixed undefined behavior in the `pabsdiff` function for ARM architectures by adding overflow prevention checks, improving library stability on recent compilers.",
"[!1500](https://gitlab.com/libeigen/eigen/-/merge_requests/1500): Fixes a scalar conversion issue in ternary expressions within the Eigen Core module, addressing a specific bug related to type handling in matrix operations.",
"[!1496](https://gitlab.com/libeigen/eigen/-/merge_requests/1496): Fixed division by zero undefined behavior in packet size logic within the GeneralBlockPanelKernel header, improving robustness of internal calculations without changing the public API.",
"[!1494](https://gitlab.com/libeigen/eigen/-/merge_requests/1494): Fixes a segmentation fault in CholmodBase::factorize() when handling zero matrices by adding robust checks in the Cholmod support module, preventing potential crashes during sparse matrix factorizations.",
"[!1492](https://gitlab.com/libeigen/eigen/-/merge_requests/1492): Fixed C++20 compatibility in Eigen's GeneralBlockPanelKernel header by resolving enumeration type arithmetic errors, ensuring proper type promotion for multiply operations.",
"[!1490](https://gitlab.com/libeigen/eigen/-/merge_requests/1490): Fixed undefined behavior in Eigen's packet math operations by modifying SSE and generic packet math implementations to use valid boolean values in the `pselect` function.",
"[!1489](https://gitlab.com/libeigen/eigen/-/merge_requests/1489): Fixed undefined behavior in the `getRandomBits` function within `MathFunctions.h` by adding a check to return 0 when no random bits are requested and optimizing mask calculation to prevent potential issues.",
"[!1488](https://gitlab.com/libeigen/eigen/-/merge_requests/1488): Fixed test cases for bfloat16 and half scalar types in Eigen's test suite, addressing constexpr behavior issues across multiple test files.",
"[!1487](https://gitlab.com/libeigen/eigen/-/merge_requests/1487): Fixed skew symmetric matrix test in Eigen by modifying the test logic to avoid a problematic case causing test failures, specifically addressing the scenario where `k == 1` leads to catastrophic cancellation.",
"[!1486](https://gitlab.com/libeigen/eigen/-/merge_requests/1486): Fixed a GCC-6 compiler optimization issue in the random number test by applying the `noinline` attribute to prevent value elision, ensuring the test passes correctly.",
"[!1485](https://gitlab.com/libeigen/eigen/-/merge_requests/1485): Fixed PPC architecture-specific packet math issues in Eigen's AltiVec implementation, addressing random integer overflow problems and improving test compatibility for PowerPC platforms.",
"[!1482](https://gitlab.com/libeigen/eigen/-/merge_requests/1482): Fixed the preshear transformation in Eigen's Geometry module by correcting its internal implementation and adding a validation test case to ensure proper functionality.",
"[!1478](https://gitlab.com/libeigen/eigen/-/merge_requests/1478): Fixed a comparison bug in the array_cwise.cpp test file, addressing a minor issue with subnormal number checking.",
"[!1468](https://gitlab.com/libeigen/eigen/-/merge_requests/1468): Fixed ARM32 architecture-specific issues in Eigen's core mathematical functions by replacing `fpclassify` and modifying `mlaq` to improve accuracy and compatibility with the half-precision floating point type.",
"[!1460](https://gitlab.com/libeigen/eigen/-/merge_requests/1460): Fixed performance regression in Eigen's stableNorm function by reverting a previous implementation change, restoring optimal performance for large vectors.",
"[!1458](https://gitlab.com/libeigen/eigen/-/merge_requests/1458): Fixed the `stableNorm` function in Eigen's core module to correctly handle zero-sized input, preventing potential edge-case errors and ensuring consistent behavior across different input sizes.",
"[!1451](https://gitlab.com/libeigen/eigen/-/merge_requests/1451): Fixed a build error in the SPQR module by resolving a type mismatch between Index and StorageIndex when using SuiteSparseQR() with SparseMatrix<double>, addressing compiler compatibility issues.",
"[!1449](https://gitlab.com/libeigen/eigen/-/merge_requests/1449): Fixed GPU-related memory access issues in GenericPacketMath.h by addressing function pointer handling, improving stability for clang and ASAN debugging scenarios.",
"[!1444](https://gitlab.com/libeigen/eigen/-/merge_requests/1444): Fixes potential overflow in Eigen's CompressedStorage by using smaller index types when determining maximum size during resize operations in sparse matrix handling.",
"[!1439](https://gitlab.com/libeigen/eigen/-/merge_requests/1439): Fixed bit manipulation functions in Eigen's MathFunctions.h for MSVC, correcting the `_BitScanReverse` implementation to accurately return the index of the first set bit and align with expected behavior.",
"[!1434](https://gitlab.com/libeigen/eigen/-/merge_requests/1434): Fixed a CUDA syntax error in the `test/gpu_common.h` test file that was introduced by clang-format, ensuring correct compilation of GPU-related test code.",
"[!1431](https://gitlab.com/libeigen/eigen/-/merge_requests/1431): Fixed scalar logistic function handling for complex inputs by updating comparison logic in UnaryFunctors.h to prevent overflow issues and improve robustness of complex input processing.",
"[!1425](https://gitlab.com/libeigen/eigen/-/merge_requests/1425): Fixed typecasting in Eigen's NEON implementation for ARM32 architecture, addressing compatibility and correctness issues in type casting operations.",
"[!1422](https://gitlab.com/libeigen/eigen/-/merge_requests/1422): Fixed ARM architecture type casting in Eigen's TypeCasting.h to correctly convert 64-bit integers to 32-bit floats, preventing potential data truncation during conversion.",
"[!1419](https://gitlab.com/libeigen/eigen/-/merge_requests/1419): Fixed a potential dimension validation issue in GeneralMatrixMatrixTriangular.h by ensuring `mc` is not smaller than `Traits::nr`, preventing potential out-of-bounds errors in matrix operations.",
"[!1417](https://gitlab.com/libeigen/eigen/-/merge_requests/1417): Fixed a bug in the `getNbThreads()` function within Eigen's parallelization infrastructure, ensuring it correctly returns 1 when not parallelizing to improve thread count retrieval reliability.",
"[!1416](https://gitlab.com/libeigen/eigen/-/merge_requests/1416): Fixed a compiler warning in the Eigen gemm parallelizer by addressing integer type shortening in the Parallelizer.h file, maintaining existing functionality.",
"[!1415](https://gitlab.com/libeigen/eigen/-/merge_requests/1415): Links pthread library for the product_threaded test in the CMakeLists.txt, ensuring proper compilation and linking of threaded test cases.",
"[!1413](https://gitlab.com/libeigen/eigen/-/merge_requests/1413): Fixed the `Ref` class implementation to correctly handle stride construction for contiguous memory layout objects, improving performance and correctness when creating `Ref` objects from mutable types.",
"[!1411](https://gitlab.com/libeigen/eigen/-/merge_requests/1411): Fixed a typo in the AVX512 implementation's runtime malloc configuration macro, resolving a build issue that prevented the nomalloc test from passing on AVX512 architectures.",
"[!1402](https://gitlab.com/libeigen/eigen/-/merge_requests/1402): Fixed a compiler-specific issue in the Block expression type within Eigen's Core module, addressing MSVC attribute handling by removing a dependent typedef in Block.h.",
"[!1401](https://gitlab.com/libeigen/eigen/-/merge_requests/1401): Fixed a typo in a comment within the Block.h file of the Eigen library's Core module, with no functional impact on the code.",
"[!1398](https://gitlab.com/libeigen/eigen/-/merge_requests/1398): Fixed macro conflict in Eigen's matrix product and sparse computation headers by eliminating use of _res, resolving potential compilation errors with resolv.h.",
"[!1396](https://gitlab.com/libeigen/eigen/-/merge_requests/1396): Fixed sparse triangular view iterator in SparseTriangularView.h by correcting the row() and col() functions, resolving a long-standing bug that could cause incorrect results and potential segfaults.",
"[!1394](https://gitlab.com/libeigen/eigen/-/merge_requests/1394): Fixed an extra semicolon in the XprHelper header file to resolve compilation errors when using the `-Wextra-semi` compiler flag.",
"[!1388](https://gitlab.com/libeigen/eigen/-/merge_requests/1388): Fixes Pardiso support in PardisoSupport.h by adjusting stage validation logic to ensure a stage is considered valid only when Pardiso returns a success status.",
"[!1386](https://gitlab.com/libeigen/eigen/-/merge_requests/1386): Fixed ARM32 floating-point operations in Eigen's NEON intrinsics by improving float division and reciprocal calculations, addressing denormal value handling and increasing computational accuracy.",
"[!1380](https://gitlab.com/libeigen/eigen/-/merge_requests/1380): Fixes unaligned scalar binding in MapBase by modifying alignment checks in MapBase.h, preventing potential undefined behavior in memory mapping operations.",
"[!1379](https://gitlab.com/libeigen/eigen/-/merge_requests/1379): Fixed a potential nullptr dereference in the SVD (Singular Value Decomposition) implementation by adding a safety check when the upper-diagonal is empty, preventing runtime errors in edge cases.",
"[!1377](https://gitlab.com/libeigen/eigen/-/merge_requests/1377): Fixed undefined behavior in triangular matrix solves by adding safety checks to prevent out-of-bounds access when the matrix system is empty or singular.",
"[!1376](https://gitlab.com/libeigen/eigen/-/merge_requests/1376): Fixed a nullptr dereference issue in the triangular matrix product implementation, preventing undefined behavior when matrices have zero size and improving robustness in edge cases.",
"[!1371](https://gitlab.com/libeigen/eigen/-/merge_requests/1371): Fixed SVD implementation warnings in Eigen by modifying header files for BDCSVD, JacobiSVD, and SVDBase to eliminate GCC 10 maybe-uninitialized compiler warnings and optimize memory usage in fixed-size SVD cases.",
"[!1370](https://gitlab.com/libeigen/eigen/-/merge_requests/1370): Fixed a warning in GeneralMatrixVector.h related to loop optimizations by explicitly defining loop bounds to silence false positive compiler warnings for specific matrix sizes.",
"[!1367](https://gitlab.com/libeigen/eigen/-/merge_requests/1367): Fixed compiler warnings and initialization issues in Eigen's Block and VectorBlock components, addressing zero-size block handling and improving matrix initialization safety.",
"[!1363](https://gitlab.com/libeigen/eigen/-/merge_requests/1363): Fixes CUDA support in Eigen's MathFunctions by replacing deprecated `::arg` with `std::arg`, resolving compatibility issues with MSVC+C++20.",
"[!1362](https://gitlab.com/libeigen/eigen/-/merge_requests/1362): Fixed an AVX intrinsic parameter issue in PacketMath.h by correcting the imm argument for _mm256_cvtps_ph, resolving potential compiler warnings and ensuring correct behavior.",
"[!1360](https://gitlab.com/libeigen/eigen/-/merge_requests/1360): Fixed the return type of `ivcSize` in `IndexedViewMethods.h` to improve type safety and ensure compatibility with Eigen's internal implementation.",
"[!1359](https://gitlab.com/libeigen/eigen/-/merge_requests/1359): Fixed AVX512 triangular solver matrix (trsm) kernels to handle no-malloc scenarios by adjusting kernel behavior and allocation strategies in the AVX512 implementation.",
"[!1358](https://gitlab.com/libeigen/eigen/-/merge_requests/1358): Fixed compiler warnings across multiple Eigen core modules, addressing integer comparison issues and removing unused typedefs to improve code robustness and reduce potential build errors.",
"[!1350](https://gitlab.com/libeigen/eigen/-/merge_requests/1350): Fixed the `safe_abs` function in the integer power implementation to prevent undefined behavior on clang by improving the absolute value handling in the default generic packet math functions.",
"[!1349](https://gitlab.com/libeigen/eigen/-/merge_requests/1349): Fixed AVX `pstore` function in Eigen's PacketMath.h to correctly use aligned store intrinsics for integer types, improving performance and correctness of AVX-based operations.",
"[!1343](https://gitlab.com/libeigen/eigen/-/merge_requests/1343): Fixed error handling in the `pow()` function for floating-point and integer operations, addressing edge cases and underflow issues in Eigen's mathematical functions.",
"[!1339](https://gitlab.com/libeigen/eigen/-/merge_requests/1339): Fixes a potential compilation issue in Macros.h by preventing the setting of EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC during CUDA compilation, which resolves miscompilation problems when mixing ARM and CUDA code.",
"[!1337](https://gitlab.com/libeigen/eigen/-/merge_requests/1337): Fixed vectorization logic in Redux library by cleaning up Redux.h and PartialReduxEvaluator.h to address traversal order-related issues and improve compatibility.",
"[!1333](https://gitlab.com/libeigen/eigen/-/merge_requests/1333): Fixed compiler warnings and potential compilation failures in Eigen's SVD (Singular Value Decomposition) implementations by ensuring proper initialization of small fixed-size matrix members in the SVDBase.h file.",
"[!1327](https://gitlab.com/libeigen/eigen/-/merge_requests/1327): Fixed CUDA compilation issues in Eigen Core by modifying Meta.h and adding MoreMeta.h to resolve include and compilation problems related to STL array handling.",
"[!1323](https://gitlab.com/libeigen/eigen/-/merge_requests/1323): Fixed a compiler warning in the Eigen Visitor implementation related to potential modulo by zero, improving code safety in the Core module.",
"[!1322](https://gitlab.com/libeigen/eigen/-/merge_requests/1322): Fixed AltiVec matrix-vector product (GEMV) implementation for BF16 data types by correcting the `loadColData` function and resolving LLVM compilation issues in the matrix product headers.",
"[!1319](https://gitlab.com/libeigen/eigen/-/merge_requests/1319): Fixed AltiVec BF16 GEMV implementation to correctly handle ColMajor matrix and RowMajor vector scenarios, improving compatibility in matrix-vector operations for specific data layouts.",
"[!1318](https://gitlab.com/libeigen/eigen/-/merge_requests/1318): Fixes JacobiSVD implementation in Eigen by adding input validation to prevent crashes when encountering invalid inputs, ensuring `m_nonzeroSingularValues` is set to zero in such cases.",
"[!1312](https://gitlab.com/libeigen/eigen/-/merge_requests/1312): Fixed a warning in the test/product_small.cpp file related to boolean bitwise operations, reducing warning noise in the test suite without impacting functionality.",
"[!1308](https://gitlab.com/libeigen/eigen/-/merge_requests/1308): Fixed vectorization support for uint32_t in Eigen's PacketMath by adding specialization and disabling problematic pmul operations to prevent compilation errors.",
"[!1302](https://gitlab.com/libeigen/eigen/-/merge_requests/1302): Fixed a typo in the SSE packet math implementation within the Eigen/src/Core/arch/SSE/PacketMath.h header file, ensuring consistency and correctness in the SSE packet math code.",
"[!1291](https://gitlab.com/libeigen/eigen/-/merge_requests/1291): Fixed .gitignore configuration to prevent Eigen/Core and Eigen/src/Core directories from being incorrectly ignored by the core ignore rule.",
"[!1283](https://gitlab.com/libeigen/eigen/-/merge_requests/1283): Fixed type casting intrinsics in Eigen's AVX, AVX512, and SSE architecture implementations to ensure consistent and correct truncation behavior when converting double to integer types.",
"[!1282](https://gitlab.com/libeigen/eigen/-/merge_requests/1282): Fixed AVX512 GEMM/TRSM kernels by addressing buffer overrun issues and adding masked loads to prevent out-of-bound data access in Eigen's AVX512 implementation.",
"[!1277](https://gitlab.com/libeigen/eigen/-/merge_requests/1277): Fixed casting issues in the AVX512DQ path within Eigen's PacketMath.h, addressing potential type conversion errors in vectorized operations.",
"[!1271](https://gitlab.com/libeigen/eigen/-/merge_requests/1271): Fixed potential StorageIndex overflow in SparseMatrix by modifying the `setFromTriplets` method and updating the `Map` typedef to use correct options in the Eigen sparse matrix implementation.",
"[!1270](https://gitlab.com/libeigen/eigen/-/merge_requests/1270): Fixed ARM-specific build compatibility in Eigen's core architecture files by addressing casting and macro definition issues for MSVC and 32-bit ARM platforms.",
"[!1269](https://gitlab.com/libeigen/eigen/-/merge_requests/1269): Fixed CMake and CI configuration by reverting recent cmake pools changes to resolve build errors in the Eigen library's build infrastructure.",
"[!1268](https://gitlab.com/libeigen/eigen/-/merge_requests/1268): Fixed CMake configuration parsing to better handle command-line arguments specified as lists, improving compatibility with different CI configuration styles.",
"[!1263](https://gitlab.com/libeigen/eigen/-/merge_requests/1263): Fixed PowerPC and clang warnings in Eigen's AltiVec and warning suppression headers, addressing architecture-specific compiler messages to improve code compilation stability.",
"[!1258](https://gitlab.com/libeigen/eigen/-/merge_requests/1258): Fixed BF16 GEMM implementation for LLVM (Power) architecture by reverting changes that caused register spillage, reducing performance overhead by 20%.",
"[!1256](https://gitlab.com/libeigen/eigen/-/merge_requests/1256): Fixed a bug in the `minmax_coeff_visitor` within Eigen's Core module to correctly handle matrices containing only NaN values, improving the robustness of coefficient traversal logic.",
"[!1252](https://gitlab.com/libeigen/eigen/-/merge_requests/1252): Fixed a compiler bug in the Tridiagonalization.h header by implementing a workaround to address a specific compiler-related issue in the Eigen library's eigenvalues module.",
"[!1249](https://gitlab.com/libeigen/eigen/-/merge_requests/1249): Fixed AVX and AVX512 packet math implementations by modifying intrinsic usage in PacketMath.h to resolve MSVC test failures, with additional test support through a new packet_ostream.h file.",
"[!1248](https://gitlab.com/libeigen/eigen/-/merge_requests/1248): Fixed LinAlgSVD example code in documentation to ensure correct compilation and least-squares solution demonstration. Corrected a typo in the TutorialLinAlgSVDSolve.cpp example file.",
"[!1245](https://gitlab.com/libeigen/eigen/-/merge_requests/1245): Fixed a test in array_cwise.cpp by modifying matrix squaring logic to use `.abs()` and prevent integer overflow during random matrix operations.",
"[!1239](https://gitlab.com/libeigen/eigen/-/merge_requests/1239): Fixed NEON integer shift operation tests in Eigen's test suite to correctly handle zero as a valid input for shift operations, resolving test failures in the array comparison tests.",
"[!1235](https://gitlab.com/libeigen/eigen/-/merge_requests/1235): Fixed ODR issues in Eigen's AVX512 TRSM kernels by removing static qualifiers from free functions in TrsmKernel.h and TrsmUnrolls.inc, resolving potential linkage compatibility problems.",
"[!1229](https://gitlab.com/libeigen/eigen/-/merge_requests/1229): Fixed MSAN (Memory Sanitizer) errors in Eigen's SVD (Singular Value Decomposition) test suite by addressing uninitialized matrix handling in test files, improving test safety and reliability.",
"[!1228](https://gitlab.com/libeigen/eigen/-/merge_requests/1228): Fixed compiler compatibility issues in the AltiVec PacketMath header for Power architecture, addressing specific problems with GCC 10.4 and the vec_div command for integer types.",
"[!1222](https://gitlab.com/libeigen/eigen/-/merge_requests/1222): Fixed epsilon values in NumTraits and related files for long double types, addressing convergence issues on PowerPC platforms by adjusting numerical precision thresholds.",
"[!1220](https://gitlab.com/libeigen/eigen/-/merge_requests/1220): Fixed NEON packetmath implementation in Eigen by addressing compilation issues in PacketMath.h and resolving a stack overflow problem in TypeCasting.h, improving stability for NEON-based computations.",
"[!1218](https://gitlab.com/libeigen/eigen/-/merge_requests/1218): Fixed an MSVC-specific test in the array component, addressing an edge case with `std::atan2` underflow behavior to ensure compliance with POSIX standards.",
"[!1216](https://gitlab.com/libeigen/eigen/-/merge_requests/1216): Fixed a typo in the NEON `make_packet2f` function within the Eigen Core architecture, addressing a minor implementation detail in the PacketMath header.",
"[!1202](https://gitlab.com/libeigen/eigen/-/merge_requests/1202): Fixed MSVC ARM build compatibility in Eigen's NEON architecture files by modifying intrinsic functions and vector type handling in Complex.h, PacketMath.h, and TypeCasting.h.",
"[!1201](https://gitlab.com/libeigen/eigen/-/merge_requests/1201): Fixed an ODR violation in Eigen's AltiVec matrix product implementation by renaming a function in MatrixProductMMA.h to prevent naming conflicts when dynamic dispatch is enabled.",
"[!1186](https://gitlab.com/libeigen/eigen/-/merge_requests/1186): Updated the ForwardDeclarations.h utility file in Eigen's core module with minor modifications to forward declarations.",
"[!1185](https://gitlab.com/libeigen/eigen/-/merge_requests/1185): Fixes a special case handling issue in the Eigen library's generic packet math `atan2` function, resolving a test failure in TensorFlow with Clang.",
"[!1184](https://gitlab.com/libeigen/eigen/-/merge_requests/1184): Fixed AltiVec vector math functions by addressing pre-POWER8_VECTOR compatibility issues in `pcmp_lt` and `pnegate`, and reactivating the `psqrt` function.",
"[!1183](https://gitlab.com/libeigen/eigen/-/merge_requests/1183): Fixed undefined behavior in Eigen's Block and StlIterators components by addressing potential pointer arithmetic issues that could trigger UBSan errors.",
"[!1180](https://gitlab.com/libeigen/eigen/-/merge_requests/1180): Fixed sparse matrix memory allocation bug in Eigen's SparseCore module, preventing segfaults when handling empty sparse matrices by ensuring proper memory allocation for `m_outerIndex`.",
"[!1179](https://gitlab.com/libeigen/eigen/-/merge_requests/1179): Fixed the AltiVec implementation of rsqrt by disabling its vectorized version to ensure compatibility with the generic version of the function.",
"[!1178](https://gitlab.com/libeigen/eigen/-/merge_requests/1178): Fixed warnings in the SparseMatrix header file to improve sparse matrix operation handling in the Eigen library, reducing potential compilation warnings.",
"[!1173](https://gitlab.com/libeigen/eigen/-/merge_requests/1173): Fixed QR test compatibility by reverting permutation index type changes in qr_colpivoting.cpp and qr_fullpivoting.cpp to restore default indexing and ensure test functionality.",
"[!1162](https://gitlab.com/libeigen/eigen/-/merge_requests/1162): Fixed QR decomposition module build conflicts by resolving conflicting definitions of StorageIndex across multiple Eigen library files, ensuring compatibility and stability of the QR-related components.",
"[!1161](https://gitlab.com/libeigen/eigen/-/merge_requests/1161): Fixed an unused parameter warning in the NEON implementation of Eigen's GeneralBlockPanelKernel for 32-bit ARM builds, improving compiler compatibility with clang.",
"[!1156](https://gitlab.com/libeigen/eigen/-/merge_requests/1156): Fixed build and test issues across multiple Eigen modules, including correcting header paths, removing unnecessary headers, and improving GPU and floating-point support configurations.",
"[!1155](https://gitlab.com/libeigen/eigen/-/merge_requests/1155): Fixed the overalign check in Macros.h to correctly handle the EIGEN_COMP_ICC macro, ensuring compatibility with compilers that do not use ICC.",
"[!1153](https://gitlab.com/libeigen/eigen/-/merge_requests/1153): Fixed macro guard conditions in Eigen's Half.h header for GPU-related FP16 operations, resolving potential compilation compatibility issues with CUDA environments.",
"[!1151](https://gitlab.com/libeigen/eigen/-/merge_requests/1151): Fixed compiler configuration in Macros.h for Intel Compiler (ICC), addressing an overalignment issue with C++17 features in the Eigen core utilities.",
"[!1150](https://gitlab.com/libeigen/eigen/-/merge_requests/1150): Fixed Altivec compatibility for macOS PPC architectures by modifying macro checks in several Eigen core vectorization headers to prevent using unsupported VSX instructions.",
"[!1149](https://gitlab.com/libeigen/eigen/-/merge_requests/1149): Fixes .gitignore configuration to include the `scripts/buildtests.in` file, resolving an issue with git tracking of build test scripts.",
"[!1144](https://gitlab.com/libeigen/eigen/-/merge_requests/1144): Fixed C++ version detection macros in Eigen's core utilities to improve compatibility and reduce CI failures by updating detection logic in Macros.h, CMakeLists.txt, and constexpr.cpp.",
"[!1143](https://gitlab.com/libeigen/eigen/-/merge_requests/1143): Reverted a change in CompressedStorage.h to restore previous type mixing behavior, addressing a compatibility issue in the Eigen sparse core module.",
"[!1142](https://gitlab.com/libeigen/eigen/-/merge_requests/1142): Fixed a bug in the NEON GEBP kernel for native `__fp16` multiplication on ARM hardware, addressing incorrect packet handling and improving robustness of tensor contractions.",
"[!1130](https://gitlab.com/libeigen/eigen/-/merge_requests/1130): Fixed a type issue in sparse index sorting, correcting the index type in the sparse vector implementation to improve sorting accuracy.",
"[!1127](https://gitlab.com/libeigen/eigen/-/merge_requests/1127): Fixed serialization for non-compressed sparse matrices by adding an explicit move constructor in SparseMatrix.h and modifying serializer.cpp to correctly handle buffer size and move construction.",
"[!1124](https://gitlab.com/libeigen/eigen/-/merge_requests/1124): Fixed SparseLU solver to handle destinations with non-unit strides by modifying block expression handling in SparseLU.h and SparseLU_SupernodalMatrix.h, improving robustness for matrix views.",
"[!1123](https://gitlab.com/libeigen/eigen/-/merge_requests/1123): Fixes stride calculation in Eigen's reshape operations to correctly handle matrices with non-zero inner strides, improving the robustness of matrix reshaping functionality.",
"[!1120](https://gitlab.com/libeigen/eigen/-/merge_requests/1120): Fixed memory management in `handmade_aligned_realloc` by adding alignment constraints and correcting byte copying during reallocation to improve safety in Eigen's memory handling.",
"[!1118](https://gitlab.com/libeigen/eigen/-/merge_requests/1118): Fixed PPC architecture's AltiVec PacketMath header to resolve compiler ambiguity between `uint64_t` and `unsigned long` in vec_splats intrinsic function definitions.",
"[!1116](https://gitlab.com/libeigen/eigen/-/merge_requests/1116): Fixes floating-point zero sign handling in AVX and AVX512 packet math operations by correcting the `pnegate` implementation to properly manage sign bit flipping for zero values.",
"[!1115](https://gitlab.com/libeigen/eigen/-/merge_requests/1115): Fixed AVX2 implementation of psignbit in Eigen's Core library, ensuring correct handling of vector operations for improved performance and correctness.",
"[!1113](https://gitlab.com/libeigen/eigen/-/merge_requests/1113): Fixed Altivec implementation in PacketMath.h for Power 8 architecture, resolving duplicate execution code in the pstore_partial function to improve reliability.",
"[!1112](https://gitlab.com/libeigen/eigen/-/merge_requests/1112): Fixed a typo in the CholmodSupport module, addressing a minor textual error without impacting functionality.",
"[!1111](https://gitlab.com/libeigen/eigen/-/merge_requests/1111): Fixes Neon-specific packet math implementation in Eigen's architecture-specific header, addressing unspecified Neon-related issues in the core library.",
"[!1107](https://gitlab.com/libeigen/eigen/-/merge_requests/1107): Fixed AltiVec PacketMath header to disable `patan` for double on PPC architectures, resolving build compatibility issues for PowerPC platforms.",
"[!1106](https://gitlab.com/libeigen/eigen/-/merge_requests/1106): Fixed memory allocation offset computation in Eigen's Memory.h utility to resolve a potential out-of-bounds memory access issue reported by oss-fuzz.",
"[!1105](https://gitlab.com/libeigen/eigen/-/merge_requests/1105): Fixed a pragma check in the LU decomposition code to improve compatibility and stability when fastmath is disabled in the Eigen library.",
"[!1104](https://gitlab.com/libeigen/eigen/-/merge_requests/1104): Fixed a bug in Neon assembly for half-precision data type in GeneralBlockPanelKernel.h, addressing performance issues with GCC compiler intrinsics.",
"[!1102](https://gitlab.com/libeigen/eigen/-/merge_requests/1102): Fixed a bug in SparseMapBase by adding an assert to validate the outerIndexPtr array size, preventing potential out-of-bounds indexing errors in sparse map operations.",
"[!1096](https://gitlab.com/libeigen/eigen/-/merge_requests/1096): Fixed a bug in the `pselect` predicate within Eigen's `BinaryFunctors.h`, addressing a platform-specific issue with single-bit packet handling in the linear algebra implementation.",
"[!1094](https://gitlab.com/libeigen/eigen/-/merge_requests/1094): Fixed warnings in Eigen's sparse linear algebra modules by addressing unused variable warnings in SparseLU and TriangularSolver header files.",
"[!1085](https://gitlab.com/libeigen/eigen/-/merge_requests/1085): Fixed 4x4 matrix inverse computation in Eigen's LU module to prevent sign flips when compiling with `-Ofast` compiler optimization flag, ensuring consistent numerical behavior.",
"[!1070](https://gitlab.com/libeigen/eigen/-/merge_requests/1070): Fixed a test case for the `pow` function in Eigen's array operations, addressing type conversion behavior for integer exponents to improve test coverage and performance.",
"[!1065](https://gitlab.com/libeigen/eigen/-/merge_requests/1065): Fixed sparse matrix compilation issues for ROCm architecture by modifying the MatrixBase.h header, resolving platform-specific build errors in Eigen's sparse matrix operations.",
"[!1064](https://gitlab.com/libeigen/eigen/-/merge_requests/1064): Fixed constexpr build errors in Eigen's core modules for g++-6 and C++20, addressing compatibility issues with compile-time evaluation functions in AssignEvaluator and Redux components.",
"[!1063](https://gitlab.com/libeigen/eigen/-/merge_requests/1063): Fixed unary pow() function in Eigen's Core module by explicitly casting std::pow() result and adjusting type comparisons to improve type handling robustness.",
"[!1061](https://gitlab.com/libeigen/eigen/-/merge_requests/1061): Fixes the `pow` function bound calculation in the array component to better handle floating-point types, resolving a specific test case failure in `array_cwise_3`.",
"[!1060](https://gitlab.com/libeigen/eigen/-/merge_requests/1060): Fixed memory reallocation handling in Eigen's core memory utilities for non-trivial types, addressing issues with pointer management and initialization during memory reallocation.",
"[!1054](https://gitlab.com/libeigen/eigen/-/merge_requests/1054): Fixed a typo in the Sparse tutorial documentation file, improving documentation clarity and accuracy.",
"[!1053](https://gitlab.com/libeigen/eigen/-/merge_requests/1053): Fixed a compilation error in GeneralizedEigenSolver.h for MSVC by adding a missing semi-colon, resolving a build compatibility issue for Microsoft Visual C++ compilers.",
"[!1051](https://gitlab.com/libeigen/eigen/-/merge_requests/1051): Fixed mixingtypes test suite by modifying the unary pow operation to resolve test failures related to binary operation plugin interactions in the test/mixingtypes.cpp file.",
"[!1045](https://gitlab.com/libeigen/eigen/-/merge_requests/1045): Fixed GeneralizedEigenSolver's info() method to improve error reporting and decomposition state tracking. Updated logic in the eigenvalue solver to ensure more accurate and robust initialization checks.",
"[!1044](https://gitlab.com/libeigen/eigen/-/merge_requests/1044): Fixed a memory management issue in the SparseMatrix reallocation process by adding a missing pointer in the realloc call, ensuring proper memory handling during sparse matrix operations.",
"[!1042](https://gitlab.com/libeigen/eigen/-/merge_requests/1042): Fixed potential signed integer overflow in array_cwise test by modifying GenericPacketMathFunctions.h and array_cwise.cpp to prevent undefined behavior during integer arithmetic operations.",
"[!1039](https://gitlab.com/libeigen/eigen/-/merge_requests/1039): Fixed the `psign` function in Eigen's core packet math functions to correctly handle unsigned integer types like bool, addressing a bug that incorrectly returned `bool(-1)`.",
"[!1030](https://gitlab.com/libeigen/eigen/-/merge_requests/1030): Fixes Half functions compilation issue in Eigen's Default architecture header for aarch64, preventing double-definition errors during GPU compilation.",
"[!1028](https://gitlab.com/libeigen/eigen/-/merge_requests/1028): Fixed PowerPC build configuration in Eigen's AltiVec architecture-specific files, addressing non-VSX build compatibility issues for PowerPC platforms.",
"[!1027](https://gitlab.com/libeigen/eigen/-/merge_requests/1027): Fixed vectorized pow() implementation in Eigen's core module to correctly handle corner cases involving negative numbers and odd exponents, improving numerical computation accuracy.",
"[!1025](https://gitlab.com/libeigen/eigen/-/merge_requests/1025): Fixed the use of Packet2d type in the AltiVec Complex.h header to prevent compilation errors on non-VSX platforms by adding appropriate conditional compilation checks.",
"[!1023](https://gitlab.com/libeigen/eigen/-/merge_requests/1023): Fixed a flaky test in packetmath.cpp by adjusting input values to prevent cancellation issues in pmsub and pnmadd test cases, improving test reliability.",
"[!1014](https://gitlab.com/libeigen/eigen/-/merge_requests/1014): Fixed memory allocation behavior in Eigen's Memory.h by modifying aligned_realloc to properly handle null pointer cases and comply with runtime malloc assertions.",
"[!1010](https://gitlab.com/libeigen/eigen/-/merge_requests/1010): Fixes sparse block iterator in SparseCompressedBase to correctly handle outer index, resolving index mismatches in sparse block operations and improving iterator reliability.",
"[!988](https://gitlab.com/libeigen/eigen/-/merge_requests/988): Fixed AVX512 kernel build issues in MSVC by modifying GemmKernel.h and TrsmKernel.h to disable recent optimizations that were causing compilation errors and high memory usage.",
"[!987](https://gitlab.com/libeigen/eigen/-/merge_requests/987): Fixed integer shortening warnings in the visitor test suite by modifying test/visitor.cpp to address potential warning issues during compilation.",
"[!980](https://gitlab.com/libeigen/eigen/-/merge_requests/980): Fixes a potential signed integer overflow issue in the Eigen adjoint test suite by modifying the test/adjoint.cpp file to prevent undefined behavior.",
"[!977](https://gitlab.com/libeigen/eigen/-/merge_requests/977): Fixed numerical stability issues in the BDCSVD (Blocked Divide and Conquer Singular Value Decomposition) implementation within the Eigen SVD module, improving the solver's robustness for handling edge case computations.",
"[!974](https://gitlab.com/libeigen/eigen/-/merge_requests/974): Fixed a potential crash in the BDCSVD implementation by adding boundary checks and preventing out-of-bounds memory access when processing matrices filled with ones.",
"[!964](https://gitlab.com/libeigen/eigen/-/merge_requests/964): Fixed HouseholderSequence implementation by ensuring the InnerPanel template parameter is always false, resolving potential compilation issues in the Householder module.",
"[!963](https://gitlab.com/libeigen/eigen/-/merge_requests/963): Fixed NaN propagation in cwise operations for scalar inputs by correcting a missing template parameter in the Eigen source code, addressing issue #2474.",
"[!958](https://gitlab.com/libeigen/eigen/-/merge_requests/958): Fixed Power GEMM inline assembly implementation to resolve compiler compatibility issues with GCC 10 and 11 on Power architecture, specifically modifying the MatrixProductMMA.h header file.",
"[!949](https://gitlab.com/libeigen/eigen/-/merge_requests/949): Fixed ODR (One Definition Rule) issues in the `lapacke_helpers.h` file to ensure proper linkage and compliance with C++ standard requirements for the Eigen library's LAPACKE helpers module.",
"[!948](https://gitlab.com/libeigen/eigen/-/merge_requests/948): Fixed MSVC+CUDA compatibility issues across multiple Eigen core and tensor modules by modifying type definitions and resolving compiler warnings related to macro arguments and friend declarations.",
"[!945](https://gitlab.com/libeigen/eigen/-/merge_requests/945): Fixed max size expressions in core Eigen library files, addressing potential size limit issues across DenseBase, SolverBase, and TriangularMatrix header files.",
"[!941](https://gitlab.com/libeigen/eigen/-/merge_requests/941): Fixed scalar comparison logic in test/main.h to correctly handle infinite and NaN values, ensuring accurate behavior in the `test_isApprox` function for edge case comparisons.",
"[!934](https://gitlab.com/libeigen/eigen/-/merge_requests/934): Fixed the order of template arguments in the BLAS `syrk` function within `level3_impl.h` to resolve potential compiler errors related to argument type compatibility.",
"[!930](https://gitlab.com/libeigen/eigen/-/merge_requests/930): Fixed compilation issues in HouseholderQR and NNLS components by adding a missing typename and removing an unused typedef, resolving warnings on GCC 9.",
"[!925](https://gitlab.com/libeigen/eigen/-/merge_requests/925): Fixed an ODR violation in the trsm function within the AVX512 architecture implementation by marking the function as inline to resolve potential compiler warnings.",
"[!924](https://gitlab.com/libeigen/eigen/-/merge_requests/924): Fixed f16c scalar conversions in Half.h for MSVC by disabling unsupported conversions, ensuring better compatibility with Microsoft's compiler and preventing potential compilation issues.",
"[!923](https://gitlab.com/libeigen/eigen/-/merge_requests/923): Fixed AVX512 build compatibility for MSVC in Eigen's core architecture, adding explicit casting and enabling AVX512 testing in the CI configuration.",
"[!922](https://gitlab.com/libeigen/eigen/-/merge_requests/922): Fixed a compiler compatibility issue in Eigen's Diagonal and Transpose headers by adding an extra `const` qualifier to work around an MSVC compiler bug that was incorrectly dropping const qualifiers.",
"[!918](https://gitlab.com/libeigen/eigen/-/merge_requests/918): Fixed AVX512 architecture implementation in Eigen by adding explicit reinterprets to resolve g++ compilation errors in TypeCasting.h and unrolls_impl.hpp.",
"[!917](https://gitlab.com/libeigen/eigen/-/merge_requests/917): Fixed test/geo_orthomethods.cpp to work around a compiler bug in g++-10 Docker image, resolving a test failure in the geo_orthomethods_4 test case for Ubuntu 20.04 compatibility.",
"[!915](https://gitlab.com/libeigen/eigen/-/merge_requests/915): Fixed a minor issue in the AltiVec matrix product header file by adding a missing pound character, with no functional impact.",
"[!914](https://gitlab.com/libeigen/eigen/-/merge_requests/914): Fixes a flaky test in the Schur decomposition test suite by disabling the non-convergence check, addressing issue #2458 and reducing test instability.",
"[!911](https://gitlab.com/libeigen/eigen/-/merge_requests/911): Fixed a mixup in the SVD implementation's row major bit handling, correcting the logic for `RowMajorBit` and `RowMajor` to ensure proper matrix layout behavior.",
"[!910](https://gitlab.com/libeigen/eigen/-/merge_requests/910): Reverted changes to PowerPC MMA flags in AltiVec matrix product header files to restore compatibility and resolve previous build configuration issues.",
"[!908](https://gitlab.com/libeigen/eigen/-/merge_requests/908): Fixes a reference code issue in the STL interface header file (STL_interface.hh) within the Eigen benchmarking library's STL module.",
"[!902](https://gitlab.com/libeigen/eigen/-/merge_requests/902): Disabled aarch64 CI builds in GitLab configuration files to address temporary machine downtime, reducing unnecessary build attempts.",
"[!901](https://gitlab.com/libeigen/eigen/-/merge_requests/901): Fixed compilation issues with `construct_at` and `destroy_at` functions in Eigen's Core utility header to improve compatibility with ROCm hardware platforms.",
"[!900](https://gitlab.com/libeigen/eigen/-/merge_requests/900): Fixed the swap test for size 1 matrices in Eigen's test suite, resolving sporadic assertion failures by ensuring proper handling of single-element matrix swapping.",
"[!882](https://gitlab.com/libeigen/eigen/-/merge_requests/882): Fixed SVD implementation compatibility issues for MSVC+CUDA by modifying Memory.h, BDCSVD.h, and JacobiSVD.h to resolve type definition and function return warnings.",
"[!880](https://gitlab.com/libeigen/eigen/-/merge_requests/880): Fixed SVD computations for Microsoft Visual C++ (MSVC) by correcting the handling of the Options template parameter, resolving issues with SVD calculations and warnings.",
"[!878](https://gitlab.com/libeigen/eigen/-/merge_requests/878): Fixed packetmath test cases for MSVC by addressing incorrect `frexp` behavior with non-finite inputs, ensuring proper compiler compatibility in the Eigen library's test suite.",
"[!876](https://gitlab.com/libeigen/eigen/-/merge_requests/876): Fixed AVX512 Complex.h implementation to resolve performance and data corruption issues with g++-11 compiler, specifically removing the problematic `_mm512_broadcast_f64x2` instruction.",
"[!875](https://gitlab.com/libeigen/eigen/-/merge_requests/875): Fixed a compilation error in the packetmath module by adding a wrapper struct to allow passing overloaded functions as functors in the test/packetmath.cpp file.",
"[!874](https://gitlab.com/libeigen/eigen/-/merge_requests/874): Fixed a gcc-5 specific bug in the packetmath test suite by modifying memory initialization to resolve incorrect value generation during optimization levels -O2 and higher.",
"[!870](https://gitlab.com/libeigen/eigen/-/merge_requests/870): Fixes test macro conflicts in the test/main.h header to resolve potential compilation issues with C++20 standard headers.",
"[!866](https://gitlab.com/libeigen/eigen/-/merge_requests/866): Fixes a potential crash bug in SuiteSparseQRSupport by initializing pointers to nullptr, preventing invalid memory free operations in the destructor.",
"[!859](https://gitlab.com/libeigen/eigen/-/merge_requests/859): Fixes a compiler compatibility issue in the Eigen core utility header `DisableStupidWarnings.h` by adding support for Microsoft-specific `__pragma` to resolve MSVC+NVCC 9.2 pragma errors.",
"[!858](https://gitlab.com/libeigen/eigen/-/merge_requests/858): Fixed NEON sqrt and rsqrt implementations in Eigen's PacketMath to improve edge case handling for zero and infinity inputs, ensuring correct behavior on NEON architectures.",
"[!851](https://gitlab.com/libeigen/eigen/-/merge_requests/851): Fixed JacobiSVD_LAPACKE header to align with the latest SVD module implementation, ensuring compatibility and correct runtime options in the Eigen library's SVD functionality.",
"[!843](https://gitlab.com/libeigen/eigen/-/merge_requests/843): Fixed a namespace collision in the TriangularMatrixMatrix header by renaming local variables to avoid conflicts with resolve.h, improving library compatibility.",
"[!833](https://gitlab.com/libeigen/eigen/-/merge_requests/833): Fixed packet math functions in Eigen's core architecture for 32-bit ARM platforms by correcting type assumptions in GenericPacketMathFunctions.h and PacketMath.h to resolve int type discrepancies.",
"[!822](https://gitlab.com/libeigen/eigen/-/merge_requests/822): Fixed random number generation code in test/rand.cpp to address a potential overflow issue by adjusting the maximum value for the short offset.",
"[!815](https://gitlab.com/libeigen/eigen/-/merge_requests/815): Fixed a warning in the GEBP kernel's packing code by addressing an implicit conversion from `int` to `Index` in the `GeneralBlockPanelKernel.h` file.",
"[!812](https://gitlab.com/libeigen/eigen/-/merge_requests/812): Fixes an implicit conversion warning in the Eigen::Reverse class by explicitly casting an index from Eigen::Index to int in the vectorwise_reverse_inplace function within the Core module.",
"[!811](https://gitlab.com/libeigen/eigen/-/merge_requests/811): Fixes compilation compatibility in Eigen's Meta.h header for older GCC versions by addressing issues with std::ssize in C++2a standard, ensuring broader compiler support.",
"[!810](https://gitlab.com/libeigen/eigen/-/merge_requests/810): Fixed logistic sigmoid implementation in Eigen's core functors to handle corner cases, ensuring correct behavior for extreme input values like +Inf and inputs >= 1.",
"[!809](https://gitlab.com/libeigen/eigen/-/merge_requests/809): Fixed asserts in the IncompleteCholesky header by improving variable name checking, reducing potential runtime errors in the iterative linear solvers module.",
"[!806](https://gitlab.com/libeigen/eigen/-/merge_requests/806): Fixed an assertion message in IterativeSolverBase to correctly reference the class name, improving error message clarity and consistency in the Eigen linear solvers module.",
"[!805](https://gitlab.com/libeigen/eigen/-/merge_requests/805): Fixed an inconsistency in the `array.exp()` method, ensuring consistent behavior between scalar and vectorized implementation paths in Eigen's core array operations.",
"[!802](https://gitlab.com/libeigen/eigen/-/merge_requests/802): Fixes a truncation bug in Eigen's CoreEvaluators.h, correcting the incorrect conversion of unsigned integers to boolean values by ensuring proper integer truncation logic.",
"[!801](https://gitlab.com/libeigen/eigen/-/merge_requests/801): Fixed numeric limits and AVX `psqrt` implementation for BFloat16 and Half types, correcting signaling NaN, denormal handling, and improving floating-point type consistency.",
"[!800](https://gitlab.com/libeigen/eigen/-/merge_requests/800): Fixed GPU unit tests in test/gpu_test_helper.h by addressing serialization API changes, resolving issues specifically for HIP testing infrastructure.",
"[!794](https://gitlab.com/libeigen/eigen/-/merge_requests/794): Fixed header guard conflicts in ZVector/Complex.h and MathFunctions.h by replacing duplicated AltiVec header guards, resolving potential inclusion issues between architectures.",
"[!789](https://gitlab.com/libeigen/eigen/-/merge_requests/789): Fixed ConfigureVectorization.h to conditionally include immintrin.h when F16C is available, preventing header inclusion failures when vectorization is disabled.",
"[!785](https://gitlab.com/libeigen/eigen/-/merge_requests/785): Fixed Clang warnings in Eigen's SSE and generic packet math functions by addressing alignment and floating-point precision issues in compiler-specific code.",
"[!782](https://gitlab.com/libeigen/eigen/-/merge_requests/782): Fixed a bug in the EIGEN_IMPLIES macro within the BLAS module, addressing side-effect short-circuiting issues in matrix vector operations by modifying the PackedTriangularMatrixVector.h header.",
"[!769](https://gitlab.com/libeigen/eigen/-/merge_requests/769): Fixed an error in the SPQRSupport module by correctly including Eigen headers, resolving a build-time header inclusion issue.",
"[!767](https://gitlab.com/libeigen/eigen/-/merge_requests/767): Fixed exp() function behavior in vectorized expressions to ensure `-Inf` inputs return zero, modifying the default packet math functions in Eigen's core library.",
"[!749](https://gitlab.com/libeigen/eigen/-/merge_requests/749): Reverted changes in the SVD module to restore compatibility with third-party libraries, rolling back previous modifications to SVD-related headers and implementation files.",
"[!746](https://gitlab.com/libeigen/eigen/-/merge_requests/746): Fixed Cholesky decomposition (LLT) implementation to handle zero-sized matrices correctly, ensuring proper behavior when using Lapacke with edge case matrix sizes.",
"[!745](https://gitlab.com/libeigen/eigen/-/merge_requests/745): Fixed HIP compilation issues in Eigen's SelfAdjointView and TriangularMatrix classes, addressing backend compatibility problems in the HIP support.",
"[!741](https://gitlab.com/libeigen/eigen/-/merge_requests/741): Fixed HIP compilation issues in DenseBase by adding `EIGEN_DEVICE_FUNC` modifiers to ensure compatibility with device code in the Eigen Core module.",
"[!730](https://gitlab.com/libeigen/eigen/-/merge_requests/730): Fixed indexed views for non-Eigen types by addressing stride computation issues in the IndexedView traits class. Improved robustness of stride calculation to prevent potential signed integer overflow.",
"[!720](https://gitlab.com/libeigen/eigen/-/merge_requests/720): Fixed a typo in the Eigen/src/Core/util/Memory.h utility header, addressing a minor textual error in the source code.",
"[!719](https://gitlab.com/libeigen/eigen/-/merge_requests/719): Fixed sparse-sparse product calculation in Eigen's sparse matrix operations by correcting storage index handling in ConservativeSparseSparseProduct, ensuring proper index compatibility across different storage index types.",
"[!714](https://gitlab.com/libeigen/eigen/-/merge_requests/714): Fixed an issue in the nestbyvalue test by initializing matrices to Random() to prevent uninitialized matrix and NaN value problems across different systems.",
"[!711](https://gitlab.com/libeigen/eigen/-/merge_requests/711): Fixed a configuration issue in Eigen's ConfigureVectorization.h by correcting the condition for defining EIGEN_HAS_FP16_C, ensuring proper compiler support across different compiler types.",
"[!707](https://gitlab.com/libeigen/eigen/-/merge_requests/707): Fixed a total deflation issue in the BDCSVD implementation when the input matrix is already diagonal, improving the robustness of the SVD decomposition algorithm.",
"[!703](https://gitlab.com/libeigen/eigen/-/merge_requests/703): Fixes min/max scalar operations in ArrayCwiseBinaryOps.h to improve nan propagation behavior, addressing inconsistent handling of nan values in scalar operations.",
"[!696](https://gitlab.com/libeigen/eigen/-/merge_requests/696): Fixed visitor return type in Eigen/src/Core/Visitor.h by removing `const` to resolve compatibility issues with `pload`/`ploadu` on arm/ppc architectures.",
"[!695](https://gitlab.com/libeigen/eigen/-/merge_requests/695): Fixed the boostmultiprec test compilation compatibility by modifying the test file to resolve symbol redefinition issues with older Boost versions.",
"[!694](https://gitlab.com/libeigen/eigen/-/merge_requests/694): Fixed ZVector architecture build configuration in Complex.h and PacketMath.h to resolve cross-compilation issues for s390x-linux-gnu-g++ environments, enabling packetmath tests to pass.",
"[!666](https://gitlab.com/libeigen/eigen/-/merge_requests/666): Fixed the `EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` macro in Eigen's core utilities to resolve compilation issues with MSVC and NVCC, specifically addressing build problems for Visual Studio 2017.",
"[!665](https://gitlab.com/libeigen/eigen/-/merge_requests/665): Fixed compilation issues in Eigen's GPU tuple implementation for Visual Studio 2017 by modifying type deduction in the Tuple.h header and corresponding test file.",
"[!660](https://gitlab.com/libeigen/eigen/-/merge_requests/660): Fixed typos across multiple Eigen library components, including core modules, architecture-specific headers, and documentation files, to improve code readability and consistency.",
"[!659](https://gitlab.com/libeigen/eigen/-/merge_requests/659): Fixes BFloat16 type handling in Eigen's default architecture by correcting alias violation issues in the BFloat16 conversion mechanism, improving stability and preventing undefined behavior.",
"[!656](https://gitlab.com/libeigen/eigen/-/merge_requests/656): Fixed strict aliasing issues in Eigen's Complex.h implementations for AVX, AVX512, and SSE architectures, resolving matrix multiplication test failures caused by packet loading problems.",
"[!643](https://gitlab.com/libeigen/eigen/-/merge_requests/643): Fixes a compilation error in the GPU test helper header for HIP, addressing a specific build issue in the test infrastructure.",
"[!639](https://gitlab.com/libeigen/eigen/-/merge_requests/639): Fixed AVX2 packet math implementation in PacketMath.h by correcting typos and addressing unaligned load issues, improving code quality and correctness for AVX2 operations.",
"[!630](https://gitlab.com/libeigen/eigen/-/merge_requests/630): Fixed AVX and AVX512 integer packet operations in Eigen's PacketMath headers by adding missing protection macros and correcting implementation details for improved compatibility and correctness.",
"[!629](https://gitlab.com/libeigen/eigen/-/merge_requests/629): Fixes Macros.h in Eigen's Core utility to add \"g\" constraint support for arm-clang, resolving compilation issues with optimization barriers.",
"[!621](https://gitlab.com/libeigen/eigen/-/merge_requests/621): Fixed GCC 4.8 compilation issue in Eigen's NEON architecture code by replacing `g` register constraint with `r` register, improving compatibility with armv7 architectures.",
"[!615](https://gitlab.com/libeigen/eigen/-/merge_requests/615): Fixed Windows ARM compilation in Eigen/Core by including the `intrin.h` header to support `BitScanReverse` and `BitScanReverse64` functions, resolving build errors on ARM Windows systems.",
"[!613](https://gitlab.com/libeigen/eigen/-/merge_requests/613): Fixes `Eigen::fix<N>` and symbolic index functionality by correcting variable template support detection and handling in IntegralConstant.h and symbolic_index.cpp.",
"[!604](https://gitlab.com/libeigen/eigen/-/merge_requests/604): Fixes a bug in Eigen's MathFunctions.h for Visual Studio 2017, addressing an incorrect `std::arg` implementation for negative real numbers to ensure proper mathematical function behavior.",
"[!591](https://gitlab.com/libeigen/eigen/-/merge_requests/591): Fixed AltiVec architecture support in Eigen's Complex and PacketMath headers to resolve compiler compatibility issues with older Power compilers (GCC 7.5 and Clang 10), addressing missing vector functions and pointer handling.",
"[!569](https://gitlab.com/libeigen/eigen/-/merge_requests/569): Fixed compatibility issues in Eigen's Macros.h header for MSVC+NVCC configurations, addressing assignment operator definition problems that were causing compilation errors.",
"[!565](https://gitlab.com/libeigen/eigen/-/merge_requests/565): Fixed a compilation error in JacobiSVD with HouseholderQRPreconditioner for row-vector inputs, ensuring consistent behavior across preconditioners in the SVD module.",
"[!562](https://gitlab.com/libeigen/eigen/-/merge_requests/562): Fixed scalar operations in Eigen's core packet math utilities to prevent undefined behavior by modifying initialization and bitwise operations for different scalar types.",
"[!554](https://gitlab.com/libeigen/eigen/-/merge_requests/554): Fixes an in-place matrix inversion issue in the LU module by creating a copy of the input matrix before performing the inverse operation, preventing unintended matrix modifications.",
"[!551](https://gitlab.com/libeigen/eigen/-/merge_requests/551): Fixed a bug in the ConjHelper.h component that resolved issues with complex conjugation when using custom types, addressing compatibility problems introduced by a previous commit.",
"[!549](https://gitlab.com/libeigen/eigen/-/merge_requests/549): Fixed memory access issues in PartialPivLU inverse computation by adding checks to prevent nullptr dereferencing and out-of-bounds memory access for empty or single-column matrices.",
"[!539](https://gitlab.com/libeigen/eigen/-/merge_requests/539): Fixed conjhelper functionality in Eigen's Default architecture header for AMD HIP GPUs, addressing a specific compatibility issue in the ConjHelper implementation.",
"[!530](https://gitlab.com/libeigen/eigen/-/merge_requests/530): Fixed an issue in Eigen's IntegralConstant.h header for GCC 4.9.3, addressing a compiler compatibility problem with template templates in C++14.",
"[!517](https://gitlab.com/libeigen/eigen/-/merge_requests/517): Fixes version parsing in CMake configuration for nvhpc compiler by stripping leading whitespace from version strings in EigenTesting.cmake, ensuring correct compiler version detection.",
"[!506](https://gitlab.com/libeigen/eigen/-/merge_requests/506): Fixes a potential memory issue in the `conservativeResize()` method by modifying the implementation in `PlainObjectBase.h` to correctly handle object type and alignment during resizing operations.",
"[!503](https://gitlab.com/libeigen/eigen/-/merge_requests/503): Fixed the `inverse_4x4` test matrices in `test/prec_inverse_4x4.cpp` to ensure all generated matrices are invertible, addressing an issue with matrix inversion tests.",
"[!499](https://gitlab.com/libeigen/eigen/-/merge_requests/499): Fixed a compiler compatibility issue in Eigen's Core module by correcting the inclusion order of `numext::imag` and `numext::real` functions in `MathFunctions.h`, resolving build errors with HIP compiler.",
"[!498](https://gitlab.com/libeigen/eigen/-/merge_requests/498): Fixes SSE complex packet storage in Eigen's Complex.h header by aligning the storage format with wrapper variables, resolving a potential mixing issue in the SSE architecture implementation.",
"[!495](https://gitlab.com/libeigen/eigen/-/merge_requests/495): Fixed the return type of `numext::arg` function in Eigen's Core module to correctly return the real type instead of complex, resolving compile-time type issues.",
"[!494](https://gitlab.com/libeigen/eigen/-/merge_requests/494): Fixed the `conj` function in Eigen's Core module to restore ABI compatibility with Boost, by reintroducing the second template parameter and addressing potential conflicts with custom complex scalar types.",
"[!483](https://gitlab.com/libeigen/eigen/-/merge_requests/483): Fixed AVX512 implementation of `pcmp_lt_or_nan` in Eigen's PacketMath header, addressing a bug and adding corresponding test cases to improve function reliability.",
"[!474](https://gitlab.com/libeigen/eigen/-/merge_requests/474): Fixed compiler warnings in AltiVec matrix product headers by addressing rvalue template address-taking issues, specifically in MatrixProduct.h and MatrixProductMMA.h to improve TensorFlow compatibility.",
"[!472](https://gitlab.com/libeigen/eigen/-/merge_requests/472): Fixed compilation errors in basicbenchmark module by replacing deprecated functions with their current equivalents, ensuring code compatibility and resolving build issues.",
"[!469](https://gitlab.com/libeigen/eigen/-/merge_requests/469): Fixed AVX512 implementation of ldexp function by correcting shuffle operations in PacketMath.h, improving performance and correctness of low and high data half interleaving."
],
"other_removed": [
"[!1915](https://gitlab.com/libeigen/eigen/-/merge_requests/1915): Removes the AArch64 Ampere runner from Eigen's CI configuration by modifying Linux build and test GitLab CI configuration files to switch to alternative GitLab runners for ARM architecture.",
"[!1642](https://gitlab.com/libeigen/eigen/-/merge_requests/1642): Reverted a change in the GenericPacketMath.h file, specifically removing a previous fix related to scalar pselect functionality.",
"[!1561](https://gitlab.com/libeigen/eigen/-/merge_requests/1561): Removed \"extern C\" preprocessor directive from the CholmodSupport module to improve code consistency and reduce unnecessary code.",
"[!1498](https://gitlab.com/libeigen/eigen/-/merge_requests/1498): Removed duplicate complex conjugate functions (`r_cnjg` and `d_cnjg`) from the f2c BLAS implementation to eliminate symbol conflicts and reduce external dependencies.",
"[!1477](https://gitlab.com/libeigen/eigen/-/merge_requests/1477): Removed the relicense script from the Eigen library's scripts directory, eliminating a redundant file from the project's codebase.",
"[!1353](https://gitlab.com/libeigen/eigen/-/merge_requests/1353): Removed a deprecated function call in the SVD test suite, cleaning up unnecessary code in the test/svd_common.h file.",
"[!1306](https://gitlab.com/libeigen/eigen/-/merge_requests/1306): Removed unused `HasHalfPacket` enum from several Eigen core architecture header files, reducing code complexity and improving clarity in AVX512, SYCL, and generic packet math implementations.",
"[!1266](https://gitlab.com/libeigen/eigen/-/merge_requests/1266): Removes pool creation logic in CMakeLists.txt when CMake version is less than 3.11, simplifying build configuration and improving compatibility with older project environments.",
"[!1230](https://gitlab.com/libeigen/eigen/-/merge_requests/1230): Removed obsolete AVX512 workarounds and redundant intrinsics across multiple AVX512-related header files in Eigen's core architecture, simplifying code complexity and maintaining compiler compatibility.",
"[!1212](https://gitlab.com/libeigen/eigen/-/merge_requests/1212): Removes BF16 to F32 conversion support in the Power architecture's AltiVec matrix product implementation, specifically modifying the MatrixProductMMAbfloat16.h header file.",
"[!1200](https://gitlab.com/libeigen/eigen/-/merge_requests/1200): Removed custom `equal_to` and `not_equal_no` functors in Eigen's Core module, replacing them with standard C++14 comparison operators to simplify and modernize the codebase.",
"[!1197](https://gitlab.com/libeigen/eigen/-/merge_requests/1197): Removes LGPL-licensed code and references from Eigen's codebase, simplifying the library's licensing by deleting non-MPL2 components and modifying related files.",
"[!1188](https://gitlab.com/libeigen/eigen/-/merge_requests/1188): Reverted changes to the StlIterators.h file, undoing a previous modification that was discussed in a prior merge request, potentially impacting code that relied on the previous implementation.",
"[!1109](https://gitlab.com/libeigen/eigen/-/merge_requests/1109): Removed an assert in SparseMapBase within the SparseMap.h file to improve map construction flexibility and reduce unnecessary validity checks for sparse matrices.",
"[!1092](https://gitlab.com/libeigen/eigen/-/merge_requests/1092): Removed deprecated mathematical constants M_PI_2 and M_PI_4 from Eigen's core header files, improving code clarity and reducing reliance on deprecated definitions.",
"[!1074](https://gitlab.com/libeigen/eigen/-/merge_requests/1074): Reverted constexpr support modifications across multiple Eigen core headers, removing previously added C++14 constexpr test and related implementation changes in the Core module.",
"[!1069](https://gitlab.com/libeigen/eigen/-/merge_requests/1069): Removed a problematic test case in the skew_symmetric_matrix3 test suite that was causing memory sanitizer (MSAN) errors due to uninitialized matrices.",
"[!946](https://gitlab.com/libeigen/eigen/-/merge_requests/946): Removed deprecated empty struct workaround in multiple Eigen core functors and utility files, modernizing the library's implementation and improving compatibility with current C++ standards.",
"[!909](https://gitlab.com/libeigen/eigen/-/merge_requests/909): Removed deprecated GCC-4 warning workarounds from Eigen's core utility files, specifically targeting Meta.h, Dot.h, and SparseBlock.h to improve code clarity and maintainability.",
"[!897](https://gitlab.com/libeigen/eigen/-/merge_requests/897): Removed an outdated gcc 4.3 compiler workaround in Eigen's Macros.h and main.h files, improving compatibility with modern compilers by eliminating unnecessary legacy code.",
"[!896](https://gitlab.com/libeigen/eigen/-/merge_requests/896): Removed ComputeCpp-specific code from the SYCL Vptr implementation in the Eigen Core module, replacing it with standard SYCL buffer reinterpret functionality to improve portability.",
"[!855](https://gitlab.com/libeigen/eigen/-/merge_requests/855): Removed unused macros from the AVX implementation in Eigen's PacketMath.h, cleaning up unnecessary code without affecting functionality.",
"[!830](https://gitlab.com/libeigen/eigen/-/merge_requests/830): Removed outdated C++98 documentation references across multiple Eigen core utility and documentation files, modernizing the library's documentation to reflect current C++ standards.",
"[!793](https://gitlab.com/libeigen/eigen/-/merge_requests/793): Removed the unused `EIGEN_HAS_STATIC_ARRAY_TEMPLATE` macro from the `indexed_view.cpp` test file, cleaning up unnecessary code.",
"[!772](https://gitlab.com/libeigen/eigen/-/merge_requests/772): Removed several deprecated Eigen macros related to constexpr, index list, and result_of from core utility and tensor-related files, simplifying the codebase and reducing potential macro-related complexity.",
"[!768](https://gitlab.com/libeigen/eigen/-/merge_requests/768): Removed redundant CMake find scripts for BLAS, GLEW, GSL, and LAPACK from the Eigen library's cmake directory, leveraging built-in CMake equivalents to simplify configuration.",
"[!761](https://gitlab.com/libeigen/eigen/-/merge_requests/761): Removed obsolete compiler version checks and deprecated flags across multiple Eigen core utility and configuration files, simplifying the codebase's preprocessor conditionals and compiler compatibility logic.",
"[!740](https://gitlab.com/libeigen/eigen/-/merge_requests/740): Removed the redundant `nonZeros()` method from `DenseBase` in the Eigen core module, simplifying the codebase by eliminating a method that merely called `size()`.",
"[!739](https://gitlab.com/libeigen/eigen/-/merge_requests/739): Removes GCC-4.8 test configurations from Eigen's GitLab CI files, reducing dependency on the older compiler version and potentially lowering the minimum C++ standard requirement.",
"[!735](https://gitlab.com/libeigen/eigen/-/merge_requests/735): Removed legacy C++11 feature preprocessor checks across multiple Eigen core and utility files, simplifying conditional compilation and reducing code complexity for compiler compatibility.",
"[!732](https://gitlab.com/libeigen/eigen/-/merge_requests/732): Removes the `EIGEN_HAS_CXX11` macro and associated conditional code across multiple Eigen core utility and test files, simplifying the codebase by eliminating version-specific checks for older C++ standards.",
"[!725](https://gitlab.com/libeigen/eigen/-/merge_requests/725): Removed deprecated MappedSparseMatrix type from Eigen's SparseCore module, eliminating internal references and deleting the associated header file to clean up the codebase.",
"[!636](https://gitlab.com/libeigen/eigen/-/merge_requests/636): Removed deprecated references to DynamicSparseMatrix across Eigen's sparse core files, cleaning up the codebase without changing functionality.",
"[!632](https://gitlab.com/libeigen/eigen/-/merge_requests/632): Removed unused `EIGEN_DEFINITIONS` from CMakeLists.txt, simplifying the build configuration by eliminating unnecessary interface definitions.",
"[!608](https://gitlab.com/libeigen/eigen/-/merge_requests/608): Removed C++11-off CI jobs from the Eigen library's build configuration, simplifying the continuous integration pipeline and transitioning towards modern C++ standards.",
"[!601](https://gitlab.com/libeigen/eigen/-/merge_requests/601): Removed unaligned assert tests from Eigen's geometry-related test files, cleaning up unnecessary memory access assertions and reducing test file complexity.",
"[!538](https://gitlab.com/libeigen/eigen/-/merge_requests/538): Removed unused macros `EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD` and `CJMADD` from multiple architecture-specific PacketMath header files, cleaning up code across different processor architecture implementations.",
"[!492](https://gitlab.com/libeigen/eigen/-/merge_requests/492): Removed unused `paddsub<Packet2cf>` function from Eigen's NEON and SSE complex packet implementations, simplifying the codebase and addressing issue #2242."
],
"other_added": [
"[!1914](https://gitlab.com/libeigen/eigen/-/merge_requests/1914): Adds a macro `EIGEN_DISABLE_ALLOCA` in the Memory.h utility header to provide users with an option to explicitly disable the use of `alloca` in Eigen code, improving library configuration flexibility.",
"[!1905](https://gitlab.com/libeigen/eigen/-/merge_requests/1905): Added a CHANGELOG.md file to the Eigen project, improving project documentation by translating wiki content to markdown and fixing formatting.",
"[!1896](https://gitlab.com/libeigen/eigen/-/merge_requests/1896): Adds new factory functions and accessor methods to the Quaternion class, providing explicit support for both scalar-first and scalar-last quaternion coefficient orderings to improve code clarity and interoperability.",
"[!1805](https://gitlab.com/libeigen/eigen/-/merge_requests/1805): Added new methods `matrixL()` and `matrixU()` to the IncompleteLUT sparse matrix component, enabling extraction of lower and upper triangular factors for improved matrix manipulation and testing capabilities.",
"[!1791](https://gitlab.com/libeigen/eigen/-/merge_requests/1791): Adds a ForkJoin-based ParallelFor algorithm to the Eigen ThreadPool module, enabling parallel execution of functions with improved task assignment logic and supporting unary and binary function parallelization.",
"[!1777](https://gitlab.com/libeigen/eigen/-/merge_requests/1777): Added LoongArch64 LSX architecture support to Eigen by introducing new architecture-specific header files and modifying core configuration files to enable vectorization and build compatibility.",
"[!1758](https://gitlab.com/libeigen/eigen/-/merge_requests/1758): Added a test case for the `pcast` function in `packetmath.cpp` to verify scalar input handling and ensure correct implementation of the packet casting functionality.",
"[!1715](https://gitlab.com/libeigen/eigen/-/merge_requests/1715): Added exp2() function to Eigen's core math operations, implementing a high-accuracy packet and array method across multiple architectures using the TwoProd algorithm.",
"[!1714](https://gitlab.com/libeigen/eigen/-/merge_requests/1714): Added `nextafter` implementation for bfloat16 type in Eigen's architecture-specific header, extending floating-point operation support for this data type.",
"[!1704](https://gitlab.com/libeigen/eigen/-/merge_requests/1704): Added free-function `swap` to Eigen's dense and sparse matrix classes, enabling easier matrix swapping in C++ algorithms. This implementation resolves issue #2853 and improves matrix manipulation flexibility.",
"[!1682](https://gitlab.com/libeigen/eigen/-/merge_requests/1682): Adds support for the nvc++ compiler by configuring compiler macros and fixing ARM NEON intrinsics in Eigen's core architecture and CMake testing scripts.",
"[!1669](https://gitlab.com/libeigen/eigen/-/merge_requests/1669): Adds NEON complex intrinsics support to Eigen's ARM architecture-specific files, enhancing complex arithmetic operations for NEON-enabled platforms by modifying Core architecture headers.",
"[!1612](https://gitlab.com/libeigen/eigen/-/merge_requests/1612): Added bit shifting functions to Eigen's numext namespace, introducing scalar bit shift operators for logical and arithmetic shifts across integer types.",
"[!1560](https://gitlab.com/libeigen/eigen/-/merge_requests/1560): Added cwiseSquare operation to Eigen's matrix operations, implementing a new element-wise squaring function and accompanying test cases to validate the implementation.",
"[!1554](https://gitlab.com/libeigen/eigen/-/merge_requests/1554): Added support for complex symmetric matrices in Eigen's sparse matrix solvers by introducing SimplicialNonHermitianLLT and SimplicialNonHermitianLDLT, enabling non-conjugate transpose views for sparse matrix computations.",
"[!1501](https://gitlab.com/libeigen/eigen/-/merge_requests/1501): Adds vectorized complex exponential function (`pexp_complex`) for float in multiple SIMD architectures, extending Eigen's packet math capabilities across AVX, AVX512, Altivec, NEON, SSE, and ZVector platforms.",
"[!1471](https://gitlab.com/libeigen/eigen/-/merge_requests/1471): Added new LAPACK CPU time measurement files (dsecnd_INT_CPU_TIME.cpp and second_INT_CPU_TIME.cpp) to support time-related functions in the LAPACK module, following LAPACK naming conventions.",
"[!1455](https://gitlab.com/libeigen/eigen/-/merge_requests/1455): Added test support for ROCm MI300 hardware variants in the Eigen testing infrastructure by modifying the CMake testing configuration.",
"[!1445](https://gitlab.com/libeigen/eigen/-/merge_requests/1445): Added getter methods for L, L\u1d40, and D factors in Cholmod LLT/LDLT solvers, exposing these matrices to improve solver usability and provide more flexibility for users working with matrix decompositions.",
"[!1436](https://gitlab.com/libeigen/eigen/-/merge_requests/1436): Added internal count trailing/leading zero (ctz/clz) implementations to Eigen's MathFunctions.h, enhancing functionality for random number generation and pointer alignment detection.",
"[!1430](https://gitlab.com/libeigen/eigen/-/merge_requests/1430): Added a .git-blame-ignore-revs configuration file to the Eigen repository, which helps control git blame behavior for revision tracking.",
"[!1403](https://gitlab.com/libeigen/eigen/-/merge_requests/1403): Adds component-wise cube root (`cbrt`) functionality to Eigen's array and matrix operations, implementing scalar and MKL-supported operations with comprehensive documentation and test coverage.",
"[!1383](https://gitlab.com/libeigen/eigen/-/merge_requests/1383): Added a temporary macro in MapBase.h to allow unaligned scalar usage in Eigen, addressing TFLite-related compatibility issues and enabling continued development.",
"[!1375](https://gitlab.com/libeigen/eigen/-/merge_requests/1375): Added architecture definition files for Qualcomm Hexagon Vector Extension (HVX) in Eigen's core vectorization support, introducing new HVX-specific header files and build flag definitions for Hexagon DSP targets.",
"[!1352](https://gitlab.com/libeigen/eigen/-/merge_requests/1352): Added rounding functions (rint, round, floor, ceil) to Eigen's unary functors in Core module, enhancing numerical computation capabilities by providing standard mathematical rounding operations.",
"[!1345](https://gitlab.com/libeigen/eigen/-/merge_requests/1345): Adds a new quaternion constructor in Eigen's Geometry module that allows direct creation from a real scalar and 3D imaginary vector, simplifying quaternion initialization for mathematical transformations.",
"[!1335](https://gitlab.com/libeigen/eigen/-/merge_requests/1335): Adds new methods `removeOuterVectors()` and `insertEmptyOuterVectors()` to Eigen's SparseMatrix class, enabling more flexible manipulation of sparse matrix columns and rows while handling edge cases related to outer vector indexing.",
"[!1331](https://gitlab.com/libeigen/eigen/-/merge_requests/1331): Added SYCL testing support to Eigen core by introducing a new test configuration file and a basic SYCL functionality test, enhancing the library's testing infrastructure for SYCL compatibility.",
"[!1309](https://gitlab.com/libeigen/eigen/-/merge_requests/1309): Added `Abs2` method to the `Packet4ul` class in the AVX implementation, enabling efficient computation of squared absolute values for 4-element unsigned long vectors.",
"[!1299](https://gitlab.com/libeigen/eigen/-/merge_requests/1299): Added BF16 packet casting functions to AltiVec's PacketMath by introducing a new TypeCasting.h header and enhancing type casting support for BF16 types.",
"[!1297](https://gitlab.com/libeigen/eigen/-/merge_requests/1297): Added support for unsigned integer packet types (`Packet4ui`, `Packet8ui`, and `Packet4ul`) in Eigen's SSE and AVX vectorization headers, extending the library's vectorization capabilities for unsigned integer operations.",
"[!1250](https://gitlab.com/libeigen/eigen/-/merge_requests/1250): Adds support for the Less operation in Eigen's cwise binary operations by modifying the MatrixCwiseBinaryOps.h plugin and expanding test coverage in array_cwise.cpp.",
"[!1211](https://gitlab.com/libeigen/eigen/-/merge_requests/1211): Adds a new `CArg` function to Eigen's core complex number operations, introducing vectorized complex argument calculations for real numbers across multiple Eigen core files.",
"[!1159](https://gitlab.com/libeigen/eigen/-/merge_requests/1159): Added missing header file `test/gpu_test_helper.h` to resolve GPU test case compilation issues, ensuring proper header inclusion for GPU testing infrastructure.",
"[!1133](https://gitlab.com/libeigen/eigen/-/merge_requests/1133): Adds a new `setEqualSpaced` function to Eigen's core linear spacing functionality, providing a more intuitive and efficient method for generating equally spaced vectors across different numeric types.",
"[!1129](https://gitlab.com/libeigen/eigen/-/merge_requests/1129): Added BDCSVD LAPACKE binding to Eigen's SVD module, extending SVD functionality by implementing LAPACKE-based computations with support for various SVD variant combinations.",
"[!1121](https://gitlab.com/libeigen/eigen/-/merge_requests/1121): Adds serialization support for sparse matrices and vectors in Eigen, enabling easier data persistence and reproduction of sparse solver issues by modifying core sparse data structure headers.",
"[!1103](https://gitlab.com/libeigen/eigen/-/merge_requests/1103): Adds a new CompressedStorageIterator to enable sorting of inner vectors in sparse matrices, improving performance and flexibility for sparse matrix operations.",
"[!1098](https://gitlab.com/libeigen/eigen/-/merge_requests/1098): Adds cross product functionality for 2D vectors in Eigen's geometry module, implementing a new method to compute cross products for 2-dimensional vector types and improving related documentation.",
"[!1097](https://gitlab.com/libeigen/eigen/-/merge_requests/1097): Adds signbit function support across multiple Eigen architecture-specific packet math implementations, enhancing floating-point operation performance with efficient sign bit calculation for various SIMD instruction sets.",
"[!1082](https://gitlab.com/libeigen/eigen/-/merge_requests/1082): Adds vectorized support for atan2 operations in Eigen's core library, implementing global functions and array syntax with significant performance improvements for large arrays.",
"[!1076](https://gitlab.com/libeigen/eigen/-/merge_requests/1076): Adds vectorized integer division support for int32 data types across AVX512, AVX, and SSE architectures, enhancing performance and error handling for integer division operations in Eigen's packet math implementations.",
"[!1073](https://gitlab.com/libeigen/eigen/-/merge_requests/1073): Adds AVX implementation for int32_t pdiv function in PacketMath.h, enabling vectorized integer division with improved performance for AVX-enabled environments.",
"[!1047](https://gitlab.com/libeigen/eigen/-/merge_requests/1047): Adds a new SkewSymmetricMatrix3 class to the Eigen Core module, implementing skew symmetric matrix functionality for Vector3 with support for Rodrigues' rotation formula and accompanying test cases.",
"[!1029](https://gitlab.com/libeigen/eigen/-/merge_requests/1029): Adds a new unary power operation for Eigen arrays, implementing efficient integer exponent calculations and leveraging existing vectorized power routines for non-integer exponents.",
"[!1008](https://gitlab.com/libeigen/eigen/-/merge_requests/1008): Adds Power10 (AltiVec) MMA instruction support for bfloat16 matrix operations in Eigen's AltiVec architecture, implementing rank-2 update operations and improving performance for bfloat16 computations on Power10 hardware.",
"[!1004](https://gitlab.com/libeigen/eigen/-/merge_requests/1004): Added determinant calculation methods to various QR decomposition classes in Eigen, enabling true determinant computation for HouseholderQR, ColPivHouseholderQR, FullPivHouseholderQR, and CompleteOrthogonalDecomposition implementations.",
"[!990](https://gitlab.com/libeigen/eigen/-/merge_requests/990): Adds support for diagonal matrix multiplication and static initializers in DiagonalMatrix, enhancing matrix operation capabilities and usability for Eigen users.",
"[!965](https://gitlab.com/libeigen/eigen/-/merge_requests/965): Added PowerPC-specific fused multiply functions (`pmsub`, `pnmadd`, and `pnmsub`) to the AltiVec architecture implementation in Eigen's PacketMath header, enhancing performance for PowerPC-based systems.",
"[!940](https://gitlab.com/libeigen/eigen/-/merge_requests/940): Restored std::remove* aliases in Eigen's Meta.h header to maintain compatibility with third-party libraries that depend on these standard library type traits.",
"[!893](https://gitlab.com/libeigen/eigen/-/merge_requests/893): Adds new CMake build configuration options for controlling BLAS, LAPACK, and CMake package components, enabling more granular build customization for Eigen library users.",
"[!817](https://gitlab.com/libeigen/eigen/-/merge_requests/817): Adds support for 64-bit integer packet operations on x86 architectures by modifying AVX and AVX512 PacketMath header files, enhancing integer computation capabilities for x86 platforms.",
"[!791](https://gitlab.com/libeigen/eigen/-/merge_requests/791): Added compiler support for Cray, Fujitsu, and Intel ICX compilers in Eigen's core utility files, extending compiler detection macros and compatibility.",
"[!652](https://gitlab.com/libeigen/eigen/-/merge_requests/652): Added a CMake macro `EIGEN_CTEST_ARGS` to enhance parallel test execution configuration in the Eigen testing infrastructure, allowing more flexible test argument passing via CMake.",
"[!625](https://gitlab.com/libeigen/eigen/-/merge_requests/625): Added GPU test utilities to Eigen's testing infrastructure, introducing new functions like `run_on_cpu`, `run_on_gpu`, and `run` to enhance GPU kernel execution and testing capabilities.",
"[!624](https://gitlab.com/libeigen/eigen/-/merge_requests/624): Adds a new serialization mechanism to Eigen's Core module, introducing a `Serializer<T>` class to support binary serialization and improve GPU testing capabilities.",
"[!623](https://gitlab.com/libeigen/eigen/-/merge_requests/623): Added a device-compatible Tuple implementation in the GPU architecture, enabling simplified GPU testing by introducing a new Tuple.h header and modifying Meta.h to support device-compatible tuple functionality.",
"[!505](https://gitlab.com/libeigen/eigen/-/merge_requests/505): Added test coverage for matrix transpose operation in packetmath.cpp, specifically implementing a test case for non-square kernel transpose functionality.",
"[!500](https://gitlab.com/libeigen/eigen/-/merge_requests/500): Adds a new macro in Eigen's utility headers to check for C++14 variable templates support, enhancing compile-time feature detection capabilities for developers using the library.",
"[!484](https://gitlab.com/libeigen/eigen/-/merge_requests/484): Added unit tests for complex matrix support in the SelfAdjointEigenSolver, expanding test coverage for complex matrix operations in the Eigen library's eigenvalue solver.",
"[!480](https://gitlab.com/libeigen/eigen/-/merge_requests/480): Added unit tests for packet math comparison operations (`pcmp_lt` and `pcmp_le`) in the `test/packetmath.cpp` file to improve test coverage for comparison operations.",
"[!473](https://gitlab.com/libeigen/eigen/-/merge_requests/473): Added AVX512 support for double packet exponential operations in Eigen's vectorized math functions, improving performance and compatibility for AVX512 architecture."
],
"major_changes": [
"[!1865](https://gitlab.com/libeigen/eigen/-/merge_requests/1865): Enhances Eigen's vectorization framework by implementing a masked load/store mechanism with packet segment support, improving performance for odd-sized arrays and expanding compatibility across different architectures.",
"[!1852](https://gitlab.com/libeigen/eigen/-/merge_requests/1852): Enhances AVX512 vectorization support by introducing native _Float16 operations, updating multiple AVX and core header files to improve performance for half-precision floating-point computations.",
"[!1830](https://gitlab.com/libeigen/eigen/-/merge_requests/1830): Enhances Eigen's core assignment functionality by adding constexpr support to assignment functors and related classes, improving compile-time evaluation capabilities across core Eigen modules.",
"[!1827](https://gitlab.com/libeigen/eigen/-/merge_requests/1827): Improves complex scalar type handling in Eigen's core and eigenvalue modules by removing assumptions about std::complex and adding support for custom complex types.",
"[!1666](https://gitlab.com/libeigen/eigen/-/merge_requests/1666): Introduces a new strongly typed matrix multiplication function in Eigen's core matrix multiplication module, optimizing performance and providing enhanced algebraic operation capabilities.",
"[!1655](https://gitlab.com/libeigen/eigen/-/merge_requests/1655): Adds a new strongly typed matrix multiplication function to Eigen's core library, implementing performance optimizations and improving matrix operation capabilities with enhanced documentation and error handling.",
"[!1654](https://gitlab.com/libeigen/eigen/-/merge_requests/1654): Introduces a new strongly typed matrix multiplication function in Eigen's core library, optimizing performance and providing enhanced algebraic operations for matrix computations.",
"[!1565](https://gitlab.com/libeigen/eigen/-/merge_requests/1565): Enhances Eigen's symbolic indexing capabilities by refactoring SymbolicIndex to support compile-time expression evaluation and simplifying indexed view handling, enabling more flexible compile-time constants in indexed expressions.",
"[!1429](https://gitlab.com/libeigen/eigen/-/merge_requests/1429): Enhances Eigen's matrix multiplication performance by adding a new strongly typed algebraic matrix multiplication function and implementing optimizations across multiple architecture-specific headers.",
"[!1414](https://gitlab.com/libeigen/eigen/-/merge_requests/1414): Introduces vectorized complex logarithm support (`plog_complex`) across multiple Eigen architecture-specific headers, enhancing complex number computational capabilities with architecture-optimized implementations.",
"[!1408](https://gitlab.com/libeigen/eigen/-/merge_requests/1408): Enables parallel GEMM implementation in Eigen's Core module to support ThreadPool, expanding parallel computation capabilities beyond OpenMP and improving portability for matrix multiplication operations.",
"[!1384](https://gitlab.com/libeigen/eigen/-/merge_requests/1384): Introduces a new strongly typed algebraic matrix multiplication function to optimize matrix operations across multiple Eigen architecture-specific implementations, enhancing performance and providing more flexible matrix computation capabilities.",
"[!1329](https://gitlab.com/libeigen/eigen/-/merge_requests/1329): Enhances the Eigen ThreadPool by introducing macros that allow users to override default synchronization primitives, providing greater flexibility for performance tuning and custom hardware implementations.",
"[!1328](https://gitlab.com/libeigen/eigen/-/merge_requests/1328): Enhances Eigen's type casting performance by introducing specialized vectorized casting evaluators across multiple architecture-specific header files, improving performance for complex casting expressions.",
"[!1314](https://gitlab.com/libeigen/eigen/-/merge_requests/1314): Introduces a new `canonicalEulerAngles` method in the Geometry module, replacing the deprecated `eulerAngles` with improved angle representation and quadrant handling for more consistent Euler angle calculations.",
"[!1296](https://gitlab.com/libeigen/eigen/-/merge_requests/1296): Enhances Power/VSX architecture support by adding dynamic dispatch for BF16 GEMM, implementing a new VSX version with significant performance improvements in vector conversions and matrix operations.",
"[!1289](https://gitlab.com/libeigen/eigen/-/merge_requests/1289): Integrates thread pool implementation from unsupported to core Eigen modules by relocating thread pool files and preparing infrastructure for broader library-wide threading support.",
"[!1281](https://gitlab.com/libeigen/eigen/-/merge_requests/1281): Enhances sparse matrix functionality by adding `insertFromTriplets` methods and optimizing `setFromTriplets` performance in Eigen's SparseCore module, improving batch insertion and memory efficiency for sparse matrix operations.",
"[!1240](https://gitlab.com/libeigen/eigen/-/merge_requests/1240): Enhances Eigen's comparison operations by introducing typed vectorized comparison methods, modifying core comparison functors to provide more flexible and performant boolean array comparisons.",
"[!1152](https://gitlab.com/libeigen/eigen/-/merge_requests/1152): Enhances QR decomposition functionality by adding template support for permutation index types in ColPivHouseholderQR, FullPivHouseholderQR, and CompleteOrthogonalDecomposition, while improving Lapacke bindings and fixing determinant sign calculation.",
"[!1147](https://gitlab.com/libeigen/eigen/-/merge_requests/1147): Enhances Eigen's sparse matrix core functionality by overhauling performance-critical methods like `setFromTriplets`, `conservativeResize`, and `insert` with optimized memory management and more efficient algorithms.",
"[!1126](https://gitlab.com/libeigen/eigen/-/merge_requests/1126): Enables SYCL-2020 support in Eigen by adding Intel DPCPP compiler compatibility, modifying core SYCL backend files to support advanced GPU acceleration and C++17 features across tensor and core library components.",
"[!1017](https://gitlab.com/libeigen/eigen/-/merge_requests/1017): Adds AVX512-FP16 support to Eigen's packet math operations, implementing vectorized half precision math with improved performance through new packet instructions and optimized type casting.",
"[!971](https://gitlab.com/libeigen/eigen/-/merge_requests/971): Adds a new strongly typed matrix multiplication function to Eigen's core matrix operations, implementing performance optimizations and improving computational efficiency for matrix calculations.",
"[!947](https://gitlab.com/libeigen/eigen/-/merge_requests/947): Enhances Eigen's packet operations by adding partial load, store, gather, and scatter functions for improved memory access and performance across multiple architectures.",
"[!860](https://gitlab.com/libeigen/eigen/-/merge_requests/860): Adds AVX512 optimizations to Eigen's matrix multiplication kernels, implementing performance improvements for single and double precision matrix operations with reduced register pressure and enhanced packet math support.",
"[!856](https://gitlab.com/libeigen/eigen/-/merge_requests/856): Adds support for Apple's Accelerate sparse matrix solvers in Eigen, implementing wrappers for LLT, LDLT, and QR solvers to improve performance for large sparse linear systems on Apple platforms.",
"[!829](https://gitlab.com/libeigen/eigen/-/merge_requests/829): Enhances Eigen's matrix multiplication performance by introducing a new strongly typed algebraic matrix multiplication function and implementing internal optimizations across multiple core library components.",
"[!631](https://gitlab.com/libeigen/eigen/-/merge_requests/631): Enhances matrix multiplication performance by adding a new strongly typed algebraic matrix multiplication function and implementing optimizations across multiple Eigen architecture-specific headers.",
"[!546](https://gitlab.com/libeigen/eigen/-/merge_requests/546): Implements a vectorized Smith's algorithm for complex division across multiple instruction sets, significantly improving performance and numerical stability in Eigen's complex arithmetic operations."
],
"breaking_changes": [
"[!1795](https://gitlab.com/libeigen/eigen/-/merge_requests/1795): Modifies the `Eigen::aligned_allocator` to remove inheritance from `std::allocator`, fixing a bug in the `allocate_at_least` method and improving memory allocation behavior in the core Eigen library.",
"[!1730](https://gitlab.com/libeigen/eigen/-/merge_requests/1730): Reverts a previous change to fixed-size objects in Eigen's core modules, addressing a compilation issue with move assignment in C++14 when using GCC and Clang with `-O` optimization.",
"[!1280](https://gitlab.com/libeigen/eigen/-/merge_requests/1280): Disables raw array indexed view access for 1d arrays in Eigen's indexed view implementation, removing potential undefined behavior and improving code safety for array operations.",
"[!1203](https://gitlab.com/libeigen/eigen/-/merge_requests/1203): Modifies Eigen's logical operators to support typed logical operations across scalar types, introducing breaking changes to how boolean and bitwise operations work for non-standard types like complex numbers and floating point values.",
"[!1196](https://gitlab.com/libeigen/eigen/-/merge_requests/1196): Introduces vectorized comparison operations in Eigen's core library with potential breaking changes, enabling typed comparisons and optimizing boolean selection through a new scalar comparison operator and macro infrastructure.",
"[!826](https://gitlab.com/libeigen/eigen/-/merge_requests/826): Breaks the SVD module's API by introducing an `Options` template parameter to `JacobiSVD`, enabling more flexible computation options while maintaining backwards compatibility through deprecated constructors.",
"[!771](https://gitlab.com/libeigen/eigen/-/merge_requests/771): Renames the internal `size` function to `ssize` in Eigen's core utilities to prevent ADL conflicts and improve compatibility with C++20's standard `ssize` function, potentially breaking existing code that relies on the previous implementation.",
"[!744](https://gitlab.com/libeigen/eigen/-/merge_requests/744): Removes outdated compiler feature test macros across Eigen's core and utility files, updating minimum compiler versions for GCC, MSCV, and ICC to require more modern C++ support.",
"[!658](https://gitlab.com/libeigen/eigen/-/merge_requests/658): Breaks the existing SVD module API by introducing an `Options` template parameter to `JacobiSVD` and `BDCSVD`, enabling more flexible computation options for singular value decomposition while modifying the current interface.",
"[!649](https://gitlab.com/libeigen/eigen/-/merge_requests/649): Breaks the existing indexing namespace by moving `Eigen::all`, `Eigen::last`, and `Eigen::lastp1` back to `Eigen::placeholders::` namespace, resolving compiler warnings and improving compatibility with external projects.",
"[!602](https://gitlab.com/libeigen/eigen/-/merge_requests/602): Renamed shift_left/shift_right array operations to shiftLeft/shiftRight, moving them to ArrayCwiseUnaryOps namespace and improving naming consistency, which may require code updates for existing users."
]
},
"unsupported": {
"other_improved": [
"[!1929](https://gitlab.com/libeigen/eigen/-/merge_requests/1929): Fixed documentation build issues in the Tensor module's README.md by addressing Doxygen compatibility with markdown links, improving documentation rendering.",
"[!1916](https://gitlab.com/libeigen/eigen/-/merge_requests/1916): Improved documentation for the Eigen Tensor module by updating the README.md file with additional information about tensor functionality.",
"[!1859](https://gitlab.com/libeigen/eigen/-/merge_requests/1859): Improved tensor trace functionality in Eigen's unsupported CXX11 Tensor module by modifying TensorTrace and TensorRef to handle tensor trace operations more consistently and correctly.",
"[!1849](https://gitlab.com/libeigen/eigen/-/merge_requests/1849): Improved the TensorDeviceThreadPool header by reformatting the code and adopting C++20's `if constexpr` for better readability and modern compiler compatibility.",
"[!1848](https://gitlab.com/libeigen/eigen/-/merge_requests/1848): Improved Eigen's tensor thread pool implementation by cleaning up unused methods, reducing type erasure, and enhancing C++20 parameter pack support in TensorDeviceThreadPool and related tensor operation files.",
"[!1844](https://gitlab.com/libeigen/eigen/-/merge_requests/1844): Optimized division operations in TensorVolumePatch.h for cases with PacketSize=1, reducing the number of CPU cycles consumed during tensor volume patch calculations.",
"[!1809](https://gitlab.com/libeigen/eigen/-/merge_requests/1809): Improved Eigen's tensor documentation by removing explicit \\class statements and consolidating documentation references across multiple tensor implementation files in the unsupported module.",
"[!1747](https://gitlab.com/libeigen/eigen/-/merge_requests/1747): Optimized the erf function implementation in the Eigen SpecialFunctions module by removing redundant computations for large argument values, reducing computational overhead.",
"[!1732](https://gitlab.com/libeigen/eigen/-/merge_requests/1732): Improved Eigen's special functions implementation by vectorizing and enhancing accuracy of erfc() for double and float, with performance gains up to 86% across multiple vector architectures.",
"[!1710](https://gitlab.com/libeigen/eigen/-/merge_requests/1710): Improved the erfc() special function implementation in Eigen's unsupported module by adding vectorized float operations using SSE and AVX2 instructions, significantly enhancing performance with maintained accuracy.",
"[!1706](https://gitlab.com/libeigen/eigen/-/merge_requests/1706): Improved the erf() special function implementation in Eigen's unsupported module, reducing maximum error from 4 to 3 ulps and achieving significant speedups in AVX2+FMA and SSE 4.2 implementations.",
"[!1702](https://gitlab.com/libeigen/eigen/-/merge_requests/1702): Improved MPReal support in Eigen by adding `max_digits10` member to `NumTraits` for `mpreal` types, enhancing numeric property handling for this type.",
"[!1680](https://gitlab.com/libeigen/eigen/-/merge_requests/1680): Improved TensorChipping optimization by detecting \"effectively inner/outer\" chipping cases where dimension products are 1, enhancing performance for tensor operations.",
"[!1678](https://gitlab.com/libeigen/eigen/-/merge_requests/1678): Suppressed a compiler warning in the TensorVolumePatch operation within the Eigen library's unsupported Tensor module by modifying an unreachable switch case statement.",
"[!1645](https://gitlab.com/libeigen/eigen/-/merge_requests/1645): Improved lambda capture syntax in Eigen's unsupported Tensor thread pool headers by explicitly capturing 'this' to eliminate potential warnings and ensure consistent behavior.",
"[!1644](https://gitlab.com/libeigen/eigen/-/merge_requests/1644): Improved async support for tensor operations by adding multi-threaded capabilities to `chip` and `extract_volume_patches` functions in the Eigen unsupported Tensor module.",
"[!1629](https://gitlab.com/libeigen/eigen/-/merge_requests/1629): Improved Eigen's tensor and numerical operations by vectorizing `isfinite` and `isinf` functions, enhancing performance for finite and infinite value comparisons across tensor and core computational components.",
"[!1613](https://gitlab.com/libeigen/eigen/-/merge_requests/1613): Improved Eigen's CXX11 tensor implementation by adding support for 128-bit integer operations in MSVC, specifically implementing the `muluh` function for scalar division.",
"[!1607](https://gitlab.com/libeigen/eigen/-/merge_requests/1607): Improved nonlinear optimization test bounds in Eigen's unsupported module by relaxing error thresholds to enhance test compatibility across different platforms.",
"[!1602](https://gitlab.com/libeigen/eigen/-/merge_requests/1602): Improved nonlinear optimization tests in the unsupported module by adjusting error bounds to account for AVX and non-FMA conditions, enhancing test accuracy.",
"[!1571](https://gitlab.com/libeigen/eigen/-/merge_requests/1571): Improved Eigen's CXX11 Tensor module by replacing custom `Eigen::array` with standard `std::array`, enhancing compatibility with C++17 and preparing for better GPU support.",
"[!1563](https://gitlab.com/libeigen/eigen/-/merge_requests/1563): Improved complex number formatting in Eigen's CXX11 Tensor module, adding custom display styles for NumPy and native code that enhance readability and compatibility.",
"[!1558](https://gitlab.com/libeigen/eigen/-/merge_requests/1558): Optimized Tensor resize performance by removing slow index checks in release mode and updating pre-C++11 code to use modern C++ constructs like static constexpr.",
"[!1542](https://gitlab.com/libeigen/eigen/-/merge_requests/1542): Improved the `cxx11_tensor_gpu` test suite by splitting the test code to reduce timeout issues on Windows systems.",
"[!1479](https://gitlab.com/libeigen/eigen/-/merge_requests/1479): Improved markdown formatting in the Eigen::Tensor README.md file to restore proper documentation structure and readability.",
"[!1470](https://gitlab.com/libeigen/eigen/-/merge_requests/1470): Formatted the cxx11_tensor_executor.cpp test file in the unsupported Tensor module, likely addressing code style or whitespace consistency.",
"[!1469](https://gitlab.com/libeigen/eigen/-/merge_requests/1469): Improved compatibility in the unsupported Tensor executor by removing explicit member function specializations, addressing potential compiler-specific issues with clang.",
"[!1466](https://gitlab.com/libeigen/eigen/-/merge_requests/1466): Improved Eigen's tensor chipping operations by adding dimension index assertions in TensorBase.h and TensorChipping.h to enhance error handling and prevent invalid index accesses.",
"[!1462](https://gitlab.com/libeigen/eigen/-/merge_requests/1462): Improved fileio testing in the unsupported sparse module by adding support for specifying custom temporary directories, enabling better testability across different system configurations.",
"[!1457](https://gitlab.com/libeigen/eigen/-/merge_requests/1457): Improved Eigen's tensor chipping operations by adding static and dynamic asserts in TensorBase.h and TensorChipping.h to validate chipping dimensions and offsets.",
"[!1423](https://gitlab.com/libeigen/eigen/-/merge_requests/1423): Improves tensor constructors in the Eigen unsupported module by adding static asserts to check for matching NumDimensions, preventing potential runtime errors due to dimensional mismatches.",
"[!1406](https://gitlab.com/libeigen/eigen/-/merge_requests/1406): Improved TensorReduction implementation by replacing `divup` with `div_ceil` function, reducing deprecation warnings and aligning with C++ standard integer division requirements.",
"[!1397](https://gitlab.com/libeigen/eigen/-/merge_requests/1397): Improved tensor-related functions by consolidating multiple implementations of division utility functions (divup/div_up/div_ceil) across tensor files, reducing code duplication and enhancing maintainability.",
"[!1391](https://gitlab.com/libeigen/eigen/-/merge_requests/1391): Improved the ThreadPool header in the unsupported module by modifying symbol exports to reduce clang warnings related to include-cleaner tool.",
"[!1378](https://gitlab.com/libeigen/eigen/-/merge_requests/1378): Improved the TensorDeviceThreadPool header by addressing a clang-tidy warning related to forwarding references, enhancing code clarity in the Eigen Tensor module.",
"[!1324](https://gitlab.com/libeigen/eigen/-/merge_requests/1324): Improved the ndtri special function in Eigen's unsupported module by adding input range validation, ensuring the function returns NaN for out-of-range values and maintaining compatibility with scipy and MATLAB.",
"[!1320](https://gitlab.com/libeigen/eigen/-/merge_requests/1320): Improved memory management in Eigen's unsupported FFT backends by replacing raw pointers with std::shared_ptr for FFTW and IMKL FFT plan objects, reducing undefined behavior risks.",
"[!1303](https://gitlab.com/libeigen/eigen/-/merge_requests/1303): Improved the error function (Erf()) implementation in the Eigen library's unsupported special functions module by adjusting the return value to include +/-1 above the clamping point, with a minor performance gain on AVX2 Skylake.",
"[!1298](https://gitlab.com/libeigen/eigen/-/merge_requests/1298): Optimized the tensor select evaluator by implementing ternary operations and scalar boolean selection, reducing execution time by 13% in performance-critical tensor operations.",
"[!1294](https://gitlab.com/libeigen/eigen/-/merge_requests/1294): Improved the error function (erf()) implementation in the unsupported SpecialFunctions module by enhancing the rational approximation and clamping technique, resulting in more accurate calculations for subnormal and normalized floats.",
"[!1287](https://gitlab.com/libeigen/eigen/-/merge_requests/1287): Improves tensor contraction handling in Eigen's unsupported tensor module by preventing crashes when performing contractions on empty tensors, returning nullptr instead of triggering an assert.",
"[!1265](https://gitlab.com/libeigen/eigen/-/merge_requests/1265): Improved tensor performance by vectorizing `isnan()` operations using typed predicates, optimizing AVX512 and other CPU architecture backends for large tensor computations.",
"[!1192](https://gitlab.com/libeigen/eigen/-/merge_requests/1192): Improved CUDA support in Eigen's Tensor module by fixing compilation issues and warnings for CUDA 10/11/12 versions across multiple tensor-related files.",
"[!1164](https://gitlab.com/libeigen/eigen/-/merge_requests/1164): Improved sparse permutation implementation in Eigen's SparseCore module by reducing memory allocations and optimizing performance through more efficient matrix copying and move semantics.",
"[!1125](https://gitlab.com/libeigen/eigen/-/merge_requests/1125): Improved device synchronization by adding a `synchronize` method to all device classes in the Eigen Tensor module, ensuring consistent behavior across different device types.",
"[!1117](https://gitlab.com/libeigen/eigen/-/merge_requests/1117): Cleaned up the IDRS.h header file in the unsupported iterative solvers module by removing an unused variable and fixing comment line breaks to improve code readability.",
"[!1031](https://gitlab.com/libeigen/eigen/-/merge_requests/1031): Improved the Tensor module's header file by eliminating bool bitwise warnings through the use of Eigen::boolean instead of standard bool types.",
"[!1006](https://gitlab.com/libeigen/eigen/-/merge_requests/1006): Improved the AutoDiff unsupported module by adding the necessary `Eigen/Core` header to resolve potential dependency issues during compilation.",
"[!1002](https://gitlab.com/libeigen/eigen/-/merge_requests/1002): Improved the unsupported FFT test shared header by addressing clang-tidy warnings related to function definitions in headers.",
"[!983](https://gitlab.com/libeigen/eigen/-/merge_requests/983): Improves the SYCL backend's QueueInterface in Eigen's unsupported Tensor module by extending queue handling to accept existing SYCL queues, reducing context creation and memory movement overhead.",
"[!982](https://gitlab.com/libeigen/eigen/-/merge_requests/982): Improved Tensor comparison operators in the unsupported CXX11 Tensor module to resolve ambiguity and enhance C++20 compatibility.",
"[!975](https://gitlab.com/libeigen/eigen/-/merge_requests/975): Improved TensorContractionMapper by adding subMappers and linear mappers to simplify address calculations for Power GEMM packing, resulting in a 10% performance boost.",
"[!937](https://gitlab.com/libeigen/eigen/-/merge_requests/937): Improved the TensorTrace.h file in the Eigen Tensor module by eliminating an unused warning related to the trace function.",
"[!932](https://gitlab.com/libeigen/eigen/-/merge_requests/932): Improved the AutoDiff unsupported module by removing `make_coherent` and introducing `CoherentPadOp`, which reduces complexity and enhances performance in derivative calculations by approximately 20%.",
"[!894](https://gitlab.com/libeigen/eigen/-/merge_requests/894): Improved tensor operations in the Eigen unsupported module by adding support for tensor packets of size 1, enabling better compatibility with platforms where full vectorization is not possible.",
"[!884](https://gitlab.com/libeigen/eigen/-/merge_requests/884): Simplified non-convergence checks in Eigen's NonLinearOptimization test cases to improve numerical stability and test compatibility across different optimization levels and hardware architectures.",
"[!869](https://gitlab.com/libeigen/eigen/-/merge_requests/869): Improved SYCL support in Eigen's unsupported modules by simplifying CMake configuration, removing unnecessary workarounds, and addressing compatibility issues in SYCL tensor math tests.",
"[!863](https://gitlab.com/libeigen/eigen/-/merge_requests/863): Improved the unsupported tensor block evaluation test by modifying the test expression to mitigate numerical differences under aggressive optimization modes.",
"[!765](https://gitlab.com/libeigen/eigen/-/merge_requests/765): Improved the TensorMeta.h header by disambiguating overloads for empty index lists, resolving a Clang warning about ambiguous function resolution.",
"[!757](https://gitlab.com/libeigen/eigen/-/merge_requests/757): Improved the IDRS (Induced Dimension Reduction Subspace) solver in the unsupported module by reformatting code and replacing norm calculations with more stable methods.",
"[!733](https://gitlab.com/libeigen/eigen/-/merge_requests/733): Improved the TensorIO header in the unsupported Tensor module by addressing compiler warnings related to shadowing definitions, enhancing code clarity.",
"[!726](https://gitlab.com/libeigen/eigen/-/merge_requests/726): Improved iterator support for Eigen::array in the unsupported CXX11 module, adding basic iterator methods to facilitate easier transition from std::array in C++11 code.",
"[!724](https://gitlab.com/libeigen/eigen/-/merge_requests/724): Improved TensorIO implementation to support TensorMap with const elements, modifying the tensor I/O handling in the Eigen library's unsupported tensor module.",
"[!715](https://gitlab.com/libeigen/eigen/-/merge_requests/715): Improved tensor reduction test in the unsupported module by adding forward error bound checking to validate the correctness and stability of summation operations.",
"[!676](https://gitlab.com/libeigen/eigen/-/merge_requests/676): Improved tensor reduction accuracy for half and bfloat16 types by implementing a tree summation algorithm with bounded relative error in the Eigen Tensor module.",
"[!669](https://gitlab.com/libeigen/eigen/-/merge_requests/669): Optimized the GPU tensor contraction test in the unsupported module by reducing the number of contractions from 3600 to 27, improving test execution time on Windows.",
"[!619](https://gitlab.com/libeigen/eigen/-/merge_requests/619): Improved documentation for unsupported sparse iterative solvers by fixing headers and removing a commented-out include, enhancing clarity of solver documentation.",
"[!616](https://gitlab.com/libeigen/eigen/-/merge_requests/616): Improved CUDA half-precision support by disabling vectorization for `__half` types on host architectures, preventing build errors and ensuring compatibility with older CUDA versions.",
"[!612](https://gitlab.com/libeigen/eigen/-/merge_requests/612): Enhances Eigen's Tensor classes by adding support for `EIGEN_TENSOR_PLUGIN`, `EIGEN_TENSORBASE_PLUGIN`, and `EIGEN_READONLY_TENSORBASE_PLUGIN` to enable more flexible tensor functionality in the unsupported module.",
"[!611](https://gitlab.com/libeigen/eigen/-/merge_requests/611): Improved sparse extra matrix operations in Eigen by adding the unordered_map header to SparseExtra module and its test file, ensuring better compatibility for sparse matrix handling.",
"[!605](https://gitlab.com/libeigen/eigen/-/merge_requests/605): Improved the RandomSetter in SparseExtra by replacing std::map with std::unordered_map to enhance performance and reduce complexity of lookup operations.",
"[!571](https://gitlab.com/libeigen/eigen/-/merge_requests/571): Improved the AutoDiffScalar component in the unsupported module by renaming a template parameter from `_derType` to `DerivativeType` to avoid using a reserved identifier.",
"[!564](https://gitlab.com/libeigen/eigen/-/merge_requests/564): Improved MPReal support in Eigen's unsupported module by adding a CMake detection script and removing an outdated internal header file to enhance compatibility with the latest MPFR version.",
"[!552](https://gitlab.com/libeigen/eigen/-/merge_requests/552): Improved Tensor documentation by removing unnecessary [TOC] tag and fixing code block formatting in the README.md file to enhance readability and resolve Doxygen rendering issues.",
"[!540](https://gitlab.com/libeigen/eigen/-/merge_requests/540): Improved the tensor argmin/argmax functionality in TensorFunctors.h to consistently return the first occurrence of minimum/maximum values, enhancing stability across multithreading and GPU implementations.",
"[!526](https://gitlab.com/libeigen/eigen/-/merge_requests/526): Improved the Tensor module documentation in README.md by fixing a compilation issue in the example code, ensuring the documentation remains accurate and functional.",
"[!521](https://gitlab.com/libeigen/eigen/-/merge_requests/521): Improved Tensor contraction dispatch mechanism in TensorContraction.h by adding macro guards, enabling more flexible custom dispatch logic for TensorFlow Lite integration.",
"[!520](https://gitlab.com/libeigen/eigen/-/merge_requests/520): Improves GPU defines in Eigen's unsupported Tensor module by adding the ability to permanently enable HIP/CUDA GPU-related defines, enhancing flexibility for users working with GPU features.",
"[!493](https://gitlab.com/libeigen/eigen/-/merge_requests/493): Improved GPU device properties management in Eigen's Tensor module by adding a class and singleton to encapsulate initialization and retrieval of device properties, enhancing code clarity and maintainability.",
"[!488](https://gitlab.com/libeigen/eigen/-/merge_requests/488): Improved TensorRandom functionality in Eigen's unsupported Tensor module by removing time-dependence and simplifying random number generation logic to enhance test reproducibility and cross-platform compatibility.",
"[!476](https://gitlab.com/libeigen/eigen/-/merge_requests/476): Improved TensorRandom header by adding a compatibility check for BSD random() function, providing a fallback to rand() for better cross-platform support, particularly for MinGW via msys2."
],
"other_fixed": [
"[!1901](https://gitlab.com/libeigen/eigen/-/merge_requests/1901): Fixed a potential type overflow error in the scalar parity check within the Eigen SpecialFunctions module by removing a long cast that could cause runtime issues.",
"[!1887](https://gitlab.com/libeigen/eigen/-/merge_requests/1887): Fixed an unused local typedef warning in the MatrixExponential.h file within the unsupported matrix functions module by removing the unused 'Scalar' typedef.",
"[!1860](https://gitlab.com/libeigen/eigen/-/merge_requests/1860): Fixed a test for the trace operation on TensorRef in the unsupported Tensor module, improving test coverage and reliability for tensor reference operations.",
"[!1851](https://gitlab.com/libeigen/eigen/-/merge_requests/1851): Fixed a bug in the Givens rotation implementation within the unsupported NonLinearOptimization module, addressing potential numerical stability issues in linear algebra operations.",
"[!1840](https://gitlab.com/libeigen/eigen/-/merge_requests/1840): Fixed boolean scatter and random generation issues in Eigen's tensor module, specifically addressing stability problems in SSE packet math and CXX11 tensor random generation.",
"[!1836](https://gitlab.com/libeigen/eigen/-/merge_requests/1836): Fixed a compiler warning in the TensorRef class by adding an explicit copy constructor in the unsupported Tensor module, resolving potential code compilation issues.",
"[!1828](https://gitlab.com/libeigen/eigen/-/merge_requests/1828): Fixed TensorRef implementation to support assigning expressions with different index types and enforce immutability, improving consistency with Eigen::Ref in the tensor module.",
"[!1793](https://gitlab.com/libeigen/eigen/-/merge_requests/1793): Fixed uninitialized read errors in the special_packetmath.cpp test file by zero-initializing test arrays, improving test stability in the unsupported module.",
"[!1769](https://gitlab.com/libeigen/eigen/-/merge_requests/1769): Fixed a subnormal flushing issue in the special packetmath erfc function for ARM32 architecture, ensuring correct handling of edge case numeric values.",
"[!1724](https://gitlab.com/libeigen/eigen/-/merge_requests/1724): Fixed macro redefinition warnings in the FFTW test suite by removing unnecessary macro definitions in the unsupported/test/CMakeLists.txt file.",
"[!1707](https://gitlab.com/libeigen/eigen/-/merge_requests/1707): Fixed a numerical stability issue in the unsupported special functions module, specifically addressing the `erf(x)` function to prevent NaN generation for large input values and improve performance.",
"[!1699](https://gitlab.com/libeigen/eigen/-/merge_requests/1699): Fixed compiler warnings in EigenSolver and TensorChipping by addressing matrix size assignment issues, improving code clarity and reducing potential compilation errors.",
"[!1698](https://gitlab.com/libeigen/eigen/-/merge_requests/1698): Fixed an implicit conversion issue in the TensorChipping class within the Eigen Tensor unsupported module, improving type conversion compatibility and correctness.",
"[!1658](https://gitlab.com/libeigen/eigen/-/merge_requests/1658): Fixed a static issue in the kissfft implementation within the Eigen unsupported FFT module by correctly defining the pi constant as a double, ensuring accurate FFT computations.",
"[!1614](https://gitlab.com/libeigen/eigen/-/merge_requests/1614): Fixed FFT implementation in Eigen's unsupported module to handle destinations with non-unit stride by adding a temporary buffer for evaluation and copying the final result.",
"[!1597](https://gitlab.com/libeigen/eigen/-/merge_requests/1597): Fixed enum comparison warnings in the AutoDiffScalar.h file within the Eigen unsupported autodiff module, addressing potential compiler warning issues.",
"[!1596](https://gitlab.com/libeigen/eigen/-/merge_requests/1596): Fixed unused variable warnings in the TensorIO module of the Eigen library, addressing potential warning issues without changing functionality.",
"[!1568](https://gitlab.com/libeigen/eigen/-/merge_requests/1568): Fixed a compiler-specific redefinition issue with ScalarPrinter in the Eigen Tensor module for GCC, ensuring proper compilation and compatibility.",
"[!1537](https://gitlab.com/libeigen/eigen/-/merge_requests/1537): Fixed static_assert compatibility in the AutoDiff unsupported module for C++14, addressing compilation issues in the CoherentPadOp header file.",
"[!1517](https://gitlab.com/libeigen/eigen/-/merge_requests/1517): Fixed memory handling in the Kronecker product test within the unsupported module to prevent uninitialized memory usage.",
"[!1476](https://gitlab.com/libeigen/eigen/-/merge_requests/1476): Fixed ODR (One Definition Rule) violations in Eigen's Tensor module by resolving namespace conflicts and adjusting function definitions across multiple header and source files.",
"[!1467](https://gitlab.com/libeigen/eigen/-/merge_requests/1467): Fixed a compile-time error in the tensor executor test file by enabling static assertions for chip dimensions, preventing potential runtime errors.",
"[!1463](https://gitlab.com/libeigen/eigen/-/merge_requests/1463): Reverted asserts in the Tensor chipping functionality within the unsupported Eigen Tensor module, addressing test-related issues with the `.chip` method.",
"[!1453](https://gitlab.com/libeigen/eigen/-/merge_requests/1453): Fixed a memory management issue in the TensorForcedEval component of the Eigen Tensor module by improving handling of temporary buffers during evaluator copying, preventing potential memory access problems.",
"[!1448](https://gitlab.com/libeigen/eigen/-/merge_requests/1448): Fixed memory-related issues in Eigen's test files for product threading and tensor concatenation, addressing potential uninitialized memory problems that could trigger Memory Sanitizer (MSAN) failures.",
"[!1447](https://gitlab.com/libeigen/eigen/-/merge_requests/1447): Fixed ASAN/UBSAN errors in Eigen's thread pool, tensor evaluation, and complex eigenvalue computation components, addressing index-out-of-bounds and use-after-scope issues to improve library stability.",
"[!1410](https://gitlab.com/libeigen/eigen/-/merge_requests/1410): Fixed an int overflow issue in the Eigen CXX11 Tensor GPU executor by modifying type conversions in TensorExecutor.h to prevent potential crashes during tensor operations.",
"[!1407](https://gitlab.com/libeigen/eigen/-/merge_requests/1407): Fixed warnings in Eigen's tensor thread pool implementation by addressing integer conversion issues in the `div_ceil` function across TensorContractionThreadPool and TensorDeviceThreadPool header files.",
"[!1382](https://gitlab.com/libeigen/eigen/-/merge_requests/1382): Fixed a tensor strided linear buffer copy issue in TensorBlock.h by preventing negative indices and ensuring unsigned integer wrapping behavior.",
"[!1372](https://gitlab.com/libeigen/eigen/-/merge_requests/1372): Fixed AltiVec and tensor operation support in Eigen, addressing issues with partial packets, bfloat16 data types, and improving Tensorflow integration for Power architecture.",
"[!1369](https://gitlab.com/libeigen/eigen/-/merge_requests/1369): Fixed warnings in Eigen's tensor-related headers by addressing type casting and integer comparison issues in TensorContraction.h and TensorDimensions.h.",
"[!1243](https://gitlab.com/libeigen/eigen/-/merge_requests/1243): Fixed a test in the Eigen tensor comparison functionality by reverting an incorrectly modified test case in the unsupported tensor module, ensuring the test suite now passes correctly.",
"[!1237](https://gitlab.com/libeigen/eigen/-/merge_requests/1237): Fixed GPU convolution resource management in the Eigen Tensor module by adjusting internal variable sizes to reduce out-of-resources errors during 3D convolution operations.",
"[!1227](https://gitlab.com/libeigen/eigen/-/merge_requests/1227): Fixed a null placeholder accessor issue in the SYCL backend's Tensor Reduction implementation, resolving segmentation faults and ensuring compatibility with SYCL 2020 compiler rules.",
"[!1181](https://gitlab.com/libeigen/eigen/-/merge_requests/1181): Fixed GPU-related bugs in Eigen's tensor convolution operations by modifying utility and test files to improve compatibility and correctness with GPU assertions.",
"[!1077](https://gitlab.com/libeigen/eigen/-/merge_requests/1077): Fixed warning in ROCm GPU device detection by adding a status check for `gpuGetDevice` in the TensorDeviceGpu header, resolving unused-result warnings in Tensorflow builds.",
"[!1033](https://gitlab.com/libeigen/eigen/-/merge_requests/1033): Fixed SYCL tensor tests by addressing specializations in PacketMath, updating binary logic operators, and adjusting test cases to improve compatibility and reduce test failures.",
"[!1007](https://gitlab.com/libeigen/eigen/-/merge_requests/1007): Fixed ODR violations in SparseLU and Tensor header files by replacing unnamed enums with named types to prevent potential build failures and ensure consistent type declarations.",
"[!1001](https://gitlab.com/libeigen/eigen/-/merge_requests/1001): Fixed AVX512 bessel function specializations by adding conditional compilation to prevent build errors on specific compiler versions and architectures.",
"[!991](https://gitlab.com/libeigen/eigen/-/merge_requests/991): Fixed comparison operators in TensorBase to resolve ambiguous warnings in C++20, improving compatibility and symmetry of tensor comparisons.",
"[!989](https://gitlab.com/libeigen/eigen/-/merge_requests/989): Fixed comparison operator ambiguity in Eigen's tensor implementation for C++20 compatibility, specifically modifying the TensorBase.h file to resolve operator comparison issues.",
"[!986](https://gitlab.com/libeigen/eigen/-/merge_requests/986): Fixed SYCL range constructor in TensorConvolutionSycl.h to ensure at least one thread is created during parallel execution, addressing a potential issue with default constructor behavior.",
"[!976](https://gitlab.com/libeigen/eigen/-/merge_requests/976): Fixes an issue in the LDLT solver with AutoDiffScalar by modifying zero value handling in TriangularSolverVector, Meta, and AutoDiffScalar files to ensure correct derivative updates and solver behavior.",
"[!926](https://gitlab.com/libeigen/eigen/-/merge_requests/926): Fixed namespace usage in SYCL tensor contraction and STL support headers to resolve compilation errors and improve SYCL framework compatibility.",
"[!898](https://gitlab.com/libeigen/eigen/-/merge_requests/898): Fixed an edge-case in the zeta function within Eigen's special functions implementation, addressing overflow issues for large input values to prevent NaN generation and align with scipy behavior.",
"[!883](https://gitlab.com/libeigen/eigen/-/merge_requests/883): Fixed matrix_power test tolerance in unsupported module to resolve test failures on MSVC 19.16, ensuring consistent test behavior across different compiler versions.",
"[!853](https://gitlab.com/libeigen/eigen/-/merge_requests/853): Fixed ODR (One Definition Rule) violations in the TensorRandom implementation within the Eigen library's unsupported Tensor module to resolve potential compilation issues.",
"[!835](https://gitlab.com/libeigen/eigen/-/merge_requests/835): Fixed ODR violations in Eigen's Tensor module headers by removing unnamed namespaces to prevent undefined behavior and ensure C++ standard compliance.",
"[!803](https://gitlab.com/libeigen/eigen/-/merge_requests/803): Fixed compiler warnings in Eigen's Tensor module by explicitly initializing base classes in Tensor.h, TensorFixedSize.h, and TensorRef.h to address GCC 8.5 compatibility issues.",
"[!770](https://gitlab.com/libeigen/eigen/-/merge_requests/770): Fixed a bug in the `customIndices2Array` function within the Tensor module, ensuring the first index is correctly included in the resulting array.",
"[!759](https://gitlab.com/libeigen/eigen/-/merge_requests/759): Fixed a typo in the unsupported IDRS.h file, correcting the spelling of `stableNorm` to ensure consistent function naming in the Eigen library's iterative solvers implementation.",
"[!755](https://gitlab.com/libeigen/eigen/-/merge_requests/755): Fixed an unnecessary else branch in the TensorDimensions.h header of the unsupported Tensor module, ensuring correct header inclusion in empty files.",
"[!728](https://gitlab.com/libeigen/eigen/-/merge_requests/728): Fixed Windows build errors in the Eigen CXX11 Tensor module's TensorIO header file, addressing platform-specific compatibility issues.",
"[!723](https://gitlab.com/libeigen/eigen/-/merge_requests/723): Fixed a bug in tensor broadcasting implementation within the Eigen Tensor module, resolving off-by-one errors and improving robustness when broadcasting across different dimensions.",
"[!713](https://gitlab.com/libeigen/eigen/-/merge_requests/713): Fixed integer overflow issues in Eigen's tensor indexing calculations for CUDA kernels, preventing potential memory access errors and enhancing robustness for large tensor sizes.",
"[!705](https://gitlab.com/libeigen/eigen/-/merge_requests/705): Fixed tensor reduction test warnings and error bound calculation in Eigen's unsupported CXX11 tensor module, addressing MSVC compilation warnings and improving sum accuracy test precision.",
"[!691](https://gitlab.com/libeigen/eigen/-/merge_requests/691): Fixed a clang warning in the TensorUInt128.h file by modifying bitwise operations to resolve a compiler warning in the Eigen Tensor module.",
"[!689](https://gitlab.com/libeigen/eigen/-/merge_requests/689): Fixed a broadcasting index-out-of-bounds error in Eigen's tensor operations, specifically addressing computation issues with 1D vectors and complex types in the non-blocking broadcasting path.",
"[!681](https://gitlab.com/libeigen/eigen/-/merge_requests/681): Fixes potential integer overflow issues in Eigen's CXX11 Tensor GPU indexing calculations by modifying TensorExecutor and TensorMeta headers to prevent CUDA_ERROR_ILLEGAL_ADDRESS and ensure correct behavior for large tensor sizes.",
"[!679](https://gitlab.com/libeigen/eigen/-/merge_requests/679): Fixed GPU tensor reduction to prevent memory errors by disabling tree reduction in the CXX11 Tensor module's GPU implementation.",
"[!671](https://gitlab.com/libeigen/eigen/-/merge_requests/671): Fixed GPU special function tests by correcting values in test/main.h and updating the VERIFY_IS_CWISE_APPROX macro to handle scalar comparisons more accurately in GPU test environments.",
"[!628](https://gitlab.com/libeigen/eigen/-/merge_requests/628): Fixed a symbol naming conflict in the cxx11_tensor_expr test by renaming 'vec_all_nan' to resolve build failures on PPC64LE platforms.",
"[!560](https://gitlab.com/libeigen/eigen/-/merge_requests/560): Fixed TriSycl CMake configuration files to improve compatibility with the latest TriSycl version, updating build settings in unsupported documentation and test directories to require C++17.",
"[!547](https://gitlab.com/libeigen/eigen/-/merge_requests/547): Fixes a runtime crash in Eigen's tensor shuffling functionality when attempting to shuffle an empty tensor by modifying the TensorIntDivisor constructor in TensorShuffling.h.",
"[!531](https://gitlab.com/libeigen/eigen/-/merge_requests/531): Fixed the balancer in the Companion.h file to prevent overflow issues when handling large matrix norms, improving numerical stability in the unsupported polynomials module.",
"[!481](https://gitlab.com/libeigen/eigen/-/merge_requests/481): Fixed static global variable initialization in TensorDeviceGpu.h by introducing inline functions to safely manage device properties across translation units.",
"[!477](https://gitlab.com/libeigen/eigen/-/merge_requests/477): Fixed CUDA compatibility issues in Eigen's tensor operations by modifying multiple tensor-related files to resolve undefined behavior when calling host functions from device code, ensuring proper support for CUDA 11.3."
],
"other_added": [
"[!1884](https://gitlab.com/libeigen/eigen/-/merge_requests/1884): Added DUCC FFT support to Eigen's unsupported module by introducing a new implementation file `duccfft_impl.h`, renaming existing FFT implementation files, and adding corresponding test coverage.",
"[!1627](https://gitlab.com/libeigen/eigen/-/merge_requests/1627): Adds tensor roll functionality to the Eigen Tensor module, implementing a new `.roll()` method for circular shifting/rotating tensors with accompanying test cases.",
"[!981](https://gitlab.com/libeigen/eigen/-/merge_requests/981): Added MKL adapter to Eigen's unsupported FFT module, introducing support for oneAPI MKL FFT library and expanding FFT library compatibility with KFR and FFTS.",
"[!973](https://gitlab.com/libeigen/eigen/-/merge_requests/973): Added `.arg()` method to Tensor class in the unsupported CXX11 Tensor module, enabling argument calculation for complex tensors and improving tensor operation capabilities.",
"[!852](https://gitlab.com/libeigen/eigen/-/merge_requests/852): Added a `size()` method to `Eigen::IndexList` in the Tensor unsupported module, providing a convenient way to retrieve the size of an index list with a `constexpr` implementation.",
"[!798](https://gitlab.com/libeigen/eigen/-/merge_requests/798): Adds a Non-Negative Least Squares (NNLS) solver to the Eigen unsupported module, implementing a standard active-set algorithm with comprehensive test coverage and a refactored API resembling other Eigen iterative solvers.",
"[!729](https://gitlab.com/libeigen/eigen/-/merge_requests/729): Added support for `reverse_iterator` to Eigen::array in the unsupported Tensor module, enabling backward-compatible iterator functionality for tensor operations.",
"[!617](https://gitlab.com/libeigen/eigen/-/merge_requests/617): Added dense matrix support to the Matrixmarket reader/writer in the SparseExtra unsupported module, extending its functionality to handle dense matrix I/O operations.",
"[!607](https://gitlab.com/libeigen/eigen/-/merge_requests/607): Added a flowchart to the unsupported sparse iterative solvers documentation, providing a visual guide to help users select appropriate solvers for their sparse matrix problems.",
"[!578](https://gitlab.com/libeigen/eigen/-/merge_requests/578): Added test coverage for std::unordered_map in the Eigen sparse_extra.cpp test file, enabling C++11 support testing for this container type."
],
"other_removed": [
"[!1475](https://gitlab.com/libeigen/eigen/-/merge_requests/1475): Removed the MoreVectorization directory from Eigen's unsupported modules, eliminating redundant code and resolving potential compatibility issues with existing vectorization implementations.",
"[!1474](https://gitlab.com/libeigen/eigen/-/merge_requests/1474): Removed the Skyline library from the Eigen unsupported module, deleting all related header files and cleaning up the CMakeLists.txt to eliminate deprecated and unused code.",
"[!1080](https://gitlab.com/libeigen/eigen/-/merge_requests/1080): Removed an unused typedef from the sparse_extra.cpp test file in the unsupported module, cleaning up unnecessary code with minimal impact.",
"[!752](https://gitlab.com/libeigen/eigen/-/merge_requests/752): Removes deprecated macro `EIGEN_GPU_TEST_C99_MATH` from the unsupported GPU tensor testing infrastructure, as it was always true and only used in a single file.",
"[!704](https://gitlab.com/libeigen/eigen/-/merge_requests/704): Removed problematic implementation of `take<n, numeric_list<T>>` in CXX11Meta.h to address a g++-11 crash, improving compiler compatibility in the Eigen unsupported utilities.",
"[!641](https://gitlab.com/libeigen/eigen/-/merge_requests/641): Removed an unnecessary `std::tuple` reference in the Eigen Tensor module's `TensorIndexList.h`, simplifying the code and reducing potential complexity.",
"[!637](https://gitlab.com/libeigen/eigen/-/merge_requests/637): Removed references to DynamicSparseMatrix in the SparseExtra unsupported module, cleaning up unnecessary code paths in MarketIO.h and sparse_extra.cpp test files.",
"[!606](https://gitlab.com/libeigen/eigen/-/merge_requests/606): Removed sparse dynamic matrix support from the Eigen unsupported module by deleting related implementation files and cleaning up deprecated API in the SparseExtra component.",
"[!513](https://gitlab.com/libeigen/eigen/-/merge_requests/513): Removed dead code from GPU float16 unit tests in the unsupported Tensor module, reducing code bloat in the test suite."
],
"major_changes": [
"[!1330](https://gitlab.com/libeigen/eigen/-/merge_requests/1330): Enables half-precision support for SYCL in Eigen's unsupported Tensor module by adding conversions between `Eigen::half` and `cl::sycl::half` and updating related test cases.",
"[!1305](https://gitlab.com/libeigen/eigen/-/merge_requests/1305): Adds a new strongly typed matrix multiplication function to Eigen's Tensor module, implementing performance optimizations and improving matrix operation capabilities.",
"[!1285](https://gitlab.com/libeigen/eigen/-/merge_requests/1285): Enables USM (Unified Shared Memory) support for the SYCL backend in Eigen's tensor operations, modifying multiple tensor-related files to improve device memory handling and compatibility with SYCL-2020 standards.",
"[!978](https://gitlab.com/libeigen/eigen/-/merge_requests/978): Adds sparse matrix inverse subset computation to the SparseExtra module using the Takahashi method, improving performance and numerical stability for sparse matrix inversions.",
"[!667](https://gitlab.com/libeigen/eigen/-/merge_requests/667): Adds a new strongly typed matrix multiplication function to Eigen's Tensor module, implementing performance optimizations and improving matrix operation capabilities.",
"[!622](https://gitlab.com/libeigen/eigen/-/merge_requests/622): Prepares Eigen's Tensor module for a new GPU-compatible Tuple implementation by renaming existing `Tuple` references to `Pair` across multiple tensor-related source files."
]
}
}

Changelog

5.0

Supported modules

Other fixed

  • #1938: Fixed duplicated 'for' words in documentation across Core MathFunctionsImpl.h and QuickReference.dox files to improve code clarity.
  • #1937: Fixes compiler warnings and edge case in packet operations by suppressing Warray-bounds warning in ploaduSegment and addressing vectorized cast issues in GenericPacketMath.
  • #1936: Fixed -Wshadow compiler warning in GenericPacketMath.h by renaming variables to avoid shadowing issues.
  • #1935: Fixed self-adjoint matrix-vector products to correctly handle compile-time vectors by modifying SelfadjointMatrixVector.h and ProductEvaluators.h, resolving issues with the selfadjoint_eigensolver tests.
  • #1934: Fixed API incompatibility issues in SuperLU support by introducing GlobalLU_t pointer to address ILU interface requirements for SuperLUv7.0.1.
  • #1931: Fixed a bug in the 1x1 selfadjoint matrix-vector product functionality within the Core products module.
  • #1927: Fixed CI build configuration by replacing g++-10 with g++14 for PPC architecture to resolve compiler issues with incorrect static_cast behavior.
  • #1921: Fixed VSX packetmath type mismatches in AltiVec TypeCasting by correcting vec_cst return type for double and adding element-by-element casts to resolve compilation and execution issues in clang and QEMU environments.
  • #1920: Fixed various test failures and compatibility issues when building with Bazel, including disabled exception handling, GPU operation support, and ODR violations in SimplicialCholesky.
  • #1915: Fixed CI infrastructure by switching ARM builds from decommissioned AArch64 ampere runner to GitLab runners in the Linux build and test configuration files.
  • #1912: Fixed unprotected SIZE argument in the ei_declare_aligned_stack_constructed_variable macro in Memory.h. Prevents potential buffer overflow issues by properly protecting the SIZE parameter during macro expansion.
  • #1911: Fixed MSVC compiler warning about type truncation from unsigned int to const bool in FindCoeff.h, reducing compiler warnings for MSVC users.
  • #1906: Fixed a compilation bug in the NEON implementation of Eigen's PacketMath.h file. Resolved a compilation error that prevented the NEON code from building correctly.
  • #1904: Fixed NEON packet math operations by adding missing native pnmadd implementations and correcting pnmsub to use proper intrinsic functions, resolving test failures on aarch64.
  • #1903: Fixed a compile warning about multiplication operator with bool in the test file packetmath.cpp.
  • #1900: Fixed Eigen::Map<const Vector>::operator[] return type in DenseCoeffsBase.h to return const Scalar& instead of Scalar for proper type correctness.
  • #1891: Fixed missing scalar argument support in multiplication and division operators for vectorwise operations, ensuring consistency in operator overloading for scalar operations in vectorwise contexts.
  • #1890: Fixed LAPACKE bindings for BDCSVD and JacobiSVD modules to align with updated API changes. Resolved compilation issues and incompatibilities between the bindings and the new SVD interface.
  • #1889: Fixed MSAN errors in vectorized casting evaluator by zeroing out unused packets in CoreEvaluators.h to prevent undefined behavior from uninitialized data.
  • #1888: Fixed infinite recursion issue in SolverBase derived classes by ensuring all derived types (FullPivLU, PartialPivLU, FullPivHouseholderQR, HouseholderQR) properly implement the info() method.
  • #1885: Fixes CMake build system to conditionally create uninstall target only when Eigen is the top-level project, preventing duplicate target conflicts when using FetchContent.
  • #1883: Fixed undefined behavior in the ploaduSegment function by adding safeguards to prevent out-of-bounds access in GenericPacketMath.h.
  • #1882: Fixed the noexcept specifier in CommaInitializer.h to restore functionality of comma initializer tests that were failing due to its previous removal.
  • #1880: Fixed non-finite input detection in the cbrt function by implementing a more conservative detection method in GenericPacketMathFunctions.h to prevent overzealous compiler substitutions.
  • #1877: Fixed packet segment implementation in XprHelper.h by adding DiagonalWrapper compatibility check to ensure robust handling of diagonal matrix wrappers.
  • #1876: Fixed constexpr usage in CoreEvaluators.h to address compiler-related issues. Improved compatibility and correctness of constexpr implementation in the CoreEvaluators header.
  • #1874: Fixed packetSegment functionality in ArrayWrapper and MatrixWrapper classes to resolve build issues with partial redux expressions using .array().
  • #1872: Fixed a potential deadlock in Eigen thread pool by ensuring tasks are properly notified and stolen in the NonBlockingThreadPool implementation.
  • #1870: Fixed type errors in ThreadPool ForkJoin.h when using custom thread environments by ensuring ParallelFor generates environment-specific tasks.
  • #1869: Fixed a conversion warning in Parallelizer.h by addressing a long int to int conversion issue that occurred when using GCC 11.4.0 with OpenMP enabled.
  • #1868: Fixed cmake warning and improved CTest configuration by defaulting to j0 thread setting for better performance and stability.
  • #1867: Fixed missing pmadd function for Packet16bf in AVX512 PacketMath.h to resolve flaky packetmath tests and improve test stability.
  • #1862: Fixed packet math operations by replacing NaN with Scalar(1) for true values in pselect and various masks to ensure compatibility with fast-math enabled modes.
  • #1858: Fixed flakiness in Eigen::half tests by adding FMA support in PacketMath.h comparisons and adjusting test parameter ranges in packetmath.cpp to reduce numerical instability.
  • #1856: Fixed issue #2828 in the Core module's packet math and mathematical functions implementations by modifying GenericPacketMath.h, MathFunctions.h, and MathFunctionsImpl.h.
  • #1854: Fixed DenseBase::allFinite to correctly return true for integer arrays instead of false, and added handling for the FINITE_MATH_ONLY macro to address std::isfinite optimization issues on certain platforms.
  • #1850: Fixed x86 complex vectorized FMA operations by modifying AVX and SSE complex number implementations to resolve performance issues in vectorized fused multiply-add operations.
  • #1847: Fixed an extra semicolon in DeviceWrapper.h to resolve compiler error with -Werror,-Wextra-semi flag.
  • #1843: Fixes STL feature detection in Core utility modules to resolve C++20 compilation breakages with libstdc++ v9 in JAX/TensorFlow builds.
  • #1842: Fixes CMake warning related to Boost library in test/CMakeLists.txt to ensure compliance with current CMake practices.
  • #1841: Fixed the documentation job configuration for nightlies by correcting an accidental overwrite of the nightly rule in the GitLab CI files.
  • #1840: Fixed boolean scatter and random generation issues in SSE PacketMath and tensor operations. Resolved potential random failures in cxx11_tensor_block_io tests by improving stability of tensor random generation.
  • #1839: Fixed compiler warning in test/constexpr.cpp by adding a deduction guide for the ConstexprTest struct to suppress class template argument deduction warnings.
  • #1838: Fixed ForkJoin and NonBlockingThreadPool code by enforcing binary functor requirements for ParallelFor operations and correcting an out-of-bounds bug in midpoint computation.
  • #1835: Fixed bitwise operation error in Eigen/src/Geometry/OrthoMethods.h when compiling with C++26. Resolves compilation issues for the geometry module under the new C++ standard.
  • #1834: Fixed matrix initialization in bicgstab test by ensuring all matrix elements are properly initialized, preventing potential issues with uninitialized values.
  • #1833: Fixed array bounds warning in the inner product implementation by addressing Warray-bounds compiler warning in Core/InnerProduct.h.
  • #1832: Fixed CMake build configuration by disabling the unused fno-check-new compiler flag for Clang, reducing warning noise during compilation.
  • #1831: Fixed build errors in AltiVec PacketMath for POWER configurations without VSX and POWER8 support by modifying compatibility checks.
  • #1826: Fixes missing MathJax and LaTeX configuration in the Doxygen documentation system by adding proper configuration settings to doc/Doxyfile.in.
  • #1825: Fixes undefined behavior in Eigen::half by replacing type-punning with proper bit-cast approach in Half.h.
  • #1823: Fixed missing graphviz dependency in the documentation build configuration to resolve broken graph visualization in the generated documentation.
  • #1821: Fixes numerical stability issues in BiCGSTAB iterative solver by adjusting initialization and restart conditions to better align with established literature and handle edge cases.
  • #1820: Fixed Warray-bounds compiler warnings in fixed-size assignments by modifying traversal logic in Meta.h and AssignEvaluator.h to consider MaxSizeAtCompileTime of entire expressions rather than individual operands.
  • #1818: Fixed Doxygen documentation generation by enabling nightly pages job, configuring failure reporting on warnings, and removing external page dependencies from the documentation setup.
  • #1816: Fixed Android NDK compatibility by removing __cpp_lib_hardware_interference_size macro usage from Core utilities and ThreadPool, resolving issues where the macro was defined but the corresponding function was unavailable in NDK r25 and lower.
  • #1815: Fixes std::hardware_destructive_interference_size detection in ConfigureVectorization.h to ensure compatibility with GCC versions that require explicit checking for this feature.
  • #1814: Fixed missing return statements in the Complex class for PPC architecture. Ensured consistent return behavior across different architectures in AltiVec Complex.h.
  • #1811: Fixes LoongArch64 emulated tests by configuring CMake to explicitly use QEMU emulator and adjusting the testing framework to support custom test commands without requiring privileged access.
  • #1810: Fixed midpoint calculation in Eigen::ForkJoinScheduler to ensure it stays within the [start, end] range when granularity is greater than one. This prevents index out-of-bounds errors in ParallelFor operations.
  • #1808: Fixed minor typos in ForkJoin.h ThreadPool module to correct documentation or comments.
  • #1807: Fixed all Doxygen warnings across Eigen library by removing non-existent examples, updating configuration files, correcting documentation references, and resolving ODR issues in packetmath files.
  • #1806: Fixed UTF-8 encoding errors in SimplicialCholesky_impl.h comments that were causing build failures with MSVC and Apple Clang compilers.
  • #1804: Fixed potential data race in NonBlockingThreadPool by making the spin_count_ member variable const and initializing it in the constructor to improve thread safety.
  • #1803: Fixed ThreadPool compatibility with C++14 by replacing C++17 if statement initializers with comma operators to resolve compiler errors in g++-6 and MSVC.
  • #1802: Fixed initialization order and removed unused variables in NonBlockingThreadPool.h to prevent potential bugs and improve code maintainability.
  • #1799: Fixed a typo in NonBlockingThreadPool that was causing incorrect task stealing behavior, where the spin loop was using the current thread's deque instead of properly stealing from other threads' deques.
  • #1797: Fixes LoongArch architecture support in the Linux CI configuration by modifying the GitLab CI test pipeline.
  • #1795: Fixed Eigen::aligned_allocator by removing inheritance from std::allocator to resolve a bug in the allocate_at_least method and added equality operators for test support.
  • #1792: Fixed std::fill_n reference issue in Core/Fill.h and SparseCore/SparseMatrix.h to improve compatibility with device code and resolve standard namespace usage problems.
  • #1790: Fixed uninitialized memory read in SparseQR by removing unnecessary access to m_threshold in the factorize() method, preventing potential undefined behavior.
  • #1787: Fixed missing CUDA device qualifiers in DiagonalMatrix and PlainObjectBase by adding EIGEN_DEVICE_FUNC macros and replacing std::copy with manual iteration to ensure proper CUDA device function compatibility.
  • #1786: Fixes parallelization bug in Parallelizer.h by using omp_get_max_threads when setNbThreads is not explicitly set, restoring the documented behavior for automatic thread detection.
  • #1785: Fixed missing #include <new> in ConfigureVectorization.h to resolve build errors with the latest LLVM commit.
  • #1768: Updates the ROCm Docker configuration in the GitLab CI build pipeline for Linux environments.
  • #1767: Fixed CI build infrastructure by updating rocm docker configuration to use Ubuntu 22.04 image instead of the corrupted 20.04 image.
  • #1764: Fixes the checkformat CI stage by modifying the GitLab CI configuration to address a Docker Hub issue with Ubuntu image versions.
  • #1762: Fixed IOFormat alignment computation in Core/IO.h by using matPrefix instead of matSuffix for rowSpacer calculation.
  • #1760: Fixed undefined behavior in setZero function by adding null pointer check in memset specialization to prevent UB when handling zero-sized blocks.
  • #1753: Restored vectorized erf(x) implementation for SSE and AVX architectures that was accidentally removed in a previous merge request, improving performance for error function computations.
  • #1751: Reverted a commit in EigenBase.h that caused numerous builds to fail in debug mode.
  • #1749: Fixes performance issues in AssignEvaluator by disabling fill_n optimization for MSVC to address problems with std::_Is_all_bits_zero.
  • #1745: Fixed C++20 constexpr test compilation failures by modifying EigenBase.h to improve compatibility with C++20 standards.
  • #1742: Fixed enum comparison issues in Assign_MKL.h by casting LinearTraversal enum to int for C++26 compatibility.
  • #1741: Fixed MatrixBase destructor symbol visibility by adding an explicit empty destructor to EigenBase, ensuring lldb can properly resolve the symbol when debugging expressions that return Eigen::MatrixXd.
  • #1736: Fixed missing EIGEN_DEVICE_FUNCTION decorations in GenericPacketMathFunctions.h to ensure proper device-side execution compatibility for GPU and hardware acceleration contexts.
  • #1730: Reverted changes to fixed-size objects' move assignment behavior in Core modules to fix a compilation bug that caused missing v2.setZero() calls with GCC/Clang optimization flags.
  • #1726: Fixes GPU builds by adding proper initializers for constexpr globals in IndexedViewHelper.h required for CUDA compatibility.
  • #1725: Fixes clang6 compilation failures in Geometry SIMD code by avoiding SSE instructions on ARM architectures and setting the last scalar component to zero.
  • #1724: Fixed macro redefinition warning in FFTW test by removing redundant FFT default macros from CMake test declarations.
  • #1723: Fixed Clang 6 compiler optimization bugs in SSE SIMD implementations and geometry modules by resolving issues with vector rearrangement and floating-point mask operations.
  • #1722: Fixed matrix passing by value in test/reshape.cpp to resolve GCC ARM test failures caused by data alignment issues.
  • #1721: Fixed compilation error in Memory.h by switching from __builtin_alloca_with_align to eigen_aligned_alloca_helper when using nvc++ compiler. This resolves compatibility issues with the NVIDIA HPC compiler for aligned memory allocation.
  • #1720: Fixed NVCC builds for CUDA 10+ by resolving build warnings and assignment operator issues across multiple Core utility files including DisableStupidWarnings.h, IndexedViewHelper.h, Macros.h, and RandomImpl.h.
  • #1718: Fixed out-of-bounds access in triangular matrix multiplication code in GeneralMatrixMatrixTriangular.h to prevent memory safety issues and improve operation stability.
  • #1716: Fixed stack allocation assert in DenseStorage by moving the static assert back into the constructor when EIGEN_NO_DEBUG is defined, resolving potential stack allocation limit issues in the Evaluator class.
  • #1712: Fixed ARM compiler warnings in Reverse.h by adding compile time information to suppress array out of bounds warnings for reverseInPlace on fixed-size matrices.
  • #1711: Fixed DenseBase::tail method to properly handle Dynamic template arguments, resolving compilation issues when using tail(tailSize) expressions.
  • #1708: Fixed array_cwise test for 32-bit ARM architectures by adjusting atan function inputs to prevent flushed zero results in polynomial expansion due to FTZ behavior.
  • #1703: Fixed inverse evaluator in Eigen/src/Core/Inverse.h by marking it as a host+device function to enable execution on CUDA devices, resolving issue #2859.
  • #1701: Fixed CUDA compilation issues by adding missing EIGEN_DEVICE_FUNC annotations to CoreEvaluators.h and InnerProduct.h header files.
  • #1700: Fixed compiler warning and improved debugging info in array_cwise tests, adding extra diagnostic information to float_pow_test_impl and cleaning up test code.
  • #1699: Fixed compiler warning in EigenSolver::pseudoEigenvalueMatrix() related to assigning a 2x2 block to a 1x1 matrix, improving code robustness for odd-sized matrices.
  • #1693: Fixed the generic pceil implementation in SSE2 to correctly handle negative numbers rounded to zero, ensuring consistency with std::ceil.
  • #1690: Fixes a bug in the atanh function implementation within the generic packet math functions. The change addresses incorrect behavior in certain cases for the inverse hyperbolic tangent calculation.
  • #1689: Fixed SVE intrinsics bug by correcting svnot_b_x to svnot_b_z and added sqrt support with svsqrt_f32_x for ARM SVE architectures in PacketMath.
  • #1688: Fixed bug in atanh function to correctly handle the edge case when input is -1, improving robustness of the mathematical function in GenericPacketMathFunctions.h.
  • #1685: Fixed out-of-range arguments bug in _mm_permute_pd function within the SSE Complex module to prevent runtime errors from invalid indices in vectorized permutation operations.
  • #1679: Fixes compiler warnings in BDCSVD and JacobiSVD implementations by suppressing Wmaybe-uninitialized warnings to address potential uninitialized memory issues.
  • #1676: Fixed missing documentation for the eigenvectors() method in GeneralizedEigenSolver class by adding missing double quotes to make the method appear in generated documentation.
  • #1668: Fixed missing header inclusion in Eigen/Core by adding <thread> include for std::this_thread::yield() to ensure proper C++11 standards compliance and compilation.
  • #1660: Updates the documentation navigation tree JavaScript file (eigen_navtree_hacks.js) with unspecified modifications to the documentation infrastructure.
  • #1656: Fixed typos across multiple Eigen components including core modules, documentation, CI scripts, and unsupported modules to improve code quality and maintainability.
  • #1653: Fixed numerous typos across the Eigen codebase, including architecture-specific headers (AVX, NEON, SSE), core matrix operations, linear solvers, and sparse matrix components.
  • #1651: Fixed conversion of Eigen::half to _Float16 in AVX512 code by adding an as_float16 conversion function and inlining bit_cast to resolve compilation issues and warnings with AVX512FP16 intrinsics.
  • #1650: Fixed deprecation warning suppression in BFloat16 and Half headers by removing C++23 check, resolving MSVC compatibility issues.
  • #1649: Fixed compiler warnings in BDCSVD and JacobiSVD by using placement new to properly construct small SVD objects, eliminating -Wmaybe-uninitialized warnings.
  • #1648: Fixed overflow warnings in AVX512 PacketMathFP16 by adding explicit cast to short for _mm512_mask_set1_epi16 intrinsic when compiling with -march=sapphirerapids.
  • #1640: Fixed markdown formatting issues in ci/README.md to improve rendering and readability in the GitLab web interface.
  • #1639: Fixes AVX512FP16 build failure by adding vectorized cast specializations for packet16h and packet16f types in the AVX512 TypeCasting module.
  • #1637: Fixed scalar pselect in GenericPacketMath.h to resolve MSVC fast-math behavior where nan == 0.0 was not properly handling NaN value propagation.
  • #1635: Fixed compiler warning C5054 in ProductEvaluators.h by addressing deprecated '==' operator usage between enumerations of different types.
  • #1633: Fixes warnings in Meta.h that were generated by previous warning fixes, resolving conflicts in the Core utility module.
  • #1631: Fixed hundreds of enum comparison warnings in Core utilities and AutoDiff modules by adding appropriate compiler suppressions.
  • #1630: Fixed macro definition warnings in ThreadPool device and test files by addressing repeated macro definitions and ensuring consistent macro usage across the codebase.
  • #1628: Fixed threading tests by reordering header inclusions in CoreThreadPoolDevice.h and addressing C++20 extension warnings in Eigen/Core.
  • #1624: Fixed Clang tidy warnings in Memory.h by addressing pointer-to-integer casting issues in the aligned_alloca function.
  • #1622: Fixed UBSAN failure in array_for_matrix test by addressing undefined behavior related to integer types in the test suite.
  • #1620: Fixed compilation failures in DenseBase by adding a trivial default constructor to resolve constexpr matrix issues with GCC 14.
  • #1619: Fixed deprecation warnings in BFloat16.h and Half.h by suppressing C++23 deprecation warnings for std::has_denorm and std::has_denorm_loss to ensure compatibility with newer C++ standards.
  • #1616: Fixed GCC6 compilation error in test/array_cwise.cpp by resolving namespace prefixing issue in struct specializations.
  • #1615: Fixes predux operation for Packet4i on PowerPC AltiVec to avoid saturating the sum of elements, aligning behavior with other architectures.
  • #1611: Fixed CMake configuration to correctly set include path for the eigen target, ensuring packages depending on eigen properly inherit the expected /include/eigen3 path.
  • #1610: Fixed generic nearest integer operations on GPU by adding proper support and enhancing compatibility across GPU PacketMath and generic packet math functions.
  • #1609: Fixed eigensolver selfadjoint tests by adjusting error tolerance for unitary-ness checks to account for scaling effects, reducing test flakiness.
  • #1606: Fixed undefined behavior in packetmath.cpp test by resolving a signed integer overflow cast for packet size 1 in the predux_mul test.
  • #1605: Fixed unnecessary semicolons in SymbolicIndex.h and RandomImpl.h that were causing potential build errors in downstream projects.
  • #1604: Fixed AVX512 preduce_mul implementation on MSVC that was incorrectly producing negative results. Modified the PacketMath.h implementation and added corresponding tests to ensure correct behavior.
  • #1601: Fixed sine and cosine functions in PowerPC AltiVec implementation by addressing missing comparison functions that caused incorrect function selection.
  • #1599: Fixed CI configuration by adding "cross-compiler" tag to arm/intel runners to prevent the PPC runner from attempting cross-compilation, resolving "unknown emulation: elf64lppc" build errors.
  • #1598: Fixed transposed matrix product bug in core matrix operations by specializing transpose() handling in XprHelper.h, Product.h, and Transpose.h to prevent unnecessary memory allocations when using noalias.
  • #1594: Fixes tridiagonalization_inplace_selector::run() method by adding EIGEN_DEVICE_FUNC attribute to enable CUDA compatibility.
  • #1592: Fixed vectorized psincos support by adding double precision support for PPC architecture and resolving a test failure on 32-bit ARM due to missing integer_packet scalar support.
  • #1591: Fixed compilation problems with PacketI on PowerPC architecture by modifying AltiVec and generic packet math headers to resolve build errors and improve platform compatibility.
  • #1588: Fixed build issues for pblend, psin_double, and pcos_double functions in AVX architecture when AVX2 is not supported. Modified AVX-specific implementations in MathFunctions.h and PacketMath.h to ensure proper compatibility with AVX-only environments.
  • #1585: Fixed missing AVX512 intrinsic in PacketMath.h to resolve a GCC bug affecting the pfirst<Packet16i> functionality in Eigen's AVX512 implementation.
  • #1582: Fixed indexed view template definitions in MSVC 14.16 by moving template class definitions out of IndexedViewMethods.inc to resolve compiler warnings and build issues.
  • #1578: Fixes the Geometry_SIMD.h file with a minor modification to the SIMD geometry operations module.
  • #1577: Fixed preverse implementation in AltiVec PacketMath for PowerPC architecture to address compatibility and correctness issues.
  • #1576: Fixed preprocessor condition in UnaryFunctors.h to correctly enable fast float logistic implementation by correcting macro name mismatches and conditional compilation logic.
  • #1575: Fixed long double random number generation by correcting mantissa bit calculation and removing redundant static asserts in Core/MathFunctions.h and Core/RandomImpl.h.
  • #1574: Fixed Packet4l definition in AVX PacketMath by adding guards to prevent undefined behavior or compilation errors in AVX-based packet operations.
  • #1573: Fixed compiler warnings in Core modules and tests by addressing unary minus operator applied to unsigned types on MSVC and other compilers. Modified arithmetic operations and packet math implementations to use explicit cast mechanisms for improved code safety.
  • #1570: Fixed casting bug in SSE TypeCasting.h by changing Packet2d to Packet2l conversion from rounding to truncation. This corrects numerical operation behavior in SSE implementations.
  • #1567: Fixed SSE 32-bit support for double-to-int64 conversions by implementing a two-step approach and added Windows build smoketests for 32-bit/64-bit compatibility.
  • #1566: Fixes an issue with Packet2l implementation in the SSE PacketMath module on Windows 32-bit systems. Modifies the SSE PacketMath.h file to ensure proper compatibility and correctness for Windows platforms.
  • #1562: Fixes potential issues with alloca usage in TriangularMatrixVector by adding protection against its use on 32-bit ARM systems to ensure compatibility and stability.
  • #1559: Fixed Packet*l for 32-bit builds by adding work-arounds for _mm_cvtsi128_si64 and _mm_extract_epi64 instructions in AVX/SSE PacketMath headers. Enabled proper compilation of 32-bit targets on Linux (GCC/clang) and Windows (MSVC).
  • #1552: Fixed CwiseUnaryView compatibility issue with MSVC compiler by rearranging code to properly handle default parameters in CwiseUnaryViewImpl.
  • #1550: Fixed compilation error in EmulateArray.h by removing unnecessary GPU guarding for rbegin/rend methods, ensuring these reverse iterators work correctly on GPU devices.
  • #1549: Fixes CwiseUnaryView const access by modifying the class to test for matrix mutability and enable mutable access functions only when applicable, preventing build failures related to const access violations.
  • #1547: Fixes const correctness and C++20 compatibility issues in CwiseUnaryView by preserving scalar const-ness and replacing deprecated std::result_of with Eigen's internal version.
  • #1545: Fixed CwiseUnaryView to resolve direct-access issues with const objects and eliminate the need for const_cast in non-const access scenarios.
  • #1543: Fixed incomplete Cholesky decomposition by adding proper diagonal element insertion handling in SparseMatrix and allowing verification of shift parameters.
  • #1541: Fixed packetmath plog test on Windows by replacing std::log with numext::log in test/packetmath.cpp to ensure MSVC compatibility and handle edge cases correctly.
  • #1540: Fixed pexp test in packetmath.cpp to handle 32-bit ARM subnormal flushes, resolving test failures on ARM architecture.
  • #1538: Fixed AlignedBox volume calculation to return 0 for empty boxes instead of negative values, ensuring consistent behavior across different scenarios.
  • #1536: Fixed unaligned memory access issue in the triangular matrix vector multiplication (trmv) function within Eigen's Core library, resolving test failure nomalloc_3 and improving stability on certain hardware architectures.
  • #1535: Fixed deprecated anonymous enum-enum conversion warnings in Core and SparseCore modules by modifying XprHelper.h, Matrix.h, TriangularMatrix.h, and SparseSelfAdjointView.h to eliminate compiler warnings while maintaining backward compatibility.
  • #1533: Fixed edge-cases in complex number test cases for the pexp function by modifying the packet math implementation and test suite to address test failures.
  • #1532: Fixed a C++14 requirement warning in the Core utilities macros. Removed warning messages related to C++14 compatibility requirements in Eigen/src/Core/util/Macros.h.
  • #1531: Fixed BLAS routine calls by adding degenerate checks to prevent crashes when operating on zero-sized matrices or vectors across multiple matrix product files.
  • #1530: Fixed a CMake warning related to FindCUDA in the CMakeLists.txt file. Removed deprecated or unnecessary CUDA detection warning to reduce build warnings without affecting functionality.
  • #1529: Fixed compiler warning in triangular matrix-vector multiplication by removing const_cast abuse and avoiding potential uninitialized memory usage in TriangularMatrixVector.h.
  • #1528: Fixed QR column pivoting test by replacing abs with numext::abs for floating-point types to resolve compilation warnings and test failures.
  • #1527: Fixed shadowed typedefs in Core utilities and QR decomposition modules to resolve type conflicts and improve code clarity.
  • #1526: Fix MSVC GPU build compatibility issues by resolving out-of-line definition problems in MathFunctions.h and addressing MSVC/NVCC compatibility issues with Index types in JacobiSVD.h.
  • #1524: Fixed signed integer undefined behavior in random number generation by modifying MathFunctions.h to address overflow issues and updating test cases in rand.cpp to ensure correct behavior under overflow conditions.
  • #1521: Fixed crash in IncompleteCholesky when input matrix has zeros on diagonal by explicitly inserting zero values into the sparse data structure before factorization.
  • #1519: Fixes array_size implementation in EmulateArray.h and Meta.h by changing from enum to constexpr, resolving comparison issues and compiler warnings while improving type safety.
  • #1518: Fixed header guard inconsistency in GeneralMatrixMatrix.h and Parallelizer.h to resolve build conflicts and improve codebase stability.
  • #1516: Fixed GPU build for ptanh_float function by adjusting declarations in GenericPacketMathFunctions.h and MathFunctions.h to ensure consistent GPU compatibility.
  • #1514: Fixes complex exponential test in packetmath.cpp by replacing index type with int type.
  • #1513: Fixes pexp_complex_test by modifying the packetmath.cpp test file to resolve issues with complex exponential packet math testing.
  • #1510: Fixed real Schur decomposition by adjusting shift frequency from every 10/30 iterations to every 16 iterations, and added validation assert in polynomial solver to ensure successful decomposition.
  • #1507: Fixed deflation issues in BDCSVD by aligning indices with the paper and replacing sqrt with hypot for better numeric stability. Corrected diagonal element assignment to ensure strictly increasing diagonal and improved convergence for large constant matrices.
  • #1504: Fixed undefined behavior in pabsdiff function on ARM architectures by adding overflow prevention checks in packet math tests.
  • #1503: Fixed the digits() function in Eigen::Core::MathFunctions.h to be constexpr for custom scalar types. Resolved compilation issues where custom scalars without constexpr implementations couldn't properly precompute mantissa bits for random number generation.
  • #1500: Fixes scalar conversion bug in TriangularSolverMatrix by adding explicit scalar conversion in ternary expressions to ensure compatibility between different scalar types.
  • #1499: Fixed compiler warning in test/packetmath.cpp by using void* cast when writing bytes to non-trivial types.
  • #1498: Fixed f2c library conflicts by removing r_cnjg and d_cnjg functions and inlining them directly into BLAS implementation files. This eliminates duplicate symbol errors when linking with external libf2c.
  • #1496: Fixed division by zero undefined behavior in packet size logic within GeneralBlockPanelKernel, preventing invalid results from being used in calculations.
  • #1494: Fixed segfault in CholmodBase::factorize() when handling zero matrices by adding proper checks to prevent unnecessary computations and ensure robust sparse matrix factorization.
  • #1492: Fixed C++20 compiler error in GeneralBlockPanelKernel.h by resolving arithmetic operations between different enumeration types, ensuring compatibility with modern C++ standards.
  • #1490: Fixed undefined behavior in bool packetmath test by using valid boolean values in SSE PacketMath and GenericPacketMath modules. Resolved bug in pselect function that was caused by invalid boolean values affecting Intel blend intrinsic compatibility.
  • #1489: Fixed undefined behavior in getRandomBits function in Core/MathFunctions.h by adding a check for when numRandomBits is 0 and optimizing the mask calculation to avoid unsafe bit shifts.
  • #1488: Fixed test failures in array_for_matrix, product, redux, and stl_iterators tests when using bfloat16 and half scalar types by addressing unsupported constexpr behavior.
  • #1487: Fixed skew-symmetric matrix test by avoiding the trivial case where k == 1 to prevent catastrophic cancellation and reduce test failures.
  • #1486: Fixed GCC-6 compiler optimization bug in the rand test by adding noinline attribute to prevent the compiler from eliding the return value and causing test failures.
  • #1485: Fixes random integer overflow issues and packet math test failures on PPC architecture by modifying AltiVec PacketMath implementation and adding ploadquad for Packet16(u)c.
  • #1482: Fixed the preshear transformation function in Eigen/src/Geometry/Transform.h by correcting invalid constructor usage. Added test coverage to verify the transformation works correctly.
  • #1481: Fixed CI configuration for clang-6 cross-compilation by ensuring consistent GLIBC versions across build and test environments in GitLab CI files.
  • #1478: Fixed a comparison mistake in the subnormal checking code within the array_cwise test module.
  • #1476: Fixed ODR (One Definition Rule) violations across multiple components by resolving namespace conflicts and addressing issues with IndexPair, TensorMeta, and function specializations in test files and tensor modules.
  • #1468: Fixed ARM32 architecture issues by replacing fpclassify with explicit subnormal checks and modifying mlaq usage to preserve accuracy during range reduction.
  • #1461: Fixed unused warnings in the failtest suite by removing unnecessary warning-generating code from const-qualified method return value test files.
  • #1460: Fixes performance regression in stableNorm function by reverting a previous implementation change that caused slowdowns on large vectors.
  • #1458: Fixed the stableNorm function to properly handle zero-sized input, preventing edge-case breakage and ensuring consistent behavior across all input sizes.
  • #1456: Fixed memory safety in Core/util/Memory.h by adding null pointer checks before freeing memory to prevent invalid memory access.
  • #1452: Fixed minor issues in basic slicing examples documentation by correcting errors and improving clarity in the TutorialSlicingIndexing.dox file.
  • #1451: Fixed SPQR module build error by resolving Index/StorageIndex type mismatch in SuiteSparseQRSupport.h. Addressed compiler error when using SparseMatrix with SuiteSparseQR() on Apple clang.
  • #1450: Fixed compiler warning in stableNorm implementation by cleaning up potentially uninitialized memory handling in StableNorm.h and associated test file.
  • #1449: Fixed GPU memory access issues in GenericPacketMath.h by addressing problems with function pointers that caused illegal memory accesses when using clang and ASAN.
  • #1448: Fixed MSAN (Memory Sanitizer) failures in test files by addressing uninitialized memory usage in product_threaded.cpp and cxx11_tensor_concatenation.cpp test cases.
  • #1447: Fixed ASAN/UBSAN errors across multiple Eigen components including ComplexSchur index bounds, thread pool destruction order, tensor evaluation cleanup, and unused variable warnings.
  • #1444: Fixed overflow issue in CompressedStorage by using smaller index type for determining maximum size during resize operations.
  • #1441: Fixed the clang-format CI pipeline by enabling non-interactive mode and requiring clang-format installation in the GitLab CI configuration.
  • #1439: Fixed MSVC _BitScanReverse implementation in MathFunctions.h to return the correct index of the first set bit for leading zero calculations. Aligned behavior with _BitScanForward/ctz to ensure consistency in bit manipulation functions.
  • #1435: Fixed kernel launch syntax in test/gpu_common.h to prevent clang-format errors across versions 13-18.
  • #1434: Fixed CUDA syntax error in test/gpu_common.h that was introduced by clang-format.
  • #1431: Fixed overflow issues in scalar_logistic_function for complex inputs by correcting comparison logic to compare real parts instead of complex values directly.
  • #1425: Fixed typecasting issues in the NEON implementation for ARM32 architecture by correcting type casting operations in the Core arch module.
  • #1422: Fixes 64-bit integer to 32-bit float conversion in ARM NEON TypeCasting module by modifying the conversion logic to prevent data truncation of large values.
  • #1419: Fixed a bug in GeneralMatrixMatrixTriangular.h by enforcing mc >= Traits::nr before checking for multiples to prevent invalid access or out-of-bounds errors in matrix operations.
  • #1417: Fixed a bug in the Parallelizer module's getNbThreads() function to correctly return 1 when not parallelizing. This ensures consistent behavior of thread count retrieval in both parallel and non-parallel scenarios.
  • #1416: Fixed Wshorten-64-to-32 compiler warning in the gemm parallelizer by modifying integer type handling in Parallelizer.h.
  • #1415: Fixes pthread linking issue for the product_threaded test by modifying the test CMakeLists.txt configuration.
  • #1413: Fixed the traits<Ref>::match implementation to use correct strides when constructing Ref<T, Options, Stride<0, 0>> from contiguous memory layout objects. This resolves compilation issues for mutable types and enables creating Ref objects without unnecessary copying.
  • #1411: Fixed typo in AVX512 TrsmKernel configuration macro, correcting EIGEN_NO_RUNTIME_MALLOC to EIGEN_RUNTIME_NO_MALLOC to allow nomalloc tests to pass on AVX512 architectures.
  • #1402: Fixed MSVC compiler issue in Block.h by removing dependent XprType typedef that caused attribute confusion with RowMajor handling.
  • #1401: Fixed a typo in a comment within the Block.h file of Eigen's Core module.
  • #1400: Fixes div_ceil function in MathFunctions.h by passing arguments by value to prevent odr-usage errors from implicit const conversions.
  • #1399: Fixed deprecation warnings in MSVC C++23 by modifying DisableStupidWarnings.h and ReenableStupidWarnings.h to suppress denormal number deprecation warnings.
  • #1398: Fixed macro conflict in matrix product implementations by eliminating use of _res variable name that conflicted with a leaked macro from resolv.h.
  • #1396: Fixed the row() and col() functions in the sparse triangular view iterator that were accidentally commented out, resolving a critical bug that caused incorrect results and potential segfaults.
  • #1394: Fixed an extra semicolon in the XprHelper class that was causing compilation errors with the -Wextra-semi compiler flag.
  • #1393: Fixed ROCm build configuration by replacing HIP_PATH with ROCM_PATH in CMakeLists.txt files to ensure compatibility with ROCm 6.0's updated directory structure.
  • #1392: Fixed CUDA device function compatibility in Transform.h by adding EIGEN_DEVICE_FUNC attribute to static run methods, resolving issues with operator* not working correctly in device contexts.
  • #1388: Fixed stage validation logic in PardisoSupport to only consider stages valid when Pardiso returns success, preventing invalid operations due to factorization errors.
  • #1387: Fixed block expression handling by adding explicit method to convert block of block expressions to simple blocks and removing problematic implicit conversion operator.
  • #1386: Fixed ARM32 NEON float division and reciprocal operations by introducing a new formula to avoid denormal values and increasing refinement iterations for better accuracy.
  • #1383: Fixes unaligned scalar usage issues in MapBase by adding a temporary macro workaround to address TFLite-related failures and enable continued Eigen development.
  • #1381: Fixed boost multiprec test to reference new SVD tests, updating test file alignment with the new SVD test suite.
  • #1380: Fixes undefined behavior in MapBase by disabling unaligned scalar binding test and adding explicit alignment assertions to ensure proper memory alignment constraints.
  • #1379: Fixed a potential nullptr dereference in the SVD UpperBidiagonalization implementation by adding a check to return nullptr when the upper-diagonal is empty, preventing runtime errors in edge cases.
  • #1377: Fixed undefined behavior access in triangular matrix operations by adding safety checks to prevent out-of-bounds access when the system is empty or the matrix is singular.
  • #1376: Fixed a nullptr dereference issue in the triangular product implementation that occurred when matrix sizes are zero. This prevents undefined behavior and UBSAN issues in edge cases involving zero-sized matrices.
  • #1373: Fixes NumTraits by adding max_digits10 function to resolve discrepancy with std::numeric_limits in serialization contexts. Enhanced precision handling for double types across Core IO and NumTraits modules.
  • #1372: Fixes critical issues in AltiVec architecture support for Power systems, addressing partial packet problems and missing stride() method for bfloat16 data types to improve Tensorflow integration.
  • #1371: Fixed -Wmaybe-uninitialized compiler warnings in SVD implementations by modifying BDCSVD, JacobiSVD, and SVDBase headers to use compile-time dimensions for workspace variables in fixed-size cases.
  • #1370: Fixed -Waggressive-loop-optimizations warning in GeneralMatrixVector.h by explicitly defining loop bounds to silence false positive warnings during matrix-vector product optimizations.
  • #1369: Fixed ARM build warnings in tensor modules by correcting cast-align issues in TensorContraction.h and signed/unsigned comparisons in TensorDimensions.h.
  • #1367: Fixed compiler warnings in Core module by handling zero-size blocks in comma initializers, adding proper copy constructors, and ensuring initialization in triangular solvers and tests.
  • #1363: Fixes CUDA compatibility in MathFunctions.h by replacing deprecated ::arg function with std::arg to resolve MSVC+C++20 compilation issues.
  • #1362: Fixed the imm parameter argument for the _mm256_cvtps_ph intrinsic in AVX PacketMath.h to resolve compiler warnings and prevent potential out-of-bounds issues.
  • #1361: Fixed Altivec compilation issues with C++20 and C++23 standards by removing unnecessary simple-template-id constructor name from MatrixVectorProduct.h.
  • #1360: Fixed the return type of ivcSize function in IndexedViewMethods.h to ensure type safety and consistency with Eigen's internal implementation.
  • #1359: Fixed AVX512 TRSM kernels to disable when malloc is unavailable, ensuring compatibility with nomalloc build configurations by modifying kernel activation logic and triangular solver matrix handling.
  • #1358: Fixed compiler warnings across multiple Eigen modules including Core utilities, Dot, GenericPacketMath, and unsupported Tensor components. Addressed unsigned/signed integer comparison warnings, removed unused typedef warnings, and improved test suite compatibility with CI environments.
  • #1357: Fixed supportsMMA function in AltiVec MatrixProduct to properly respect the EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH compilation flag and ensure compiler compatibility.
  • #1356: Fixed compilation warning on ARM architectures by defining the EIGEN_HAS_ARM64_FP16_VECTOR_ARITHMETIC macro in Macros.h. This ensures the required macro is always defined when compiling on ARM systems with clang.
  • #1355: Fixed FP16 arithmetic support by disabling it for arm32 architecture, restricting FP16 intrinsics to arm64 only to align with Arm developer guide restrictions.
  • #1351: Fixed SVD tests by removing deprecated behavior testing and reducing resource consumption to address CI failures.
  • #1350: Fixed the safe_abs function in int_pow implementation to handle inputs outside the result type range, preventing undefined behavior on clang compiler.
  • #1349: Fixed AVX pstore function in PacketMath.h to correctly use aligned store intrinsics for integer types, avoiding unaligned store calls and improving performance/correctness of AVX operations.
  • #1344: Fixes underflow issues in the prsqrt function by modifying MathFunctionsImpl.h to improve numerical stability for small input values.
  • #1343: Fixed error handling and underflow issues in the pow function within GenericPacketMathFunctions, adding safe_abs to handle signed integer edge cases and improving test coverage for various exponent scenarios.
  • #1339: Fixes EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC macro in Eigen/src/Core/util/Macros.h to prevent it from being set during CUDA compilation, resolving miscompilation issues in shared ARM/CUDA code.
  • #1337: Fixed vectorization logic in Redux and PartialReduxEvaluator components after traversal order changes. Resolved compatibility issues and updated test cases to ensure proper vectorization behavior.
  • #1334: Fixed unrolled assignment evaluator in Core module to use correct linear access functions and avoid template parameter naming conflicts.
  • #1333: Fixed compiler warnings and failures in SVD implementations by properly initializing small fixed-size matrix members in SVDBase.h.
  • #1327: Fixed CUDA compilation issues in Core utilities by reordering include files and adding vector header. Resolved compilation problems related to EIGEN_AVOID_STL_ARRAY in CUDA builds.
  • #1325: Fixed the array_cwise test to suppress compiler warnings and renamed it to avoid naming conflicts with the tensor array() function.
  • #1323: Fixed compiler warning related to modulo by zero operations in the Visitor implementation. The change addresses potential division by zero warnings in Eigen/src/Core/Visitor.h.
  • #1322: Fixed SFINAE solution in AltiVec MatrixProduct components to resolve LLVM compilation issues with BF16 GEMV operations by correctly specializing the loadColData function.
  • #1321: Fixed compiler warnings and code clarity issues in array_cwise test by suppressing MSVC warnings for unsigned integer negation and addressing operator precedence ambiguities.
  • #1319: Fixed AltiVec BF16 GEMV operations to properly handle RowMajor vector layouts in ColMajor matrix operations, ensuring correct data type handling and improved robustness across different memory layout scenarios.
  • #1318: Fixes JacobiSVD to set m_nonzeroSingularValues to zero when invalid input is detected, preventing crashes and ensuring rank() returns 0 appropriately.
  • #1316: Fixed SSE Packet4ui by adding missing pcmp, pmin, and pmax functions to resolve compilation issues for SSE4.1 vector types.
  • #1312: Fixed a boolean bitwise operation warning in the test file test/product_small.cpp to reduce warning noise in the test suite.
  • #1311: Fixed sparse matrix iterator compatibility by making StorageRef move-able for std::sort usage and replaced deprecated std::random_shuffle in sparse tests to eliminate warnings.
  • #1308: Fixes vectorized power operations for uint32_t by adding proper specialization in PacketMath.h and disabling unimplemented pmul to prevent compilation errors.
  • #1302: Fixed a typo in the SSE packet math implementation header file. Corrected inconsistency in Eigen/src/Core/arch/SSE/PacketMath.h to ensure proper SSE packet math functionality.
  • #1291: Fixed .gitignore configuration to prevent Eigen/Core and Eigen/src/Core directories from being incorrectly ignored by the core ignore rule.
  • #1286: Fixed non-const symbolic indexed views by adding explicit l-value qualifier check and resolving type mismatch in Map expression to enhance type safety.
  • #1283: Fixed double-to-int casting operations by using the correct truncating intrinsic across AVX, AVX512, and SSE architectures to ensure consistent truncating behavior and improve casting reliability.
  • #1282: Fixed buffer overrun issues in AVX512 GEMM/TRSM kernels detected by AddressSanitizer. Added masked loads to prevent out-of-bounds data access while maintaining performance.
  • #1280: Fixed raw array indexed view access for 1d arrays by disabling this functionality in IndexedViewMethods.h to prevent undefined behavior and improve code safety.
  • #1277: Fixed incorrect casting in AVX512DQ path by modifying PacketMath.h to ensure correct type conversions in vectorized operations.
  • #1271: Fixed SparseMatrix typedef and overflow handling by changing Map typedef to use Options_ instead of Flags and adding StorageIndex overflow check in setFromTriplets.
  • #1270: Fixed ARM build compatibility issues by adding missing casts, resolving MSVC packet conversion problems, and defining required macros for 32-bit ARM architectures in Core modules.
  • #1269: Fixes build issues by reverting cmake pools changes in CI configuration and CMakeLists.txt files.
  • #1268: Fixed CMakeLists.txt to properly parse command-line arguments when specified as cmake lists. Resolved incompatibility with CI configurations by allowing both space-separated and semi-colon separated list formats.
  • #1264: Fixed MathFunctions.h by using EIGEN_NOT_A_MACRO macro to resolve build conflicts with TensorFlow environments.
  • #1263: Fixed PowerPC and clang warnings in AltiVec matrix operations and warning suppression utilities by disabling unnecessary compiler warnings in the Eigen codebase.
  • #1262: Fixed PowerPC build configuration by limiting build jobs to 8 and link jobs to 4 to prevent out-of-memory issues during CI builds.
  • #1259: Fixed deadcode checks in MatrixProductMMAbfloat16.h by restoring previously removed checks to prevent unused code from being optimized away.
  • #1258: Fixes BF16 GEMM register spillage issues in AltiVec/Power architecture by reverting problematic changes that caused 20% performance slowdown in LLVM compiler.
  • #1257: Fixed minmax visitor in Core/Visitor.h to handle PropagateFast consistently with PropagateNaN, preventing out-of-bounds indices for matrices containing all NaN values.
  • #1256: Fixed bug in minmax_coeff_visitor for matrices containing only NaN values. Improved robustness by ensuring proper handling of edge cases to prevent undefined behavior.
  • #1254: Fixed backwards compatibility in DenseBase::select implementation by swapping template argument order to ensure legacy code mixing arrays and matrices continues to work.
  • #1252: Fixed a compiler bug workaround in the Tridiagonalization module by modifying the implementation in Tridiagonalization.h.
  • #1249: Fixed MSVC compiler test failures in AVX/AVX512 PacketMath by replacing problematic set1 intrinsics with set intrinsics and adding test support files.
  • #1248: Fixed a typo in the LinAlgSVD example code in TutorialLinAlgSVDSolve.cpp to ensure it compiles and runs correctly, allowing users to see the proper least-squares solution output.
  • #1245: Fixed a failing cwise test by replacing direct multiplication with .abs() to avoid signed integer overflow issues in random matrix squaring operations.
  • #1239: Fixed NEON integer shift operation tests to properly handle zero as a valid input argument, resolving test failures in the array_cwise test suite.
  • #1235: Fixed ODR (One Definition Rule) issues in Intel's AVX512 TRSM kernels by removing static qualifiers from free functions in TrsmKernel.h and TrsmUnrolls.inc to resolve linkage problems.
  • #1232: Fixed long double usage on GPU devices by adding guards in Core utilities to prevent warnings and duplicate symbols in CUDA/HIP environments.
  • #1229: Fixed MSAN failures in SVD tests by addressing uninitialized matrix handling issues in bdcsvd.cpp, jacobisvd.cpp, and svd_common.h to resolve undefined behavior.
  • #1228: Fixes compiler version compatibility issues in AltiVec PacketMath for Power architecture by addressing problems with the vec_div command for int type in GCC 10.4.
  • #1222: Fixed epsilon value in NumTraits for long double types to prevent convergence issues in algorithms on PowerPC systems, with corresponding adjustments in SparseMatrix and MatrixPower modules.
  • #1221: Fixed complex sqrt functionality in AVX512 by adding conditional compilation guards to prevent failures on older MSVC compilers.
  • #1220: Fixed NEON packetmath implementation by resolving GCC compilation issues in PacketMath.h and addressing a preinterpret stack overflow in TypeCasting.h.
  • #1218: Fixes MSVC atan2 test by adding correction for std::atan2 returning denorm_min on underflow to ensure POSIX compliance.
  • #1216: Fixed a typo in the return value of the make_packet2f function in the NEON PacketMath implementation.
  • #1215: Fixed compiler warnings in test files by modifying array_cwise, fastmath, sparse_basic, and NNLS test modules to address compilation warning issues.
  • #1214: Fixed array conversions from BF16 to F32 in Power architecture by optimizing the conversion process in AltiVec MatrixProductMMAbfloat16 implementation. Reduced the number of vector instructions used, improving performance and reducing computational overhead.
  • #1213: Fixed compiler warnings in multiple Eigen core modules including BinaryFunctors, TriangularMatrixVector, PlainObjectBase, and Jacobi components. Addressed code structure and formatting issues to improve code quality and compiler compatibility.
  • #1202: Fixed MSVC ARM build issues by modifying intrinsic functions and vector type aliases in NEON architecture files (Complex.h, PacketMath.h, TypeCasting.h).
  • #1201: Fixed ODR violation in AltiVec MatrixProduct by renaming gemm_extra_cols to gemmMMA_cols to avoid duplicate binary definitions when EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH is enabled.
  • #1192: Fixed EIGEN_DEVICE_FUNC compilation issues for CUDA 10/11/12 across GPU PacketMath, tensor modules, and warning suppression utilities to improve NVCC compatibility.
  • #1191: Fixed LAPACKE configuration by defining complex types as std::complex and adding LAPACK_ILP64 support for int64_t lapack_int definition. Updated QR bindings to use standard std::complex types instead of scomplex for improved compatibility with external LAPACK libraries like MKL.
  • #1189: Fixed SkewSymmetricMatrix3 by adding EIGEN_DEVICE_FUNC qualifiers to the Assignment struct. This enables CUDA compatibility for SkewSymmetric operations in device kernels.
  • #1188: Reverts changes to StlIterators.h that were made in a previous "Fix undefined behavior" merge request, restoring the original implementation of the StlIterators functionality.
  • #1185: Fixes special case handling in the atan2 function within Eigen's generic packet math to resolve a test failure in TensorFlow with Clang.
  • #1184: Fixed bugs in pcmp_lt and pnegate functions for pre-POWER8_VECTOR architectures in AltiVec backend and reactivated psqrt function to improve vector operations compatibility.
  • #1183: Fixed undefined behavior in Block access by modifying pointer arithmetic handling in Core/Block.h and Core/StlIterators.h to prevent UBSan errors from null pointer operations.
  • #1181: Fixed bugs in GPU convolution operations exposed by enabling GPU asserts. Modified BlasUtil.h, VectorwiseOp.h, TensorConvolution.h and related test files to address assertion failures and improve GPU compatibility.
  • #1180: Fixed segfault in sparse matrix operations when outerSize == 0 by ensuring proper allocation of m_outerIndex array for empty sparse matrices.
  • #1179: Fixed the Altivec rsqrt function by disabling its vectorized implementation to resolve compatibility issues with the generic version.
  • #1178: Fixed sparse matrix warnings in SparseCore module by addressing potential warning issues in SparseMatrix.h.
  • #1173: Fixed QR test compatibility by reverting permutation index type changes in qr_colpivoting.cpp and qr_fullpivoting.cpp to restore default types from FwdDeclarations.h.
  • #1169: Fixed deprecated CMake generator expression by replacing $<CONFIGURATION> with $<CONFIG> in the testing configuration. This ensures compatibility with CMake 3.0+ and avoids potential build issues.
  • #1167: Fixed move assignment issues in ColPivHouseholderQR by adding a workaround that resizes the m_colsPermutation variable to avoid compiler errors with fussy compilers.
  • #1165: Fixed missing EIGEN_DEVICE_FUNC annotations in Core utility files and removed an old GCC 4.7 workaround that was causing undefined behavior in assert contexts.
  • #1162: Fixed build error in QR decomposition module by resolving conflicting definitions of StorageIndex across multiple QR-related files and LAPACKE helpers.
  • #1161: Fixed unused parameter warning in NEON implementation of GeneralBlockPanelKernel. Resolved compiler warning about unused parameter 'tmp' when building with clang on 32-bit ARM architecture.
  • #1159: Fixed missing header file for GPU tests by adding the necessary test/gpu_test_helper.h file to resolve compilation issues.
  • #1156: Fixed build and test issues across multiple Eigen modules by correcting header paths in SPQRSupport, updating minmax visitor vectorization, and removing superfluous headers and test code.
  • #1155: Fixed the overalign check in Macros.h to properly handle the EIGEN_COMP_ICC macro when set to 0, ensuring compatibility with non-ICC compilers.
  • #1153: Fixed guard macros in Half.h to resolve compilation errors when using CUDA with EIGEN_CUDACC not defined for emulated FP16 GPU operations.
  • #1151: Fixed EIGEN_HAS_CXX17_OVERALIGN macro in Macros.h to resolve Intel Compiler (ICC) compatibility issues with C++17 overalignment features.
  • #1150: Fixed Altivec vectorization on macOS by correcting VSX instruction checks and PowerPC architecture detection macros to avoid using unsupported instructions on Darwin PPC platforms.
  • #1149: Fixed .gitignore file to properly include scripts/buildtests.in when using git add.
  • #1145: Fixed bfloat16 product test failures by adjusting VERIFY_IS_APPROX thresholds in test/product.h to account for the reduced precision of bfloat16 arithmetic.
  • #1144: Fixed C++ version detection macros in Macros.h and disabled problematic constexpr tests in CMake to prevent CI failures caused by incorrect __cplusplus reporting.
  • #1143: Reverts changes in CompressedStorage.h that avoided mixing types to restore compatibility with existing codebase.
  • #1142: Fixed incorrect NEON native fp16 multiplication by adding fallback mechanisms in the GEBP kernel and assertion to the NEON __fp16 specialization, resolving TensorFlow tensor contraction failures on ARM hardware.
  • #1140: Fixed bugs in SparseLU implementation by removing deprecated code and resolving a subtle bug in SparseLUTransposeView that caused test failures.
  • #1137: Fixed test/array_cwise.cpp by replacing std::signbit with numext::signbit to resolve compatibility issues with bfloat16 types that lack std::signbit support.
  • #1135: Fixed divide by zero handling in BinaryFunctors by replacing std::raise() with integer division by zero for better embedded systems compatibility where is not available.
  • #1130: Fixed a type typo in the sparse index sorting code within the sparse vector implementation, correcting the index type to improve sorting functionality.
  • #1127: Fixed serialization for non-compressed sparse matrices by adding an explicit move constructor and modifying SparseMatrix.h to handle serialization buffer size calculation correctly.
  • #1124: Fixed the SparseLU solver to properly handle destinations with non-unit stride by replacing direct memory access with block expressions in SparseLU.h and SparseLU_SupernodalMatrix.h.
  • #1123: Fixed stride calculation in Reshaped.h to properly handle non-zero inner strides during reshape operations. This resolves a bug where outer stride calculations were incorrect when input matrices had non-zero inner strides.
  • #1122: Fixed compiler warnings in test files by addressing narrowing conversions and deprecated pow function usage in adjoint, array_cwise, and visitor tests.
  • #1120: Fixed bugs in handmade_aligned_realloc function in memory management utilities. Corrected memory copying behavior and added alignment bounds checking to prevent overflows and undefined behavior.
  • #1118: Fixed ambiguity in PPC architecture's AltiVec PacketMath by resolving compiler confusion between uint64_t and unsigned long types in vec_splats call.
  • #1116: Fixed pnegate function in AVX and AVX512 PacketMath to correctly handle floating-point zero sign bits, ensuring proper +/-0 value behavior in vectorized operations.
  • #1115: Fixed AVX2 psignbit implementation in Core's AVX PacketMath module to ensure correct handling of AVX2 instructions and improve vector operation performance.
  • #1113: Fixed duplicate execution code for Power 8 Altivec in the pstore_partial function within AltiVec PacketMath implementation. Eliminated redundant code paths to improve reliability and correctness of Altivec instructions handling.
  • #1112: Fixed a typo in the CholmodSupport module to improve code quality without affecting functionality.
  • #1111: Fixes Neon-related issues in the NEON PacketMath implementation by modifying the architecture-specific code in the Core module.
  • #1109: Fixed SparseMapBase by removing a recently added sparse assert in SparseMap.h that was causing issues when the map may not be valid during construction.
  • #1107: Fixed build failures on PPC architectures by disabling the use of patan for double precision in the AltiVec PacketMath implementation where it is not supported.
  • #1106: Fixed offset computation in handmade_aligned_malloc to resolve compiler warning and potential out-of-bounds memory access issue in Core memory utilities.
  • #1105: Fixed pragma check for disabling fastmath in LU decomposition code. Addressed a bug related to fastmath pragmas in the InverseSize4.h implementation to improve compatibility and stability.
  • #1104: Fixed bug in NEON assembly for half-precision data type in GeneralBlockPanelKernel.h. Restricted use of intrinsics to prevent performance degradation and costly duplicate operations in GCC compiler.
  • #1102: Fixed a bug in SparseMapBase by adding an assert to validate that the outerIndexPtr array has the correct size, preventing invalid indexing and out-of-bounds errors.
  • #1101: Fixes memory alignment handling in handmade_aligned_malloc/realloc/free functions by storing a 1-byte offset instead of absolute address. This resolves potential alignment issues in malloc-based allocations while ensuring compatibility with alignment values less than 256.
  • #1100: Fixed empty matrix resizing in DenseStorage by adding support for properly resizing dynamic empty matrices and correcting dimension reporting.
  • #1096: Fixed a bug in BinaryFunctors.h related to atan2 function where pselect predicate was incorrectly handling single-bit packets, resolving a platform-specific issue in Eigen's linear algebra implementation.
  • #1094: Fixed compiler warnings for unused variables in Eigen's sparse linear algebra modules. Addressed -Wunused-but-set-variable warnings in SparseLU and TriangularSolver components to ensure clang 16.0.0git compliance.
  • #1093: Fixes NaN input handling in the atan2 function by adding proper NaN value processing to prevent undefined behavior and ensure mathematical consistency.
  • #1085: Fixes 4x4 matrix inverse computation in InverseSize4.h to prevent sign flips when compiling with the -Ofast optimization flag.
  • #1070: Fixed pow function test case for mixed integer types to prevent unnecessary type conversion of integer exponents, improving both performance and correctness.
  • #1069: Fixed a test case in the skew_symmetric_matrix3 test suite that was causing MSAN errors due to uninitialized matrices, removing the problematic test to improve stability and memory safety.
  • #1065: Fixed compilation failure in sparse matrix operations on ROCm architecture by addressing issues in MatrixBase.h.
  • #1064: Fixed build errors in Core modules (AssignEvaluator.h and Redux.h) for g++-6 and C++20 by adding required constexpr labels to functions.
  • #1063: Fixes unary pow() function in UnaryFunctors.h by explicitly casting std::pow() results to result_type and discarding const qualifiers in type comparisons.
  • #1061: Fixes the pow function bound in array_cwise tests to properly handle floating-point types, resolving array_cwise_3 test failures.
  • #1060: Fixed realloc usage in Core memory utilities for non-trivial types by adding proper allocation and copy/move construction logic when realloc is not applicable.
  • #1058: Fixed missing comparison operators in GPU PacketMath to resolve build issues with vectorized psign implementation on CUDA.
  • #1057: Fixed overflow threshold bounds in pow tests within test/array_cwise.cpp to prevent integer overflow issues and avoid CI pipeline failures.
  • #1056: Fixed compiler warnings in Core packet math functions and array component-wise tests by modifying GenericPacketMathFunctions.h and array_cwise.cpp test file.
  • #1055: Fixes aligned_realloc() in Memory.h by adding check_that_malloc_is_allowed() to prevent reallocs when EIGEN_RUNTIME_NO_MALLOC is defined, ensuring compatibility with malloc-restricted environments.
  • #1054: Fixed typo in doc/TutorialSparse.dox documentation file.
  • #1053: Fixed a missing semi-colon compilation error in GeneralizedEigenSolver.h that was preventing MSVC builds from succeeding.
  • #1052: Fixed CMake build system issues by removing default benchmark building and resolving test dependency problems in sparse library environments.
  • #1051: Fixed mixingtypes test failures by modifying the unary pow operation to avoid incorrectly relying on the binary op plugin.
  • #1049: Fixed two typos in the slicing tutorial documentation by correcting "vector v" to "matrix A" in the 3rd table for improved clarity and consistency.
  • #1048: Fixed test build errors in unary power operations by correcting return types to match ScalarBinaryOpTraits and removing unnecessary const qualifiers from UnaryFunctors.h.
  • #1046: Fixed pow() function support for complex types by re-enabling the functionality across Core modules including functors, global functions, and array/matrix operations.
  • #1045: Fixed GeneralizedEigenSolver::info() method to always return valid values when decomposition is initialized and replaced m_valuesOkay with m_isInitialized for better error state management.
  • #1044: Fixed missing pointer in realloc call for SparseMatrix. Corrected memory management in sparse matrix reallocations by adding the missing ptr parameter.
  • #1042: Fixed signed integer overflow in array_cwise test by modifying GenericPacketMathFunctions.h and test/array_cwise.cpp to avoid undefined behavior and ensure safer integer arithmetic.
  • #1039: Fixed the psign function in packet math implementations to correctly handle unsigned integer types, particularly addressing a bug where psign<bool> incorrectly returned bool(-1).
  • #1037: Fixed pblend implementation in AVX PacketMath by adding EIGEN_VECTORIZE_AVX2 protection to ensure compatibility with AVX2 architectures and resolve issues in AVX builds without AVX2.
  • #1033: Fixed SYCL tests by addressing sigmoid test specializations in PacketMath, updating binary logic operators to use bitwise casting, and adjusting tensor builtin and random test outputs to match CUDA behavior.
  • #1032: Fixes deprecated warning edge-case in BDCSVD by disabling compiler warnings when BDCSVD dispatches to JacobiSVD's deprecated constructor with no computation options.
  • #1030: Fixed compilation error in Half.h by disabling aarch64 Half functions during GPU compilation to prevent double-definition conflicts.
  • #1028: Fixed PowerPC AltiVec build compatibility by resolving compilation issues on non-VSX PowerPC architectures across the AltiVec SIMD implementation files.
  • #1027: Fixed vectorized pow() implementation in GenericPacketMathFunctions to handle corner cases with negative numbers and odd exponents. Updated unit tests to verify correct behavior for these edge cases.
  • #1025: Fixed Packet2d type usage in AltiVec Complex.h to conditionally compile only when VSX support is available, preventing compilation errors on non-VSX systems.
  • #1024: Fixed compilation warnings and errors in PowerPC AltiVec GEMM operations while adding partial packet support for real-only matrix multiplication, resulting in up to 40% binary size reduction.
  • #1023: Fixed flaky packetmath_1 test by adjusting input values in test/packetmath.cpp to prevent cancellation issues in pmsub and pnmadd test cases.
  • #1016: Fixed vectorization support in ConfigureVectorization.h by adding immintrin.h header for emscripten builds.
  • #1015: Fixes potential segfaults in AVX512 GEMM kernels by disabling them by default, modifying GemmKernel.h, TrsmKernel.h, and TrsmUnrolls.inc to prevent crashes in applications unless explicitly enabled.
  • #1014: Fixed aligned_realloc in Memory.h to properly call check_that_malloc_is_allowed() when ptr == 0, ensuring compliance with EIGEN_RUNTIME_NO_MALLOC assertions when using conservativeResize().
  • #1012: Fixed vectorized Jacobi rotation by correcting the vectorization check logic in Jacobi.h and its corresponding test, enabling the vectorized code path to be properly used in modern compilers.
  • #1010: Fixed the inner iterator in sparse block implementation to correctly handle the outer index when forwarding to the InnerIterator constructor.
  • #1009: Fixed wrong doxygen group usage in PlainObjectBase.h to ensure proper documentation generation for Eigen's core classes.
  • #1007: Fixed ODR violations in SparseLU_Structs.h and TensorTraits.h by replacing unnamed enums with named types to prevent build failures and ensure consistent type declarations across header files.
  • #1005: Fixed GPU unit tests by enabling device side malloc functionality that was previously disabled, restoring compatibility with ROCm 5.2.
  • #1003: Fixed undef warnings in TriangularSolverMatrix.h by adding conditional macro guards to prevent warnings when AVX512 is not enabled.
  • #996: Fixed kernel names in Constants.h to comply with SYCL-2020 specification by ensuring they are C++ types and forward declarable, avoiding scoped enums for less intrusive changes.
  • #994: Fixed GPU compatibility in Reshaped.h by marking the index_remap function with EIGEN_DEVICE_FUNC attribute. This enables the reshape expression to be used in GPU/CUDA code.
  • #993: Fixed a typo in the Matrix class tutorial documentation where the roles of row and column vectors were incorrectly swapped, improving clarity for new users.
  • #988: Fixed build issues with MSVC for AVX512 kernels by disabling recent optimizations in GemmKernel.h and TrsmKernel.h to resolve compilation errors and reduce memory usage.
  • #987: Fixed integer shortening warnings in visitor tests by modifying test/visitor.cpp to address potential warning issues in the visitor test suite.
  • #984: Fixed file permissions by removing executable flags from header files across Core modules, architecture-specific code, and unsupported components.
  • #980: Fixes signed integer overflow issue in the adjoint test by modifying test/adjoint.cpp to prevent undefined behavior during testing.
  • #977: Fixed numerical stability issues in the BDCSVD algorithm to improve robustness when handling edge cases and numerical challenges.
  • #976: Fixes AutoDiffScalar zero value handling in LDLT solver by modifying TriangularSolverVector to properly check for zero values and update derivative computations.
  • #974: Fixed BDCSVD crash by adding bounds checking to prevent accessing perm(-1) and forcing zhat to zero when handling large matrices of ones.
  • #964: Fixed the InnerPanel template parameter in HouseholderSequence.h to always be false, resolving potential compilation issues and ensuring consistent behavior.
  • #963: Fixed NaN propagation in cwise operations for scalar inputs by adding a missing template parameter in MatrixCwiseBinaryOps.h. Resolves issue #2474 where scalar cwise operations were not properly handling NaN values.
  • #958: Fixed compiler bugs for GCC 10 & 11 in Power GEMM implementation by addressing inline assembly issues related to vector pair handling in AltiVec MatrixProductMMA.
  • #953: Fixes ambiguous DiagonalMatrix constructors by adding a single initializer list constructor to resolve overloading conflicts between scalar and vector list initialization.
  • #952: Fixed test failures when running with EIGEN_TEST_NO_EXPLICIT_VECTORIZATION by addressing alignment assumptions in five test files. Modified dense_storage, dynalloc, evaluators, mapped_matrix, and mapstride tests to work around vectorization-related issues.
  • #949: Fixed ODR (One Definition Rule) issues in lapacke_helpers.h to ensure proper linkage and C++ standard compliance.
  • #948: Fixed MSVC+CUDA compatibility issues in Core modules and Tensor components by replacing internal typedefs with true types and addressing compiler warnings related to macro arguments and friend declarations.
  • #945: Fixed max size expressions in Core module components including DenseBase, SolverBase, and TriangularMatrix to ensure consistency in size limits and prevent compilation errors due to size constraints.
  • #942: Fixes navbar scrolling issues in documentation by updating the resize logic in eigen_navtree_hacks.js and adjusting CSS positioning for the table of contents to resolve scrollbar interference.
  • #941: Fixes scalar comparison logic in test suite by adding proper handling of inf/nan values in the test_isApprox function in test/main.h.
  • #940: Restored std::remove* aliases in Meta.h to fix compatibility issues with third-party libraries that depend on these previously available type traits.
  • #934: Fixed the order of template arguments in the BLAS syrk function in blas/level3_impl.h to resolve compiler errors and ensure compatibility with the C++ standard when Row/ColMajor modes are no longer implicitly convertible to bool.
  • #930: Fixed compilation issues in HouseholderQR by adding a missing typename for GCC 9 compatibility and removed an unused typedef in NNLS test to eliminate compiler warnings.
  • #929: Fixes GEMV compilation issues in TensorFlow by splitting the general_matrix_vector_product interface in AltiVec MatrixVectorProduct into separate ColMajor and RowMajor macros.
  • #926: Fixed namespace usage in StlSupport and TensorContractionSycl modules to resolve compilation errors and ensure SYCL framework compatibility.
  • #925: Fixed ODR violation in the trsm function by marking it as inline in AVX512 unrolls implementation to resolve compiler warnings and ensure proper linkage.
  • #924: Fixed f16c scalar conversions in Half.h by disabling them for MSVC to address missing compiler support and prevent compilation issues.
  • #923: Fixed AVX512 builds with MSVC by adding explicit cast intrinsics in PacketMath.h and enabling AVX512 testing in CMakeLists.txt.
  • #922: Fixed MSVC compiler bug in Diagonal.h and Transpose.h by adding extra const qualifiers to prevent the compiler from incorrectly dropping const from Const**ReturnType definitions.
  • #919: Fixed a missing parenthesis in the tutorial documentation. The syntax error was corrected in the TutorialSlicingIndexing.dox file to improve code readability and correctness.
  • #918: Fixed compilation issues in AVX512 architecture by adding explicit reinterprets to _mm512_shuffle_f32x4 calls that were causing build errors with g++.
  • #917: Fixed geo_orthomethods test by adding workaround for g++-10 compiler bug in Docker environment to resolve test failure in geo_orthomethods_4.
  • #915: Fixed missing pound character in AltiVec MatrixProduct implementation. Corrected syntax error in Eigen/src/Core/arch/AltiVec/MatrixProduct.h.
  • #914: Fixes flaky test cases by disabling the non-convergence check in the Schur decomposition test suite. This change removes unreliable test assertions that were causing intermittent failures.
  • #913: Fixes PowerPC MMA build issues by adding dynamic dispatch option and modifying AltiVec MatrixProduct files to handle MMA flags appropriately, preventing LTO-related build failures by default.
  • #911: Fixed a mixup between RowMajorBit and RowMajor logic in the SVD UpperBidiagonalization implementation. This ensures proper handling of different matrix layouts to prevent incorrect behavior in SVD operations.
  • #910: Fixes PowerPC MMA build issues by reverting previous changes to AltiVec MatrixProduct modules. Restores compatibility with PowerPC architectures by properly handling MMA flags without requiring additional build configuration.
  • #908: Fixes incorrect reference code in STL_interface.hh for the ata_product functionality in the benchmark testing library.
  • #907: Fixes PowerPC MMA build configuration by adding dynamic dispatch option and defaulting MMA usage for Power10 architectures. Resolves LTO compilation issues and improves PowerPC compatibility.
  • #901: Fixed construct_at and destroy_at functions in Core memory utilities to resolve compilation breakage on ROCm platform.
  • #900: Fixed the swap test to properly handle matrices of size 1, preventing sporadic assertion failures in the test suite.
  • #890: Fixed duplicate IsRowMajor declaration in BooleanRedux.h to eliminate compiler warnings.
  • #887: Fixed vectorization_logic tests by adding support for ignoring non-beneficial unrolling/traversal and adjusting matrix sizes for platforms where unaligned vectorization is disabled, improving test pass rates across all vectorization platforms.
  • #885: Fixed enum conversion warnings in BooleanRedux module by modifying the BooleanRedux.h implementation to address potential compiler warnings and improve code clarity.
  • #882: Fixed SVD implementation compatibility issues with MSVC+CUDA by resolving Index type definition conflicts and addressing non-void function return warnings in BDCSVD and JacobiSVD modules.
  • #880: Fixed SVD computations in MSVC by properly storing and handling the Options template parameter in BDCSVD and JacobiSVD classes, resolving a critical bug that caused incorrect results for fixed-sized SVDs.
  • #879: Fixed reduction operations in BooleanRedux module to handle row-major layout efficiently. Optimized any/all reduction performance for row-major matrices.
  • #878: Fixed frexp packetmath tests to handle MSVC's incorrect behavior with non-finite inputs by ensuring the exponent is properly reset in test cases.
  • #877: Fixed deprecated warnings in SVD tests by disabling them specifically for MSVC compiler in bdcsvd.cpp and jacobisvd.cpp test files.
  • #876: Fixed AVX512 complex number operations by removing problematic _mm512_broadcast_f64x2 instruction that caused data corruption and test failures with g++-11 compiler optimizations.
  • #875: Fixed compilation error in packetmath test by adding wrapper structs to allow passing overloaded functions as functors.
  • #874: Fixed a gcc-5 compiler bug in packetmath test where data1 was filled with incorrect values (-0, 0) and (-0, NaN) under optimization levels -O2 or higher. Resolved by initializing memory to zeroes in the packetmath_minus_zero_add() function.
  • #873: Fixed deprecated warnings in SVD test files by disabling warning flags in test/bdcsvd.cpp, test/jacobisvd.cpp, and test/svd_common.h to reduce build noise and improve test clarity.
  • #870: Fixes test macro conflicts with STL headers in C++20 by modifying test/main.h to resolve compatibility issues.
  • #869: Fixes CMake configuration for SYCL support by simplifying the build setup and removing unnecessary workarounds. Also temporarily disables problematic sigmoid tests due to SYCL function limitations.
  • #866: Fix SPQRSupport crash by initializing pointers to nullptr to prevent invalid free() calls in the destructor.
  • #864: Fixed EIGEN_UNUSED decorations in architecture-specific math functions by removing unnecessary annotations from functions that are actually used across AVX, SSE, NEON, and other platform implementations.
  • #862: Fixed SVD U/V matrix sizing for fixed-sized inputs by restoring original dimensions in SVDBase.h, eliminating unexpected dynamic sizing behavior.
  • #861: Fixed FixedInt in IntegralConstant.h by making it constexpr and removing static qualifier to resolve ODR violations for fix.
  • #859: Fixes MSVC+NVCC 9.2 pragma error in DisableStupidWarnings.h by adding support for Microsoft-specific __pragma extension to bypass _Pragma compatibility issues.
  • #858: Fixed sqrt/rsqrt functions for NEON architecture by replacing hand-written implementations with generic versions that correctly handle edge cases like 0 and infinity.
  • #857: Restores the deprecated svd::compute(Matrix, options) method in SVD classes to maintain backward compatibility and prevent breaking external projects that still rely on this interface.
  • #851: Fixed JacobiSVD_LAPACKE bindings to reflect the current SVD module's runtime options and ensure compatibility with the latest SVD module implementation.
  • #847: Fixed compiler warnings and errors in PowerPC-specific GEMM and GEMV implementations by cleaning up unnecessary code in AltiVec architecture files.
  • #845: Fixed numeric_limits static data members for BFloat16 and Half types by moving implementation into a class template to avoid ODR violations.
  • #844: Fixed COPYING.MPL2 file by updating the license reference to use HTTPS protocol instead of HTTP for secure connections.
  • #843: Fixed namespace collision in TriangularMatrixMatrix.h by renaming local variables from _* to * to avoid conflicts with resolve.h.
  • #842: Fixed a typo in the CompleteOrthogonalDecomposition class documentation, correcting "matrixR()" to "matrixT()" to match the actual method name.
  • #836: Fixed GCC<6.3 maxpd workaround in SSE PacketMath to only apply to GCC compilers, preventing the workaround from being incorrectly applied to non-GCC compilers like Clang.
  • #835: Fixed ODR (One Definition Rule) violations in header files by removing unnamed namespaces from GPU PacketMath and Tensor modules to ensure C++ standards compliance.
  • #833: Fixed 32-bit ARM integer type handling in packet math functions by correcting type discrepancies between int32_t and long int in GenericPacketMathFunctions.h and NEON PacketMath.h.
  • #832: Fixed AVX512 math function consistency issues and enabled AVX512 support for Intel C++ Compiler (ICC) in Eigen's architecture-specific implementations.
  • #828: Fixed cache overflow issue in PowerPC AltiVec GEMV implementation by adjusting block column count to prevent abnormal cache behavior.
  • #825: Fixed floating point warnings throughout Eigen library by replacing direct equality comparisons with strict utility functions and adding explicit casts for implicit conversions in both core modules and test suite.
  • #822: Fixed a potential overflow issue in the random number generation code by making casts explicit and adjusting the maximum value for the short offset in test/rand.cpp.
  • #818: Fixes MSVC compiler warnings C4701 and C4702 in the Memory.h utility by silencing warnings related to uninitialized variables and unreachable code in construct_elements_of_array().
  • #816: Fixed EIGEN_OPTIMIZATION_BARRIER macro to support soft float ARM architecture by removing "w" inline assembly constraint for ARM targets, enabling Eigen to build with ARMv6j+nofp flags.
  • #815: Fixed implicit conversion warning in GEBP kernel's packing code by modifying GeneralBlockPanelKernel.h to remove int to Index conversion warning.
  • #814: Fixed comment in Umeyama.h by updating reference from removed macro EIGEN_SIZE_MIN_PREFER_DYNAMIC to the new constexpr function.
  • #812: Fixed implicit conversion warning in vectorwise_reverse_inplace by explicitly casting half from Eigen::Index to int in the Reverse.h implementation.
  • #811: Fixed compilation issue in Meta.h for GCC versions below 10 when using C++2a standard by addressing std::ssize availability problems.
  • #810: Fixes corner cases in logistic sigmoid implementation by ensuring it returns 1 for inputs >= 1 and handles positive infinity correctly.
  • #809: Fixed broken assert statements in IncompleteCholesky iterative linear solver by correcting variable name checking logic.
  • #808: Fixed type casting issues in LU Determinant module by adding explicit type casting for pmadd function to ensure compatibility with custom scalar types.
  • #806: Fixed misleading assertion messages in IterativeSolverBase that incorrectly referred to the class as "ConjugateGradient" instead of "IterativeSolverBase". This ensures consistency and clarity in error messages across the iterative linear solvers module.
  • #805: Fixed inconsistency between scalar and vectorized paths in array.exp() function to ensure consistent return values across different implementation paths.
  • #802: Fixed a truncation bug in CoreEvaluators.h where unsigned int values were incorrectly truncated to bool, ensuring proper integer truncation behavior.
  • #801: Fixed numeric_limits implementations for BFloat16 and Half types, correcting signaling_NaN, denorm handling, epsilon calculation, and various flags. Also added AVX psqrt workaround to prevent negative denormal values from being flushed to zero.
  • #800: Fixed GPU unit tests in test/gpu_test_helper.h that were broken due to serialization API changes, resolving issues for HIP compatibility.
  • #794: Fixed header guard duplication in ZVector architecture by replacing ALTIVEC_H guards with ZVECTOR_H in Complex.h and MathFunctions.h to resolve potential conflicts when including both AltiVec and ZVector headers.
  • #790: Fixed missing internal namespace qualifiers in vectorization logic tests to properly handle namespace-related issues and improve test coverage.
  • #789: Fixed ConfigureVectorization.h to conditionally include immintrin.h only when F16C is available and vectorization is enabled, preventing failures when vectorization is explicitly disabled.
  • #788: Fixed documentation formatting and compiler warnings across multiple Eigen modules by correcting \tparam tags, removing unnecessary semicolons, and updating float literal types.
  • #785: Fixed Clang compiler warnings in Core architecture modules by aligning variables properly and removing incorrect floating point suffixes from double constants.
  • #782: Fixed a bug in the EIGEN_IMPLIES macro in PackedTriangularMatrixVector.h that was causing side-effects to be conditionally short-circuited, introduced in merge request !751.
  • #774: Fixes CMake compatibility issues in EigenTesting.cmake to enable HIP unit tests with the latest CMake version.
  • #771: Fixed ADL conflicts in Eigen::internal by renaming the size function to ssize and updating it to be compatible with C++20's ssize standard function for improved type safety.
  • #769: Fixed build error in SPQRSupport module by including the correct Eigen headers instead of headers from the source directory.
  • #767: Fixed the exp() function in vectorized expressions to correctly return 0.0 for -Inf inputs instead of non-zero values. Modified GenericPacketMathFunctions.h to ensure consistent behavior across vectorized operations.
  • #762: Fixed documentation snippets for slicing operations by updating examples in array expressions, raw arrays, and standard vectors to reflect correct usage and behavior.
  • #749: Reverted SVD module changes to restore compatibility with third-party libraries that were disrupted by previous computation options modifications.
  • #746: Fixed Cholesky LLT decomposition to properly handle zero-sized matrices by making the Lapacke-based implementation a no-op for empty matrices, ensuring test suite compatibility and preventing errors.
  • #745: Fixed HIP compilation issues in SelfAdjointView and TriangularMatrix core classes. Resolved compilation errors to ensure compatibility and stability for the HIP backend.
  • #741: Fixed DenseBase compilation failures in HIP environments by adding EIGEN_DEVICE_FUNC modifiers to ensure compatibility with device code.
  • #730: Fixed indexed views for non-Eigen types by addressing stride computation issues that could cause signed integer overflow. Added explicit checks for undefined increment values and set stride to Dynamic when detected.
  • #720: Fixes a typo in Eigen/src/Core/util/Memory.h.
  • #719: Fixed sparse-sparse matrix product handling when using mixed StorageIndex types by correcting the storage index usage in ConservativeSparseSparseProduct.h and added corresponding test cases.
  • #714: Fixed uninitialized matrix issue in nestbyvalue test by initializing matrices to Random() to prevent NaN values and ensure consistent behavior across different systems.
  • #711: Fixed a bug in ConfigureVectorization.h where EIGEN_HAS_FP16_C was not being defined for non-clang compilers, correcting the conditional logic to ensure proper definition across different compiler types.
  • #707: Fixed total deflation issue in BDCSVD when the matrix M is already diagonal, and added unit tests to improve test coverage for this edge case.
  • #703: Fixed nan propagation in ArrayCwiseBinaryOps for scalar "other" in min/max operations by copying input type from EIGEN_MAKE_CWISE_BINARY_OP to ensure consistent behavior across different input types.
  • #701: Fixed alignment qualifier placement in ZVector architecture by moving alignas to come first in Complex.h and PacketMath.h to eliminate compiler warnings.
  • #698: Fixed CommaInitializer to properly reuse fixed dimensions by modifying the block size handling in Eigen/src/Core/CommaInitializer.h, preventing potential issues with variable-sized blocks.
  • #696: Fixed visitor return type compatibility in Core/Visitor.h by removing const qualifier to resolve build issues with pload/ploadu on ARM/PPC architectures.
  • #695: Fixed boostmultiprec test compilation issue with older Boost versions by avoiding symbol redefinition that caused build failures for Boost versions prior to 1.77.
  • #694: Fixed ZVector build issues for cross-compiled environments by modifying Complex.h and PacketMath.h to address configuration problems with s390x-linux-gnu-g++ and qemu setups.
  • #692: Fixed Qt6 compatibility in Transform.h by extending EIGEN_QT_SUPPORT and excluding incompatible functions when building with Qt6.
  • #686: Fixed bit_cast implementation in NumTraits.h by reverting to memcpy for CUDA code to avoid undefined behavior from reinterpret_cast.
  • #680: Fixed PowerPC AltiVec packing implementation by inverting rows and depth in non-vectorized portion, resolving incorrect row end detection that caused bad test results and improving performance with 5X faster extra rows processing.
  • #671: Fixed GPU special function tests by correcting incorrect values in test/main.h and updating VERIFY_IS_CWISE_APPROX to handle scalars properly, addressing copy-paste errors from PR #398.
  • #666: Fixed EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR macro in Core/util/Macros.h to resolve compilation errors when using MSVC with NVCC, particularly for Visual Studio 2017.
  • #665: Fixed tuple compilation errors in VS2017 by modifying Eigen/src/Core/arch/GPU/Tuple.h to use TupleImpl instead of tuple to resolve alias type deduction issues.
  • #664: Fixed complex compound assignment operators for CUDA by disabling their testing on MSVC to prevent duplicate definition errors and improve compiler compatibility.
  • #663: Fixed CUDA compiler warnings by disabling additional warning types in DisableStupidWarnings.h for CUDA versions 9.2 and 11.4.
  • #661: Fixed typos in English text across multiple Eigen library files including core modules, tensor components, and test files, correcting spelling errors like "assignement" to "assignment" to improve code readability and consistency.
  • #660: Fixed typos across multiple Eigen modules including Core, Eigenvalues, SVD, SparseCore, and unsupported components. Corrected spelling errors in header files, documentation, CMake files, and test utilities throughout the codebase.
  • #659: Fixed alias violation in BFloat16 type handling by correcting undefined behavior from improper reinterpret_cast usage, enhancing stability and performance for BFloat16 data conversion.
  • #657: Fixed implicit conversion warnings in tuple_test.cpp by addressing compiler warnings related to implicit conversions in the tuple test suite.
  • #656: Fixed strict aliasing bugs in AVX, AVX512, and SSE Complex.h implementations that were causing packet loading skips and product_small test failures in matrix multiplication operations.
  • #654: Fixes GCC string overflow warning in the initializer_list_construction test by silencing the compiler warning that was causing noise in the test suite.
  • #653: Fixes GPU tests by disabling specific subtests in test/gpu_example.cu that fail on HIP due to missing device-side malloc/free functionality.
  • #651: Fixed AVX512 build configuration by removing the unnecessary -fabi-version=6 flag from CMakeLists.txt to resolve compilation issues.
  • #643: Fixes compilation error in GPU test helper for HIP platform compatibility.
  • #639: Fixed typos and unaligned load issues in AVX2 PacketMath.h, correcting ps -> epi32 references and ensuring proper memory alignment for packet math operations.
  • #638: Fixed missing integer packet types in the pset1 call within the AVX PacketMath.h implementation, improving packet type support for vectorized operations.
  • #635: Fixes tridiagonalization_inplace_selector by introducing CoeffVectorType template parameter to resolve build errors from hCoeffs and MatrixType option mismatches.
  • #630: Fixed AVX integer packet operations by adding proper AVX2 vectorization guards and correcting missing semicolon in AVX512 implementation.
  • #629: Fixes EIGEN_OPTIMIZATION_BARRIER macro in Eigen/src/Core/util/Macros.h by adding support for the "g" constraint in arm-clang compiler to resolve GitLab CI build issues.
  • #621: Fixed GCC 4.8 compilation issue in NEON PacketMath by changing from 'g' to 'r' register constraint and addressed minor macro warnings on armv7 architectures.
  • #618: Fixed CUDA 9 compilation issues by adding missing EIGEN_DEVICE_FUNC macros to Macros.h and Block.h, enabling the gpu_basic test to pass with CUDA 9.1.
  • #616: Fixed CUDA half-precision vectorization by disabling __half vectorization on host architectures to prevent build errors and ensure compatibility with CUDA 10 and earlier versions.
  • #615: Fixed Windows on ARM compilation by including the intrin.h header in Eigen/Core to support BitScanReverse and BitScanReverse64 functions.
  • #614: Fixed LAPACK CMake configuration to allow compilation of legacy Fortran test code with GNU Fortran 10 by addressing argument type mismatch errors.
  • #613: Fixed Eigen::fix functionality by correcting typos that incorrectly checked EIGEN_HAS_CXX14 instead of EIGEN_HAS_CXX14_VARIABLE_TEMPLATES. Updated IntegralConstant.h and symbolic_index test to properly handle cases where variable templates are not supported.
  • #604: Fixes a Visual Studio 2017 compiler bug in MathFunctions.h where std::arg incorrectly returned 0 for negative real numbers instead of PI. Implements a workaround to ensure correct behavior across all supported compilers.
  • #600: Fixed missing PPC packet comparisons in AltiVec PacketMath implementation. Resolved packetmath test failures on the PPC architecture pipeline.
  • #598: Fixes minor documentation issue in Map.h to correct documentation formatting or content.
  • #596: Fixes F32ToBf16 function in AltiVec PacketMath by reversing compare logic to work around missing vec_cmpne instruction on Power8 architecture, enabling clang10 compilation support.
  • #595: Fixed compiler warnings in AltiVec MatrixProduct modules by initializing pointers to NULL to eliminate unused variable warnings in GCC 11+.
  • #591: Fixed compatibility issues in AltiVec architecture code for older Power compilers by addressing missing vec_neg function in GCC 7.5 and const pointer issues with vec_xl in Clang 10.
  • #588: Fixed undefined behavior in test files by conditionally setting AnnoyingScalar::dont_throw only when EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW macro is defined.
  • #575: Fixed template identifier naming conventions across the Eigen library by removing leading underscores followed by capital letters from template identifiers. This change affects multiple core modules including matrix operations, linear algebra solvers, and geometry components to improve code compliance and reduce potential compiler warnings.
  • #569: Fixed assignment operator compatibility issues in Core/util/Macros.h for MSVC+NVCC configurations. Resolved duplicate definition and implicit deletion errors affecting the latest MSVC compiler versions.
  • #565: Fixed compilation error in JacobiSVD with HouseholderQRPreconditioner when using row-vector input by applying the same fix used for FullPiv and ColPiv preconditioners.
  • #564: Fixed MPReal detection and support by adding FindMPREAL.cmake module and removing outdated internal mpreal.h file. Resolved compatibility issues with latest mpfr version and fixed min/max function bugs.
  • #563: Fixed CMake configuration warnings by replacing find_package with find_dependency in package files and corrected case mismatches in FindPASTIX/FindSCOTCH modules. Resolved package name mismatches and file corruption issues to improve build system reliability.
  • #562: Fixed scalar packet operations in XprHelper.h and GenericPacketMath.h to avoid undefined behavior by using appropriate scalar operations instead of bitwise operations for non-POD types.
  • #560: Fixed TriSycl CMake configuration files to ensure compatibility with the latest TriSycl version and enable proper compilation with c++17 support.
  • #557: Fixed HIP GPU backend compilation by enabling extract operations and resolving build issues in BlasUtil.h.
  • #554: Fixed in-place matrix inversion in LU module by making a copy of the input matrix before performing the inverse operation to prevent undefined behavior and unintentional modifications.
  • #551: Fixed breakage in ConjHelper.h that caused issues when using conj_helper with custom types, addressing problems introduced by commit !537 and improving robustness of complex conjugation functionality.
  • #549: Fixed PartialPivLU inverse computation by adding null pointer and bounds checks for empty or single-column matrices. Prevents undefined behavior and memory access errors in edge cases.
  • #539: Fixed conjhelper functionality in ConjHelper.h for AMD HIP GPUs that was broken by commit 52a5f982.
  • #535: Fixes CMake build system to conditionally build shared libraries only when supported by the platform, preventing build errors on systems without shared library support.
  • #532: Fixed NEON PacketMath declarations for aarch64-pc-windows-msvc by adding EIGEN_COMP_CLANG == 0 condition to prevent MSVC declaration conflicts with Clang compiler.
  • #530: Fixed missing macro replacement in IntegralConstant.h to resolve compiler issues with GCC 4.9.3 and ensure proper C++14 compatibility.
  • #525: Fixed missing pcmp_lt_or_nan<Packet8bf> implementation in AltiVec PacketMath for PPC architecture. Added the necessary definition and test coverage to ensure proper functionality.
  • #524: Fixed enum arithmetic across multiple Eigen modules by removing deprecated enum arithmetic operations to ensure C++20 compatibility.
  • #522: Fixed MinGW compiler version detection by modifying Meta.h to properly parse MinGW version strings and adding make_unsigned support for long long types.
  • #518: Fixed C++20 compiler warnings about using enums in arithmetic expressions across multiple Eigen core modules including functors, evaluators, and numeric traits by updating code to comply with modern C++ standards.
  • #517: Fixed version parsing bug in cmake/EigenTesting.cmake by stripping whitespace from nvhpc compiler version strings to ensure correct extraction of version information.
  • #511: Fixed missing NEON ptranspose implementations by adding PacketBlock<Packet,4> support and unifying the code to use vzip operations for improved consistency and performance.
  • #509: Fixed CUDA compiler warning in CwiseBinaryOp by removing EIGEN_DEVICE_FUNC annotation from the default copy constructor.
  • #506: Fixed conservative_resize_like_impl in PlainObjectBase to use derived object type, preventing assertion errors and memory corruption when resizing with DontAlign flag.
  • #503: Fixed test/prec_inverse_4x4.cpp to ensure all generated matrices for inverse_4x4 tests are invertible, resolving issue #2248 where non-invertible matrices caused test failures.
  • #499: Fixed the order of inclusion for numext::imag and numext::real functions in Eigen/src/Core/MathFunctions.h to ensure they are defined before use. Resolved a compiler error in HIP compiler environment by correcting the function definition ordering.
  • #498: Fixed SSE complex packet storage format in Core/arch/SSE/Complex.h to match wrapper variables, resolving issue #2242 with potential mixing of Packet2cf and Packet1cd values.
  • #495: Fixed the return type of numext::arg function in Core MathFunctions to return the correct real type instead of complex type, resolving compile errors and ensuring consistent behavior.
  • #494: Fixed ABI compatibility issue in the conj function by restoring the second template parameter to resolve conflicts with Boost library specializations.
  • #492: Fixed issue #2242 by removing unused paddsub<Packet2cf> function from NEON and SSE complex number implementations. This reduces code complexity and eliminates potential build issues related to the unused function.
  • #486: Fixes complex division in CUDA implementation by implementing Smith's algorithm for better numerical stability and avoiding NaN results for edge cases with subnormal numbers.
  • #483: Fixed AVX512 implementation bug in pcmp_lt_or_nan function and added corresponding test cases to improve test coverage.
  • #478: Fixed DenseStorage copy/swap operations to safely handle only initialized elements in dynamic matrices with fixed-sized storage, preventing undefined behavior and improving efficiency.
  • #474: Fixed compiler warnings in AltiVec MatrixProduct modules by addressing rvalue template address-taking issues that occurred during TensorFlow compilation.
  • #472: Fixed compilation errors in basicbenchmark module by replacing deprecated functions lazyProduct() and evalOMP() with their current equivalents eval() and lazy().
  • #469: Fixed ldexp implementation in AVX512 architecture by correcting shuffle operations and introducing proper permute instructions to interleave data halves.

Other improved

  • #1932: Updates CMake configuration by setting POLICY CMP0177 to NEW in the main CMakeLists.txt file.
  • #1930: Improves sparse x dense dot product accuracy and performance by replacing standard multiply-add operations with numext::fma in SparseDot.h, reducing computational error and providing speedup for SparseQR operations.
  • #1928: Improves CI/CD infrastructure by moving default builds and tests from a single Linux machine to GitLab runners, enhancing workflow efficiency and reducing single-point dependencies.
  • #1924: Improved ARM NEON packet alignment by addressing strict alignment requirements for SIMD vectors and enabling aligned load/store operations for ARM32 architecture.
  • #1923: Improved code organization by moving HIP/CUDA defines from unsupported Tensor module to Core utilities. Relocated GPU-related define files to reduce dependency on unsupported code and enhance maintainability.
  • #1917: Improved CI testing infrastructure by transitioning ARM and PPC tests from native hardware to QEMU emulation. This change enhances test environment reliability and reduces dependency on specific hardware for cross-architecture testing.
  • #1910: Improves half-precision floating point comparison performance by adding faster scalar and intrinsic implementations for emulated half comparisons in PacketMath.h and Half.h.
  • #1902: Optimizes maxCoeff and related functions by implementing vectorized approach with delayed index determination and linear access evaluators for improved performance on large matrices.
  • #1899: Improved packet math operations by enabling default behavior for pmin and predux_min functions, removing redundant specializations across Intel architectures and supporting integral types with simplified forwarding logic.
  • #1898: Improved Intel packet reductions by organizing reduction operations into separate Reductions.h files for AVX, AVX512, and SSE architectures, and added missing predux operations with NaN propagation support.
  • #1894: Improves scalar construction in Tridiagonalization to explicitly construct scalars for non-implicitly convertible types like ceres::Jet, enhancing compatibility with custom scalar types in SelfAdjointEigenSolver.
  • #1886: Improves BDCSVD and JacobiSVD performance by adding MatrixBase overloads to avoid unnecessary matrix copies when computing SVD decompositions.
  • #1881: Optimizes the slerp() method in Eigen's Quaternion class by replacing expensive trigonometric functions with a more efficient direct formula implementation.
  • #1878: Improved partial redux operations by enabling packet segment support, optimizing performance for rowwise sum operations and enhancing efficiency for large matrix operations.
  • #1875: Improves memset optimization in Core/Fill.h by adding support for std::complex and std::complex types, enabling better performance for complex type memory operations.
  • #1866: Improved packet output streaming by adding a more reliable postream function to GenericPacketMath.h and removing the conflicting test/packet_ostream.h file.
  • #1864: Improves cmake/EigenConfigureTesting.cmake by enabling parallel ctest execution across all available cores by default for faster testing.
  • #1863: Improved floating-point arithmetic for Half and BFloat16 types by implementing fallback to fma when native fma operations are not available on the platform, preventing overflow issues in corner cases.
  • #1855: Improved the Eigen ForkJoin scheduler by generalizing the ThreadPool argument to accept any ThreadPool interface. This enhances flexibility and portability by allowing users to customize threading behavior with different thread pool implementations.
  • #1853: Optimizes matrix operations by adding more .noalias() calls across multiple Eigen modules including Eigenvalues, Geometry, LU, and SVD components to reduce unnecessary memory accesses and improve performance.
  • #1852: Improved AVX512FP16 vectorization by adding native _Float16 support and updating associated math functions and type casting operations for better performance.
  • #1846: Improved AssignmentFunctors.h by unifying assignment functors with existing scalar operations, consolidating redundant code and ensuring consistent behavior between compound and simple assignments.
  • #1837: Improves CI documentation pipeline by building docs on push and preventing automatic expiration to ensure documentation is preserved even if pipeline fails.
  • #1830: Improved Eigen's assignment functors and evaluators by adding constexpr support, enabling compile-time evaluation of assignments in C++17 constexpr contexts.
  • #1829: Improved AssignmentEvaluator by refactoring AssignEvaluator.h to remove enums and legacy patterns. Enhanced code clarity and maintainability without functional changes.
  • #1827: Improves complex scalar type handling in Eigen's core and eigenvalue modules by removing the assumption of std::complex and adding support for custom complex types.
  • #1824: Improves FullPivLU condition number computation by returning zero for non-invertible matrices instead of potentially undefined behavior.
  • #1817: Improves CI test configuration by adding EIGEN_CI_CTEST_ARGS for custom timeout and renaming existing ctest arguments for better consistency across GitLab CI scripts.
  • #1813: Improved alignment handling in Memory.h by increasing the maximum alignment requirement from the previous limit to 256 bytes, enabling better compatibility with modern ARM architectures and cache line alignment requirements.
  • #1801: Improves SimplicialCholesky analyzePattern routine performance by implementing advanced algorithms, reducing runtime from 7.5 minutes to less than 0.5 seconds on large benchmark problems.
  • #1800: Improves documentation in ForkJoin.h by fixing typos and enhancing clarity and formatting of comments.
  • #1796: Improved block operations documentation by clarifying that block objects can have non-square dimensions in the tutorial example.
  • #1794: Improved cross product documentation in OrthoMethods.h to clarify behavior for complex numbers.
  • #1788: Improved CI efficiency by removing the unused Ubuntu ToolChain PPA repository from the CI script, eliminating unnecessary dependencies in the build environment.
  • #1779: Improved matrix construction and assignment performance by adding fill_n and memset optimizations for ConstantReturnType and ZeroReturnType expressions in Core assignment evaluators.
  • #1776: Improved CI deployment pipeline by switching to Alpine Linux image, removing unnecessary dependencies to speed up nightly tag deployment.
  • #1775: Improves CI/CD pipeline by removing branch name from nightly tag job configuration in deploy.gitlab-ci.yml to simplify and enhance maintainability.
  • #1774: Improved matrix equality operator in MatrixBase to support comparison between matrices of different sizes, fixing issue #1061 and enhancing flexibility for dimensional comparisons.
  • #1773: Improves CI deployment pipeline by modifying ci/deploy.gitlab-ci.yml to use specific commit tags instead of branches for more consistent and reliable builds.
  • #1772: Improves CI/CD pipeline reliability by modifying the git clone strategy in deploy.gitlab-ci.yml to ensure proper branch management and force tag updates to latest head.
  • #1771: Updates the CI deployment configuration by modifying the deploy job settings in the GitLab CI pipeline.
  • #1770: Improves CI configuration by experimenting with Alpine Linux for the formatting pipeline in checkformat.gitlab-ci.yml.
  • #1766: Updates the ROCm Docker image configuration in the Linux CI pipeline to use a newer version.
  • #1763: Improved documentation for move constructor and move assignment operators in core Eigen classes (Array, Matrix, PlainObjectBase) by adding proper doc strings.
  • #1761: Improved map fill operations in Core/Fill.h by adding support for custom stride logic, including 0/0 stride and optimized contiguous memory access handling.
  • #1759: Improved pow(x,y) function for <float,int> types by refactoring special case handling and reverting to repeated squaring algorithm for better precision and performance.
  • #1756: Improved pow(x,y) function performance by optimizing the log2() operator and refining integer exponent handling, achieving 25% speedup while maintaining accuracy below 3 ulps.
  • #1755: Optimizes setConstant and setZero operations in Core assignment logic by implementing fill_n and memset support, achieving up to 57% performance improvements for matrix initialization.
  • #1754: Optimized the pow() function in GenericPacketMathFunctions by replacing the complex implementation with pldexp_fast, achieving 5-6% performance improvements for power operations in AVX2+FMA mode.
  • #1752: Improved the exp function in GenericPacketMathFunctions.h to prevent premature overflow to infinity while providing a 3-4% performance speedup.
  • #1750: Optimized the exponential function (pexp) in packet math implementations for SSE and AVX architectures, achieving 30-35% speedup by using faster code paths for non-subnormal results and adjusting clamping logic.
  • #1748: Improved NullaryFunctors.h by removing unnecessary HasBlend trait check, simplifying code and potentially reducing performance overhead.
  • #1746: Simplified exception handling across the Eigen library by replacing legacy EIGEN_NOEXCEPT macros with modern C++ noexcept specifiers, improving code consistency and maintainability.
  • #1744: Improved code consistency across the Eigen library by replacing EIGEN_CONSTEXPR macro instances with standard constexpr keyword. Removed redundant inline keywords from constexpr functions to follow modern C++ practices.
  • #1739: Improves portability in Memory.h by replacing deprecated C99 size_t macro with standard numeric limits for overflow checking.
  • #1737: Improves fixed-size matrices to conform to std::is_standard_layout by modifying DenseStorage and Memory handling, enabling safe type punning and better compiler compatibility.
  • #1735: Enhanced element accessor operators (operator() and operator[]) in Core modules by making them constexpr, enabling their use in compile-time contexts and improving template code performance.
  • #1734: Improved AVX and AVX512 PacketMath implementations by adding more predux_any operations. Enhanced vectorized performance for linear algebra operations using advanced AVX instruction sets.
  • #1732: Improved the erfc function implementation by adding vectorized support for double precision and enhancing float precision accuracy to 5 ulps with up to 86% speedup across multiple vector architectures (AVX2, AVX512, SVE).
  • #1731: Improves compiler compatibility in StlIterators.h by replacing direct __cplusplus checks with the standardized EIGEN_CPLUSPLUS macro to avoid penalizing MSVC users.
  • #1727: Improved fixed-size Eigen objects by adding support for trivial move assignment. Modified core classes including Array, Matrix, and PlainObjectBase to enable trivial move semantics for better performance.
  • #1719: Improved test coverage for sizeof() function by adding test cases for dynamic dimensions in the sizeof.cpp test file.
  • #1709: Improved polynomial evaluation in GenericPacketMathFunctions and SpecialFunctionsImpl by replacing manual evaluation with the ppolevl helper function for better performance and code clarity.
  • #1697: Improved SSE PacketMath implementation by removing an unnecessary call to _mm_setzero_si128, reducing redundant computations and potentially enhancing performance.
  • #1696: Improves fixed-size matrices and arrays by making them trivially default constructible, removing unnecessary constructor variants and requiring compilation with debug assertions disabled.
  • #1694: Improves fixed size matrices and arrays by making copy and move constructors trivial, enabling better compiler optimizations and C++ standard compliance.
  • #1692: Optimizes the dot product implementation in InnerProduct.h by improving bounds calculations, adding branch handling to avoid unnecessary vector code, and simplifying the scalar loop for better performance on small matrix sizes.
  • #1691: Improved NonBlockingThreadPool.h compatibility by replacing plain asserts with eigen_plain_assert to support projects not using C++20.
  • #1684: Improved the atanh function by adding vectorized implementations for SSE 4.2, AVX2 + FMA, and AVX512 architectures and fixed standard compliance for arguments |x| >= 1.
  • #1683: Improved SSE and AVX complex number operations by adding fused-multiply-add implementations that reduce instruction count and enhance performance compared to traditional pmadd approaches.
  • #1681: Improved complex number traits in NumTraits by implementing HasSign for std::complex types and enhanced test coverage by adding pnmsub tests to packetmath.
  • #1677: Improved the patan() function implementation by consolidating float and double versions, enhancing accuracy to within 3 ULPs of std::atan() and providing performance speedups for specific CPU architectures.
  • #1675: Improves tanh performance by adding vectorized implementations across multiple architectures (SSE, AVX, AVX512, NEON, AltiVec), achieving speedups of up to 22x for AVX512.
  • #1673: Improved SVE intrinsics performance by replacing "_z" suffix with "_x" suffix to eliminate compiler-generated extra SEL instructions and reduce register overhead.
  • #1672: Improved squaredNorm() function for complex types by adding vectorization support in UnaryFunctors.h and Dot.h. This enhancement reduces computational overhead and increases efficiency for complex number operations in Eigen's linear algebra computations.
  • #1671: Improved dot product performance by adding a new inner product evaluator with explicit unrolling for small vectors and enabling fused multiply-add instructions for AVX2+FMA operations.
  • #1670: Improves the tanh function implementation in GenericPacketMathFunctions by using a new rational approximation that reduces error from 2.9 to 2.5 ulps and increases performance by 20-50% on vectorized architectures.
  • #1667: Improved StableNorm performance for non-trivial sizes and enhanced consistency between aligned and unaligned input handling.
  • #1665: Improved threaded product code by cleaning up the Parallelizer implementation and enhancing the associated test code for better clarity and maintainability.
  • #1663: Improved complex multiplication performance in SSE/AVX architectures by replacing separate multiply and add-subtract operations with fused multiply-add-subtract instructions (vfmaddsub213ps).
  • #1662: Improves complex matrix multiplication performance by optimizing block panel size in GeneralBlockPanelKernel.h based on available registers, achieving 8-33% speedup depending on backend (SSE/AVX2).
  • #1661: Improves symbol lookup flexibility for the hlog function in Half.h by removing restriction to global namespace, aligning it with other half-precision functions like hsqrt and hexp.
  • #1659: Updates the .clang-format configuration file to modify code formatting rules for the project.
  • #1641: Improved AVX512F type casting by adding support for double to int64_t conversion using AVX512F instructions, enhancing performance of integer conversion operations.
  • #1632: Improves the allFinite() function performance by adding AVX vectorization support, achieving up to 2.7x speedup for large arrays.
  • #1626: Improved data() functions across Eigen core and sparse modules by adding constexpr specifiers for compile-time computation and reduced runtime overhead.
  • #1625: Improves memory allocation in Core utilities by using __builtin_alloca_with_align when available for better performance and alignment.
  • #1623: Improved code formatting consistency by converting EIGEN_STATIC_ASSERT() calls to statement macro format across multiple Eigen core modules and added a formatting script to maintain this standard.
  • #1621: Improves SparseMatrix::insert method by adding index validation checks to prevent out-of-bounds access and enhance runtime safety.
  • #1618: Improved documentation clarity in Matrix class by correcting a grammatical error to "are not known at compile-time."
  • #1600: Optimized Product class transpose and adjoint operations by specializing them to use algebraically equivalent expressions that avoid unnecessary memory allocations.
  • #1595: Improved CI scripts by fixing Windows cache and folder issues, adding AVX tests, and disabling problematic MSVC+CUDA 9.2 configuration due to compiler bugs.
  • #1593: Improved ternary evaluator in CoreEvaluators.h by specializing it to handle typed, vectorized comparisons for scalar boolean select operations, reducing dependency on output scalar type and enhancing performance for expressions like (a < b).select(c, d).
  • #1590: Optimizes the pblend function in AVX and SSE PacketMath implementations by adding a blend_mask_helper utility and improving loop unrolling and vectorization for better performance with GCC and Clang compilers.
  • #1584: Optimized packet math operations by replacing floating point comparisons with efficient integer arithmetic for bit masking across AVX, AVX512, and SSE architectures.
  • #1583: Optimizes the pldexp_generic function in GenericPacketMathFunctions to improve performance by up to 6% across SSE4.2, AVX2, and AVX512 instruction sets.
  • #1581: Improved compile-time performance by adding constexpr to accessors in DenseBase, Quaternions, and Translations classes, enabling compile-time computations and reducing runtime overhead.
  • #1572: Improves AVX2 double-to-int64_t casting performance by fully vectorizing the operation using intrinsics, achieving 70% better throughput.
  • #1569: Improved SparseMatrix and SparseVector move operations by optimizing constructors and assignments to use direct swap operations, reducing copy costs and enabling better optimization opportunities.
  • #1565: Improved SymbolicIndex and IndexedViewHelper to support compile-time evaluation of symbolic expressions. Refactored the indexed view system to enable compile-time constants in expressions and simplified handling of first/size/incr parameters.
  • #1564: Improves cross3_product function in Eigen/src/Geometry/OrthoMethods.h by adding vectorization support and fixes MSVC compilation issues in AVX TypeCasting module.
  • #1561: Improved CholmodSupport module by removing deprecated "extern C" preprocessor directive to align with cholmod.h behavior and enhance code consistency.
  • #1558: Optimized Tensor::resize performance by removing slow index checks in release mode and modernized code by replacing static const with static constexpr declarations.
  • #1557: Improved Jacobi module documentation by fixing the tag on applyOnTheRight method to ensure it appears in the correct location in the generated docs.
  • #1556: Improves CMake configuration by reorganizing build logic and conditionalizing settings, reducing configuration time from 1 minute to 5 seconds for non-top-level builds.
  • #1555: Improved Matrix class by making default constructor and assignment operators constexpr, enabling compile-time evaluation and reducing runtime overhead for Matrix operations.
  • #1539: Improved TRMV (Triangular Matrix Multiply) operation by adding support for aligned assignment to ensure static vector allocations are properly aligned, enhancing stability and correctness for fixed-sized vectors.
  • #1525: Improved sparse x dense dot product performance in SparseCore by adding inline keywords to methods in SparseDot.h, reducing SparseQR computation time from ~200s to ~165s.
  • #1523: Optimizes SparseQR algorithm performance by modifying the SparseQR.h implementation, reducing execution time from 256s to 200s.
  • #1520: Improves namespace management in BLAS/LAPACK modules by removing "using namespace Eigen" from common.h to prevent symbol collisions and reduce namespace pollution.
  • #1511: Improved IndexedView by adding direct memory access capabilities through new .data() method, strides, and coeffRef() methods for enhanced performance and flexibility.
  • #1509: Improved code organization by renaming generic_fast_tanh_float to ptanh_float and moving it to the appropriate header files across Core architecture modules for better consistency and reduced confusion.
  • #1506: Improves code consistency across eigenvalue solvers, decomposition modules, and other core components by replacing direct Matrix::Options usage with the more flexible traits::Options pattern to support objects like Ref that lack Options attributes.
  • #1505: Improves AVX512 float16 packet casting by adding conditional logic to disable unnecessary casting when native AVX512 f16 hardware support is available, potentially enhancing performance and avoiding undefined behavior.
  • #1491: Improves code formatting consistency by applying clang-format to BLAS and LAPACK C source files and include files to enhance readability and maintainability.
  • #1483: Improves ComplexEigenSolver by replacing standard norm calculations with stableNorm() for better numerical stability in eigenvalue computations.
  • #1473: Improved documentation for LAPACK timing routines second and dsecnd by updating their respective implementation files.
  • #1459: Improved PlainObjectBase class by adding missing constexpr qualifier to ensure consistency and correctness in constexpr contexts.
  • #1443: Updated CI infrastructure by replacing old GitLab CI configuration with new cross-platform testing framework that includes separate Linux and Windows scripts and configuration files.
  • #1438: Improved SparseLU documentation by clarifying the relationship between compute, analyzePattern, and factorize methods, and fixed factorize documentation to reflect that internal info is not user-accessible.
  • #1437: Improves random number generation for 64-bit scalars by implementing a more robust mechanism using std::rand to ensure sufficient entropy on platforms where RAND_MAX is not equal to INT_MAX.
  • #1433: Improves the .git-blame-ignore-revs file by adding formatting changes to better manage git blame ignore functionality.
  • #1432: Improved benchmarking infrastructure by adding comprehensive copyright headers and license information across all benchmark, test, and example files throughout the Eigen library.
  • #1424: Optimizes GeneralMatrixVector.h matrix-vector multiplication performance for power-of-two packet sizes while maintaining optimal behavior for all other cases.
  • #1421: Optimizes GeneralMatrixVector by explicitly defining loop bounds for unrolled stages and using bitwise operations for rounding, reducing compiler warnings and improving performance.
  • #1404: Improves CMake build system by avoiding documentation builds during cross-compilation or when not at the top-level directory, reducing unnecessary build steps.
  • #1397: Improved code organization by consolidating multiple implementations of divup/div_up/div_ceil functions across Core utilities and tensor modules, reducing code duplication and ensuring consistency throughout the library.
  • #1389: Improved GEMM MMA operations in AltiVec architecture by adding new panel modes for real and complex numbers, achieving up to 2.84x performance gains for small matrices and significant speedups for large matrices.
  • #1385: Improved plugin header handling by renaming plugin header files from .h to .inc extensions to prevent build tools from treating them as standard headers.
  • #1364: Optimizes the check_rows_cols_for_overflow function in XprHelper.h and PlainObjectBase.h by adding partial template specialization for compile-time optimization. Improves performance for matrices with known dimensions at compile time and reduces compiler optimization overhead for specific matrix types.
  • #1354: Enhanced AltiVec packet math operations by adding optional offset parameters to ploadu_partial and pstoreu_partial functions for improved memory access flexibility.
  • #1347: Improves Ref construction by adding compile-time assertions to catch potential issues earlier and prevent runtime assertions.
  • #1346: Improved Ref<const...> class by adding a move constructor to avoid unnecessary copying of dynamic member variables, enhancing performance for dynamically allocated data.
  • #1342: Improved the accuracy of the prsqrt function in MathFunctionsImpl by reducing maximum relative error from 3 to 2 ulps using a different Newton-Raphson formulation.
  • #1338: Improved error handling in scalar_unary_pow_op by fixing integer base and integer exponent operations, removing unnecessary error handling for floating point cases, and reducing code complexity.
  • #1328: Improves casting performance by adding vectorized support for scalar_cast_op across multiple architectures including AVX, AVX512, NEON, and SSE, with specialized evaluators and runtime safety checks.
  • #1317: Improves F32 to BF16 conversion performance in AltiVec architecture by unrolling conversion loops, achieving 1.8X faster conversions for LLVM and using vector pairs for GCC compatibility.
  • #1307: Improves BF16 GEMV performance on VSX architectures by updating AltiVec matrix product implementations, achieving up to 6.7X faster RowMajor and 5.9X faster ColMajor operations.
  • #1304: Improved vectorization of cast operations by specializing the scalar_cast_op evaluator for different packet types and optimizing AVX instructions for better performance in complex expressions.
  • #1301: Improves EulerAngles geometry module by implementing canonical angle range enforcement with configurable behavior via a new default parameter, ensuring consistency with expected ranges for common applications like yaw-pitch-roll while maintaining backward compatibility.
  • #1296: Improved BF16 GEMM performance on Power architecture by adding dynamic dispatch and new VSX implementation, achieving 13.4X speedup over generic code and enhanced F32-to-BF16 vector conversion efficiency.
  • #1295: Improved IndexedView implementation by refactoring to reduce SFINAE verbosity in the public API and re-enabling raw, fixed-size array access for better maintainability.
  • #1293: Improved AVX512 GEMM kernel performance by enabling the new AVX512 GEMM kernel as the default option, enhancing matrix operation efficiency for modern CPU architectures.
  • #1288: Updates documentation files including examples and tutorials to reflect changes in the 3.4.x branch codebase.
  • #1284: Improved packet math implementation by removing unused HasHalfPacket trait and adding missing pselect/pblend specializations across all SIMD architectures.
  • #1279: Improved IndexedViewMethods by refactoring to eliminate code duplication and enabling non-const reference access for indexed views with symbolic indices like placeholders::last.
  • #1276: Optimized the generic_rsqrt_newton_step function in Core/MathFunctionsImpl.h by reordering operations to improve accuracy and performance, reducing worst-case ULP errors and execution time in the AVX path by ~4.4%.
  • #1275: Improves vectorized casting performance by adding missing int type casts for x86 architecture (AVX, AVX512, SSE) and removes redundant unit tests causing undefined behavior.
  • #1274: Optimized float-to-bool type casting in AVX2 architecture by improving the pcast<Packet8f,Packet16b> implementation, achieving significant performance improvements for large data sizes.
  • #1273: Improved Core utilities by replacing internal pointer type definitions with standard library equivalents and removing ICC compiler workarounds for better CHERI/Morello architecture compatibility.
  • #1272: Optimized pcast operators for x86_64 architectures (AVX, SSE, AVX512) with significant performance improvements for bool casting and tensor operations.
  • #1267: Improved code quality across multiple files by fixing typos in Constants.h, documentation files, and examples.
  • #1265: Improves tensor.isnan() performance by implementing vectorization using typed predicates, achieving significant speedups especially on AVX512 architectures.
  • #1260: Improved Inf and NaN detection in Core MathFunctions by utilizing C++11 standard features like std::numeric_limits traits for better compiler compatibility and reliability.
  • #1255: Improves BF16 GEMV operations in AltiVec architecture by implementing MMA instructions instead of F32 conversion, achieving 5.0-6.3X performance gains on Power systems.
  • #1253: Improved packetmath specializations across multiple architecture backends (AVX, AVX512, AltiVec, NEON, SSE) by introducing a unifying macro to reduce code duplication and enhance maintainability.
  • #1251: Improved formatting of CommonCwiseBinaryOps.h plugin by adding a missing newline character at the end of the file.
  • #1242: Optimized memory allocation in SelfAdjointEigenSolver and Tridiagonalization by pre-allocating workspace and checking options parameter. Improved performance for in-place tridiagonalization and eigenvalue computation.
  • #1241: Improves CMake configuration by restricting CMAKE_* cache variable setting to only when Eigen is built as a top-level project, preventing unintended modifications to external projects' build settings.
  • #1236: Optimized bfloat16 GEMM MMA performance for Power architecture by adding partial linear access for LHS and Output, achieving 30% faster performance and reducing memory loads by 33%.
  • #1234: Improved BLAS/LAPACK header organization by removing unused declarations and reorganizing files into dedicated blas/ and lapack/ directories for better maintainability.
  • #1233: Improves boolean reduction operations (any/all) by vectorizing them and adding short-circuit evaluation support, replacing the specialized BooleanRedux implementation with enhanced visitor-based vectorization.
  • #1230: Simplified AVX512 architecture code by removing obsolete EIGEN_HAS_AVX512_MATH workaround and redundant intrinsics across multiple AVX512 files.
  • #1226: Improves performance of the pow() function by using pmsub in twoprod within GenericPacketMathFunctions.h, achieving ~1% speedup on Skylake architecture.
  • #1223: Improves mathematical function performance by adding vectorized implementation of atanh with up to 39.8x speedup on AVX512, plus missing definition for atan and comprehensive unit tests.
  • #1219: Optimizes pasin_float function performance by ~11% on AVX and fixes special case handling in psqrt_complex function in GenericPacketMathFunctions.
  • #1210: Improves bfloat16 MMA GEMM performance in AltiVec architecture by folding extra column calculations into an additional MMA accumulator and increasing the number of accumulators from 7 to 8, achieving up to 10% speed improvement.
  • #1208: Improves AltiVec matrix multiplication performance by reverting ODR changes and making gemm_extra_cols and gemm_complex_extra_cols functions inline to avoid external function calls.
  • #1207: Optimizes the psign function in GenericPacketMathFunctions.h by reducing logical operations from 3 to 2 and improving AVX2 instruction usage for better performance.
  • #1206: Improved ColPivHouseholderQR LAPACKE interface by replacing std::complex with LAPACKE complex types using translate_type_imp. Enhanced compatibility and portability with LAPACKE-based implementations.
  • #1200: Improved Core functors by removing custom equal_to and not_equal_no implementations in favor of standard C++14 operators, simplifying the codebase and aligning with modern C++ practices.
  • #1199: Improved header management across all Eigen modules by adding IWYU (Include What You Use) export pragmas to top-level headers, enabling better tooling support for clang-tidy and clangd.
  • #1198: Improved Power/AltiVec architecture performance by replacing eigen_assert with eigen_internal_assert to reduce unnecessary error checking overhead without NDEBUG.
  • #1196: Improved vectorization of comparison operations by introducing scalar_boolean_select_op and enabling typed comparisons via EIGEN_USE_TYPED_COMPARATORS macro. Enhanced performance of boolean and bitmask comparisons across Core functors and selection operations.
  • #1190: Improved test code in test/array_for_matrix.cpp by replacing zero comparisons with VERIFY_IS_EQUAL macro for better clarity and consistency with Eigen's standard testing practices.
  • #1186: Updates the ForwardDeclarations.h file in the Core utilities module with unspecified modifications.
  • #1176: Improves mathematical packet operations by fixing edge cases for atan(-0) and binary pow with -0, optimizing atan2 and acos implementations, and adding signbit tests for better correctness.
  • #1175: Improved atan2 functionality in Core math functions by leveraging existing atan implementation for better corner case handling and added numext/patan2 support.
  • #1174: Optimized bfloat16 MMA performance in AltiVec architecture by changing packing from RowMajor to ColMajor, fixing slowdowns when matrix dimensions are not aligned to multiples of 8 rows or 4 columns.
  • #1172: Refactored SparseMatrix.h to use direct class member access instead of encapsulation wrappers, improving code consistency and readability in the sparse matrix implementation.
  • #1170: Improved sparse matrix insertion operations by fixing issues with unused capacity and inactive nonzeros, enhancing performance of insert/coeffRef operations and reducing memory allocation overhead.
  • #1168: Improved Eigen's Memory.h to support per-thread is_malloc_allowed() state for better thread safety in multi-threaded applications, with an optional compiler flag to disable thread-local behavior for single-threaded use cases.
  • #1164: Improved sparse permutation implementation by reducing memory allocations and avoiding unnecessary matrix transpositions, optimizing performance for both outer and inner permutations through better use of move semantics and contiguous data copying.
  • #1160: Improves SparseMatrix insertion strategy by always uncompressing the matrix before inserting and adding capacity heuristics to reduce reallocations and optimize performance for large compressed sparse matrices.
  • #1158: Improved help message in spbenchsolver.cpp to clarify SPD matrix naming conventions, preventing misinterpretation of required _SPD suffix for matrices and corresponding rhs files.
  • #1154: Optimizes Power10 MMA bfloat16 GEMM operations by adding rank-2 friendly data packing, improving indexing, eliminating MMA masking, and using LinearMappers, achieving 61X performance improvement over generic code and 2.3-12X speedup over the previous version.
  • #1152: Improves QR decomposition classes by adding template support for custom permutation index types and fixes Lapacke bindings for ColPivHouseholderQR to ensure correct behavior with non-const references.
  • #1148: Improves memory management in Core utilities by adding runtime guards to all malloc, realloc, and free functions to detect and prevent unwanted dynamic allocations.
  • #1147: Improved Sparse Core performance by rewriting setFromTriplets, adding setFromSortedTriplets for sorted data, and optimizing memory usage with better search algorithms and aligned memory operations.
  • #1141: Improved NEON packet operations by enabling pabs for unsigned integer types (uint16_t, uint32_t, uint64_t), ensuring consistency between packet_traits and actual implementation.
  • #1138: Improves test coverage for numext::signbit by adding a new test case in test/numext.cpp to validate the function's behavior across different input values.
  • #1136: Improved compiler version checks across Eigen's vectorization architectures (AVX, AVX512, NEON, SSE, AltiVec) and core utilities to enhance compatibility and robustness with different compiler environments.
  • #1134: Optimizes the equalspace packet operation in NullaryFunctors.h for better performance. Improves the efficiency of the equalspace packet operation within Eigen's core functionality.
  • #1131: Improved cache utilization for Power10 architecture by increasing L2 and L3 cache sizes in the general block panel kernel. This optimization enhances performance for matrix operations involving sub-matrix splitting such as triangular matrix solve and GEMM operations.
  • #1128: Improved NestByValue by adding direct access support, conditionally enabling appropriate accessors to reduce overhead and enhance performance for expressions with direct access capability.
  • #1119: Improved code formatting in AltiVec PacketMath by adding brackets around unsigned type names for better readability and consistency.
  • #1114: Improved BiCGSTAB iterative solver to support custom parameter types by modifying the initialization in BiCGSTAB.h, enhancing flexibility for users implementing custom parameter types.
  • #1110: Improved code clarity in DenseStorage.h by removing unused parameter names to reduce potential confusion in code readability.
  • #1099: Improves SparseMap class by explicitly requiring that indices must be sorted, enhancing clarity and ensuring consistent behavior in sparse matrix operations.
  • #1095: Improved test coverage for mathematical functions by refactoring special values tests for pow in BinaryFunctors.h and adding similar tests for atan2 in array_cwise.cpp.
  • #1091: Improves clang-format configuration by adding macros to AttributeMacros, reducing odd formatting in automated code formatting.
  • #1089: Improved C++11 math support by unconditionally enabling CXX11 math features for C++14 and newer compilers across Core modules including MathFunctions, GlobalFunctions, and related headers.
  • #1088: Improved assertion consistency across Eigen codebase by replacing standard assert calls with eigen_assert in Core, SVD, and unsupported modules to enable uniform assertion control.
  • #1087: Improved atan() performance by simplifying the range reduction strategy, eliminating a division and some pselect logic to achieve 20-40% performance gains on x86.
  • #1086: Improves Altivec vectorization by making atan conditional on VSX availability, avoiding unnecessary vectorization when VSX is not present.
  • #1084: Improved atan() function performance by adding vectorization support for double-precision arguments across multiple CPU architectures (AVX, AVX512, AltiVec, NEON, SSE).
  • #1083: Optimizes the GEBP kernel in GeneralBlockPanelKernel.h by reducing its size for non-ARM targets. This change addresses MSVC heap memory issues when building TensorFlow.
  • #1079: Improved GEBP kernel compilation performance by using EIGEN_IF_CONSTEXPR to reduce compilation time and memory usage in the General Block Panel kernel and related components.
  • #1078: Improved the NEON GEBP kernel by adding a macro to set the nr trait, enabling more flexible matrix operations on NEON architecture.
  • #1075: Improved complex number sign function handling by avoiding generic sign function for complex types unless vectorization is supported, enhancing performance on non-vectorizable architectures.
  • #1050: Improves IndexedView class by adding assertions to detect and prevent index-out-of-bounds errors during array access operations.
  • #1040: Improved psign function performance by adding AVX2 specialization for Packet8i and removing unnecessary vectorization for psign<bool>.
  • #1038: Improved performance of inverse trigonometric functions (acos, asin, atan) for float type by adding vectorized implementations across all SIMD architectures, achieving 9-30x speedups.
  • #1036: Improves sparse matrix memory management by replacing malloc/free with conditional_aligned allocator for better alignment and heap allocation tracking consistency.
  • #1035: Improved AVX512 packetmath intrinsics by removing unnecessary FP16C flag checks, allowing better performance when using -mavx512f without -mfp16c by avoiding slow scalar typecasts.
  • #1034: Improved pow performance by implementing proper double word division algorithm (Algorithm 15) in GenericPacketMathFunctions, achieving 11-15% speedup while maintaining faithful rounding accuracy.
  • #1026: Improves performance of the sign operator in Eigen by adding vectorization support for the scalar_sign_op function across SSE, AVX, and AVX512 architectures for both real and complex types.
  • #1021: Updated AccelerateSupport module documentation to reflect changes from PR 966, ensuring documentation aligns with the latest code modifications.
  • #1020: Improves ConjugateGradient solver by adding support for numext::sqrt, enabling custom sqrt functions for specialized floating-point types like __float128.
  • #1019: Improves Eigen/Dense header compatibility for embedded environments by avoiding inclusion of when EIGEN_NO_IO is defined, enabling use with libc++ configured without localization support.
  • #1018: Improves GEMM performance on arm64-neon architecture by implementing larger gebp_kernel sizes (3px8/2px8/1px8/1x8) for better register utilization and data reuse.
  • #1013: Improves AVX512 GEBP kernel configuration by adding compiler flags to enable/disable kernels and fixes undef warnings in trsm kernels when not using clang.
  • #1011: Improves pblend AVX implementation by removing vcvtdq2ps instruction and optimizing for integer operations, achieving 24.84% performance improvement for blend operations.
  • #1000: Improves GEMV performance on Power10 architecture by optimizing vector pair usage in load and store operations within the AltiVec MatrixVectorProduct implementation.
  • #999: Improves Householder module by replacing custom sqrt function with numext::sqrt to simplify usage with custom types and avoid header include order issues.
  • #998: Optimizes tanh and erf functions in VSX architecture by enabling vectorized implementations when EIGEN_FAST_MATH is used, improving performance on AltiVec/VSX systems.
  • #997: Improved AVX512 TRSM kernels to conditionally use alloca instead of malloc when EIGEN_NO_MALLOC is enabled, optimizing memory allocation while maintaining performance.
  • #995: Improves DiagonalBase class documentation by adding missing documentation and applying code formatting to the previously undocumented class in DiagonalMatrix.h.
  • #992: Improves AVX512 TRSM kernels to respect EIGEN_NO_MALLOC by switching from malloc to Eigen's handmade memory allocation and disabling problematic kernel variants when malloc is not allowed.
  • #985: Improves plogical_shift_* function implementations in SVE PacketMath and fixes a typo in the header file. Enhances SVE-based vector operations for better performance.
  • #975: Improved Power GEMM packing by adding subMappers to TensorContractionMapper, simplifying address calculations and achieving a 10% performance boost.
  • #972: Improves AVX512 matrix multiplication performance by adding optimized GEMM kernels and resolving build issues from the previous implementation attempt.
  • #969: Improved CMakeLists.txt by adding a check to only define the uninstall target if it doesn't already exist, preventing conflicts when using Eigen with FetchContent.
  • #968: Improves DiagonalMatrix by adding EIGEN_CONSTEXPR to cols() and rows() methods, enabling constexpr usage and aligning with Eigen::Matrix's existing constexpr support.
  • #967: Improved AltiVec matrix operations by adding vector pair loading for GEMM MMA RHS and optimizing GEMV predux to use vectors instead of scalars for better performance.
  • #966: Simplified Accelerate LLT and LDLT interfaces by automatically applying the Symmetric flag to UpLo arguments, reducing boilerplate code for symmetric matrix operations.
  • #962: Optimizes HouseholderSequence memory usage by avoiding heap allocations when applying Householder sequences to vectors, using dynamic or fixed-size blocks based on input configuration.
  • #960: Improves AVX512 TRSM implementation by removing AVX512VL dependency and switching to _mm512_mask* intrinsics with zmm-ymm reinterpretation for broader AVX512F compatibility.
  • #959: Improves AVX512 TRSM implementation by restricting it to AVX512VL and renaming associated files to follow Eigen's standard naming conventions.
  • #951: Improved Power GEMV order of operations in predux for MMA by reducing instructions from 20 to 7 and fixing inline assembly issues for GCC compatibility.
  • #944: Improved array reshaping operations by replacing template-based approach with a constexpr helper function in ReshapedHelper.h and ReshapedMethods.h, simplifying the codebase and enhancing compile-time performance.
  • #943: Improved XprHelper.h and related files by converting several helper functions from template metaprogramming to constexpr functions, reducing code complexity and improving compile-time performance.
  • #939: Improved the LAPACK module organization by renaming .cpp files to .inc files to avoid including C++ source files directly and clarify module boundaries.
  • #936: Improved GEMM performance on Power architecture by adding vector_pair loads, extra accumulators, and single-pass optimizations, achieving up to 2400% speedup in specific matrix operations.
  • #931: Improved CI pipeline configuration by enabling Aarch64 support in GitLab CI build and test files to ensure ARM architecture compatibility.
  • #921: Optimizes visitor traversal in Core/Visitor.h by checking for RowMajor layout and traversing in row-major order instead of forcing column-major traversal for RowMajor matrices.
  • #916: Improved Altivec MMA flag configuration by changing EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to use explicit 0/1 values like TensorFlow's approach. Updated MatrixProduct and MatrixVectorProduct modules along with documentation to provide better control over Altivec MMA flag configurations.
  • #909: Improved code maintainability in Core modules by removing obsolete GCC-4 warning workarounds from Meta.h, Dot.h, and SparseBlock.h files.
  • #904: Improved compile-time performance by converting static const class members to static constexpr across Core modules and Tensor library components, reducing runtime memory usage and enhancing code clarity.
  • #903: Improves bit calculation implementation in GenericPacketMathFunctions by converting to constexpr and replacing enums with static constexpr int to eliminate casts and enhance performance.
  • #896: Improved SYCL Vptr implementation by removing ComputeCpp-specific code and replacing it with standard SYCL buffer class reinterpret functionality for better portability.
  • #892: Improved constant evaluation handling in Core utilities by adding a wrapper for std::is_constant_evaluated and disabling alignment check assertions during constant evaluation.
  • #891: Improved SVD test suite by splitting large tests and reducing matrix sizes to lower memory consumption and avoid heap space issues on MSVC and GCC.
  • #889: Improves memory management throughout Eigen by adding construct_at and destroy_at wrappers for std::construct_at and std::destroy_at, replacing direct placement new and destructor calls across multiple core modules.
  • #888: Optimizes the least square conjugate gradient solver by adding .noalias() to avoid unnecessary memory copies and improve execution speed.
  • #886: Improves test/packetmath.cpp by adding conditional logic to skip denormal tests when the packet operation does not exist, reducing unnecessary test execution.
  • #872: Improved sqrt and rsqrt implementations in AVX/AVX512 math functions to handle denormal numbers correctly, avoiding NaN results and reducing flush-to-zero/infinity behavior while optimizing performance for AVX512.
  • #868: Optimized SQRT/RSQRT implementations for x86 Skylake/Zen2 processors by removing specialized internal::psqrt functions and fixing generic implementations to handle IEEE special values. Enhanced test coverage by adding support for testing packet math functions on IEEE special values.
  • #865: Improves SVD edge case handling by adding assertions for thin U runtime requests across BDCSVD, JacobiSVD, and SVDBase modules. Enhances testability and provides better failure messages without changing existing behavior.
  • #860: Improves matrix multiplication performance by adding AVX512 optimizations with new 48x8 and 24x8 kernel unrolls for single and double precision operations.
  • #855: Improved AVX PacketMath implementation by removing unused macros, eliminating unnecessary code to enhance code clarity.
  • #850: Improved Matrix typedef documentation by adding explicit descriptions to ensure they appear correctly in doxygen-generated documentation.
  • #849: Improved documentation by adding missing MatrixXNt and MatrixNXt matrix patterns to TutorialMatrixClass.dox and fixed namespace issues in TutorialLinAlgSVDSolve.cpp to enable proper doc compilation.
  • #846: Improved GeneralizedEigenSolver by modifying alphas() and betas() methods to return const references instead of copies, reducing memory allocations and aligning with documentation.
  • #841: Improved psqrt and prsqrt implementations across SSE, AVX, and AVX512 architectures by consolidating them into generic implementations with correct handling of edge cases (0, +Inf, NaN, negative arguments).
  • #834: Improved triangular solve performance by adding AVX512 optimized kernels with template specializations for fp32/fp64 and unroll implementations, enhancing performance for smaller problem sizes.
  • #830: Improved Eigen documentation by removing outdated references to C++98/03 behavior across Core utility files and documentation.
  • #827: Improved reciprocal operations in Core math functions to be IEEE compliant for edge cases like 1/0 and 1/inf, while optimizing Newton-Raphson implementation with NaN-based detection for significant performance gains up to 77%.
  • #824: Improved AVX/FMA packet operations by removing inline assembly workarounds and adding pmsub, pnmadd, and pnmsub extensions for better performance and readability.
  • #821: Improves DiagonalMatrix performance by setting NestByRefBit in traits to prevent heap allocation during diagonal product operations.
  • #819: Improves clang warning suppression in DisableStupidWarnings.h by adding logic to check if a warning is supported before attempting to suppress it.
  • #813: Improved documentation for LeastSquaresConjugateGradient solver by correcting misleading descriptions and clarifying the distinction between the least squares problem and normal equations.
  • #799: Improves plog performance by replacing degree 10 polynomial with (3,3) rational approximation for 20% speedup on float with AVX2, and fixes denormalized float argument handling to prevent incorrect saturation.
  • #797: Improves Eigen's serializer implementation by adding bounds checking to prevent out-of-bounds access and enhance safety during serialization/deserialization operations.
  • #796: Improved Matrix and Array classes to be trivially copyable in C++20, enabling safer use of std::memcpy() for fixed-size containers by adding type traits and leveraging C++20 concepts instead of SFINAE.
  • #795: Improved naming conventions across Eigen library by replacing reserved names (e.g., leading underscores) with compliant C++ identifiers to prevent potential conflicts with implementation details.
  • #792: Improved CWiseUnaryView to support specifying custom inner and outer strides. Added stride customization functionality and corresponding tests to enhance flexibility of the component.
  • #786: Improves GDB pretty printer code by renaming variables to avoid Python conflicts, removing unused imports, and enhancing code formatting for better readability.
  • #783: Improved the logical_xor() function in Core utilities by simplifying the boolean logic from (a || b) && !(a && b) to a != b for better readability and reduced complexity.
  • #780: Improved logistic sigmoid function accuracy in UnaryFunctors.h by implementing a hybrid range reduction method that reduces relative error to 4.5 ulps and better handles large negative values.
  • #779: Improved exp() function by reducing polynomial degree from 7 to 6, achieving 4% speedup on AVX2 while enhancing accuracy for denormalized values in large negative arguments.
  • #776: Improved CMake build system by converting EIGEN_TEST_CUSTOM_CXX_FLAGS from space-separated to semicolon-separated format to properly handle it as a CMake list with proper escaping and quote support.
  • #773: Improved SparseDenseProduct.h by implementing sparse-dense matrix multiplication with two accumulation variables instead of one, enabling better CPU instruction-level parallelism for RowMajor matrices.
  • #764: Improved GEMV performance for PowerPC architecture by adding VSX and MMA acceleration support, achieving up to 2.5X and 4X speedups respectively.
  • #763: Improved CMake configuration by removing deprecated COMPILE_FLAGS macro and replacing it with modern target_compile_options and target_compile_definitions.
  • #761: Improves codebase maintainability by removing obsolete compiler version checks and deprecated feature detection flags (EIGEN_HAS_CXX14_VARIABLES_TEMPLATES, EIGEN_HAS_TYPE_TRAITS, EIGEN_HAS_SFINAE) across core utilities and architecture-specific modules.
  • #760: Improves code quality in documentation examples by removing using namespace Eigen statements from 52 sample files to prevent namespace pollution and promote better C++ coding practices.
  • #756: Improves Parallelizer.h to conditionally include header only when compiler support is available, enabling Eigen Core compilation in embedded toolchains without atomic support.
  • #753: Improved type safety in Eigen's core computational macros by converting them to constexpr functions, adding stricter type checking and preventing narrowing conversions across matrix operations and linear algebra modules.
  • #748: Improved LAPACKE bindings for HouseholderQR and PartialPivLU by replacing binding macros with C++ code and factoring common functionality into a new helper file. This reduces memory usage and improves code organization in the linear algebra library.
  • #742: Updated CMake configuration by raising minimum version to 3.10 and removing C++11 test disable option, improving compatibility with older Linux distributions and simplifying CI setup.
  • #737: Improves the LLT Cholesky decomposition implementation by splitting the large Lapacke LLT macro into smaller, more maintainable parts to enhance code readability and reduce complexity.
  • #736: Improved const correctness in SelfAdjointView and TriangularMatrix by using SFINAE to remove non-const transpose overloads when views don't refer to lvalues, preventing incorrect overload selection and addressing clang-tidy warnings.
  • #735: Simplified preprocessor directives by removing EIGEN_HAS_CXX11_* macros and redundant compiler version checks across core modules. Consolidated outdated compiler handling into a unified error path while reducing conditional compilation complexity.
  • #734: Improves AVX2 vectorization in XprHelper by allowing usage even when data size is not a multiple of 8, enhancing vectorization performance for non-aligned data sizes.
  • #727: Improved numeric_limits members in BFloat16 and Half types by making them constexpr to comply with modern C++ standards.
  • #722: Improved Umeyama function by adding conditional logic to avoid unnecessary computation of src_var when with_scaling == false, reducing computational overhead and clarifying code behavior.
  • #718: Improves SparseMatrix by standardizing StorageIndex usage across Map and TransposedSparseMatrix to match the underlying SparseMatrix object, ensuring consistent index handling throughout sparse matrix operations.
  • #717: Improved SparseVector modularity by moving the pruning function from CompressedStorage.h to SparseVector.h, separating pruning logic from storage implementation.
  • #716: Improves NVIDIA GPU compatibility by converting diagnostic pragmas from diag to nv_diag format in DisableStupidWarnings.h and ReenableStupidWarnings.h header files.
  • #712: Improved Quaternion constructor documentation by clarifying the expected order of matrix elements for the MatrixBase constructor.
  • #702: Optimizes float-to-half and half-to-float conversion functions by adding AVX vector paths to the PacketMath implementations, improving matrix multiplication performance by up to 65%.
  • #700: Improves fp16 tanh and logistic functions on Neon architecture by adding vectorized implementations. Separates vectorized code into new UnaryFunctors.h header and optimizes performance for fp16 computations on Neon devices.
  • #697: Improves CMake build system configuration by optimizing scripts for subproject use, removing unnecessary default test building, and replacing CMAKE_CXX_FLAGS with a dedicated test_interface target.
  • #693: Improved documentation for the Stride class by adding a note clarifying that compile-time vectors always use the inner stride.
  • #678: Reorganized GPU/CUDA architecture components by moving Complex.h from CUDA to GPU directory and removing deprecated TensorReductionCuda.h file for better code organization.
  • #677: Optimizes GPU bit_cast operations in NumTraits by replacing memcpy with reinterpret_cast for better CUDA performance.
  • #673: Improved Visitor.h by adding vectorized codepaths for coeffMax and similar functions, achieving ~5x performance gains on AVX2-enabled machines and up to 38% improvement in matrix decompositions.
  • #668: Improved Windows CMake configuration by replacing deprecated Visual Studio detection macros with standard CMAKE_CXX_COMPILER_VERSION and removing obsolete detection files.
  • #662: Reorganized test infrastructure by moving random matrix generators from main.h to separate random_matrix_helper.h file and added ICC compiler compatibility protections.
  • #655: Improved CI infrastructure by enabling parallel test execution on all available CPU cores in GitLab CI configuration files.
  • #647: Improves EIGEN_STATIC_ASSERT macro by migrating to standard C++11 static_assert, moving assertions out of constructors and splitting large assertions into more readable individual checks.
  • #645: Improved eigen_packet_wrapper by adding a default constructor to enable trivial memory copying operations with memcpy.
  • #634: Improves CMake configuration by defaulting to populate the package registry for CMake 3.15+, reducing configuration complexity while maintaining backward compatibility.
  • #633: Improved CMake versioning by adding support for ARCH_INDEPENDENT option in package configuration, simplifying architecture-independent versioning and reducing complexity.
  • #610: Improved CMake build configuration by updating all CMakeLists.txt files to use C++11 as the minimum standard and centralizing C++ standard settings through CMAKE_CXX_STANDARD command-line parameter.
  • #609: Optimized predux reduction operations for aarch64 architecture by replacing vp(add|min|max) with more efficient v(add|min|max)v NEON intrinsics in PacketMath.h.
  • #603: Improved documentation for squaredNorm() method in Core/Dot.h to clarify it returns the squared Frobenius norm for matrices, not the regular Frobenius norm.
  • #602: Improved naming consistency in ArrayCwiseUnaryOps by renaming shift_left/shift_right functions to shiftLeft/shiftRight and reorganizing them into a dedicated namespace.
  • #597: Improved documentation for matrix decompositions and least squares solvers by updating three documentation files to enhance clarity, structure, and consistency for key linear algebra features.
  • #584: Optimizes tridiagonalization process in SelfAdjointEigenSolver and Tridiagonalization modules by eliminating unnecessary memory allocations in the in-place selector implementation.
  • #582: Optimizes 3x3 matrix inverse computation in InverseImpl.h, reducing execution time by 88.67% for static matrices and avoiding a GCC uninitialized memory warning.
  • #581: Improved documentation for middleCol and middleRow methods by adding tutorial entries for these block operations.
  • #580: Improved BFloat16 conversion performance by removing denormal flushing in FP32ToBF16 operations for AVX and AVX512 architectures, reducing overhead in BF16 computations.
  • #573: Improved documentation grammar in Constants.h by fixing a typo from "has" to "have" in the dox comments.
  • #566: Improved documentation formatting by converting code snippets to monospace (typewriter) style across multiple Doxygen documentation files including QuickStartGuide, SparseLinearSystems, and tutorial documents.
  • #556: Improved dense matrix filling performance by deferring to std::fill_n for constant value assignments in AssignEvaluator, reducing execution time for large matrix operations.
  • #546: Improved complex division operations by implementing Smith's algorithm with vectorized support across multiple architectures (SSE, AVX, AVX512, NEON, etc.), providing ~38% performance speedup and better numerical stability.
  • #542: Improved documentation for functions in test/main.h by adding Doxygen-style comments.
  • #541: Improves DenseStorage by adding trivially_copyable trait, enabling safe memcpy usage and better compatibility with systems requiring trivially copyable types.
  • #537: Improved conj_helper implementation by removing code duplication across multiple architecture files (AVX, SSE, NEON, etc.) and simplifying specializations for scalar operations.
  • #533: Improved reference management in Core/Solve.h by using internal::ref_selector to avoid holding references to RHS expressions, enhancing code efficiency and reducing potential reference-related issues in the solver implementation.
  • #529: Improved packet math operations by replacing pset with ploadu for safer unaligned data loading across Core architecture modules. This change reduces segfault risks and enhances robustness when alignment is not guaranteed.
  • #527: Improved AltiVec matrix multiplication performance by replacing EIGEN_STRONG_INLINE with EIGEN_ALWAYS_INLINE in critical areas. This change ensures proper inlining in TensorFlow integration, reducing slowdowns in matrix operations.
  • #519: Improves floating point handling in SIMD PacketMath implementations by using bit_cast to create -0.0 values, ensuring consistent zero value creation across AltiVec, NEON, ZVector, and other architectures.
  • #510: Improved Unary/Binary/TernaryOp evaluators in CoreEvaluators.h to support non-class types like raw function pointers while maintaining compatibility and assembly size.
  • #505: Improved test coverage in packetmath by adding test cases for transpose operations on non-square kernels.
  • #484: Improved test coverage for SelfAdjointEigenSolver by adding unit tests for complex matrix operations with std::complex matrices.
  • #480: Improves test coverage for packet math comparison operations by adding unit tests for pcmp_lt and pcmp_le functions in the packetmath test suite.

Other added

  • #1914: Adds EIGEN_DISABLE_ALLOCA macro to Core/util/Memory.h allowing users to explicitly disable alloca usage for improved portability and stack safety.
  • #1909: Adds OpenBLAS sbgemm support to GeneralMatrixMatrix_BLAS.h for efficient bfloat16 matrix multiplication. Introduces EIGEN_USE_OPENBLAS_BFLOAT16 macro to enable the feature in compatible OpenBLAS builds.
  • #1905: Added CHANGELOG.md file to the project with proper markdown formatting and fixed links from the original wiki documentation.
  • #1896: Adds factory functions and accessor methods to the Quaternion class for explicit coefficient ordering, providing both scalar-first and scalar-last variants to improve API clarity and reduce ambiguity in interoperability contexts.
  • #1879: Adds vectorized implementation of cbrt function for float and double types across multiple SIMD architectures (AVX, SSE, NEON, Altivec, AVX512) with 1-2 ULP accuracy and significant performance improvements.
  • #1865: Added masked load/store framework to Eigen's vectorized assignment loops, implementing packet segment support for improved vectorization of odd-sized arrays with significant performance gains.
  • #1861: Adds support for triggering full CI builds when merge request labels contain all-tests. This enables manual control over comprehensive test execution in GitLab CI pipelines.
  • #1857: Adds numext::fma support and missing pmadd implementations for packet math operations across multiple architectures (AVX, AVX512, NEON, AltiVec) with enhanced float16 and bfloat16 type support.
  • #1812: Adds automated Doxygen documentation building and deployment to the CI pipeline by creating a new build script and integrating it into GitLab CI configuration files.
  • #1805: Added matrixL() and matrixU() functions to IncompleteLUT class for extracting lower and upper triangular factors from sparse matrices, along with corresponding test cases.
  • #1791: Adds ForkJoin-based ParallelFor algorithm to the ThreadPool module, implementing parallel execution support for unary and binary functions with corresponding test coverage.
  • #1778: Adds an install-doc target to CMake configuration that enables documentation installation to the standard CMAKE_INSTALL_DOCDIR location.
  • #1777: Adds LoongArch64 LSX vectorization support to Eigen's Core architecture module with new LSX-specific implementation files and updated build configurations.
  • #1765: Adds a deploy phase to the GitLab CI pipeline that automatically tags successful nightly builds, implemented through a new ci/deploy.gitlab-ci.yml configuration file.
  • #1758: Adds test case for pcast function when applied to scalar values in the packetmath test suite.
  • #1743: Added vectorized implementation of erf(x) function for double-precision values across multiple architectures (SSE, AVX, AVX512, AltiVec, NEON). Implemented efficient vectorized operations in architecture-specific PacketMath.h files to achieve significant performance speedups for error function computations.
  • #1733: Added missing AVX predux_any function implementations to Eigen's Core PacketMath module. This enhances AVX vector operation support and improves performance for vectorized computations.
  • #1715: Adds exp2() function as a packet operation and array method across multiple architectures (AVX, NEON, GPU, etc.) with improved accuracy using the TwoProd algorithm, reducing worst-case error from 35 ulps to 4 ulps for float operations.
  • #1714: Adds nextafter function implementation for bfloat16 type in Eigen's Default architecture header, providing standard C++ nextafter equivalent behavior for bfloat16 floating-point operations.
  • #1704: Adds free-function swap for Eigen matrices and sparse structures. Implements swap functionality in DenseBase, SparseMatrix, and SparseVector to enable compatibility with C++ algorithms requiring swapping.
  • #1682: Adds nvc++ compiler support by configuring compiler macros, fixing ARM NEON intrinsics compatibility, and updating CMake scripts to handle nvc++ compiler flags correctly.
  • #1669: Adds NEON complex intrinsics support to Eigen's ARM NEON architecture files, implementing potential pmul and pmadd operations for enhanced complex arithmetic performance.
  • #1666: Adds new EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function to GeneralMatrixMatrix.h with performance optimizations and improved documentation for faster matrix operations.
  • #1655: Adds new strongly typed algebraic matrix multiplication function with performance optimizations and improved error handling to the ThreadPool module.
  • #1654: Adds new EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function to Core utilities with performance optimizations and improved documentation for faster matrix operations.
  • #1636: Adds C++20 contiguous_iterator concept conformance to pointer_based_stl_iterator in StlIterators.h, enabling std::span compatibility with range operations like std::views::drop and std::views::take.
  • #1629: Adds vectorized implementations of isfinite and isinf functions to Eigen's core mathematical operations, enhancing performance through SIMD optimization in UnaryFunctors and GenericPacketMath modules.
  • #1627: Adds tensor roll/circular shift functionality to the unsupported Tensor module with new TensorRoll.h implementation and corresponding test cases.
  • #1612: Added bit shifting functions to Eigen's numext module, implementing scalar bit shift operators (logical_shift_left, logical_shift_right, arithmetic_shift_right) and generalizing existing packet functions for scalar use.
  • #1580: Adds Packet8l support to AVX512 architecture by implementing packet operations in AVX and AVX512 PacketMath modules and type casting functionality.
  • #1560: Adds cwiseSquare operation to MatrixCwiseUnaryOps plugin and includes comprehensive test coverage for component-wise matrix operations.
  • #1554: Adds SimplicialNonHermitianLLT and SimplicialNonHermitianLDLT sparse matrix solvers to the SparseCholesky module. These new solvers handle complex symmetric matrices instead of complex hermitian matrices.
  • #1546: Adds support for casting between double and int64_t data types in SSE and AVX2 vectorized operations. Enhances performance of tensor cast expressions by implementing optimized SIMD instructions for these specific type conversions.
  • #1544: Adds Packet2l support for SSE vectorization, enabling 64-bit integer operations in the SSE PacketMath implementation to improve performance of int64_t operations.
  • #1501: Adds vectorized pexp_complex function for float type across all SIMD architectures (AVX, AVX512, Altivec, NEON, SSE, ZVector), providing SIMD support for complex exponential operations.
  • #1471: Added LAPACK CPU time functions by introducing dsecnd_INT_CPU_TIME.cpp and second_INT_CPU_TIME.cpp files following LAPACK naming conventions for configurable CPU time measurement.
  • #1455: Adds MI300 hardware support to Eigen testing infrastructure by enabling test execution on MI300 series GPU variants (gfx940, gfx941, gfx942) through CMake configuration updates.
  • #1454: Adds half and quarter vector support to HVX architecture PacketMath, enabling vectorization for smaller matrix sizes (8-31 elements) on Snapdragon XR2 Gen 2.
  • #1445: Adds factor getter methods to Cholmod LLT and LDLT solvers, exposing access to the L, Lᵀ, and D factor matrices for improved usability.
  • #1436: Adds internal ctz (count trailing zeros) and clz (count leading zeros) implementations to Eigen's MathFunctions.h for improved random number generation and pointer alignment detection.
  • #1430: Adds .git-blame-ignore-revs file to configure git blame behavior for revision tracking in the repository.
  • #1429: Adds new EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function to Core with performance optimizations and comprehensive architecture support across all SIMD instruction sets.
  • #1428: Adds clang-format CI enforcement by creating a new formatting check stage in the GitLab CI pipeline to ensure consistent code formatting across commits.
  • #1414: Adds plog_complex function to Eigen's vectorized packet math operations, implementing complex logarithm support across multiple SIMD architectures (AVX, AVX512, NEON, SSE, AltiVec, ZVector).
  • #1408: Adds ThreadPool support to parallel GEMM implementation in Core module, enabling matrix multiplication parallelization on platforms without OpenMP.
  • #1403: Adds cbrt (cube root) function to Eigen's component-wise operations for arrays and matrices, implementing the scalar_cbrt_op functor with numext::cbrt and MKL vml support along with comprehensive tests and documentation.
  • #1384: Added new EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function with performance optimizations across Core modules and architecture-specific implementations. Implemented comprehensive vectorization support for AVX, AVX512, NEON, SSE and other architectures with improved matrix multiplication kernels and documentation.
  • #1375: Adds support for Qualcomm Hexagon Vector Extension (HVX) architecture by introducing new architecture definition files and build flags. Updates Core vectorization configuration to enable HVX-specific optimizations for Hexagon DSP targets.
  • #1365: Adds missing pcasts for float->bool, int->double, and float->double conversions in x86 SIMD architectures (SSE, AVX, AVX512). Introduces simplified method for enabling pcasts with vectorized_type_casting_traits and cleans up array_cwise implementation warnings.
  • #1352: Adds rint, round, floor, and ceil functions to Eigen's unary functors, providing standard rounding operations in the Core module's MathFunctions and UnaryFunctors components.
  • #1345: Added a new constructor to Quaternion class that accepts a real scalar and imaginary 3-vector, simplifying quaternion construction for common mathematical expressions like angular velocity formulas.
  • #1336: Adds linear access evaluators to the Redux library for unrolled scalar, unrolled vectorized, and rolled scalar traversals to improve performance and simplify traversal logic.
  • #1335: Added removeOuterVectors() and insertEmptyOuterVectors() methods to SparseMatrix class for removing and inserting contiguous outer vectors. Enhanced robustness by explicitly handling edge cases where the first outer vector is not empty.
  • #1331: Adds SYCL testing infrastructure to Eigen core, including a new CMake configuration file and basic validation test to enable SYCL functionality testing.
  • #1330: Added half-precision support for SYCL by implementing conversions between Eigen::half and cl::sycl::half in SYCL architecture files and tensor operations, along with comprehensive test coverage for half-precision data types.
  • #1329: Adds macro-based customization support to ThreadPool synchronization primitives, allowing users to override default synchronization mechanisms for performance and instrumentation purposes.
  • #1314: Adds canonicalEulerAngles method to MatrixBase in the Geometry module, providing a new API for converting Euler angles to canonical ranges while maintaining backward compatibility with the deprecated eulerAngles method.
  • #1313: Added pmul and abs2 operations to Packet4ul in AVX2 implementation, enabling vectorized multiplication and absolute square operations for 4 unsigned 64-bit integers.
  • #1309: Adds the Abs2 method to the Packet4ul class in the AVX implementation to enable computation of squared absolute values for 4-element unsigned long vectors.
  • #1299: Adds new BF16 packet casting functions to AltiVec PacketMath and creates dedicated TypeCasting.h header to organize type casting functionality.
  • #1297: Added Packet4ui, Packet8ui, and Packet4ul packet types to SSE/AVX PacketMath headers to extend vectorization support for unsigned integer types (uint32_t and uint64_t).
  • #1289: Adds thread pool functionality to Core by moving thread pool implementation from unsupported/Tensor to Eigen/src/Core and creating a new Eigen/ThreadPool directory structure.
  • #1285: Adds USM (Unified Shared Memory) support to the SYCL backend, enabling SYCL-2020 compatibility by introducing virtual pointer mechanisms and removing the old memory model implementation.
  • #1281: Adds insertFromTriplets/insertFromSortedTriplets methods to sparse matrices for batch insertion of triplets and optimizes setFromTriplets performance through improved memory allocation and out-of-place sorting.
  • #1250: Adds the Less operation to Eigen's MatrixCwiseBinaryOps plugin, enabling efficient element-wise less-than comparisons for arrays and matrices.
  • #1244: Adds support for specifying permutation index type in PartialPivLU and FullPivLU classes to enhance compatibility with Lapacke ILP64 interfaces.
  • #1240: Adds typed comparison API to matrix operations by introducing cwiseTypedLesser() method while reverting comparison overloads to return bool arrays for improved vectorized expression performance.
  • #1224: Adds Packet division operations support for Power10 architecture in AltiVec PacketMath, enabling integer packet division operations for improved performance.
  • #1211: Adds CArg function to Eigen's core unary operations, providing vectorized complex argument calculations for real numbers returned as complex numbers.
  • #1209: Adds support for printing diagonal matrix expressions directly without requiring allocation to dense objects, enhancing debugging capabilities for diagonal matrices.
  • #1203: Adds typed logical operators to Core functors, enabling vectorization of logical operations without casting to bool and supporting complex types and non-standard scalars as boolean values.
  • #1166: Adds custom ODR-safe assert functionality to Eigen Core utilities, implementing compiler-specific assert mechanisms that resolve ODR violations when using C++20 modules.
  • #1146: Adds NEON support for pcmp, plset, and complex psqrt operations in the Core architecture module, enabling improved performance for complex number operations on ARM NEON processors.
  • #1139: Adds comparison and arithmetic operators to CompressedStorageIterator in SparseCompressedBase, implementing RandomAccessIterator requirements and making the iterator DefaultConstructible for improved usability.
  • #1133: Adds setEqualSpaced function to DenseBase for creating linearly spaced sequences with equal intervals, providing a more intuitive and efficient alternative to setLinSpaced for cases where size-1 is a multiple of the range.
  • #1129: Adds LAPACKE binding support for BDCSVD, enabling LAPACK-based SVD computations with full matrix variant support (fullU/fullV, thinU/thinV, none) when EIGEN_USE_LAPACKE is configured.
  • #1126: Adds SYCL-2020 support to Eigen by integrating Intel DPCPP compiler compatibility. Enables compilation and execution of Eigen SYCL code on various GPU accelerators through new CMake configuration and updated Core/Tensor modules.
  • #1121: Adds serialization support for SparseMatrix and SparseVector classes to enable easier reproduction of sparse solver issues and improve flexibility for users working with sparse data structures.
  • #1103: Adds sparse matrix sorting functionality by implementing CompressedStorageIterator and sort methods for inner vectors in SparseCore, enabling efficient single-pass sorting of sparse matrix indices and values with custom comparison support.
  • #1098: Adds cross product functionality for 2D vectors in the geometry module, implementing the missing feature and updating documentation to fix issue #1037.
  • #1097: Adds signbit function to Eigen's core math functions with optimized SIMD implementations across AVX, AVX512, SSE, NEON, and AltiVec architectures for efficient floating-point sign bit detection.
  • #1090: Adds constexpr support for std::initializer_list constructors in Eigen::Matrix and Eigen::Array classes. Enables compile-time initialization of matrices using initializer lists in C++14/17/20 constexpr contexts.
  • #1082: Added vectorized implementation of atan2 function to Eigen core with support for global functions and array syntax, providing approximately 12.4x speedup for AVX512 on large arrays.
  • #1076: Adds vectorized integer division support for int32 data types across AVX512, AVX, and SSE architectures in Eigen's PacketMath modules. Enables HasDiv=1 for Packet4i, Packet8i, and Packet16i while adding SIGFPE error handling for division by zero cases.
  • #1073: Adds AVX implementation for int32_t pdiv function in PacketMath.h, providing vectorized division with truncation that casts to double internally for improved performance on AVX-enabled systems.
  • #1066: Adds mixed-type support to pow() operations in UnaryFunctors, allowing exponents of different types as long as they are exactly representable in the base type.
  • #1047: Added SkewSymmetricMatrix3 class to Core module implementing skew symmetric matrices for Vector3 with Rodrigues' rotation formula for matrix exponential.
  • #1043: Adds vectorized pow implementation for integer base and exponent types across multiple SIMD architectures, with improved handling of negative exponents and integer overflow protection.
  • #1029: Adds fixed power unary operation for arrays with optimized repeated squaring for integer exponents and fallback to Eigen's vectorized pow routine for non-integer cases.
  • #1017: Adds AVX512-FP16 instruction set support to Eigen's packet math operations, implementing Packet32h and vectorized half precision routines for significantly improved performance in linear algebra computations.
  • #1008: Adds Power10 MMA instruction support for bfloat16 matrix operations in AltiVec architecture, implementing gemmMMAbfloat16 function with rank-2 updates to improve performance on Power10 hardware.
  • #1004: Adds determinant() method to QR decomposition classes (HouseholderQR, ColPivHouseholderQR, FullPivHouseholderQR, and CompleteOrthogonalDecomposition) enabling true determinant calculation for QR decompositions.
  • #990: Adds diagonal matrix multiplication support and static initializers (zero, identity) to DiagonalMatrix class for enhanced usability and convenience.
  • #971: Added EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function to Core utilities and SVD modules with performance optimizations and comprehensive test coverage.
  • #965: Adds fused multiply functions (pmsub, pnmadd, pnmsub) to PowerPC AltiVec PacketMath for improved performance on PowerPC processors.
  • #947: Adds partial packet operations to Eigen's vectorization system, introducing pload_partial, pstore_partial, pgather_partial, pscatter_partial and related functions for safer memory access that prevents reading/writing past data boundaries.
  • #899: Adds constexpr support to Map objects and core Eigen components, enabling compile-time evaluation of basic operations like addition and subtraction with C++14 constexpr compatibility.
  • #895: Adds move constructor support to SparseSolverBase and IterativeSolverBase classes, enabling move semantics for iterative solvers to improve performance and usability.
  • #893: Adds new CMake options (EIGEN_BUILD_BLAS, EIGEN_BUILD_LAPACK, EIGEN_BUILD_CMAKE_PACKAGE) to allow users to control which build components are enabled or disabled during configuration.
  • #856: Added support for Apple's Accelerate framework sparse matrix solvers to Eigen, implementing wrappers for LLT, LDLT, and QR solvers with improved performance for large sparse systems.
  • #854: Added Scaling function overload to accept rvalue reference vectors, enabling creation of diagonal matrices from temporary vectors.
  • #838: Added EIGEN_HAS_AVX512_MATH macro definition in PacketMath to enable AVX512 support. Fixed order of operations in AVX512 math functions to ensure proper compatibility and performance optimizations.
  • #829: Added new EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function to Core module with performance optimizations for matrix operations. Updated extensive Core components including product evaluators, matrix multiplication kernels, and various linear algebra modules to support the new functionality.
  • #820: Adds reciprocal packet operations with fast specializations for float types across SSE, AVX, and AVX512 architectures. Implements Newton-Raphson refinement for improved accuracy and leverages built-in reciprocal instructions for up to 62% performance improvements in large-scale inverse operations.
  • #817: Adds support for int64 packet operations on x86 architectures by modifying PacketMath.h files in AVX and AVX512 directories. This enhancement enables 64-bit integer vectorized operations for improved performance on x86 platforms.
  • #798: Adds a Non-Negative Least Squares (NNLS) solver to the unsupported module, implementing the standard active-set algorithm with compute, solve, and info methods. Includes comprehensive test suite and API design consistent with other Eigen iterative solvers.
  • #791: Adds compiler support for Cray, Fujitsu, and Intel ICX compilers by extending preprocessor macros and compiler detection in Core utilities. Also extends IBM XL compiler detection to include newer versions V13.1 and V16.1.
  • #758: Adds HIP GPU unit testing support to the Eigen library build system by modifying the CMake testing configuration to enable C++14 compilation for HIP-based tests.
  • #687: Adds nan-propagation options to array and matrix plugins, enabling control over NaN value handling in elementwise min/max operations.
  • #652: Added a macro EIGEN_CTEST_ARGS to CMake testing configuration to enable passing arguments to ctest for parallel test execution.
  • #646: Added new GPU testing targets buildtests_gpu and check_gpu to the CMake configuration. Modified testing configuration files to enable isolated GPU test builds and runs in the CI pipeline.
  • #631: Added EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function to Core module with performance optimizations and comprehensive internal header checking across all Eigen modules.
  • #625: Adds new GPU test utilities and infrastructure including run_on_cpu, run_on_gpu, and run functions along with example GPU test files to enhance GPU testing capabilities in the Eigen library.
  • #624: Added a simple serialization mechanism to Eigen Core with a new Serializer<T> class for binary serialization. Includes support for GPU data transfers and enhanced testing capabilities through new serializer utilities and test infrastructure.
  • #623: Adds device-compatible tuple implementation for GPU support by creating Tuple.h in the GPU architecture module and adding corresponding test infrastructure.
  • #568: Adds vectorization support for comparison functors in NEON PacketMath and BinaryFunctors to enable SIMD-optimized Select operations and improve performance for large array comparisons.
  • #567: Adds GPU support for equality comparisons by modifying StlFunctors.h and MatrixCwiseBinaryOps.h with EIGEN_DEVICE_FUNC annotations and enabling portable comparison operations across different device types.
  • #545: Adds support to disable specialized gemm_pack_rhs in Eigen's AltiVec MatrixProduct for PPC architecture, specifically enabling TensorFlow to optimize packing operations with direct virtual tensor access.
  • #501: Added device implementation of log function for std::complex types in MathFunctions modules to enable complex logarithm operations on GPU/device platforms.
  • #500: Adds C++14 variable template support detection macro to Core utilities, enhancing compiler feature checking capabilities in IntegralConstant.h and Macros.h.
  • #473: Adds HasExp support for AVX512 Packet8d, enabling vectorized exponential function operations for double-precision packets in the AVX512 architecture.

Other removed

  • #1642: Reverted the "fix scalar pselect" change in GenericPacketMath.h, removing a previous modification to the scalar packet select functionality.
  • #1477: Removed the redundant relicense.py script from the scripts directory to clean up the codebase.
  • #1475: Removes the MoreVectorization module from the unsupported directory, eliminating redundant code and resolving ODR violations with Eigen's generalized pasin implementation.
  • #1353: Removed deprecated function call from SVD test suite to clean up test/svd_common.h and eliminate unnecessary code.
  • #1306: Removed unused HasHalfPacket enum from packet math headers including AVX512/PacketMathFP16.h, SYCL/InteropHeaders.h, and GenericPacketMath.h to reduce code size and improve clarity.
  • #1266: Removes pool creation code from CMakeLists.txt when CMake version is less than 3.11 to improve compatibility with older CMake versions.
  • #1212: Removes BF16 to F32 array conversions in Power architecture by disabling the functionality in the AltiVec MatrixProduct module.
  • #1197: Removed LGPL-licensed code and references from Eigen core library, including NonMPL2.h, COPYING.LGPL file, and unsupported directory components to simplify licensing and eliminate potential MPL2 conflicts.
  • #1092: Removed deprecated M_PI_2 and M_PI_4 constant references from Core math functions and binary functors to improve code maintainability.
  • #1074: Reverts the addition of constexpr support and C++14 constexpr testing across core Eigen modules, removing the compile-time evaluation test file and rolling back constexpr-related changes to functors, matrix operations, and utility classes.
  • #946: Removed the deprecated EIGEN_EMPTY_STRUCT_CTOR workaround from multiple Core modules and functors. This eliminates an old GCC compatibility hack to modernize the codebase.
  • #902: Temporarily disables aarch64 CI builds by modifying the GitLab CI configuration files due to machine downtime.
  • #897: Removed gcc 4.3 copy_bool workaround from Core utilities and test infrastructure to improve compatibility with modern compilers.
  • #793: Removed the unused EIGEN_HAS_STATIC_ARRAY_TEMPLATE macro from the codebase to reduce build noise and improve code clarity.
  • #772: Removed several Eigen macros (EIGEN_HAS_CONSTEXPR, EIGEN_HAS_INDEX_LIST, EIGEN_HAS_STD_RESULT_OF) and their related implementations from core utilities and tensor modules. This cleanup eliminates redundant code and reduces macro-related complexity across the codebase.
  • #768: Removed redundant CMake Find scripts (FindBLAS.cmake, FindGLEW.cmake, FindGSL.cmake, FindLAPACK.cmake) that are already provided by CMake itself, simplifying the build configuration.
  • #744: Removed obsolete feature test macros and compiler version checks from Eigen's core infrastructure, including EIGEN_HAS_CXX14, EIGEN_HAS_VARIADIC_TEMPLATES, and support for GCC < 5.1 and MSCV < 1900. Updated minimum compiler requirements and modernized preprocessor directives across core modules and tensor components.
  • #740: Removed the deprecated nonZeros() method from DenseBase class since it was redundant and merely called size(). This simplifies the codebase by eliminating unnecessary functionality.
  • #739: Removes GCC-4.8 tests from CI configuration by disabling them in both build.gitlab-ci.yml and test.gitlab-ci.yml files. This reduces dependency on GCC-4.8 and allows lowering the minimum C++ standard to C++14.
  • #732: Removed EIGEN_HAS_CXX11 macro and all associated conditional compilation code from core Eigen modules. Simplified the codebase by eliminating C++11 version checks and legacy GCC compatibility code.
  • #725: Removed deprecated MappedSparseMatrix type from Eigen's sparse core module, deleting the header file and eliminating all internal references to ensure C++14 compatibility.
  • #636: Removed obsolete DynamicSparseMatrix references from SparseCore module files and test headers to clean up deprecated code remnants.
  • #632: Removed unused EIGEN_DEFINITIONS from CMakeLists.txt to clean up the CMake build configuration.
  • #608: Removed C++11-off CI jobs from the build configuration to simplify the CI pipeline and improve maintainability by transitioning away from C++03 standards.
  • #601: Removed unaligned assert tests from the test suite, deleting the test/unalignedassert.cpp file and updating related geometry test files and CMake configuration.
  • #538: Removed unused macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and CJMADD from architecture-specific PacketMath headers and GeneralBlockPanelKernel. This cleanup eliminates dead code that was only used on X86 where it produced identical performance to existing implementations.

Major changes

  • #826: Updates SVD module with Options template parameter for flexible computation options while deprecating old constructor for backward compatibility.

Breaking changes

  • #658: Breaks backward compatibility by adding an Options template parameter to JacobiSVD and BDCSVD classes in the SVD module. Changes the API to allow users to specify computation options like thin unitaries for fixed-size matrices.
  • #649: Breaks API compatibility by moving Eigen::all, Eigen::last, and Eigen::lastp1 back to the Eigen::placeholders namespace. Removes the deprecated Eigen::placeholders namespace and updates indexing imports to resolve compiler warnings and improve compatibility with external projects.

Unsupported modules

Other fixed

  • #1929: Fixed Doxygen documentation build issues in the Tensor module by addressing markdown link compatibility problems in the README file.
  • #1901: Fixed type overflow error in SpecialFunctionsImpl by removing long cast from scalar parity check. This addresses a runtime error discovered by TensorFlow fuzzer related to type conversion issues.
  • #1887: Fixed unused local typedef warning in MatrixExponential.h by removing the unused 'Scalar' typedef declaration.
  • #1860: Fixed test for TensorRef trace operation in the unsupported tensor module. Corrected test coverage to ensure proper testing of the trace functionality on TensorRef objects.
  • #1851: Fixed Givens rotation implementation in the NonLinearOptimization module. Addresses a bug that could affect numerical stability in linear algebra operations.
  • #1836: Fixed implicit copy-constructor warning in TensorRef by adding an explicit copy constructor to the TensorRef class.
  • #1828: Fixed TensorRef to support assigning expressions with different index types and enforced immutability for immutable referenced expressions, resolving issue #2884.
  • #1809: Fixed tensor documentation by removing explicit \class ... statements and moving documentation to main classes to eliminate dangling references across multiple tensor implementation files.
  • #1793: Fixes uninitialized memory reads in special_packetmath test by zero-initializing test arrays to prevent memory sanitizer errors.
  • #1769: Fixed subnormal flushing bug in the special packetmath erfc function for ARM32 architecture by modifying the reference function to properly handle subnormals.
  • #1707: Fixed the erf function in SpecialFunctionsImpl to avoid producing NaN values for large input magnitudes. Improved numerical stability while achieving up to 28% performance speedup with AVX+FMA.
  • #1698: Fixed implicit conversion issue in TensorChipping class to improve compatibility and correctness of type conversions.
  • #1678: Fixed compiler warning in TensorVolumePatch.h by suppressing Wmaybe-uninitialized warning caused by unreachable switch case code.
  • #1658: Fixed the static double assignment of pi in the kissfft FFT implementation to ensure accurate FFT computations.
  • #1645: Fixed implicit this capture warnings in tensor thread pool lambdas by adding explicit this captures to TensorContractionThreadPool and TensorDeviceThreadPool classes.
  • #1614: Fixed FFT module to correctly handle destinations with non-unit stride by adding a temporary buffer for evaluation and copying the result to the true destination.
  • #1607: Fixed hard-coded magic bounds in unsupported nonlinear optimization tests by relaxing error bounds in NonLinearOptimization.cpp and levenberg_marquardt.cpp to improve test reliability across different platforms.
  • #1602: Fixes error bounds in nonlinear optimization tests to properly account for AVX without FMA conditions, adjusting tolerance values in NonLinearOptimization and Levenberg-Marquardt test files.
  • #1597: Fixed enum comparison warnings in the AutoDiff module by addressing compiler warnings in AutoDiffScalar.h.
  • #1596: Fixed unused variable warnings in TensorIO module by addressing compiler warning issues in the TensorIO.h implementation.
  • #1568: Fixed ScalarPrinter redefinition issue in TensorIO.h to resolve GCC compiler conflicts and ensure proper compilation compatibility.
  • #1537: Fixed static_assert compatibility issues in CoherentPadOp.h for C++14 standard compliance, ensuring proper compilation in C++14 environments.
  • #1517: Fixed uninitialized memory usage in the kronecker_product test to prevent potential memory errors in the test case.
  • #1479: Fixed markdown formatting in Eigen::Tensor README.md to restore proper documentation structure and readability.
  • #1469: Fixed explicit specialization of member functions in tensor executor test to improve compiler compatibility with clang and ensure C++ standard compliance.
  • #1467: Fixed compile-time error in tensor executor test caused by static assertions for chip dimensions, enabling proper compile-time checks to avoid runtime errors.
  • #1466: Fixed tensor chipping operations by adding dimension index assertions and removing dimension checks that were incompatible with expressions.
  • #1463: Reverted previously added asserts for the .chip functionality in Tensor modules to fix broken tests.
  • #1453: Fixes memory management issue in TensorForcedEval by properly handling temporary buffers when the evaluator is copied, preventing double-free or memory access errors.
  • #1410: Fixed integer overflow in TensorExecutor by replacing int types with DenseIndex and adding explicit casts to prevent crashes in cxx11_tensor_gpu_1 test.
  • #1407: Fixed Wshorten-64-to-32 warnings in the div_ceil function within tensor contraction and device thread pool modules by addressing implicit widening conversions.
  • #1406: Fixed TensorReduction implementation by replacing deprecated divup function with div_ceil to reduce deprecation warnings and align with C++ standard requirements.
  • #1391: Fixed ThreadPool symbol export warnings by modifying the legacy header to silence clang include-cleaner warnings for ThreadPool symbols.
  • #1382: Fixed tensor strided linear buffer copy in TensorBlock.h by avoiding negative indices in loop bounds using division and multiplication on compile-time powers of two, ensuring proper unsigned integer wrapping behavior.
  • #1378: Fixed clang-tidy warning in TensorDeviceThreadPool.h by addressing forwarding reference issue to improve code clarity and safety.
  • #1341: Fixed GPU tensor benchmarks by replacing CudaStreamDevice with GpuStreamDevice to ensure proper usage of stream devices.
  • #1320: Fixed undefined behavior in FFTW/IMKL FFT backends by replacing raw pointers with std::shared_ptr for FFT plan objects, improving memory management and preventing copying issues.
  • #1303: Fixes the Erf() function in SpecialFunctionsImpl to correctly return +/-1 above the clamping point, improving performance in certain cases.
  • #1287: Fixed TensorContraction to handle empty tensor contractions by returning nullptr for size 0 allocations instead of triggering an assert, preventing potential crashes when contracting empty tensors.
  • #1243: Fixed tensor comparison test by reverting code that was incorrectly left in a failing state due to a forgotten revert.
  • #1237: Fixed GPU resource exhaustion in 3D tensor convolution operations by adjusting internal variable sizes in TensorConvolution.h to prevent out-of-resources failures.
  • #1227: Fixed null placeholder accessor issue in SYCL backend by modifying TensorDeviceSycl.h and TensorReduction.h to prevent creation of null accessors. Resolved segmentation fault in Reduction test and ensured compatibility with DPCPP compiler's SYCL 2020 enforcement.
  • #1077: Fixed unused-result warning for gpuGetDevice in TensorDeviceGpu by adding proper warning status check that prints failure message when GPU device retrieval fails.
  • #1031: Fixed bitwise warnings in TensorIndexList by replacing bool with Eigen::boolean to eliminate compiler warnings.
  • #1006: Fixed dependency resolution in AutoDiff module by adding missing include for Eigen/Core header to prevent potential build errors.
  • #1002: Fixes clang-tidy warnings about function definitions in headers in the FFT test utilities by modifying unsupported/test/fft_test_shared.h.
  • #1001: Fixed conditional compilation in AVX512 BesselFunctions to skip f16/bf16 specializations when unavailable, preventing build errors on MSVC < 1923 and GCC < 5.3.
  • #991: Fixes ambiguous comparison operators in TensorBase for C++20 compatibility by making comparisons symmetric.
  • #989: Fixed C++20 comparison operator ambiguity in TensorBase by resolving conflicts caused by operator reversal in the C++20 standard.
  • #986: Fixed SYCL tensor convolution by removing default constructor for range class and ensuring at least one thread runs in parallel_for by setting range size to 1.
  • #982: Fixed Tensor comparison operators in TensorBase.h to resolve ambiguity issues with C++20 standards. This ensures compatibility with newer C++ versions by eliminating ambiguous operator definitions.
  • #937: Fixed unused warning in TensorTrace.h by eliminating the trace function warning in the unsupported Tensor module.
  • #898: Fixed edge-case in zeta special function for large inputs by addressing overflow issue in the tail sum correction term parameter, preventing NaNs and aligning behavior with scipy.
  • #894: Fixed broken tensor executor test and enabled support for tensor packets of size 1 across multiple tensor operations. Modified tensor broadcasting, chipping, reduction, and other operations to handle PacketSize == 1 cases, improving compatibility with platforms where vectorization is not available.
  • #884: Fixed NonLinearOptimization by removing poor non-convergence checks that caused numerical instability issues. Simplified test cases to improve compatibility across different architectures and optimization settings.
  • #883: Fixed matrix_power test tolerance for MSVC 19.16 compatibility. Adjusted tolerance settings in the unsupported matrix_power test to resolve test failures on this compiler version.
  • #863: Fixed numerical differences in tensor block evaluation tests by modifying test expressions to avoid issues caused by operation fusing during aggressive optimization.
  • #853: Fixes ODR (One Definition Rule) violations in TensorRandom implementation to resolve compilation issues and ensure standard compliance.
  • #803: Fixes GCC 8.5 compiler warnings in Tensor classes by explicitly initializing base classes in Tensor.h, TensorFixedSize.h, and TensorRef.h.
  • #770: Fixed customIndices2Array function in TensorMeta.h to properly include the first index in the resulting array, resolving potential confusion in tensor module index handling.
  • #765: Fixed ambiguous overload resolution in TensorMeta.h by disambiguating overloads for customIndices2Array when the index list is empty, resolving a Clang compiler warning.
  • #759: Fixed typo in IDRS.h by correcting StableNorm to stableNorm to ensure proper function name consistency in the iterative solvers module.
  • #755: Fixed a leftover else branch in TensorDimensions.h of the unsupported Tensor module. Removed unnecessary else branch that was part of an #ifdef directive to ensure correct header inclusion.
  • #733: Fixed shadowing definition warnings in TensorIO.h to reduce compiler warnings and improve code clarity in the unsupported Tensor module.
  • #728: Fixed Windows build errors in TensorIO.h by addressing platform-specific compatibility issues in the unsupported Tensor module.
  • #724: Fixed TensorIO implementation to support TensorMap with const elements. The change ensures compatibility between the new TensorIO functionality and const-qualified tensor maps.
  • #723: Fixed off-by-one error in tensor broadcasting implementation that was causing inconsistencies with JAX unit tests and improved robustness when broadcast size is smaller than packet size.
  • #715: Fixed tensor reduction test by modifying the test to compare summation results against forward error bound instead of exact values, improving test reliability and stability.
  • #713: Fixes integer overflow issues in TensorExecutor and TensorMeta by modifying indexing calculations to use saturated addition when overflow is possible, preventing CUDA_ERROR_ILLEGAL_ADDRESS errors for large tensor sizes.
  • #705: Fixed TensorReduction warnings and error bounds by adjusting sum accuracy test calculations in Tensor module and resolving MSVC compilation warnings.
  • #704: Fixed a problematic take function implementation in CXX11Meta.h that was causing g++-11 compiler crashes by removing the problematic code.
  • #691: Fixed clang warning about bitwise-instead-of-logical operations in TensorUInt128.h by modifying the code to use proper logical operations instead of bitwise ones.
  • #689: Fixed index-out-of-bounds error in tensor broadcasting for 1D vectors that bypass the special blocking path. Corrected broadcast size computation for complex types to improve stability of the broadcasting logic.
  • #681: Fixed integer overflow issues in Tensor GPU execution by modifying EigenMetaKernel indexing calculations to prevent CUDA_ERROR_ILLEGAL_ADDRESS errors with large tensor sizes.
  • #679: Fixes memory errors in GPU tensor reductions by disabling tree reduction for GPU operations in TensorReduction.h.
  • #628: Fixed symbol name conflict in cxx11_tensor_expr test by renaming 'vec_all_nan' to avoid collision with altivec.h, resolving PPC64LE build failures.
  • #611: Fixed missing header inclusion by adding unordered_map header to SparseExtra module and its test file to ensure proper compilation of sparse extra matrix operations.
  • #571: Fixed reserved identifier issue in AutoDiffScalar by renaming template parameter from _derType to DerivativeType to comply with C++ naming standards.
  • #552: Fixed Tensor documentation formatting by removing duplicate table of contents tag and replacing markdown code blocks with HTML to resolve Doxygen rendering issues.
  • #547: Fixed TensorShuffling to handle empty tensors by preventing runtime crashes when TensorIntDivisor constructor receives 0, and added test coverage for this edge case.
  • #540: Fixed TensorFunctors argmin/argmax to always return the first occurrence of min/max values. This ensures consistent behavior across multithreaded and GPU implementations, resolving unpredictable results in TensorFlow tests.
  • #531: Fixed overflow issues in the balancer implementation within the Companion matrix module by rewriting the logic to handle large row/column norms without overflowing.
  • #526: Fixed compilation issue in Tensor documentation by updating the example code in the README.md file.
  • #488: Fixed TensorRandom by removing time-dependence to resolve test reproducibility issues. Simplified random number generation logic using consistent approaches for CPU and GPU platforms.
  • #481: Fixes static global variable issues in TensorDeviceGpu.h by replacing them with inline functions that manage static local variables safely, preventing multiple copies across translation units.
  • #477: Fixed calls to device functions from host code in Tensor module by restricting host-only functions and creating device implementations where needed. Resolves CUDA 11.3 build issues and undefined behavior when using nvcc compiler.
  • #476: Fixes TensorRandom.h to check for BSD random() function availability before use. Adds fallback to rand() when random() is not available, ensuring MinGW compatibility.

Other improved

  • #1916: Improved tensor module documentation by adding content to the tensor README file.
  • #1859: Improved TensorTrace compatibility with TensorRef by marking tensor trace as non-l-value and ensuring consistent pseudo data pointer behavior with TensorBroadcastingOp.
  • #1849: Improved TensorDeviceThreadPool.h formatting and enhanced C++20 compatibility by using if constexpr statements.
  • #1848: Improved TensorDeviceThreadPool by removing unused methods, reducing type erasure in parallelFor, and enabling perfect forwarding in enqueue for C++20 compatibility.
  • #1844: Optimizes division operations in TensorVolumePatch.h by reducing division operations from 2 to 1 when PacketSize=1, improving performance and reducing CPU cycles for tensor volume patch operations.
  • #1747: Optimizes the erf function in SpecialFunctionsImpl by removing redundant computation for large arguments, reducing computational overhead.
  • #1706: Improved the erf() function implementation in SpecialFunctionsImpl.h, reducing maximum error from 4 to 3 ulps and achieving significant performance speedups across various floating-point scales on AVX2+FMA and SSE 4.2 architectures.
  • #1680: Improves TensorChipping by adding detection for "effectively inner/outer" chipping cases where dimensions have a product of 1 and don't affect strides, extending existing optimizations to handle these edge cases.
  • #1571: Improved Tensor module compatibility by replacing Eigen::array with std::array usage to leverage C++17 standard library features and enable better GPU support.
  • #1563: Improves complex number formatting in Tensor IO by adding custom Numpy-compatible (1+2j) and Native-compatible ({1, 2}) output formats, enhancing usability and copy-paste functionality.
  • #1542: Improved cxx11_tensor_gpu test by splitting it into smaller parts to reduce timeout issues on Windows systems.
  • #1470: Improves code formatting in the tensor executor test file for better readability and consistency.
  • #1462: Improves sparse_extra test by adding support for specifying a temporary directory for fileio outputs, enabling testing on systems that cannot write to the current directory.
  • #1457: Improves tensor chipping operations by adding static and dynamic asserts to validate chipping dimensions and offsets in TensorBase.h and TensorChipping.h.
  • #1423: Improves Tensor constructors by adding static asserts to check for matching NumDimensions between the constructor and provided OtherDerived, preventing runtime errors from dimensional mismatches.
  • #1324: Improves the ndtri special function to return NaN for out-of-range input values, ensuring consistency with scipy and MATLAB behavior.
  • #1298: Optimized tensor select evaluator by implementing ternary operation usage with scalar_boolean_select_op, reducing execution time by 13% in benchmarks.
  • #1294: Improves accuracy of the erf() function in SpecialFunctionsImpl by implementing a better rational approximation with more careful clamping, reducing maximum relative errors for both subnormal and normalized floats.
  • #1117: Improved code quality in IDRS.h by removing unused variables and fixing comment formatting with proper line breaks.
  • #983: Improved SYCL queue interface to accept existing SYCL queues, enabling better integration with high-level frameworks by reusing contexts and reducing memory movement overhead.
  • #932: Improves AutoDiff module performance by replacing the make_coherent function with a more efficient CoherentPadOp for handling derivative size mismatches, achieving ~20% performance gains in benchmarks.
  • #757: Improved IDRS iterative solver by replacing norm() calls with StableNorm() for better numerical stability and reformatting the code structure.
  • #726: Improves Eigen::array by adding basic iterator support to facilitate transition from std::array in C++11 code and address syntax inconsistencies when EIGEN_AVOID_STL_ARRAY is used.
  • #676: Improved accuracy of tensor reduction operations for half and bfloat16 types by implementing a tree summation algorithm with bounded relative error.
  • #669: Optimized GPU tensor contraction test by reducing the number of contractions from 3600 to 27 for subtests 8 and 9. This improves test execution time on Windows while maintaining test coverage.
  • #641: Improves TensorIndexList implementation by removing unnecessary std::tuple reference, simplifying the code and reducing complexity.
  • #637: Improved the SparseExtra module by removing obsolete DynamicSparseMatrix references and fixing typos reported by clang-tidy.
  • #622: Improves Eigen's Tensor module by renaming Tuple to Pair across tensor-related source files to prepare for a new std::tuple-compatible Tuple class that supports GPU usage and aligned Eigen types.
  • #619: Improved documentation for unsupported sparse iterative solvers by fixing the header and removing outdated commented code.
  • #605: Improved RandomSetter in SparseExtra module by replacing std::map with std::unordered_map for better performance and reduced complexity.
  • #521: Improves TensorContraction.h by adding #ifndef guards around TENSOR_CONTRACTION_DISPATCH macro to allow custom contraction dispatch logic for better TensorFlow Lite integration.
  • #493: Improved GPU device property management in TensorDeviceGpu by adding a singleton class to encapsulate initialization and retrieval, reducing low-level property exposure and addressing static linkage issues.

Other added

  • #1884: Adds DUCC FFT implementation support to Eigen's unsupported FFT module, including new implementation header, test coverage, and renaming of existing FFT implementation files for consistency.
  • #1710: Adds vectorized implementation of erfc() function for float type in SpecialFunctions module. Optimizes performance using SSE and AVX2 instructions while maintaining accuracy across all input ranges.
  • #1702: Adds max_digits10 member to NumTraits for mpreal types in the MPRealSupport module, enhancing numeric property handling for arbitrary precision real numbers.
  • #1644: Adds asynchronous execution support to the chip and extract_volume_patches tensor operations. Extends these operations with multi-threaded capabilities following existing async patterns in the tensor module.
  • #1613: Adds muluh function for 128-bit integer operations in TensorIntDiv, implementing scalar division support for MSVC in Eigen's tensor codebase.
  • #1305: Added new matrix multiplication function with performance optimizations to TensorBlock. Implemented EIGEN_STRONGLY_TYPED_ALGEBRAIC_MATRIX_MULTIPLICATION function with improved error handling and documentation.
  • #1125: Adds synchronize method to all tensor device classes, including dummy implementations for threadpool and default devices that are synchronous by default.
  • #981: Adds MKL adapter support to the FFT module, implementing oneAPI MKL FFT library integration with new adapter files and test cases.
  • #978: Added sparse subset matrix inverse functionality to SparseExtra module, implementing the Takahashi method for efficient computation of specific inverse matrix elements with improved numerical stability through Kahan summation.
  • #973: Adds arg() method to Tensor class for computing complex number arguments, including test coverage for complex tensor operations.
  • #852: Adds a constexpr std::size_t size() const convenience method to Eigen::IndexList to provide direct access to the size without iteration.
  • #729: Adds reverse_iterator support to Eigen::array when std::reverse_iterator exists, enabling backward-compatible iteration for tensor operations in the unsupported CXX11 modules.
  • #667: Added new matrix multiplication function with performance optimizations to TensorReduction module, implementing strongly typed algebraic operations with improved error handling and documentation.
  • #617: Adds support for dense matrices to the MatrixMarket reader/writer in the SparseExtra module, extending the existing sparse-only functionality with new dense matrix capabilities and documentation.
  • #612: Adds EIGEN_TENSOR_PLUGIN support to Tensor classes, enabling plugin functionality for TensorBase and related tensor components in the unsupported module.
  • #607: Adds a flowchart to the unsupported/Eigen/IterativeSolvers directory to help users visually select the most appropriate iterative solver for their sparse matrix problems.
  • #578: Added test coverage for std::unordered_map in the sparse_extra.cpp test file to enable C++11 standard library testing.
  • #520: Adds permanent enablement capability for HIP/CUDA GPU defines in TensorGpuHipCudaDefines.h, improving portability and flexibility for users wanting to enable GPU features in their code.

Other removed

  • #1474: Removed the entire Skyline library from the unsupported modules, deleting all related header files and removing it from the CMake build configuration to clean up deprecated and unused code.
  • #1080: Removes unused typedef from unsupported/test/sparse_extra.cpp test file to clean up code.
  • #752: Deprecated the EIGEN_GPU_TEST_C99_MATH macro in the tensor GPU test file as it was only used in one location and always evaluated to true.
  • #606: Removed sparse dynamic matrix functionality from the SparseExtra module, deleting deprecated API files and unsupported codebase including DynamicSparseMatrix.h and BlockOfDynamicSparseMatrix.h.
  • #513: Removed dead code from GPU float16 unit tests in the unsupported tensor module to reduce code bloat.
This file has been truncated, but you can view the full file.
{"iid":1938,"title":"Fix typo: duplicated 'for' in docs","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1938","description":"### Reference issue\n\n\n### What does this implement/fix?\n\nFixed 4 instances of duplicated 'for' in the code.\n\n### Additional information","created_at":"2025-07-14T06:46:03.413Z","merged_at":"2025-07-16T01:23:05.434Z","author":{"name":"Kuan-Ting Lee","username":"leekt0124"},"changes":[{"diff":"@@ -28,7 +28,7 @@ namespace internal {\n 2. If a is zero, approx_a_recip must be infinite with the same sign as a.\n 3. If a is infinite, approx_a_recip must be zero with the same sign as a.\n \n- If the preconditions are satisfied, which they are for for the _*_rcp_ps\n+ If the preconditions are satisfied, which they are for the _*_rcp_ps\n instructions on x86, the result has a maximum relative error of 2 ulps,\n and correctly handles reciprocals of zero, infinity, and NaN.\n */\n@@ -66,7 +66,7 @@ struct generic_reciprocal_newton_step<Packet, 0> {\n 2. If a is zero, approx_a_recip must be infinite with the same sign as a.\n 3. If a is infinite, approx_a_recip must be zero with the same sign as a.\n \n- If the preconditions are satisfied, which they are for for the _*_rcp_ps\n+ If the preconditions are satisfied, which they are for the _*_rcp_ps\n instructions on x86, the result has a maximum relative error of 2 ulps,\n and correctly handles zero, infinity, and NaN. Positive denormals are\n treated as zero.\n@@ -116,7 +116,7 @@ struct generic_rsqrt_newton_step<Packet, 0> {\n 2. If a is zero, approx_rsqrt must be infinite.\n 3. If a is infinite, approx_rsqrt must be zero.\n \n- If the preconditions are satisfied, which they are for for the _*_rsqrt_ps\n+ If the preconditions are satisfied, which they are for the _*_rsqrt_ps\n instructions on x86, the result has a maximum relative error of 2 ulps,\n and correctly handles zero and infinity, and NaN. Positive denormal inputs\n are treated as zero.\n","new_path":"Eigen/src/Core/MathFunctionsImpl.h","old_path":"Eigen/src/Core/MathFunctionsImpl.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -449,7 +449,7 @@ conj(array1)\n </td></tr>\n </table>\n \n-Some coefficient-wise operators are readily available for for matrices and vectors through the following cwise* methods:\n+Some coefficient-wise operators are readily available for matrices and vectors through the following cwise* methods:\n <table class=\"manual\">\n <tr><th>Matrix API \\matrixworld</th><th>Via Array conversions</th></tr>\n <tr><td>\\code\n","new_path":"doc/QuickReference.dox","old_path":"doc/QuickReference.dox","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"2","labels":["5.0"],"state":"merged","summary":"## Title:\nFix typo: duplicated 'for' in docs\n\n## Author:\nKuan-Ting Lee (leekt0124)\n\n## Summary\n### Key Changes:\n- Fixed duplicated 'for' in code (4 instances)\n### Improvements:\n- Corrected typo in documentation\n### Impact:\n- Improved code clarity in documentation"}
{"iid":1937,"title":"Suppress Warray-bounds warning in generic ploaduSegment, fix edge case for vectorized cast","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1937","description":"### Reference issue\n\n\n### What does this implement/fix?\n\n\ng++ will produce the following warning when compiling the half_float test with -O2 -Warray-bounds -mavx2 (possibly other simd flags). \n```\nwarning: ‘void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)’ forming offset [16, 17] is out of the bounds [0, 16] of object ‘aux’ with type ‘Scalar [8]’ {aka ‘Eigen::half [8]’} [-Warray-bounds=]\n```\nThis warning can be suppressed with various asserts in `smart_copy`, or just replacing `smart_copy` with a loop, which I have done here.\n\nThis also fixes an (apparently unrelated) edge case when vectorizing casts involving partial packets. \n\n### Additional information","created_at":"2025-07-14T03:43:10.014Z","merged_at":"2025-07-23T22:26:42.053Z","author":{"name":"Charles Schlosser","username":"chuckyschluz"},"changes":[{"diff":"@@ -707,7 +707,7 @@ struct unary_evaluator<CwiseUnaryOp<core_cast_op<SrcType, DstType>, ArgType>, In\n Index packetOffset = offset * PacketSize;\n Index actualRow = IsRowMajor ? row : row + packetOffset;\n Index actualCol = IsRowMajor ? col + packetOffset : col;\n- eigen_assert(check_array_bounds(actualRow, actualCol, 0, count) && \"Array index out of bounds\");\n+ eigen_assert(check_array_bounds(actualRow, actualCol, begin, count) && \"Array index out of bounds\");\n return m_argImpl.template packetSegment<LoadMode, PacketType>(actualRow, actualCol, begin, count);\n }\n template <int LoadMode, typename PacketType = SrcPacketType>\n@@ -715,8 +715,8 @@ struct unary_evaluator<CwiseUnaryOp<core_cast_op<SrcType, DstType>, ArgType>, In\n Index offset) const {\n constexpr int PacketSize = unpacket_traits<PacketType>::size;\n Index packetOffset = offset * PacketSize;\n- Index actualIndex = index + packetOffset + begin;\n- eigen_assert(check_array_bounds(actualIndex, 0, count) && \"Array index out of bounds\");\n+ Index actualIndex = index + packetOffset;\n+ eigen_assert(check_array_bounds(actualIndex, begin, count) && \"Array index out of bounds\");\n return m_argImpl.template packetSegment<LoadMode, PacketType>(actualIndex, begin, count);\n }\n \n","new_path":"Eigen/src/Core/CoreEvaluators.h","old_path":"Eigen/src/Core/CoreEvaluators.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -1596,9 +1596,10 @@ EIGEN_DEVICE_FUNC inline Packet ploaduSegment(const typename unpacket_traits<Pac\n using Scalar = typename unpacket_traits<Packet>::type;\n constexpr Index PacketSize = unpacket_traits<Packet>::size;\n eigen_assert((begin >= 0 && count >= 0 && begin + count <= PacketSize) && \"invalid range\");\n- Scalar aux[PacketSize];\n- memset(static_cast<void*>(aux), 0x00, sizeof(Scalar) * PacketSize);\n- smart_copy(from + begin, from + begin + count, aux + begin);\n+ Scalar aux[PacketSize] = {};\n+ for (Index k = begin; k < begin + count; k++) {\n+ aux[k] = from[k];\n+ }\n return ploadu<Packet>(aux);\n }\n \n@@ -1619,7 +1620,9 @@ EIGEN_DEVICE_FUNC inline void pstoreuSegment(Scalar* to, const Packet& from, Ind\n eigen_assert((begin >= 0 && count >= 0 && begin + count <= PacketSize) && \"invalid range\");\n Scalar aux[PacketSize];\n pstoreu<Scalar, Packet>(aux, from);\n- smart_copy(aux + begin, aux + begin + count, to + begin);\n+ for (Index k = begin; k < begin + count; k++) {\n+ to[k] = aux[k];\n+ }\n }\n \n /** \\internal copy the packet \\a from in the range [begin, begin + count) to \\a *to.\n","new_path":"Eigen/src/Core/GenericPacketMath.h","old_path":"Eigen/src/Core/GenericPacketMath.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -142,6 +142,21 @@ struct packet_segment_test_driver<Scalar, 1> {\n static void run() {}\n };\n \n+template <bool Enable = internal::packet_traits<half>::Vectorizable>\n+void testReverseEdgeCase() {\n+ // this reversed cast uses a non-zero offset for ploadSegment\n+ Index size = 16 * internal::packet_traits<half>::size + 1;\n+ VectorX<half> v1(size);\n+ VectorX<float> v2(size), v3(size);\n+ v1.setRandom();\n+ v2 = v1.reverse().cast<float>();\n+ v3 = v1.cast<float>().reverse();\n+ VERIFY_IS_EQUAL(v2, v3);\n+}\n+\n+template <>\n+void testReverseEdgeCase<false>() {}\n+\n template <typename Scalar>\n void test_packet_segment() {\n packet_segment_test_driver<Scalar, internal::packet_traits<Scalar>::size>::run();\n@@ -164,5 +179,6 @@ EIGEN_DECLARE_TEST(packet_segment) {\n test_packet_segment<double>();\n test_packet_segment<std::complex<float>>();\n test_packet_segment<std::complex<double>>();\n+ testReverseEdgeCase();\n }\n }\n","new_path":"test/packet_segment.cpp","old_path":"test/packet_segment.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"3","labels":["5.0"],"state":"merged","summary":"## Title:\nSuppress Warray-bounds warning in generic ploaduSegment, fix edge case for vectorized cast\n\n## Author:\nCharles Schlosser (chuckyschluz)\n\n## Summary\n### Key Changes:\n- Modified `Eigen/src/Core/CoreEvaluators.h`\n- Modified `Eigen/src/Core/GenericPacketMath.h`\n- Modified `test/packet_segment.cpp`\n\n### Improvements:\n- Suppressed a warning related to array bounds in the `ploaduSegment` function\n- Fixed an edge case in vectorized casts involving partial packets\n\n### Impact:\n- Reduced compiler warnings in the half_float test\n- Addressed a potential issue with vectorized casts in the packet segment implementation"}
{"iid":1936,"title":"Fixed -Wshadow warning by renaming variables","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1936","description":"","created_at":"2025-07-11T15:30:53.192Z","merged_at":"2025-07-13T14:11:19.413Z","author":{"name":"Sean McBride","username":"seanmcb"},"changes":[{"diff":"@@ -638,7 +638,7 @@ struct pminmax_impl<PropagateNumbers, false> {\n }\n };\n \n-#define EIGEN_BINARY_OP_NAN_PROPAGATION(Type, Func) [](const Type& a, const Type& b) { return Func(a, b); }\n+#define EIGEN_BINARY_OP_NAN_PROPAGATION(Type, Func) [](const Type& aa, const Type& bb) { return Func(aa, bb); }\n \n /** \\internal \\returns the min of \\a a and \\a b (coeff-wise).\n If \\a a or \\b b is NaN, the return value is implementation defined. */\n","new_path":"Eigen/src/Core/GenericPacketMath.h","old_path":"Eigen/src/Core/GenericPacketMath.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nFixed -Wshadow warning by renaming variables\n\n## Author:\nSean McBride (seanmcb)\n\n## Summary\n### Key Changes:\n- Modified `Eigen/src/Core/GenericPacketMath.h` to address the -Wshadow warning by renaming variables.\n\n### Improvements:\n- Addressed compiler warning related to variable shadowing in the Eigen library.\n\n### Impact:\n- Reduced compiler warnings in the Eigen library by renaming variables to avoid shadowing issues."}
{"iid":1935,"title":"Fix self-adjoint products when multiplying by a compile-time vector.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1935","description":"The implicit assumption for self-adjoint Matrix-Vector products\nis that the vector is a column vector (i.e. same size as the matrix).\n \nIn the case of a row vector, revert to the generic matrix-matrix\nproduct.\n\nThis replaces !1931, which broke the `selfadjoint_eigensolver` tests.\n\nFixes #2943.","created_at":"2025-07-08T17:36:05.061Z","merged_at":"2025-07-08T21:49:00.603Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -164,6 +164,11 @@ struct selfadjoint_product_impl<Lhs, LhsMode, false, Rhs, 0, true> {\n \n enum { LhsUpLo = LhsMode & (Upper | Lower) };\n \n+ // Verify that the Rhs is a vector in the correct orientation.\n+ // Otherwise, we break the assumption that we are multiplying\n+ // MxN * Nx1.\n+ static_assert(Rhs::ColsAtCompileTime == 1, \"The RHS must be a column vector.\");\n+\n template <typename Dest>\n static EIGEN_DEVICE_FUNC void run(Dest& dest, const Lhs& a_lhs, const Rhs& a_rhs, const Scalar& alpha) {\n typedef typename Dest::Scalar ResScalar;\n@@ -173,11 +178,6 @@ struct selfadjoint_product_impl<Lhs, LhsMode, false, Rhs, 0, true> {\n \n eigen_assert(dest.rows() == a_lhs.rows() && dest.cols() == a_rhs.cols());\n \n- if (a_lhs.rows() == 1) {\n- dest = (alpha * a_lhs.coeff(0, 0)) * a_rhs;\n- return;\n- }\n-\n add_const_on_value_type_t<ActualLhsType> lhs = LhsBlasTraits::extract(a_lhs);\n add_const_on_value_type_t<ActualRhsType> rhs = RhsBlasTraits::extract(a_rhs);\n \n","new_path":"Eigen/src/Core/products/SelfadjointMatrixVector.h","old_path":"Eigen/src/Core/products/SelfadjointMatrixVector.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -846,7 +846,7 @@ struct generic_product_impl<Lhs, Rhs, SelfAdjointShape, DenseShape, ProductTag>\n \n template <typename Dest>\n static EIGEN_DEVICE_FUNC void scaleAndAddTo(Dest& dst, const Lhs& lhs, const Rhs& rhs, const Scalar& alpha) {\n- selfadjoint_product_impl<typename Lhs::MatrixType, Lhs::Mode, false, Rhs, 0, Rhs::IsVectorAtCompileTime>::run(\n+ selfadjoint_product_impl<typename Lhs::MatrixType, Lhs::Mode, false, Rhs, 0, Rhs::ColsAtCompileTime == 1>::run(\n dst, lhs.nestedExpression(), rhs, alpha);\n }\n };\n@@ -858,7 +858,7 @@ struct generic_product_impl<Lhs, Rhs, DenseShape, SelfAdjointShape, ProductTag>\n \n template <typename Dest>\n static void scaleAndAddTo(Dest& dst, const Lhs& lhs, const Rhs& rhs, const Scalar& alpha) {\n- selfadjoint_product_impl<Lhs, 0, Lhs::IsVectorAtCompileTime, typename Rhs::MatrixType, Rhs::Mode, false>::run(\n+ selfadjoint_product_impl<Lhs, 0, Lhs::RowsAtCompileTime == 1, typename Rhs::MatrixType, Rhs::Mode, false>::run(\n dst, lhs, rhs.nestedExpression(), alpha);\n }\n };\n","new_path":"Eigen/src/Core/ProductEvaluators.h","old_path":"Eigen/src/Core/ProductEvaluators.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"2","labels":["5.0"],"state":"merged","summary":"## Title:\nFix self-adjoint products when multiplying by a compile-time vector.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Modified `Eigen/src/Core/products/SelfadjointMatrixVector.h` to handle compile-time vectors correctly.\n- Modified `Eigen/src/Core/ProductEvaluators.h` to adjust behavior for row vectors.\n\n### Improvements:\n- Fixed the behavior of self-adjoint matrix-vector products for compile-time vectors.\n- Replaced an old implementation (`!1931`) that caused issues with the `selfadjoint_eigensolver` tests.\n\n### Impact:\n- Resolved a bug that affected the `selfadjoint_eigensolver` tests.\n- Improved handling of row vectors in self-adjoint products."}
{"iid":1934,"title":"Fix API incompatibility for ILU in superLU support","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1934","description":"### Reference issue\nFix issue #2949 \n\n### What does this implement/fix?\nIntroduce `GlobalLU_t` pointer in the encapsulated interface to superlu as for LU interface. \nDon't know though if a version check is needed. SuperLUv7.0.1 requires this interface. Maybe older version are not.","created_at":"2025-07-08T14:25:13.583Z","merged_at":"2025-07-17T15:27:26.920Z","author":{"name":"jacques FRANC","username":"jacquesn7"},"changes":[{"diff":"@@ -65,6 +65,24 @@ DECL_GSSVX(z, double, std::complex<double>)\n #ifdef EIGEN_SUPERLU_HAS_ILU\n \n // similarly for the incomplete factorization using gsisx\n+#if defined(SUPERLU_MAJOR_VERSION) && (SUPERLU_MAJOR_VERSION >= 5)\n+#define DECL_GSISX(PREFIX, FLOATTYPE, KEYTYPE) \\\n+ extern \"C\" { \\\n+ extern void PREFIX##gsisx(superlu_options_t *, SuperMatrix *, int *, int *, int *, char *, FLOATTYPE *, FLOATTYPE *, \\\n+ SuperMatrix *, SuperMatrix *, void *, int, SuperMatrix *, SuperMatrix *, FLOATTYPE *, \\\n+ FLOATTYPE *, GlobalLU_t *, mem_usage_t *, SuperLUStat_t *, int *); \\\n+ } \\\n+ inline float SuperLU_gsisx(superlu_options_t *options, SuperMatrix *A, int *perm_c, int *perm_r, int *etree, \\\n+ char *equed, FLOATTYPE *R, FLOATTYPE *C, SuperMatrix *L, SuperMatrix *U, void *work, \\\n+ int lwork, SuperMatrix *B, SuperMatrix *X, FLOATTYPE *recip_pivot_growth, \\\n+ FLOATTYPE *rcond, SuperLUStat_t *stats, int *info, KEYTYPE) { \\\n+ mem_usage_t mem_usage; \\\n+ GlobalLU_t gLU; \\\n+ PREFIX##gsisx(options, A, perm_c, perm_r, etree, equed, R, C, L, U, work, lwork, B, X, recip_pivot_growth, rcond, \\\n+ &gLU, &mem_usage, stats, info); \\\n+ return mem_usage.for_lu; /* bytes used by the factor storage */ \\\n+ }\n+#else // version < 5.0\n #define DECL_GSISX(PREFIX, FLOATTYPE, KEYTYPE) \\\n extern \"C\" { \\\n extern void PREFIX##gsisx(superlu_options_t *, SuperMatrix *, int *, int *, int *, char *, FLOATTYPE *, FLOATTYPE *, \\\n@@ -80,6 +98,7 @@ DECL_GSSVX(z, double, std::complex<double>)\n &mem_usage, stats, info); \\\n return mem_usage.for_lu; /* bytes used by the factor storage */ \\\n }\n+#endif\n \n DECL_GSISX(s, float, float)\n DECL_GSISX(c, float, std::complex<float>)\n","new_path":"Eigen/src/SuperLUSupport/SuperLUSupport.h","old_path":"Eigen/src/SuperLUSupport/SuperLUSupport.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nFix API incompatibility for ILU in superLU support\n\n## Author:\njacques FRANC (jacquesn7)\n\n## Summary\n### Key Changes:\n- Introduced `GlobalLU_t` pointer in the encapsulated interface for SuperLU support.\n- Addressed API incompatibility issues related to ILU in SuperLU.\n\n### Improvements:\n- Added support for SuperLUv7.0.1 interface requirements.\n\n### Impact:\n- Resolved compatibility issues between Eigen and SuperLU versions.\n- Improved interface consistency for users transitioning between versions."}
{"iid":1932,"title":"Set CMake POLICY CMP0177 to NEW","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1932","description":"","created_at":"2025-07-04T13:40:57.584Z","merged_at":"2025-07-07T16:47:05.997Z","author":{"name":"Sean McBride","username":"seanmcb"},"changes":[{"diff":"@@ -29,6 +29,11 @@ if (POLICY CMP0146)\n cmake_policy(SET CMP0146 OLD)\n endif ()\n \n+# Normalize DESTINATION paths\n+if (POLICY CMP0177)\n+ cmake_policy(SET CMP0177 NEW)\n+endif ()\n+\n #==============================================================================\n # CMake Project.\n #==============================================================================\n","new_path":"CMakeLists.txt","old_path":"CMakeLists.txt","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nSet CMake POLICY CMP0177 to NEW\n\n## Author:\nSean McBride (seanmcb)\n\n## Summary\n### Key Changes:\nMODIFIED FILE CMakeLists.txt\n\n### Improvements:\nNA\n\n### Impact:\nNA"}
{"iid":1931,"title":"Fix 1x1 selfadjoint matrix-vector product bug","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1931","description":"### Reference issue\n\nFixes #2943\n\n### What does this implement/fix?\n\n\n### Additional information","created_at":"2025-07-03T02:03:54.440Z","merged_at":"2025-07-07T17:32:55.607Z","author":{"name":"Charles Schlosser","username":"chuckyschluz"},"changes":[{"diff":"@@ -173,6 +173,11 @@ struct selfadjoint_product_impl<Lhs, LhsMode, false, Rhs, 0, true> {\n \n eigen_assert(dest.rows() == a_lhs.rows() && dest.cols() == a_rhs.cols());\n \n+ if (a_lhs.rows() == 1) {\n+ dest = (alpha * a_lhs.coeff(0, 0)) * a_rhs;\n+ return;\n+ }\n+\n add_const_on_value_type_t<ActualLhsType> lhs = LhsBlasTraits::extract(a_lhs);\n add_const_on_value_type_t<ActualRhsType> rhs = RhsBlasTraits::extract(a_rhs);\n \n","new_path":"Eigen/src/Core/products/SelfadjointMatrixVector.h","old_path":"Eigen/src/Core/products/SelfadjointMatrixVector.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -57,6 +57,10 @@ void product_selfadjoint(const MatrixType& m) {\n v1.tail(rows - 1) * v2.head(cols - 1).adjoint() + v2.head(cols - 1) * v1.tail(rows - 1).adjoint();\n VERIFY_IS_APPROX(m2, m3.template triangularView<Lower>().toDenseMatrix());\n }\n+\n+ // matrix-vector\n+ m2 = m1.template triangularView<Lower>();\n+ VERIFY_IS_APPROX(m1 * m4, m2.template selfadjointView<Lower>() * m4);\n }\n \n EIGEN_DECLARE_TEST(product_selfadjoint) {\n","new_path":"test/product_selfadjoint.cpp","old_path":"test/product_selfadjoint.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"2","labels":["5.0"],"state":"merged","summary":"## Title:\nFix 1x1 selfadjoint matrix-vector product bug\n\n## Author:\nCharles Schlosser (chuckyschluz)\n\n## Summary\n### Key Changes:\n- Fixed a bug in the 1x1 selfadjoint matrix-vector product in Eigen.\n\n### Improvements:\n- No improvements listed.\n\n### Impact:\n- Impact on the codebase is minimal, as the fix is specific to the 1x1 case."}
{"iid":1930,"title":"Use numext::fma for sparse x dense dot product.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1930","description":"This change improves accuracy of SparseQR and gives a small speedup on the example in #2583 from 170s to 163s (Skylake-X, AVX2).\n\nFlame graph before:\n\n![image.png](/uploads/a062230ba72bbd6ca0c338fde2faf553/image.png)\n\nFlame graph after:\n\n![image.png](/uploads/9a76814bfc9b9e6a314ebd5a4fed699e/image.png)\n\nFixes #2583","created_at":"2025-07-02T22:56:05.605Z","merged_at":"2025-07-02T23:46:39.828Z","author":{"name":"Rasmus Munk Larsen","username":"rmlarsen1"},"changes":[{"diff":"@@ -36,10 +36,10 @@ inline typename internal::traits<Derived>::Scalar SparseMatrixBase<Derived>::dot\n Scalar res1(0);\n Scalar res2(0);\n for (; i; ++i) {\n- res1 += numext::conj(i.value()) * other.coeff(i.index());\n+ res1 = numext::fma(numext::conj(i.value()), other.coeff(i.index()), res1);\n ++i;\n if (i) {\n- res2 += numext::conj(i.value()) * other.coeff(i.index());\n+ res2 = numext::fma(numext::conj(i.value()), other.coeff(i.index()), res2);\n }\n }\n return res1 + res2;\n","new_path":"Eigen/src/SparseCore/SparseDot.h","old_path":"Eigen/src/SparseCore/SparseDot.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nUse numext::fma for sparse x dense dot product.\n\n## Author:\nRasmus Munk Larsen (rmlarsen1)\n\n## Summary\n### Key Changes:\n- Modified `SparseDot.h` to use `numext::fma` for sparse x dense dot product.\n\n### Improvements:\n- Improved accuracy of SparseQR.\n- Small speedup on the example in #2583 (from 170s to 163s).\n\n### Impact:\n- Reduced computational error in sparse matrix-vector products.\n- Improved performance on specific workloads."}
{"iid":1929,"title":"Fix docs build.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1929","description":"Doxygen doesn't like markdown links to sections.","created_at":"2025-07-02T22:10:23.165Z","merged_at":"2025-07-02T22:10:36.450Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -92,7 +92,7 @@ TensorMap<Tensor<float, 1>> t_12(t_4x3.data(), 12);\n \n #### Class TensorRef\n \n-See [Assigning to a TensorRef.](#assigning-to-a-tensorref)\n+See **Assigning to a `TensorRef`**.\n \n ## Accessing Tensor Elements\n \n@@ -602,8 +602,8 @@ std::cout << \"Size: \" << a.size();\n A few operations provide `dimensions()` directly,\n e.g. `TensorReslicingOp`. Most operations defer calculating dimensions\n until the operation is being evaluated. If you need access to the dimensions\n-of a deferred operation, you can wrap it in a TensorRef (see \n-[Assigning to a TensorRef.](#assigning-to-a-tensorref)), which provides \n+of a deferred operation, you can wrap it in a `TensorRef` (see\n+**Assigning to a TensorRef** above), which provides\n `dimensions()` and `dimension()` as above.\n \n `TensorRef` can also wrap the plain `Tensor` types, so this is a useful idiom in\n@@ -635,7 +635,7 @@ is not initialized.\n Eigen::TensorFixedSize<float, Sizes<3, 4>> a;\n std::cout << \"Rank: \" << a.rank() << endl;\n // Rank: 2\n-std::cout << \"NumRows: \" << a.dimension(0) \n+std::cout << \"NumRows: \" << a.dimension(0)\n << \" NumCols: \" << a.dimension(1) << endl;\n // NumRows: 3 NumCols: 4\n ```\n@@ -848,7 +848,7 @@ These can be chained: you can apply another `Tensor` Operation to the value\n returned by the method.\n \n The chain of Operation is evaluated lazily, typically when it is assigned to a\n-tensor. See [Controlling When Expression are Evaluated](#controlling-when-expression-are-evaluated) for more details about\n+tensor. See **Controlling When Expression are Evaluated** for more details about\n their evaluation.\n \n ### (Operation) constant(const Scalar& val)\n@@ -858,7 +858,7 @@ where all elements have the value `val`.\n \n This is useful, for example, when you want to add or subtract a constant from a\n tensor, or multiply every element of a tensor by a scalar.\n-However, such operations can also be performed using operator overloads (see [operator+](#operation-operator-scalar-s)).\n+However, such operations can also be performed using operator overloads (see `operator+`).\n \n \n ```cpp\n@@ -927,7 +927,7 @@ std::cout << \"b\\n\" << b << \"\\n\\n\";\n // a\n // 1 1 1\n // 1 1 1\n-// \n+//\n // b\n // -1 -1 -1\n // -1 -1 -1\n@@ -1010,7 +1010,7 @@ std::cout << \"b\" << endl << b << endl << endl;\n // a\n // 0 1 8\n // 27 64 125\n-// \n+//\n // b\n // 0 1 2\n // 3 4 5\n@@ -1031,7 +1031,7 @@ std::cout << \"scaled_a\\n\" << scaled_a << \"\\n\";\n // a\n // 1 2 3\n // 4 5 6\n-// \n+//\n // scaled_a\n // 2 4 6\n // 8 10 12\n@@ -1048,8 +1048,8 @@ Divides every element in the tensor by `s`.\n ### (Operation) operator% (Scalar s)\n Computes the element-wise modulus (remainder) of each tensor element divided by `s`\n \n-**Only integer types are supported.** \n-For floating-point tensors, implement a [unaryExpr](#operation-unaryexprcustomunaryop-func) using `std::fmod`.\n+**Only integer types are supported.**\n+For floating-point tensors, implement a `unaryExpr` using `std::fmod`.\n \n ### (Operation) cwiseMax(Scalar threshold)\n Returns the coefficient-wise maximum between two tensors.\n@@ -1203,10 +1203,10 @@ The following boolean operators are supported:\n * `operator>=(const OtherDerived& other)`\n * `operator==(const OtherDerived& other)`\n * `operator!=(const OtherDerived& other)`\n- \n+\n as well as bitwise operators:\n \n- * `operator&(const OtherDerived& other)` \n+ * `operator&(const OtherDerived& other)`\n * `operator|(const OtherDerived& other)`\n * `operator^(const OtherDerived& other)`\n \n@@ -1448,7 +1448,7 @@ std::cout << \"Flat argmax index: \" << argmax_flat();\n \n ### (Operation) argmin(const Dimensions& reduction_dim)\n ### (Operation) argmin()\n-See [argmax](#operation-argmaxconst-dimensions-reduction_dim)\n+See `argmax`.\n \n ### (Operation) reduce(const Dimensions& reduction_dims, const Reducer& reducer)\n \n@@ -1953,7 +1953,7 @@ std::cout << \"b\\n\" << b << \"\\n\";\n \n ### (Operation) roll(const Rolls& shifts)\n \n-Returns a tensor with the elements **circularly shifted** (like bit rotation) along one or more dimensions. \n+Returns a tensor with the elements **circularly shifted** (like bit rotation) along one or more dimensions.\n \n For each dimension `i`, the content is shifted by `shifts[i]` positions:\n \n@@ -2277,7 +2277,7 @@ std::cout << \"b\\n\" << b << \"\\n\";\n ```\n \n ### (Operation) eval()\n-See [Calling eval()](#calling-eval)\n+See **Calling eval()**.\n \n \n \n@@ -2340,7 +2340,7 @@ For example `Tensor<T, N>::maximum()` returns a `Tensor<T, 0>`.\n \n Similarly, the inner product of 2 1d tensors (through contractions) returns a 0d tensor.\n \n-The scalar value can be extracted as explained in [Reduction along all dimensions](#reduction-along-all-dimensions).\n+The scalar value can be extracted as explained in **Reduction along all dimensions**.\n \n \n ## Limitations\n@@ -2349,4 +2349,4 @@ The scalar value can be extracted as explained in [Reduction along all dimension\n compiler that supports cxx11. It is limited to only 5 for older compilers.\n * The `IndexList` class requires a cxx11 compliant compiler. You can use an\n array of indices instead if you don't have access to a modern compiler.\n-* On GPUs only floating point values are properly tested and optimized for.\n\\ No newline at end of file\n+* On GPUs only floating point values are properly tested and optimized for.\n","new_path":"unsupported/Eigen/CXX11/src/Tensor/README.md","old_path":"unsupported/Eigen/CXX11/src/Tensor/README.md","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nFix docs build.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Modified `unsupported/Eigen/CXX11/src/Tensor/README.md` to address Doxygen issues with markdown links to sections.\n\n### Improvements:\n- Addressed Doxygen compatibility issues with markdown links in documentation.\n\n### Impact:\n- Resolved build issues related to Doxygen processing of markdown links in the documentation."}
{"iid":1928,"title":"Move default builds/tests to GitLab runners.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1928","description":"Let's see how many minutes we end up burning through.\n\nOur dedicated self-hosted GitLab runner is swamped now that we only have one remaining linux machine. Let's take advantage of the open-source program.","created_at":"2025-07-02T17:20:56.511Z","merged_at":"2025-07-05T04:37:08.949Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -7,9 +7,7 @@\n script:\n - . ci/scripts/build.linux.script.sh\n tags:\n- - linux\n- - eigen-runner\n- - cross-compiler\n+ - saas-linux-2xlarge-amd64\n rules:\n - if: $CI_PIPELINE_SOURCE == \"schedule\" && $CI_PROJECT_NAMESPACE == \"libeigen\"\n - if: $CI_PIPELINE_SOURCE == \"web\" && $CI_PROJECT_NAMESPACE == \"libeigen\"\n@@ -309,12 +307,12 @@ build:linux:cross:ppc64le:gcc-14:default:\n EIGEN_CI_CROSS_C_COMPILER: powerpc64le-linux-gnu-gcc-14\n EIGEN_CI_CROSS_CXX_COMPILER: powerpc64le-linux-gnu-g++-14\n \n-build:linux:cross:ppc64le:clang-12:default:\n+build:linux:cross:ppc64le:clang-16:default:\n extends: .build:linux:cross:ppc64le\n variables:\n- EIGEN_CI_C_COMPILER: clang-12\n- EIGEN_CI_CXX_COMPILER: clang++-12\n- EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu clang-12 qemu-user-static\n+ EIGEN_CI_C_COMPILER: clang-16\n+ EIGEN_CI_CXX_COMPILER: clang++-16\n+ EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu clang-16 qemu-user-static\n \n ######## loongarch64 #################################################\n \n","new_path":"ci/build.linux.gitlab-ci.yml","old_path":"ci/build.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -10,9 +10,7 @@\n - if: $CI_PIPELINE_SOURCE == \"web\" && $CI_PROJECT_NAMESPACE == \"libeigen\"\n - if: $CI_PIPELINE_SOURCE == \"merge_request_event\" && $CI_PROJECT_NAMESPACE == \"libeigen\" && $CI_MERGE_REQUEST_LABELS =~ \"/all-tests/\"\n tags:\n- - eigen-runner\n- - linux\n- - x86-64\n+ - saas-linux-2xlarge-amd64\n \n ##### x86-64 ###################################################################\n .test:linux:x86-64:\n@@ -373,13 +371,13 @@ test:linux:aarch64:clang-12:default:unsupported:\n variables:\n EIGEN_CI_TARGET_ARCH: ppc64le\n EIGEN_CI_CROSS_TARGET_TRIPLE: powerpc64le-linux-gnu\n+ EIGEN_CI_CTEST_ARGS: --timeout 2000\n \n .test:linux:ppc64le:gcc-14:default:\n extends: .test:linux:ppc64le\n needs: [ build:linux:cross:ppc64le:gcc-14:default ]\n variables:\n EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu qemu-user-static\n- EIGEN_CI_CTEST_ARGS: --timeout 2000\n \n test:linux:ppc64le:gcc-14:default:official:\n extends: .test:linux:ppc64le:gcc-14:default\n@@ -391,19 +389,19 @@ test:linux:ppc64le:gcc-14:default:unsupported:\n variables:\n EIGEN_CI_CTEST_LABEL: Unsupported\n \n-.test:linux:ppc64le:clang-12:default:\n+.test:linux:ppc64le:clang-16:default:\n extends: .test:linux:ppc64le\n- needs: [ build:linux:cross:ppc64le:clang-12:default ]\n+ needs: [ build:linux:cross:ppc64le:clang-16:default ]\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu clang-12 qemu-user-static\n+ EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu clang-16 qemu-user-static\n \n-test:linux:ppc64le:clang-12:default:official:\n- extends: .test:linux:ppc64le:clang-12:default\n+test:linux:ppc64le:clang-16:default:official:\n+ extends: .test:linux:ppc64le:clang-16:default\n variables:\n EIGEN_CI_CTEST_LABEL: Official\n \n-test:linux:ppc64le:clang-12:default:unsupported:\n- extends: .test:linux:ppc64le:clang-12:default\n+test:linux:ppc64le:clang-16:default:unsupported:\n+ extends: .test:linux:ppc64le:clang-16:default\n variables:\n EIGEN_CI_CTEST_LABEL: Unsupported\n \n","new_path":"ci/test.linux.gitlab-ci.yml","old_path":"ci/test.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"2","labels":["5.0","all-tests"],"state":"merged","summary":"## Title:\nMove default builds/tests to GitLab runners.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Modified `ci/build.linux.gitlab-ci.yml` and `ci/test.linux.gitlab-ci.yml` to move default builds/tests to GitLab runners.\n\n### Improvements:\n- Improved CI/CD workflow by utilizing GitLab runners for builds and tests.\n\n### Impact:\n- Reduced reliance on a single Linux machine for builds and tests.\n- Improved efficiency by leveraging open-source GitLab runners."}
{"iid":1927,"title":"Replace PPC g++-10 with g++14.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1927","description":"It looks like g++-10 has a bunch of compiler issues.\n\nThrough some debugging/print statements, it looks like sometimes a `static_cast<float>(double)`\nproduces 0 for inputs like 0.224. Adding print statements fixes\nthe issue, adding other explicit casts fixes the issue, re-arranging statements\nfixes the issue. There are no such issues with clang, or with newer versions of\ng++-powerpc64le-linux-gnu.\n\nOur CI had a bunch of failures ever since removing the PPC runner.","created_at":"2025-07-01T23:10:25.105Z","merged_at":"2025-07-02T17:07:44.994Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -294,6 +294,7 @@ build:linux:cross:aarch64:clang-12:default:\n \n .build:linux:cross:ppc64le:\n extends: .build:linux:cross\n+ image: ubuntu:24.04\n variables:\n EIGEN_CI_TARGET_ARCH: ppc64le\n EIGEN_CI_CROSS_TARGET_TRIPLE: powerpc64le-linux-gnu\n@@ -301,19 +302,19 @@ build:linux:cross:aarch64:clang-12:default:\n -DCMAKE_SYSTEM_NAME=Linux\n -DCMAKE_CROSSCOMPILING_EMULATOR=qemu-ppc64le-static;-L;/usr/powerpc64le-linux-gnu\n \n-build:linux:cross:ppc64le:gcc-10:default:\n+build:linux:cross:ppc64le:gcc-14:default:\n extends: .build:linux:cross:ppc64le\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu qemu-user-static\n- EIGEN_CI_CROSS_C_COMPILER: powerpc64le-linux-gnu-gcc-10\n- EIGEN_CI_CROSS_CXX_COMPILER: powerpc64le-linux-gnu-g++-10\n+ EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu qemu-user-static\n+ EIGEN_CI_CROSS_C_COMPILER: powerpc64le-linux-gnu-gcc-14\n+ EIGEN_CI_CROSS_CXX_COMPILER: powerpc64le-linux-gnu-g++-14\n \n build:linux:cross:ppc64le:clang-12:default:\n extends: .build:linux:cross:ppc64le\n variables:\n EIGEN_CI_C_COMPILER: clang-12\n EIGEN_CI_CXX_COMPILER: clang++-12\n- EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu clang-12 qemu-user-static\n+ EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu clang-12 qemu-user-static\n \n ######## loongarch64 #################################################\n \n","new_path":"ci/build.linux.gitlab-ci.yml","old_path":"ci/build.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -369,24 +369,25 @@ test:linux:aarch64:clang-12:default:unsupported:\n \n .test:linux:ppc64le:\n extends: .test:linux\n+ image: ubuntu:24.04\n variables:\n EIGEN_CI_TARGET_ARCH: ppc64le\n EIGEN_CI_CROSS_TARGET_TRIPLE: powerpc64le-linux-gnu\n \n-.test:linux:ppc64le:gcc-10:default:\n+.test:linux:ppc64le:gcc-14:default:\n extends: .test:linux:ppc64le\n- needs: [ build:linux:cross:ppc64le:gcc-10:default ]\n+ needs: [ build:linux:cross:ppc64le:gcc-14:default ]\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu qemu-user-static\n+ EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu qemu-user-static\n EIGEN_CI_CTEST_ARGS: --timeout 2000\n \n-test:linux:ppc64le:gcc-10:default:official:\n- extends: .test:linux:ppc64le:gcc-10:default\n+test:linux:ppc64le:gcc-14:default:official:\n+ extends: .test:linux:ppc64le:gcc-14:default\n variables:\n EIGEN_CI_CTEST_LABEL: Official\n \n-test:linux:ppc64le:gcc-10:default:unsupported:\n- extends: .test:linux:ppc64le:gcc-10:default\n+test:linux:ppc64le:gcc-14:default:unsupported:\n+ extends: .test:linux:ppc64le:gcc-14:default\n variables:\n EIGEN_CI_CTEST_LABEL: Unsupported\n \n@@ -394,7 +395,7 @@ test:linux:ppc64le:gcc-10:default:unsupported:\n extends: .test:linux:ppc64le\n needs: [ build:linux:cross:ppc64le:clang-12:default ]\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu clang-12 qemu-user-static\n+ EIGEN_CI_CROSS_INSTALL: g++-14-powerpc64le-linux-gnu clang-12 qemu-user-static\n \n test:linux:ppc64le:clang-12:default:official:\n extends: .test:linux:ppc64le:clang-12:default\n","new_path":"ci/test.linux.gitlab-ci.yml","old_path":"ci/test.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"2","labels":["5.0"],"state":"merged","summary":"## Title:\nReplace PPC g++-10 with g++14.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Modified `ci/build.linux.gitlab-ci.yml` and `ci/test.linux.gitlab-ci.yml` to use g++14 instead of g++-10.\n\n### Improvements:\n- Addressed compiler issues with g++-10 on PPC architecture.\n- Fixed issues with `static_cast<float>(double)` producing incorrect values.\n\n### Impact:\n- Resolved CI failures caused by removing PPC support.\n- Ensured compatibility with PPC architecture using g++14."}
{"iid":1924,"title":"arm packet alignment requirements and aligned loads/stores","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1924","description":"### Reference issue\n\n\n### What does this implement/fix?\n\n1) The alignment requirements for some arm simd vectors are too strict. \n2) Arm does not provide intrinsics for aligned loads and stores. For arm32, we can provide an alignment hint which generates the aligned instructions. Arm64 appears to ignore these hints.\n\nhttps://godbolt.org/z/6dd33M4Wq\n\nCan anyone benchmark this?\n\n### Additional information","created_at":"2025-06-27T03:12:51.603Z","merged_at":"2025-07-15T23:49:05.738Z","author":{"name":"Charles Schlosser","username":"chuckyschluz"},"changes":[{"diff":"@@ -73,30 +73,13 @@ struct packet_traits<std::complex<float> > : default_packet_traits {\n };\n \n template <>\n-struct unpacket_traits<Packet1cf> {\n- typedef std::complex<float> type;\n- typedef Packet1cf half;\n- typedef Packet2f as_real;\n- enum {\n- size = 1,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet1cf> : neon_unpacket_default<Packet1cf, std::complex<float>> {\n+ using as_real = Packet2f;\n };\n template <>\n-struct unpacket_traits<Packet2cf> {\n- typedef std::complex<float> type;\n- typedef Packet1cf half;\n- typedef Packet4f as_real;\n- enum {\n- size = 2,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet2cf> : neon_unpacket_default<Packet2cf, std::complex<float>> {\n+ using half = Packet1cf;\n+ using as_real = Packet4f;\n };\n \n template <>\n@@ -297,10 +280,12 @@ EIGEN_STRONG_INLINE Packet2cf pandnot<Packet2cf>(const Packet2cf& a, const Packe\n \n template <>\n EIGEN_STRONG_INLINE Packet1cf pload<Packet1cf>(const std::complex<float>* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet1cf>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return Packet1cf(pload<Packet2f>((const float*)from));\n }\n template <>\n EIGEN_STRONG_INLINE Packet2cf pload<Packet2cf>(const std::complex<float>* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet2cf>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return Packet2cf(pload<Packet4f>(reinterpret_cast<const float*>(from)));\n }\n \n@@ -324,10 +309,12 @@ EIGEN_STRONG_INLINE Packet2cf ploaddup<Packet2cf>(const std::complex<float>* fro\n \n template <>\n EIGEN_STRONG_INLINE void pstore<std::complex<float> >(std::complex<float>* to, const Packet1cf& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet1cf>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE pstore((float*)to, from.v);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<std::complex<float> >(std::complex<float>* to, const Packet2cf& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet2cf>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE pstore(reinterpret_cast<float*>(to), from.v);\n }\n \n@@ -538,21 +525,13 @@ struct packet_traits<std::complex<double> > : default_packet_traits {\n };\n \n template <>\n-struct unpacket_traits<Packet1cd> {\n- typedef std::complex<double> type;\n- typedef Packet1cd half;\n- typedef Packet2d as_real;\n- enum {\n- size = 1,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet1cd> : neon_unpacket_default<Packet1cd, std::complex<double>> {\n+ using as_real = Packet2d;\n };\n \n template <>\n EIGEN_STRONG_INLINE Packet1cd pload<Packet1cd>(const std::complex<double>* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet1cd>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return Packet1cd(pload<Packet2d>(reinterpret_cast<const double*>(from)));\n }\n \n@@ -666,6 +645,7 @@ EIGEN_STRONG_INLINE Packet1cd ploaddup<Packet1cd>(const std::complex<double>* fr\n \n template <>\n EIGEN_STRONG_INLINE void pstore<std::complex<double> >(std::complex<double>* to, const Packet1cd& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet1cd>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE pstore(reinterpret_cast<double*>(to), from.v);\n }\n \n","new_path":"Eigen/src/Core/arch/NEON/Complex.h","old_path":"Eigen/src/Core/arch/NEON/Complex.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -437,224 +437,74 @@ struct packet_traits<uint64_t> : default_packet_traits {\n };\n };\n \n-template <>\n-struct unpacket_traits<Packet2f> {\n- typedef float type;\n- typedef Packet2f half;\n- typedef Packet2i integer_packet;\n- enum {\n- size = 2,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+template <typename Packet, typename Scalar>\n+struct neon_unpacket_default {\n+ using type = Scalar;\n+ using half = Packet;\n+ static constexpr int size = sizeof(Packet) / sizeof(Scalar);\n+ static constexpr int alignment = sizeof(Packet);\n+ static constexpr bool vectorizable = true;\n+ static constexpr bool masked_load_available = false;\n+ static constexpr bool masked_store_available = false;\n };\n+\n template <>\n-struct unpacket_traits<Packet4f> {\n- typedef float type;\n- typedef Packet2f half;\n- typedef Packet4i integer_packet;\n- enum {\n- size = 4,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet2f> : neon_unpacket_default<Packet2f, float> {\n+ using integer_packet = Packet2i;\n };\n template <>\n-struct unpacket_traits<Packet4c> {\n- typedef int8_t type;\n- typedef Packet4c half;\n- enum {\n- size = 4,\n- alignment = Unaligned,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet4f> : neon_unpacket_default<Packet4f, float> {\n+ using half = Packet2f;\n+ using integer_packet = Packet4i;\n };\n template <>\n-struct unpacket_traits<Packet8c> {\n- typedef int8_t type;\n- typedef Packet4c half;\n- enum {\n- size = 8,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet4c> : neon_unpacket_default<Packet4c, int8_t> {};\n template <>\n-struct unpacket_traits<Packet16c> {\n- typedef int8_t type;\n- typedef Packet8c half;\n- enum {\n- size = 16,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet8c> : neon_unpacket_default<Packet8c, int8_t> {\n+ using half = Packet4c;\n };\n template <>\n-struct unpacket_traits<Packet4uc> {\n- typedef uint8_t type;\n- typedef Packet4uc half;\n- enum {\n- size = 4,\n- alignment = Unaligned,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet16c> : neon_unpacket_default<Packet16c, int8_t> {\n+ using half = Packet8c;\n };\n template <>\n-struct unpacket_traits<Packet8uc> {\n- typedef uint8_t type;\n- typedef Packet4uc half;\n- enum {\n- size = 8,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet4uc> : neon_unpacket_default<Packet4uc, uint8_t> {};\n template <>\n-struct unpacket_traits<Packet16uc> {\n- typedef uint8_t type;\n- typedef Packet8uc half;\n- enum {\n- size = 16,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet8uc> : neon_unpacket_default<Packet8uc, uint8_t> {\n+ using half = Packet4uc;\n };\n template <>\n-struct unpacket_traits<Packet4s> {\n- typedef int16_t type;\n- typedef Packet4s half;\n- enum {\n- size = 4,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet16uc> : neon_unpacket_default<Packet16uc, uint8_t> {\n+ using half = Packet8uc;\n };\n template <>\n-struct unpacket_traits<Packet8s> {\n- typedef int16_t type;\n- typedef Packet4s half;\n- enum {\n- size = 8,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet4s> : neon_unpacket_default<Packet4s, int16_t> {};\n template <>\n-struct unpacket_traits<Packet4us> {\n- typedef uint16_t type;\n- typedef Packet4us half;\n- enum {\n- size = 4,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet8s> : neon_unpacket_default<Packet8s, int16_t> {\n+ using half = Packet4s;\n };\n template <>\n-struct unpacket_traits<Packet8us> {\n- typedef uint16_t type;\n- typedef Packet4us half;\n- enum {\n- size = 8,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet4us> : neon_unpacket_default<Packet4us, uint16_t> {};\n template <>\n-struct unpacket_traits<Packet2i> {\n- typedef int32_t type;\n- typedef Packet2i half;\n- enum {\n- size = 2,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet8us> : neon_unpacket_default<Packet8us, uint16_t> {\n+ using half = Packet4us;\n };\n template <>\n-struct unpacket_traits<Packet4i> {\n- typedef int32_t type;\n- typedef Packet2i half;\n- enum {\n- size = 4,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet2i> : neon_unpacket_default<Packet2i, int32_t> {};\n template <>\n-struct unpacket_traits<Packet2ui> {\n- typedef uint32_t type;\n- typedef Packet2ui half;\n- enum {\n- size = 2,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet4i> : neon_unpacket_default<Packet4i, int32_t> {\n+ using half = Packet2i;\n };\n template <>\n-struct unpacket_traits<Packet4ui> {\n- typedef uint32_t type;\n- typedef Packet2ui half;\n- enum {\n- size = 4,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet2ui> : neon_unpacket_default<Packet2ui, uint32_t> {};\n template <>\n-struct unpacket_traits<Packet2l> {\n- typedef int64_t type;\n- typedef Packet2l half;\n- enum {\n- size = 2,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet4ui> : neon_unpacket_default<Packet4ui, uint32_t> {\n+ using half = Packet2ui;\n };\n template <>\n-struct unpacket_traits<Packet2ul> {\n- typedef uint64_t type;\n- typedef Packet2ul half;\n- enum {\n- size = 2,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet2l> : neon_unpacket_default<Packet2l, int64_t> {};\n+template <>\n+struct unpacket_traits<Packet2ul> : neon_unpacket_default<Packet2ul, uint64_t> {};\n \n template <>\n EIGEN_STRONG_INLINE Packet2f pzero(const Packet2f& /*a*/) {\n@@ -2417,10 +2267,12 @@ EIGEN_STRONG_INLINE Packet2ul plogical_shift_left(Packet2ul a) {\n \n template <>\n EIGEN_STRONG_INLINE Packet2f pload<Packet2f>(const float* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet2f>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_f32(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet4f pload<Packet4f>(const float* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet4f>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_f32(from);\n }\n template <>\n@@ -2431,10 +2283,12 @@ EIGEN_STRONG_INLINE Packet4c pload<Packet4c>(const int8_t* from) {\n }\n template <>\n EIGEN_STRONG_INLINE Packet8c pload<Packet8c>(const int8_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet8c>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_s8(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet16c pload<Packet16c>(const int8_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet16c>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_s8(from);\n }\n template <>\n@@ -2445,50 +2299,62 @@ EIGEN_STRONG_INLINE Packet4uc pload<Packet4uc>(const uint8_t* from) {\n }\n template <>\n EIGEN_STRONG_INLINE Packet8uc pload<Packet8uc>(const uint8_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet8uc>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_u8(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet16uc pload<Packet16uc>(const uint8_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet16uc>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_u8(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet4s pload<Packet4s>(const int16_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet4s>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_s16(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet8s pload<Packet8s>(const int16_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet8s>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_s16(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet4us pload<Packet4us>(const uint16_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet4us>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_u16(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet8us pload<Packet8us>(const uint16_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet8us>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_u16(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet2i pload<Packet2i>(const int32_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet2i>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_s32(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet4i pload<Packet4i>(const int32_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet4i>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_s32(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet2ui pload<Packet2ui>(const uint32_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet2ui>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_u32(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet4ui pload<Packet4ui>(const uint32_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet4ui>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_u32(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet2l pload<Packet2l>(const int64_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet2l>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_s64(from);\n }\n template <>\n EIGEN_STRONG_INLINE Packet2ul pload<Packet2ul>(const uint64_t* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet2ul>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_u64(from);\n }\n \n@@ -2713,10 +2579,12 @@ EIGEN_STRONG_INLINE Packet4ui ploadquad<Packet4ui>(const uint32_t* from) {\n \n template <>\n EIGEN_STRONG_INLINE void pstore<float>(float* to, const Packet2f& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet2f>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_f32(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<float>(float* to, const Packet4f& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet4f>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_f32(to, from);\n }\n template <>\n@@ -2725,10 +2593,12 @@ EIGEN_STRONG_INLINE void pstore<int8_t>(int8_t* to, const Packet4c& from) {\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<int8_t>(int8_t* to, const Packet8c& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet8c>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_s8(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<int8_t>(int8_t* to, const Packet16c& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet16c>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_s8(to, from);\n }\n template <>\n@@ -2737,50 +2607,62 @@ EIGEN_STRONG_INLINE void pstore<uint8_t>(uint8_t* to, const Packet4uc& from) {\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<uint8_t>(uint8_t* to, const Packet8uc& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet8uc>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_u8(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<uint8_t>(uint8_t* to, const Packet16uc& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet16uc>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_u8(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<int16_t>(int16_t* to, const Packet4s& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet4s>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_s16(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<int16_t>(int16_t* to, const Packet8s& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet8s>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_s16(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<uint16_t>(uint16_t* to, const Packet4us& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet4us>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_u16(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<uint16_t>(uint16_t* to, const Packet8us& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet8us>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_u16(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<int32_t>(int32_t* to, const Packet2i& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet2i>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_s32(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<int32_t>(int32_t* to, const Packet4i& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet4i>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_s32(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<uint32_t>(uint32_t* to, const Packet2ui& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet2ui>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_u32(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<uint32_t>(uint32_t* to, const Packet4ui& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet4ui>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_u32(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<int64_t>(int64_t* to, const Packet2l& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet2l>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_s64(to, from);\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<uint64_t>(uint64_t* to, const Packet2ul& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet2ul>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_u64(to, from);\n }\n \n@@ -4801,17 +4683,7 @@ struct packet_traits<bfloat16> : default_packet_traits {\n };\n \n template <>\n-struct unpacket_traits<Packet4bf> {\n- typedef bfloat16 type;\n- typedef Packet4bf half;\n- enum {\n- size = 4,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n+struct unpacket_traits<Packet4bf> : neon_unpacket_default<Packet4bf, bfloat16> {};\n \n namespace detail {\n template <>\n@@ -4866,6 +4738,7 @@ EIGEN_STRONG_INLINE bfloat16 pfirst<Packet4bf>(const Packet4bf& from) {\n \n template <>\n EIGEN_STRONG_INLINE Packet4bf pload<Packet4bf>(const bfloat16* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet4bf>::alignment);\n return Packet4bf(pload<Packet4us>(reinterpret_cast<const uint16_t*>(from)));\n }\n \n@@ -4876,6 +4749,7 @@ EIGEN_STRONG_INLINE Packet4bf ploadu<Packet4bf>(const bfloat16* from) {\n \n template <>\n EIGEN_STRONG_INLINE void pstore<bfloat16>(bfloat16* to, const Packet4bf& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet4bf>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_u16(reinterpret_cast<uint16_t*>(to), from);\n }\n \n@@ -5201,17 +5075,8 @@ struct packet_traits<double> : default_packet_traits {\n };\n \n template <>\n-struct unpacket_traits<Packet2d> {\n- typedef double type;\n- typedef Packet2d half;\n- typedef Packet2l integer_packet;\n- enum {\n- size = 2,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet2d> : neon_unpacket_default<Packet2d, double> {\n+ using integer_packet = Packet2l;\n };\n \n template <>\n@@ -5373,6 +5238,7 @@ EIGEN_STRONG_INLINE Packet2d pcmp_eq(const Packet2d& a, const Packet2d& b) {\n \n template <>\n EIGEN_STRONG_INLINE Packet2d pload<Packet2d>(const double* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet2d>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_f64(from);\n }\n \n@@ -5387,6 +5253,7 @@ EIGEN_STRONG_INLINE Packet2d ploaddup<Packet2d>(const double* from) {\n }\n template <>\n EIGEN_STRONG_INLINE void pstore<double>(double* to, const Packet2d& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet2d>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_f64(to, from);\n }\n \n@@ -5579,29 +5446,10 @@ struct packet_traits<Eigen::half> : default_packet_traits {\n };\n \n template <>\n-struct unpacket_traits<Packet4hf> {\n- typedef Eigen::half type;\n- typedef Packet4hf half;\n- enum {\n- size = 4,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n-};\n-\n+struct unpacket_traits<Packet4hf> : neon_unpacket_default<Packet4hf, half> {};\n template <>\n-struct unpacket_traits<Packet8hf> {\n- typedef Eigen::half type;\n- typedef Packet4hf half;\n- enum {\n- size = 8,\n- alignment = Aligned16,\n- vectorizable = true,\n- masked_load_available = false,\n- masked_store_available = false\n- };\n+struct unpacket_traits<Packet8hf> : neon_unpacket_default<Packet8hf, half> {\n+ using half = Packet4hf;\n };\n \n template <>\n@@ -5934,11 +5782,13 @@ EIGEN_STRONG_INLINE Packet4hf pandnot<Packet4hf>(const Packet4hf& a, const Packe\n \n template <>\n EIGEN_STRONG_INLINE Packet8hf pload<Packet8hf>(const Eigen::half* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet8hf>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1q_f16(reinterpret_cast<const float16_t*>(from));\n }\n \n template <>\n EIGEN_STRONG_INLINE Packet4hf pload<Packet4hf>(const Eigen::half* from) {\n+ EIGEN_ASSUME_ALIGNED(from, unpacket_traits<Packet4hf>::alignment);\n EIGEN_DEBUG_ALIGNED_LOAD return vld1_f16(reinterpret_cast<const float16_t*>(from));\n }\n \n@@ -6014,11 +5864,13 @@ EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Packet4hf pinsertlast(const Packet4hf& a,\n \n template <>\n EIGEN_STRONG_INLINE void pstore<Eigen::half>(Eigen::half* to, const Packet8hf& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet8hf>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1q_f16(reinterpret_cast<float16_t*>(to), from);\n }\n \n template <>\n EIGEN_STRONG_INLINE void pstore<Eigen::half>(Eigen::half* to, const Packet4hf& from) {\n+ EIGEN_ASSUME_ALIGNED(to, unpacket_traits<Packet4hf>::alignment);\n EIGEN_DEBUG_ALIGNED_STORE vst1_f16(reinterpret_cast<float16_t*>(to), from);\n }\n \n","new_path":"Eigen/src/Core/arch/NEON/PacketMath.h","old_path":"Eigen/src/Core/arch/NEON/PacketMath.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -1339,6 +1339,21 @@ EIGEN_DEVICE_FUNC void destroy_at(T* p) {\n }\n #endif\n \n+/** \\internal\n+ * This informs the implementation that PTR is aligned to at least ALIGN_BYTES\n+ */\n+#ifndef EIGEN_ASSUME_ALIGNED\n+#if defined(__cpp_lib_assume_aligned) && (__cpp_lib_assume_aligned >= 201811L)\n+#define EIGEN_ASSUME_ALIGNED(PTR, ALIGN_BYTES) \\\n+ { PTR = std::assume_aligned<8 * (ALIGN_BYTES)>(PTR); }\n+#elif EIGEN_HAS_BUILTIN(__builtin_assume_aligned)\n+#define EIGEN_ASSUME_ALIGNED(PTR, ALIGN_BYTES) \\\n+ { PTR = static_cast<decltype(PTR)>(__builtin_assume_aligned(PTR, (ALIGN_BYTES))); }\n+#else\n+#define EIGEN_ASSUME_ALIGNED(PTR, ALIGN_BYTES) /* do nothing */\n+#endif\n+#endif\n+\n } // end namespace internal\n \n } // end namespace Eigen\n","new_path":"Eigen/src/Core/util/Memory.h","old_path":"Eigen/src/Core/util/Memory.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"3","labels":["5.0"],"state":"merged","summary":"## Title:\narm packet alignment requirements and aligned loads/stores\n\n## Author:\nCharles Schlosser (chuckyschluz)\n\n## Summary\n### Key Changes:\n- Modified `Eigen/src/Core/arch/NEON/Complex.h` \n- Modified `Eigen/src/Core/arch/NEON/PacketMath.h` \n- Modified `Eigen/src/Core/util/Memory.h`\n\n### Improvements:\n- Addressed alignment requirements for ARM SIMD vectors \n- Added alignment hint for ARM32 to generate aligned instructions \n- Improved handling of aligned loads/stores for ARM architecture\n\n### Impact:\n- Reduced strict alignment requirements for ARM SIMD vectors \n- Enabled aligned load/store operations for ARM32 \n- Improved compatibility with ARM architecture intrinsics"}
{"iid":1923,"title":"Move HIP/CUDA defines to Core.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1923","description":"They are used in the basic GPU test in Core - so we shouldn't rely on something in unsupported/*/Tensor.","created_at":"2025-06-26T21:55:41.079Z","merged_at":"2025-06-27T16:48:08.044Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -8,7 +8,7 @@\n // Public License v. 2.0. If a copy of the MPL was not distributed\n // with this file, You can obtain one at http://mozilla.org/MPL/2.0/.\n \n-#if defined(EIGEN_USE_GPU) && !defined(EIGEN_CXX11_TENSOR_GPU_HIP_CUDA_DEFINES_H)\n+#if defined(EIGEN_USE_GPU) && !defined(EIGEN_CORE_GPU_HIP_CUDA_DEFINES_H)\n #define EIGEN_CXX11_TENSOR_GPU_HIP_CUDA_DEFINES_H\n \n // Note that we are using EIGEN_USE_HIP here instead of EIGEN_HIPCC...this is by design\n@@ -98,4 +98,4 @@\n \n #endif // gpu_assert\n \n-#endif // EIGEN_CXX11_TENSOR_GPU_HIP_CUDA_DEFINES_H\n+#endif // EIGEN_CORE_GPU_HIP_CUDA_DEFINES_H\n","new_path":"Eigen/src/Core/util/GpuHipCudaDefines.inc","old_path":"unsupported/Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":true,"deleted_file":false,"generated_file":false},{"diff":"@@ -8,7 +8,7 @@\n // Public License v. 2.0. If a copy of the MPL was not distributed\n // with this file, You can obtain one at http://mozilla.org/MPL/2.0/.\n \n-#if defined(EIGEN_CXX11_TENSOR_GPU_HIP_CUDA_DEFINES_H)\n+#if defined(EIGEN_CORE_GPU_HIP_CUDA_DEFINES_H)\n \n #ifndef EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES\n \n@@ -40,6 +40,6 @@\n \n #endif // EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES\n \n-#undef EIGEN_CXX11_TENSOR_GPU_HIP_CUDA_DEFINES_H\n+#undef EIGEN_CORE_GPU_HIP_CUDA_DEFINES_H\n \n-#endif // EIGEN_CXX11_TENSOR_GPU_HIP_CUDA_DEFINES_H\n+#endif // EIGEN_CORE_GPU_HIP_CUDA_DEFINES_H\n","new_path":"Eigen/src/Core/util/GpuHipCudaUndefines.inc","old_path":"unsupported/Eigen/CXX11/src/Tensor/TensorGpuHipCudaUndefines.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":true,"deleted_file":false,"generated_file":false},{"diff":"@@ -4,7 +4,7 @@\n #include <Eigen/Core>\n \n // Allow gpu** macros for generic tests.\n-#include <unsupported/Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h>\n+#include <Eigen/src/Core/util/GpuHipCudaDefines.inc>\n \n // std::tuple cannot be used on device, and there is a bug in cuda < 9.2 that\n // doesn't allow std::tuple to compile for host code either. In these cases,\n","new_path":"test/gpu_test_helper.h","old_path":"test/gpu_test_helper.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -10,14 +10,11 @@\n #if defined(EIGEN_USE_GPU) && !defined(EIGEN_CXX11_TENSOR_TENSOR_DEVICE_GPU_H)\n #define EIGEN_CXX11_TENSOR_TENSOR_DEVICE_GPU_H\n \n-// This header file container defines fo gpu* macros which will resolve to\n-// their equivalent hip* or cuda* versions depending on the compiler in use\n-// A separate header (included at the end of this file) will undefine all\n-#include \"TensorGpuHipCudaDefines.h\"\n-\n // IWYU pragma: private\n #include \"./InternalHeaderCheck.h\"\n \n+#include \"../../../../../Eigen/src/Core/util/GpuHipCudaDefines.inc\"\n+\n namespace Eigen {\n \n static const int kGpuScratchSize = 1024;\n@@ -390,6 +387,6 @@ static EIGEN_DEVICE_FUNC inline void setGpuSharedMemConfig(gpuSharedMemConfig co\n } // end namespace Eigen\n \n // undefine all the gpu* macros we defined at the beginning of the file\n-#include \"TensorGpuHipCudaUndefines.h\"\n+#include \"../../../../../Eigen/src/Core/util/GpuHipCudaUndefines.inc\"\n \n #endif // EIGEN_CXX11_TENSOR_TENSOR_DEVICE_GPU_H\n","new_path":"unsupported/Eigen/CXX11/src/Tensor/TensorDeviceGpu.h","old_path":"unsupported/Eigen/CXX11/src/Tensor/TensorDeviceGpu.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -14,8 +14,6 @@\n #include \"main.h\"\n #include <unsupported/Eigen/CXX11/Tensor>\n \n-#include <unsupported/Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h>\n-\n using Eigen::Tensor;\n \n template <int Layout>\n","new_path":"unsupported/test/cxx11_tensor_argmax_gpu.cu","old_path":"unsupported/test/cxx11_tensor_argmax_gpu.cu","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -17,8 +17,6 @@\n #include \"main.h\"\n #include <unsupported/Eigen/CXX11/Tensor>\n \n-#include <unsupported/Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h>\n-\n using Eigen::Tensor;\n typedef Tensor<float, 1>::DimensionPair DimPair;\n \n","new_path":"unsupported/test/cxx11_tensor_contract_gpu.cu","old_path":"unsupported/test/cxx11_tensor_contract_gpu.cu","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -17,8 +17,6 @@\n #include \"OffByOneScalar.h\"\n #include <unsupported/Eigen/CXX11/Tensor>\n \n-#include <unsupported/Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h>\n-\n using Eigen::RowMajor;\n using Eigen::Tensor;\n \n","new_path":"unsupported/test/cxx11_tensor_device.cu","old_path":"unsupported/test/cxx11_tensor_device.cu","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -15,8 +15,6 @@\n #include \"main.h\"\n #include <unsupported/Eigen/CXX11/Tensor>\n \n-#include <unsupported/Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h>\n-\n using Eigen::Tensor;\n \n void test_gpu_nullary() {\n","new_path":"unsupported/test/cxx11_tensor_gpu.cu","old_path":"unsupported/test/cxx11_tensor_gpu.cu","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -16,8 +16,6 @@\n #include \"main.h\"\n #include <Eigen/CXX11/Tensor>\n \n-#include <Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h>\n-\n void test_gpu_random_uniform() {\n Tensor<float, 2> out(72, 97);\n out.setZero();\n","new_path":"unsupported/test/cxx11_tensor_random_gpu.cu","old_path":"unsupported/test/cxx11_tensor_random_gpu.cu","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -16,8 +16,6 @@\n #include \"main.h\"\n #include <unsupported/Eigen/CXX11/Tensor>\n \n-#include <Eigen/CXX11/src/Tensor/TensorGpuHipCudaDefines.h>\n-\n using Eigen::Tensor;\n typedef Tensor<float, 1>::DimensionPair DimPair;\n \n","new_path":"unsupported/test/cxx11_tensor_scan_gpu.cu","old_path":"unsupported/test/cxx11_tensor_scan_gpu.cu","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"10","labels":["5.0"],"state":"merged","summary":"## Title:\nMove HIP/CUDA defines to Core.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Moved GPU-related defines from `unsupported/Eigen/CXX11/src/Tensor` to `Eigen/src/Core/util/`.\n- Renamed files: `TensorGpuHipCudaDefines.h` to `GpuHipCudaDefines.inc`, and `TensorGpuHipCudaUndefines.h` to `GpuHipCudaUndefines.inc`.\n- Modified test files to reflect the new location of defines.\n\n### Improvements:\n- Organized code structure by moving defines to the core module, reducing dependency on unsupported GPU code.\n- Ensured compatibility with core Eigen codebase.\n\n### Impact:\n- Reduced reliance on unsupported GPU code in core tests.\n- Improved maintainability and consistency across the Eigen library."}
{"iid":1921,"title":"Fix VSX packetmath psin and pcast tests.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1921","description":"For double, `vec_cst` returns a `signed int` packet, not a `signed long long`,\nleading to an undiagnosed type mismatch and garbage data. We need to always resort to element-by-element casts.\n\nThe memcpy was also causing issues with clang - explicity loading/storing the vector seems to work.\n\nThis was failing on QEMU.","created_at":"2025-06-26T19:02:27.056Z","merged_at":"2025-06-27T04:08:21.966Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -129,30 +129,20 @@ EIGEN_STRONG_INLINE Packet4f preinterpret<Packet4f, Packet4i>(const Packet4i& a)\n }\n \n #ifdef EIGEN_VECTORIZE_VSX\n-// VSX support varies between different compilers and even different\n-// versions of the same compiler. For gcc version >= 4.9.3, we can use\n-// vec_cts to efficiently convert Packet2d to Packet2l. Otherwise, use\n-// a slow version that works with older compilers.\n-// Update: apparently vec_cts/vec_ctf intrinsics for 64-bit doubles\n-// are buggy, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70963\n template <>\n inline Packet2l pcast<Packet2d, Packet2l>(const Packet2d& x) {\n-#if EIGEN_GNUC_STRICT_AT_LEAST(7, 1, 0)\n- return vec_cts(x, 0); // TODO: check clang version.\n-#else\n- double tmp[2];\n- memcpy(tmp, &x, sizeof(tmp));\n- Packet2l l = {static_cast<long long>(tmp[0]), static_cast<long long>(tmp[1])};\n- return l;\n-#endif\n+ EIGEN_ALIGN_MAX double dtmp[2];\n+ pstore(dtmp, x);\n+ EIGEN_ALIGN_MAX long long itmp[2] = {static_cast<long long>(dtmp[0]), static_cast<long long>(dtmp[1])};\n+ return vec_xl(0, itmp);\n }\n \n template <>\n inline Packet2d pcast<Packet2l, Packet2d>(const Packet2l& x) {\n- unsigned long long tmp[2];\n- memcpy(tmp, &x, sizeof(tmp));\n- Packet2d d = {static_cast<double>(tmp[0]), static_cast<double>(tmp[1])};\n- return d;\n+ EIGEN_ALIGN_MAX long long itmp[2];\n+ vec_xst(x, 0, itmp);\n+ EIGEN_ALIGN_MAX double dtmp[2] = {static_cast<double>(itmp[0]), static_cast<double>(itmp[1])};\n+ return pload<Packet2d>(dtmp);\n }\n #endif\n \n","new_path":"Eigen/src/Core/arch/AltiVec/TypeCasting.h","old_path":"Eigen/src/Core/arch/AltiVec/TypeCasting.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nFix VSX packetmath psin and pcast tests.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Fixed type mismatch in `vec_cst` for double, ensuring it returns a `signed long long` instead of `signed int`.\n- Added element-by-element casts to resolve type mismatches.\n- Fixed issues with `memcpy` and ensured proper vector loading/storing in clang.\n\n### Improvements:\n- Resolved type mismatches in VSX packetmath operations.\n- Improved stability and correctness of vector operations in QEMU.\n\n### Impact:\n- Fixed compilation and execution issues in VSX-based environments.\n- Ensured compatibility with clang and QEMU."}
{"iid":1920,"title":"Fix a collection of random failures encountered when testing with Bazel.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1920","description":"- Minor fixes to tests with EIGEN_EXCEPTIONS disabled.\n- Some bad test numbering resulting in empty tests.\n- Enabled GPU support for floor/cmp.\n- An ODR issue in SimplicialCholesky.","created_at":"2025-06-25T22:26:39.150Z","merged_at":"2025-06-26T16:58:25.544Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -31,6 +31,15 @@ namespace internal {\n #define EIGEN_GPU_HAS_FP16_ARITHMETIC 1\n #endif\n \n+// We need to distinguish ‘clang as the CUDA compiler’ from ‘clang as the host compiler,\n+// invoked by NVCC’ (e.g. on MacOS). The former needs to see both host and device implementation\n+// of the functions, while the latter can only deal with one of them.\n+#if defined(EIGEN_CUDA_ARCH) || defined(EIGEN_HIPCC) || (defined(EIGEN_CUDACC) && EIGEN_COMP_CLANG && !EIGEN_COMP_NVCC)\n+#define EIGEN_HAS_GPU_DEVICE_FUNCTIONS 1\n+#else\n+#define EIGEN_HAS_GPU_DEVICE_FUNCTIONS 0\n+#endif\n+\n // Make sure this is only available when targeting a GPU: we don't want to\n // introduce conflicts between these packet_traits definitions and the ones\n // we'll use on the host side (SSE, AVX, ...)\n@@ -74,7 +83,10 @@ struct packet_traits<float> : default_packet_traits {\n HasGammaSampleDerAlpha = 1,\n HasIGammac = 1,\n HasBetaInc = 1,\n- HasBlend = 0\n+\n+ HasBlend = 0,\n+ HasFloor = 1,\n+ HasCmp = EIGEN_HAS_GPU_DEVICE_FUNCTIONS\n };\n };\n \n@@ -143,10 +155,7 @@ EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE double2 pset1<double2>(const double& from)\n return make_double2(from, from);\n }\n \n-// We need to distinguish ‘clang as the CUDA compiler’ from ‘clang as the host compiler,\n-// invoked by NVCC’ (e.g. on MacOS). The former needs to see both host and device implementation\n-// of the functions, while the latter can only deal with one of them.\n-#if defined(EIGEN_CUDA_ARCH) || defined(EIGEN_HIPCC) || (defined(EIGEN_CUDACC) && EIGEN_COMP_CLANG && !EIGEN_COMP_NVCC)\n+#if EIGEN_HAS_GPU_DEVICE_FUNCTIONS\n \n EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE float bitwise_and(const float& a, const float& b) {\n return __int_as_float(__float_as_int(a) & __float_as_int(b));\n@@ -259,8 +268,7 @@ template <>\n EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE double2 pcmp_le<double2>(const double2& a, const double2& b) {\n return make_double2(le_mask(a.x, b.x), le_mask(a.y, b.y));\n }\n-#endif // defined(EIGEN_CUDA_ARCH) || defined(EIGEN_HIPCC) || (defined(EIGEN_CUDACC) && EIGEN_COMP_CLANG &&\n- // !EIGEN_COMP_NVCC)\n+#endif // EIGEN_HAS_GPU_DEVICE_FUNCTIONS\n \n template <>\n EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE float4 plset<float4>(const float& a) {\n","new_path":"Eigen/src/Core/arch/GPU/PacketMath.h","old_path":"Eigen/src/Core/arch/GPU/PacketMath.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -274,6 +274,10 @@ struct simpl_chol_helper {\n }\n };\n \n+// Symbol is ODR-used, so we need a definition.\n+template <typename Scalar, typename StorageIndex>\n+constexpr StorageIndex simpl_chol_helper<Scalar, StorageIndex>::kEmpty;\n+\n } // namespace internal\n \n template <typename Derived>\n","new_path":"Eigen/src/SparseCholesky/SimplicialCholesky_impl.h","old_path":"Eigen/src/SparseCholesky/SimplicialCholesky_impl.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -16,7 +16,7 @@\n #pragma GCC diagnostic ignored \"-Wshadow\"\n #endif\n \n-#ifndef EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW\n+#if defined(EIGEN_EXCEPTIONS) && !defined(EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW)\n struct my_exception {\n my_exception() {}\n ~my_exception() {}\n@@ -76,7 +76,7 @@ class AnnoyingScalar {\n }\n \n AnnoyingScalar operator+(const AnnoyingScalar& other) const {\n-#ifndef EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW\n+#if defined(EIGEN_EXCEPTIONS) && !defined(EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW)\n countdown--;\n if (countdown <= 0 && !dont_throw) throw my_exception();\n #endif\n","new_path":"test/AnnoyingScalar.h","old_path":"test/AnnoyingScalar.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -1340,7 +1340,7 @@ EIGEN_DECLARE_TEST(array_cwise) {\n CALL_SUBTEST_3(array_generic(Array44d()));\n CALL_SUBTEST_4(array_generic(\n ArrayXXcf(internal::random<int>(1, EIGEN_TEST_MAX_SIZE), internal::random<int>(1, EIGEN_TEST_MAX_SIZE))));\n- CALL_SUBTEST_7(array_generic(\n+ CALL_SUBTEST_5(array_generic(\n ArrayXXf(internal::random<int>(1, EIGEN_TEST_MAX_SIZE), internal::random<int>(1, EIGEN_TEST_MAX_SIZE))));\n CALL_SUBTEST_8(array_generic(\n ArrayXXi(internal::random<int>(1, EIGEN_TEST_MAX_SIZE), internal::random<int>(1, EIGEN_TEST_MAX_SIZE))));\n","new_path":"test/array_cwise.cpp","old_path":"test/array_cwise.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -8,7 +8,7 @@\n // with this file, You can obtain one at http://mozilla.org/MPL/2.0/.\n \n // Various sanity tests with exceptions and non trivially copyable scalar type.\n-// - no memory leak when a custom scalar type trow an exceptions\n+// - no memory leak when a custom scalar type throw an exceptions\n // - todo: complete the list of tests!\n \n #define EIGEN_STACK_ALLOCATION_LIMIT 100000000\n@@ -21,9 +21,8 @@\n AnnoyingScalar::countdown = 100; \\\n int before = AnnoyingScalar::instances; \\\n bool exception_thrown = false; \\\n- try { \\\n- OP; \\\n- } catch (my_exception) { \\\n+ EIGEN_TRY { OP; } \\\n+ EIGEN_CATCH(my_exception) { \\\n exception_thrown = true; \\\n VERIFY(AnnoyingScalar::instances == before && \"memory leak detected in \" && EIGEN_MAKESTRING(OP)); \\\n } \\\n@@ -35,7 +34,11 @@ EIGEN_DECLARE_TEST(exceptions) {\n typedef Eigen::Matrix<AnnoyingScalar, Dynamic, Dynamic> MatrixType;\n \n {\n+#if defined(EIGEN_EXCEPTIONS) && !defined(EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW)\n AnnoyingScalar::dont_throw = false;\n+#else\n+ AnnoyingScalar::dont_throw = true;\n+#endif\n int n = 50;\n VectorType v0(n), v1(n);\n MatrixType m0(n, n), m1(n, n), m2(n, n);\n","new_path":"test/exceptions.cpp","old_path":"test/exceptions.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -343,7 +343,7 @@ static std::vector<std::string> eigen_assert_list;\n #if !defined(EIGEN_TESTING_CONSTEXPR) && !defined(EIGEN_TESTING_PLAINOBJECT_CTOR)\n #define EIGEN_INTERNAL_DEBUGGING\n #endif\n-#include <Eigen/QR> // required for createRandomPIMatrixOfRank and generateRandomMatrixSvs\n+#include <Eigen/Core>\n \n inline void verify_impl(bool condition, const char* testname, const char* file, int line,\n const char* condition_as_string) {\n@@ -935,3 +935,7 @@ int main(int argc, char* argv[]) {\n #endif\n \n #include \"gpu_test_helper.h\"\n+\n+#ifndef EIGEN_TEST_MAX_SIZE\n+#define EIGEN_TEST_MAX_SIZE 320\n+#endif\n","new_path":"test/main.h","old_path":"test/main.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -1,6 +1,8 @@\n #include \"main.h\"\n \n+#ifdef EIGEN_EXCEPTIONS\n #include <exception> // std::exception\n+#endif\n \n #include <Eigen/src/Core/util/MaxSizeVector.h>\n \n@@ -31,28 +33,27 @@ struct Foo {\n std::cout << '~';\n --Foo::object_count;\n }\n-\n+#ifdef EIGEN_EXCEPTIONS\n class Fail : public std::exception {};\n+#endif\n };\n \n Index Foo::object_count = 0;\n Index Foo::object_limit = 0;\n \n-EIGEN_DECLARE_TEST(cxx11_maxsizevector) {\n+EIGEN_DECLARE_TEST(maxsizevector) {\n typedef MaxSizeVector<Foo> VectorX;\n Foo::object_count = 0;\n for (int r = 0; r < g_repeat; r++) {\n Index rows = internal::random<Index>(3, 30);\n Foo::object_limit = internal::random<Index>(0, rows - 2);\n std::cout << \"object_limit = \" << Foo::object_limit << std::endl;\n- bool exception_raised = false;\n #ifdef EIGEN_EXCEPTIONS\n+ bool exception_raised = false;\n try {\n-#endif\n std::cout << \"\\nVectorX m(\" << rows << \");\\n\";\n VectorX vect(rows);\n for (int i = 0; i < rows; ++i) vect.push_back(Foo());\n-#ifdef EIGEN_EXCEPTIONS\n VERIFY(false); // not reached if exceptions are enabled\n } catch (const Foo::Fail&) {\n exception_raised = true;\n","new_path":"test/maxsizevector.cpp","old_path":"test/maxsizevector.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -182,8 +182,8 @@ EIGEN_DECLARE_TEST(redux) {\n CALL_SUBTEST_5(matrixRedux(ArrayXX<int64_t>(rows, cols)));\n CALL_SUBTEST_6(matrixRedux(MatrixXcf(rows, cols)));\n CALL_SUBTEST_6(matrixRedux(ArrayXXcf(rows, cols)));\n- CALL_SUBTEST_6(matrixRedux(MatrixXcd(rows, cols)));\n- CALL_SUBTEST_6(matrixRedux(ArrayXXcd(rows, cols)));\n+ CALL_SUBTEST_7(matrixRedux(MatrixXcd(rows, cols)));\n+ CALL_SUBTEST_7(matrixRedux(ArrayXXcd(rows, cols)));\n }\n for (int i = 0; i < g_repeat; i++) {\n int size = internal::random<int>(1, maxsize);\n","new_path":"test/redux.cpp","old_path":"test/redux.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -9,6 +9,7 @@\n \n #include \"main.h\"\n \n+#ifdef EIGEN_EXCEPTIONS\n #define VERIFY_THROWS_BADALLOC(a) \\\n { \\\n bool threw = false; \\\n@@ -19,6 +20,10 @@\n } \\\n VERIFY(threw && \"should have thrown bad_alloc: \" #a); \\\n }\n+#else\n+// No way to catch a bad alloc - program terminates.\n+#define VERIFY_THROWS_BADALLOC(a)\n+#endif\n \n template <typename MatrixType>\n void triggerMatrixBadAlloc(Index rows, Index cols) {\n","new_path":"test/sizeoverflow.cpp","old_path":"test/sizeoverflow.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -381,6 +381,7 @@ void svd_verify_assert_full_only(const MatrixType& input = MatrixType()) {\n \n typedef Matrix<typename MatrixType::Scalar, RowsAtCompileTime, 1> RhsType;\n RhsType rhs = RhsType::Zero(input.rows());\n+ EIGEN_UNUSED_VARIABLE(rhs); // Only used if asserts are enabled.\n MatrixType m(input.rows(), input.cols());\n svd_fill_random(m);\n \n@@ -410,6 +411,7 @@ void svd_verify_assert(const MatrixType& input = MatrixType()) {\n enum { RowsAtCompileTime = MatrixType::RowsAtCompileTime };\n typedef Matrix<typename MatrixType::Scalar, RowsAtCompileTime, 1> RhsType;\n RhsType rhs = RhsType::Zero(input.rows());\n+ EIGEN_UNUSED_VARIABLE(rhs); // Only used if asserts are enabled.\n MatrixType m(input.rows(), input.cols());\n svd_fill_random(m);\n \n","new_path":"test/svd_common.h","old_path":"test/svd_common.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -24,6 +24,8 @@ void zeroReduction(const MatrixType& m) {\n VERIFY_RAISES_ASSERT(m.minCoeff());\n VERIFY_RAISES_ASSERT(m.maxCoeff());\n Index i, j;\n+ EIGEN_UNUSED_VARIABLE(i); // Only used if exceptions are enabled.\n+ EIGEN_UNUSED_VARIABLE(j);\n VERIFY_RAISES_ASSERT(m.minCoeff(&i, &j));\n VERIFY_RAISES_ASSERT(m.maxCoeff(&i, &j));\n VERIFY_RAISES_ASSERT(m.reshaped().minCoeff(&i));\n","new_path":"test/zerosized.cpp","old_path":"test/zerosized.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -45,7 +45,7 @@\n #include <thread>\n \n #if defined(EIGEN_USE_THREADS) || defined(EIGEN_USE_SYCL)\n-#include \"ThreadPool\"\n+#include \"../../../Eigen/ThreadPool\"\n #endif\n \n #ifdef EIGEN_USE_GPU\n","new_path":"unsupported/Eigen/CXX11/Tensor","old_path":"unsupported/Eigen/CXX11/Tensor","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"12","labels":["5.0"],"state":"merged","summary":"## Title:\nFix a collection of random failures encountered when testing with Bazel.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Fixed tests with `EIGEN_EXCEPTIONS` disabled.\n- Enabled GPU support for floor/cmp operations.\n- Addressed ODR issues in SimplicialCholesky.\n\n### Improvements:\n- Enabled GPU support for specific operations.\n- Fixed test numbering issues leading to empty tests.\n\n### Impact:\n- Resolved test failures related to Bazel compilation.\n- Improved compatibility with GPU hardware."}
{"iid":1917,"title":"Use QEMU for arm and ppc tests.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1917","description":"The public gitlab aarch64 machines don't seem to like running 32-bit arm tests. Transition though to QEMU.\n \nWe've also lost ownership of the PPC gitlab runner, and the runner itself is very out of date. Transition PPC tests to QEMU.","created_at":"2025-06-23T18:51:54.203Z","merged_at":"2025-06-25T15:22:47.652Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -244,11 +244,13 @@ build:linux:rocm-latest:gcc-10:\n EIGEN_CI_CROSS_TARGET_TRIPLE: arm-linux-gnueabihf\n EIGEN_CI_ADDITIONAL_ARGS: >\n -DEIGEN_TEST_CUSTOM_CXX_FLAGS=-march=armv7-a;-mfpu=neon-vfpv4\n+ -DCMAKE_SYSTEM_NAME=Linux\n+ -DCMAKE_CROSSCOMPILING_EMULATOR=qemu-arm-static;-L;/usr/arm-linux-gnueabihf\n \n build:linux:cross:arm:gcc-10:default:\n extends: .build:linux:cross:arm\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf\n+ EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf qemu-user-static\n EIGEN_CI_CROSS_C_COMPILER: arm-linux-gnueabihf-gcc-10\n EIGEN_CI_CROSS_CXX_COMPILER: arm-linux-gnueabihf-g++-10\n \n@@ -258,7 +260,7 @@ build:linux:cross:arm:clang-12:default:\n EIGEN_CI_INSTALL: clang-12\n EIGEN_CI_C_COMPILER: clang-12\n EIGEN_CI_CXX_COMPILER: clang++-12\n- EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf clang-12\n+ EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf clang-12 qemu-user-static\n \n ######## aarch64 ###############################################################\n \n@@ -295,25 +297,23 @@ build:linux:cross:aarch64:clang-12:default:\n variables:\n EIGEN_CI_TARGET_ARCH: ppc64le\n EIGEN_CI_CROSS_TARGET_TRIPLE: powerpc64le-linux-gnu\n+ EIGEN_CI_ADDITIONAL_ARGS: >-\n+ -DCMAKE_SYSTEM_NAME=Linux\n+ -DCMAKE_CROSSCOMPILING_EMULATOR=qemu-ppc64le-static;-L;/usr/powerpc64le-linux-gnu\n \n build:linux:cross:ppc64le:gcc-10:default:\n extends: .build:linux:cross:ppc64le\n variables:\n- EIGEN_CI_C_COMPILER: gcc-10\n- EIGEN_CI_CXX_COMPILER: g++-10\n- EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu\n+ EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu qemu-user-static\n EIGEN_CI_CROSS_C_COMPILER: powerpc64le-linux-gnu-gcc-10\n EIGEN_CI_CROSS_CXX_COMPILER: powerpc64le-linux-gnu-g++-10\n- # Temporarily disable MMA until #2457 is resolved.\n- EIGEN_CI_ADDITIONAL_ARGS: \"-DEIGEN_ALTIVEC_DISABLE_MMA=1\"\n \n build:linux:cross:ppc64le:clang-12:default:\n extends: .build:linux:cross:ppc64le\n variables:\n- EIGEN_CI_INSTALL: clang-12\n EIGEN_CI_C_COMPILER: clang-12\n EIGEN_CI_CXX_COMPILER: clang++-12\n- EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu clang-12\n+ EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu clang-12 qemu-user-static\n \n ######## loongarch64 #################################################\n \n@@ -328,7 +328,7 @@ build:linux:cross:loongarch64:gcc-14:default:\n extends: .build:linux:cross:loongarch64\n image: ubuntu:24.04\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-14-loongarch64-linux-gnu\n+ EIGEN_CI_CROSS_INSTALL: g++-14-loongarch64-linux-gnu qemu-user-static\n EIGEN_CI_CROSS_C_COMPILER: loongarch64-linux-gnu-gcc-14\n EIGEN_CI_CROSS_CXX_COMPILER: loongarch64-linux-gnu-g++-14\n EIGEN_CI_ADDITIONAL_ARGS: >-\n","new_path":"ci/build.linux.gitlab-ci.yml","old_path":"ci/build.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -9,6 +9,10 @@\n - if: $CI_PIPELINE_SOURCE == \"schedule\" && $CI_PROJECT_NAMESPACE == \"libeigen\"\n - if: $CI_PIPELINE_SOURCE == \"web\" && $CI_PROJECT_NAMESPACE == \"libeigen\"\n - if: $CI_PIPELINE_SOURCE == \"merge_request_event\" && $CI_PROJECT_NAMESPACE == \"libeigen\" && $CI_MERGE_REQUEST_LABELS =~ \"/all-tests/\"\n+ tags:\n+ - eigen-runner\n+ - linux\n+ - x86-64\n \n ##### x86-64 ###################################################################\n .test:linux:x86-64:\n@@ -16,10 +20,6 @@\n variables:\n EIGEN_CI_TARGET_ARCH: x86_64\n EIGEN_CI_CROSS_TARGET_TRIPLE: x86_64-linux-gnu\n- tags:\n- - eigen-runner\n- - linux\n- - x86-64\n \n # GCC-6 (minimum on Ubuntu 18.04)\n .test:linux:x86-64:gcc-6:default:\n@@ -289,16 +289,13 @@ test:linux:cuda-12.2:clang-12:\n variables:\n EIGEN_CI_TARGET_ARCH: arm\n EIGEN_CI_CROSS_TARGET_TRIPLE: arm-linux-gnueabihf\n- # Enable cross-compiled arm binary to run on aarch64.\n- EIGEN_CI_BEFORE_SCRIPT: \"ln -s /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 /lib/ && export LD_LIBRARY_PATH=/usr/arm-linux-gnueabihf/lib/\"\n- tags:\n- - saas-linux-large-arm64\n+ EIGEN_CI_CTEST_ARGS: --timeout 2000\n \n .test:linux:arm:gcc-10:default:\n extends: .test:linux:arm\n needs: [ build:linux:cross:arm:gcc-10:default ]\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf\n+ EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf qemu-user-static\n \n test:linux:arm:gcc-10:default:official:\n extends: .test:linux:arm:gcc-10:default\n@@ -314,7 +311,7 @@ test:linux:arm:gcc-10:default:unsupported:\n extends: .test:linux:arm\n needs: [ build:linux:cross:arm:clang-12:default ]\n variables:\n- EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf clang-12\n+ EIGEN_CI_CROSS_INSTALL: g++-10-arm-linux-gnueabihf clang-12 qemu-user-static\n \n test:linux:arm:clang-12:default:official:\n extends: .test:linux:arm:clang-12:default\n@@ -375,16 +372,13 @@ test:linux:aarch64:clang-12:default:unsupported:\n variables:\n EIGEN_CI_TARGET_ARCH: ppc64le\n EIGEN_CI_CROSS_TARGET_TRIPLE: powerpc64le-linux-gnu\n- tags:\n- - eigen-runner\n- - linux\n- - ppc64le\n \n .test:linux:ppc64le:gcc-10:default:\n extends: .test:linux:ppc64le\n needs: [ build:linux:cross:ppc64le:gcc-10:default ]\n variables:\n- EIGEN_CI_INSTALL: g++-10\n+ EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu qemu-user-static\n+ EIGEN_CI_CTEST_ARGS: --timeout 2000\n \n test:linux:ppc64le:gcc-10:default:official:\n extends: .test:linux:ppc64le:gcc-10:default\n@@ -400,7 +394,7 @@ test:linux:ppc64le:gcc-10:default:unsupported:\n extends: .test:linux:ppc64le\n needs: [ build:linux:cross:ppc64le:clang-12:default ]\n variables:\n- EIGEN_CI_INSTALL: clang-12\n+ EIGEN_CI_CROSS_INSTALL: g++-10-powerpc64le-linux-gnu clang-12 qemu-user-static\n \n test:linux:ppc64le:clang-12:default:official:\n extends: .test:linux:ppc64le:clang-12:default\n@@ -412,20 +406,16 @@ test:linux:ppc64le:clang-12:default:unsupported:\n variables:\n EIGEN_CI_CTEST_LABEL: Unsupported\n \n-##### loongarch64 ###################################################################\n+##### loongarch64 ##############################################################\n+\n .test:linux:loongarch64:\n extends: .test:linux\n image: ubuntu:24.04\n variables:\n EIGEN_CI_TARGET_ARCH: loongarch64\n EIGEN_CI_CROSS_TARGET_TRIPLE: loongarch64-linux-gnu\n- # Install QEMU and set up the execution environment in the image\n EIGEN_CI_CROSS_INSTALL: g++-14-loongarch64-linux-gnu qemu-user-static\n EIGEN_CI_CTEST_ARGS: --timeout 2000\n- tags:\n- - eigen-runner\n- - linux\n- - cross-compiler\n \n # GCC-14 (Ubuntu 24)\n .test:linux:loongarch64:gcc-14:default:\n","new_path":"ci/test.linux.gitlab-ci.yml","old_path":"ci/test.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"2","labels":["5.0"],"state":"merged","summary":"## Title:\nUse QEMU for arm and ppc tests.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Modified `ci/build.linux.gitlab-ci.yml` to use QEMU for ARM tests.\n- Modified `ci/test.linux.gitlab-ci.yml` to use QEMU for PPC tests.\n\n### Improvements:\n- Transitioned from using native hardware to QEMU for ARM and PPC testing.\n- Improved test environment reliability for ARM and PPC architectures.\n\n### Impact:\n- Reduced dependency on native hardware for ARM and PPC tests.\n- Improved test stability and portability across different architectures."}
{"iid":1916,"title":"tensor documentation","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1916","description":"### Reference issue\n\n\n### What does this implement/fix?\nAdded some documentation to the tensor readme.\nIt turned out to be a lot of changes so maybe it's easiest to read the markdown rendered one directly:\n\nhttps://gitlab.com/henric.ryden/eigen/-/blob/tensor_documentation/unsupported/Eigen/CXX11/src/Tensor/README.md\n\n\n### Additional information","created_at":"2025-06-23T13:56:21.025Z","merged_at":"2025-06-29T03:47:43.542Z","author":{"name":"Henric Ryden","username":"henric.ryden"},"changes":[{"diff":"@@ -24,12 +24,14 @@ Constructor for a Tensor. The constructor must be passed `rank` integers\n indicating the sizes of the instance along each of the the `rank`\n dimensions.\n \n- // Create a tensor of rank 3 of sizes 2, 3, 4. This tensor owns\n- // memory to hold 24 floating point values (24 = 2 x 3 x 4).\n- Tensor<float, 3> t_3d(2, 3, 4);\n+```cpp\n+// Create a tensor of rank 3 of sizes 2, 3, 4. This tensor owns\n+// memory to hold 24 floating point values (24 = 2 x 3 x 4).\n+Tensor<float, 3> t_3d(2, 3, 4);\n \n- // Resize t_3d by assigning a tensor of different sizes, but same rank.\n- t_3d = Tensor<float, 3>(3, 4, 3);\n+// Resize t_3d by assigning a tensor of different sizes, but same rank.\n+t_3d = Tensor<float, 3>(3, 4, 3);\n+```\n \n #### Constructor Tensor<data_type, rank>(size_array)\n \n@@ -38,8 +40,10 @@ values instead of an explicitly list of parameters. The array type to use is\n `Eigen::array<Eigen::Index>`. The array can be constructed automatically\n from an initializer list.\n \n- // Create a tensor of strings of rank 2 with sizes 5, 7.\n- Tensor<string, 2> t_2d({5, 7});\n+```cpp\n+// Create a tensor of strings of rank 2 with sizes 5, 7.\n+Tensor<string, 2> t_2d({5, 7});\n+```\n \n ### Class TensorFixedSize<data_type, Sizes<size0, size1, ...>>\n \n@@ -50,17 +54,19 @@ dimensions are known by the compiler. FixedSize tensors are not resizable.\n If the total number of elements in a fixed size tensor is small enough the\n tensor data is held onto the stack and does not cause heap allocation and free.\n \n- // Create a 4 x 3 tensor of floats.\n- TensorFixedSize<float, Sizes<4, 3>> t_4x3;\n+```cpp\n+// Create a 4 x 3 tensor of floats.\n+TensorFixedSize<float, Sizes<4, 3>> t_4x3;\n+```\n \n ### Class TensorMap<Tensor<data_type, rank>>\n \n This is the class to use to create a tensor on top of memory allocated and\n owned by another part of your code. It allows to view any piece of allocated\n-memory as a Tensor. Instances of this class do not own the memory where the\n+memory as a `Tensor`. Instances of this class do not own the memory where the\n data are stored.\n \n-A TensorMap is not resizable because it does not own the memory where its data\n+A `TensorMap` is not resizable because it does not own the memory where its data\n are stored.\n \n #### Constructor TensorMap<Tensor<data_type, rank>>(data, size0, size1, ...)\n@@ -69,23 +75,24 @@ Constructor for a Tensor. The constructor must be passed a pointer to the\n storage for the data, and \"rank\" size attributes. The storage has to be\n large enough to hold all the data.\n \n- // Map a tensor of ints on top of stack-allocated storage.\n- int storage[128]; // 2 x 4 x 2 x 8 = 128\n- TensorMap<Tensor<int, 4>> t_4d(storage, 2, 4, 2, 8);\n+```cpp\n+// Map a tensor of ints on top of stack-allocated storage.\n+int storage[128]; // 2 x 4 x 2 x 8 = 128\n+TensorMap<Tensor<int, 4>> t_4d(storage, 2, 4, 2, 8);\n \n- // The same storage can be viewed as a different tensor.\n- // You can also pass the sizes as an array.\n- TensorMap<Tensor<int, 2>> t_2d(storage, 16, 8);\n-\n- // You can also map fixed-size tensors. Here we get a 1d view of\n- // the 2d fixed-size tensor.\n- TensorFixedSize<float, Sizes<4, 3>> t_4x3;\n- TensorMap<Tensor<float, 1>> t_12(t_4x3.data(), 12);\n+// The same storage can be viewed as a different tensor.\n+// You can also pass the sizes as an array.\n+TensorMap<Tensor<int, 2>> t_2d(storage, 16, 8);\n \n+// You can also map fixed-size tensors. Here we get a 1d view of\n+// the 2d fixed-size tensor.\n+TensorFixedSize<float, Sizes<4, 3>> t_4x3;\n+TensorMap<Tensor<float, 1>> t_12(t_4x3.data(), 12);\n+```\n \n #### Class TensorRef\n \n-See Assigning to a `TensorRef` below.\n+See [Assigning to a TensorRef.](#assigning-to-a-tensorref)\n \n ## Accessing Tensor Elements\n \n@@ -96,24 +103,25 @@ Return the element at position `(index0, index1...)` in tensor\n The expression can be used as an l-value to set the value of the element at the\n specified position. The value returned is of the datatype of the tensor.\n \n- // Set the value of the element at position (0, 1, 0);\n- Tensor<float, 3> t_3d(2, 3, 4);\n- t_3d(0, 1, 0) = 12.0f;\n-\n- // Initialize all elements to random values.\n- for (int i = 0; i < 2; ++i) {\n- for (int j = 0; j < 3; ++j) {\n- for (int k = 0; k < 4; ++k) {\n- t_3d(i, j, k) = ...some random value...;\n- }\n- }\n- }\n+```cpp\n+// Set the value of the element at position (0, 1, 0);\n+Tensor<float, 3> t_3d(2, 3, 4);\n+t_3d(0, 1, 0) = 12.0f;\n \n- // Print elements of a tensor.\n- for (int i = 0; i < 2; ++i) {\n- LOG(INFO) << t_3d(i, 0, 0);\n+// Initialize all elements to random values.\n+for (int i = 0; i < 2; ++i) {\n+ for (int j = 0; j < 3; ++j) {\n+ for (int k = 0; k < 4; ++k) {\n+ t_3d(i, j, k) = ...some random value...;\n }\n+ }\n+}\n \n+// Print elements of a tensor.\n+for (int i = 0; i < 2; ++i) {\n+ std::cout << t_3d(i, 0, 0);\n+}\n+```\n \n ## TensorLayout\n \n@@ -123,8 +131,10 @@ The tensor library supports 2 layouts: `ColMajor` (the default) and\n The layout of a tensor is optionally specified as part of its type. If not\n specified explicitly column major is assumed.\n \n- Tensor<float, 3, ColMajor> col_major; // equivalent to Tensor<float, 3>\n- TensorMap<Tensor<float, 3, RowMajor> > row_major(data, ...);\n+```cpp\n+Tensor<float, 3, ColMajor> col_major; // equivalent to Tensor<float, 3>\n+TensorMap<Tensor<float, 3, RowMajor> > row_major(data, ...);\n+```\n \n All the arguments to an expression must use the same layout. Attempting to mix\n different layouts will result in a compilation error.\n@@ -133,47 +143,50 @@ It is possible to change the layout of a tensor or an expression using the\n `swap_layout()` method. Note that this will also reverse the order of the\n dimensions.\n \n- Tensor<float, 2, ColMajor> col_major(2, 4);\n- Tensor<float, 2, RowMajor> row_major(2, 4);\n+```cpp\n+Tensor<float, 2, ColMajor> col_major(2, 4);\n+Tensor<float, 2, RowMajor> row_major(2, 4);\n \n- Tensor<float, 2> col_major_result = col_major; // ok, layouts match\n- Tensor<float, 2> col_major_result = row_major; // will not compile\n+Tensor<float, 2> col_major_result = col_major; // ok, layouts match\n+Tensor<float, 2> col_major_result = row_major; // will not compile\n \n- // Simple layout swap\n- col_major_result = row_major.swap_layout();\n- eigen_assert(col_major_result.dimension(0) == 4);\n- eigen_assert(col_major_result.dimension(1) == 2);\n-\n- // Swap the layout and preserve the order of the dimensions\n- array<int, 2> shuffle(1, 0);\n- col_major_result = row_major.swap_layout().shuffle(shuffle);\n- eigen_assert(col_major_result.dimension(0) == 2);\n- eigen_assert(col_major_result.dimension(1) == 4);\n+// Simple layout swap\n+col_major_result = row_major.swap_layout();\n+eigen_assert(col_major_result.dimension(0) == 4);\n+eigen_assert(col_major_result.dimension(1) == 2);\n \n+// Swap the layout and preserve the order of the dimensions\n+array<int, 2> shuffle(1, 0);\n+col_major_result = row_major.swap_layout().shuffle(shuffle);\n+eigen_assert(col_major_result.dimension(0) == 2);\n+eigen_assert(col_major_result.dimension(1) == 4);\n+```\n \n ## Tensor Operations\n \n The Eigen Tensor library provides a vast library of operations on Tensors:\n numerical operations such as addition and multiplication, geometry operations\n such as slicing and shuffling, etc. These operations are available as methods\n-of the Tensor classes, and in some cases as operator overloads. For example\n+of the `Tensor` classes, and in some cases as operator overloads. For example\n the following code computes the elementwise addition of two tensors:\n \n- Tensor<float, 3> t1(2, 3, 4);\n- ...set some values in t1...\n- Tensor<float, 3> t2(2, 3, 4);\n- ...set some values in t2...\n- // Set t3 to the element wise sum of t1 and t2\n- Tensor<float, 3> t3 = t1 + t2;\n+```cpp\n+Tensor<float, 3> t1(2, 3, 4);\n+t2.setRandom();\n+Tensor<float, 3> t2(2, 3, 4);\n+t2.setRandom();\n+// Set t3 to the element wise sum of t1 and t2\n+Tensor<float, 3> t3 = t1 + t2;\n+```\n \n While the code above looks easy enough, it is important to understand that the\n expression `t1 + t2` is not actually adding the values of the tensors. The\n expression instead constructs a \"tensor operator\" object of the class\n-TensorCwiseBinaryOp<scalar_sum>, which has references to the tensors\n+`TensorCwiseBinaryOp<scalar_sum>`, which has references to the tensors\n `t1` and `t2`. This is a small C++ object that knows how to add\n `t1` and `t2`. It is only when the value of the expression is assigned\n to the tensor `t3` that the addition is actually performed. Technically,\n-this happens through the overloading of `operator=()` in the Tensor class.\n+this happens through the overloading of `operator=` in the Tensor class.\n \n This mechanism for computing tensor expressions allows for lazy evaluation and\n optimizations which are what make the tensor library very fast.\n@@ -181,16 +194,19 @@ optimizations which are what make the tensor library very fast.\n Of course, the tensor operators do nest, and the expression `t1 + t2 * 0.3f`\n is actually represented with the (approximate) tree of operators:\n \n- TensorCwiseBinaryOp<scalar_sum>(t1, TensorCwiseUnaryOp<scalar_mul>(t2, 0.3f))\n-\n+```cpp\n+TensorCwiseBinaryOp<scalar_sum>(t1, TensorCwiseUnaryOp<scalar_mul>(t2, 0.3f))\n+```\n \n ### Tensor Operations and C++ \"auto\"\n \n-Because Tensor operations create tensor operators, the C++ `auto` keyword\n+Because `Tensor` operations create tensor operators, the C++ `auto` keyword\n does not have its intuitive meaning. Consider these 2 lines of code:\n \n- Tensor<float, 3> t3 = t1 + t2;\n- auto t4 = t1 + t2;\n+```cpp\n+Tensor<float, 3> t3 = t1 + t2;\n+auto t4 = t1 + t2;\n+```\n \n In the first line we allocate the tensor `t3` and it will contain the\n result of the addition of `t1` and `t2`. In the second line, `t4`\n@@ -198,191 +214,221 @@ is actually the tree of tensor operators that will compute the addition of\n `t1` and `t2`. In fact, `t4` is *not* a tensor and you cannot get\n the values of its elements:\n \n- Tensor<float, 3> t3 = t1 + t2;\n- cout << t3(0, 0, 0); // OK prints the value of t1(0, 0, 0) + t2(0, 0, 0)\n+```cpp\n+Tensor<float, 3> t3 = t1 + t2;\n+std::cout << t3(0, 0, 0); // OK prints the value of t1(0, 0, 0) + t2(0, 0, 0)\n \n- auto t4 = t1 + t2;\n- cout << t4(0, 0, 0); // Compilation error!\n+auto t4 = t1 + t2;\n+std::cout << t4(0, 0, 0); // Compilation error!\n+```\n \n-When you use `auto` you do not get a Tensor as a result but instead a\n-non-evaluated expression. So only use `auto` to delay evaluation.\n+When you use `auto` you do not get a `Tensor` as a result but instead a\n+non-evaluated expression.\n+So only use `auto` to delay evaluation.\n \n Unfortunately, there is no single underlying concrete type for holding\n-non-evaluated expressions, hence you have to use auto in the case when you do\n+non-evaluated expressions, hence you have to use `auto` in the case when you do\n want to hold non-evaluated expressions.\n \n When you need the results of set of tensor computations you have to assign the\n-result to a Tensor that will be capable of holding onto them. This can be\n-either a normal Tensor, a fixed size Tensor, or a TensorMap on an existing\n+result to a `Tensor` that will be capable of holding onto them. This can be\n+either a normal `Tensor`, a `TensorFixedSize`, or a `TensorMap` on an existing\n piece of memory. All the following will work:\n \n- auto t4 = t1 + t2;\n+```cpp\n+auto t4 = t1 + t2;\n \n- Tensor<float, 3> result = t4; // Could also be: result(t4);\n- cout << result(0, 0, 0);\n+Tensor<float, 3> result = t4; // Could also be: result(t4);\n+std::cout << result(0, 0, 0);\n \n- TensorMap<float, 4> result(<a float* with enough space>, <size0>, ...) = t4;\n- cout << result(0, 0, 0);\n+TensorMap<float, 4> result(<a float* with enough space>, <size0>, ...) = t4;\n+std::cout << result(0, 0, 0);\n \n- TensorFixedSize<float, Sizes<size0, ...>> result = t4;\n- cout << result(0, 0, 0);\n+TensorFixedSize<float, Sizes<size0, ...>> result = t4;\n+std::cout << result(0, 0, 0);\n+```\n \n Until you need the results, you can keep the operation around, and even reuse\n it for additional operations. As long as you keep the expression as an\n operation, no computation is performed.\n \n- // One way to compute exp((t1 + t2) * 0.2f);\n- auto t3 = t1 + t2;\n- auto t4 = t3 * 0.2f;\n- auto t5 = t4.exp();\n- Tensor<float, 3> result = t5;\n+```cpp\n+// One way to compute exp((t1 + t2) * 0.2f);\n+auto t3 = t1 + t2;\n+auto t4 = t3 * 0.2f;\n+auto t5 = t4.exp();\n+Tensor<float, 3> result = t5;\n \n- // Another way, exactly as efficient as the previous one:\n- Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();\n+// Another way, exactly as efficient as the previous one:\n+Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();\n+```\n \n ### Controlling When Expression are Evaluated\n \n There are several ways to control when expressions are evaluated:\n \n-* Assignment to a Tensor, TensorFixedSize, or TensorMap.\n-* Use of the eval() method.\n-* Assignment to a TensorRef.\n+* Assignment to a `Tensor`, `TensorFixedSize`, or `TensorMap`.\n+* Use of the `eval()` method.\n+* Assignment to a `TensorRef`.\n \n #### Assigning to a Tensor, TensorFixedSize, or TensorMap.\n \n-The most common way to evaluate an expression is to assign it to a Tensor. In\n-the example below, the `auto` declarations make the intermediate values\n+The most common way to evaluate an expression is to assign it to a `Tensor`.\n+In the example below, the `auto` declarations make the intermediate values\n \"Operations\", not Tensors, and do not cause the expressions to be evaluated.\n The assignment to the Tensor `result` causes the evaluation of all the\n operations.\n \n- auto t3 = t1 + t2; // t3 is an Operation.\n- auto t4 = t3 * 0.2f; // t4 is an Operation.\n- auto t5 = t4.exp(); // t5 is an Operation.\n- Tensor<float, 3> result = t5; // The operations are evaluated.\n+```cpp\n+auto t3 = t1 + t2; // t3 is an Operation.\n+auto t4 = t3 * 0.2f; // t4 is an Operation.\n+auto t5 = t4.exp(); // t5 is an Operation.\n+Tensor<float, 3> result = t5; // The operations are evaluated.\n+```\n \n If you know the ranks and sizes of the Operation value you can assign the\n-Operation to a TensorFixedSize instead of a Tensor, which is a bit more\n-efficient.\n+Operation to a `TensorFixedSize` instead of a `Tensor`, which is a bit more efficient.\n \n- // We know that the result is a 4x4x2 tensor!\n- TensorFixedSize<float, Sizes<4, 4, 2>> result = t5;\n+```cpp\n+// We know that the result is a 4x4x2 tensor!\n+TensorFixedSize<float, Sizes<4, 4, 2>> result = t5;\n+```\n \n-Simiarly, assigning an expression to a TensorMap causes its evaluation. Like\n-tensors of type TensorFixedSize, TensorMaps cannot be resized so they have to\n+Similarly, assigning an expression to a `TensorMap` causes its evaluation.\n+Like tensors of type `TensorFixedSize`, a `TensorMap` cannot be resized so they have to\n have the rank and sizes of the expression that are assigned to them.\n \n #### Calling eval().\n \n When you compute large composite expressions, you sometimes want to tell Eigen\n that an intermediate value in the expression tree is worth evaluating ahead of\n-time. This is done by inserting a call to the `eval()` method of the\n+time.\n+This is done by inserting a call to the `eval()` method of the\n expression Operation.\n \n- // The previous example could have been written:\n- Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();\n+```cpp\n+// The previous example could have been written:\n+Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();\n \n- // If you want to compute (t1 + t2) once ahead of time you can write:\n- Tensor<float, 3> result = ((t1 + t2).eval() * 0.2f).exp();\n+// If you want to compute (t1 + t2) once ahead of time you can write:\n+Tensor<float, 3> result = ((t1 + t2).eval() * 0.2f).exp();\n+```\n \n Semantically, calling `eval()` is equivalent to materializing the value of\n-the expression in a temporary Tensor of the right size. The code above in\n-effect does:\n+the expression in a temporary `Tensor` of the right size.\n+The code above in effect does:\n \n- // .eval() knows the size!\n- TensorFixedSize<float, Sizes<4, 4, 2>> tmp = t1 + t2;\n- Tensor<float, 3> result = (tmp * 0.2f).exp();\n+```cpp\n+// .eval() knows the size!\n+TensorFixedSize<float, Sizes<4, 4, 2>> tmp = t1 + t2;\n+Tensor<float, 3> result = (tmp * 0.2f).exp();\n+```\n \n Note that the return value of `eval()` is itself an Operation, so the\n following code does not do what you may think:\n \n- // Here t3 is an evaluation Operation. t3 has not been evaluated yet.\n- auto t3 = (t1 + t2).eval();\n+```cpp\n+// Here t3 is an evaluation Operation. t3 has not been evaluated yet.\n+auto t3 = (t1 + t2).eval();\n \n- // You can use t3 in another expression. Still no evaluation.\n- auto t4 = (t3 * 0.2f).exp();\n+// You can use t3 in another expression. Still no evaluation.\n+auto t4 = (t3 * 0.2f).exp();\n \n- // The value is evaluated when you assign the Operation to a Tensor, using\n- // an intermediate tensor to represent t3.x\n- Tensor<float, 3> result = t4;\n+// The value is evaluated when you assign the Operation to a Tensor, using\n+// an intermediate tensor to represent t3.x\n+Tensor<float, 3> result = t4;\n+```\n \n While in the examples above calling `eval()` does not make a difference in\n performance, in other cases it can make a huge difference. In the expression\n below the `broadcast()` expression causes the `X.maximum()` expression\n to be evaluated many times:\n \n- Tensor<...> X ...;\n- Tensor<...> Y = ((X - X.maximum(depth_dim).reshape(dims2d).broadcast(bcast))\n- * beta).exp();\n+```cpp\n+Tensor<...> X ...;\n+Tensor<...> Y = ((X - X.maximum(depth_dim).reshape(dims2d).broadcast(bcast))\n+ * beta).exp();\n+```\n \n Inserting a call to `eval()` between the `maximum()` and\n-`reshape()` calls guarantees that maximum() is only computed once and\n+`reshape()` calls guarantees that `maximum()` is only computed once and\n greatly speeds-up execution:\n \n- Tensor<...> Y =\n- ((X - X.maximum(depth_dim).eval().reshape(dims2d).broadcast(bcast))\n- * beta).exp();\n+```cpp\n+Tensor<...> Y =\n+ ((X - X.maximum(depth_dim).eval().reshape(dims2d).broadcast(bcast))\n+ * beta).exp();\n+```\n \n-In the other example below, the tensor `Y` is both used in the expression\n-and its assignment. This is an aliasing problem and if the evaluation is not\n-done in the right order Y will be updated incrementally during the evaluation\n+In the other example below, the tensor `Y` is both used in the expression and its assignment.\n+This is an aliasing problem and if the evaluation is not done in the right order\n+Y will be updated incrementally during the evaluation\n resulting in bogus results:\n \n- Tensor<...> Y ...;\n- Y = Y / (Y.sum(depth_dim).reshape(dims2d).broadcast(bcast));\n+```cpp\n+ Tensor<...> Y ...;\n+ Y = Y / (Y.sum(depth_dim).reshape(dims2d).broadcast(bcast));\n+```\n \n Inserting a call to `eval()` between the `sum()` and `reshape()`\n expressions ensures that the sum is computed before any updates to `Y` are\n done.\n \n- Y = Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast));\n+```cpp\n+ Y = Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast));\n+```\n \n Note that an eval around the full right hand side expression is not needed\n-because the generated has to compute the i-th value of the right hand side\n+because the generated has to compute the `i`-th value of the right hand side\n before assigning it to the left hand side.\n \n However, if you were assigning the expression value to a shuffle of `Y`\n then you would need to force an eval for correctness by adding an `eval()`\n call for the right hand side:\n \n- Y.shuffle(...) =\n- (Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast))).eval();\n-\n+```cpp\n+ Y.shuffle(...) =\n+ (Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast))).eval();\n+```\n \n #### Assigning to a TensorRef.\n \n If you need to access only a few elements from the value of an expression you\n-can avoid materializing the value in a full tensor by using a TensorRef.\n+can avoid materializing the value in a full tensor by using a `TensorRef`.\n \n-A TensorRef is a small wrapper class for any Eigen Operation. It provides\n+A `TensorRef` is a small wrapper class for any Eigen Operation. It provides\n overloads for the `()` operator that let you access individual values in\n-the expression. TensorRef is convenient, because the Operation themselves do\n+the expression.\n+`TensorRef` is convenient, because the Operation themselves do\n not provide a way to access individual elements.\n \n- // Create a TensorRef for the expression. The expression is not\n- // evaluated yet.\n- TensorRef<Tensor<float, 3> > ref = ((t1 + t2) * 0.2f).exp();\n+```cpp\n+// Create a TensorRef for the expression. The expression is not\n+// evaluated yet.\n+TensorRef<Tensor<float, 3> > ref = ((t1 + t2) * 0.2f).exp();\n \n- // Use \"ref\" to access individual elements. The expression is evaluated\n- // on the fly.\n- float at_0 = ref(0, 0, 0);\n- cout << ref(0, 1, 0);\n+// Use \"ref\" to access individual elements. The expression is evaluated\n+// on the fly.\n+float at_0 = ref(0, 0, 0);\n+std::cout << ref(0, 1, 0);\n+```\n \n-Only use TensorRef when you need a subset of the values of the expression.\n-TensorRef only computes the values you access. However note that if you are\n-going to access all the values it will be much faster to materialize the\n-results in a Tensor first.\n+Only use `TensorRef` when you need a subset of the values of the expression.\n+`TensorRef` only computes the values you access.\n+However note that if you are going to access all the values it will be much\n+ faster to materialize the results in a `Tensor` first.\n \n-In some cases, if the full Tensor result would be very large, you may save\n-memory by accessing it as a TensorRef. But not always. So don't count on it.\n+In some cases, if the full `Tensor` result would be very large, you may save\n+memory by accessing it as a `TensorRef`.\n+But not always.\n+So don't count on it.\n \n \n ### Controlling How Expressions Are Evaluated\n \n The tensor library provides several implementations of the various operations\n such as contractions and convolutions. The implementations are optimized for\n-different environments: single threaded on CPU, multi threaded on CPU, or on a\n-GPU using cuda. Additional implementations may be added later.\n+different environments: single threaded on CPU, multi threaded on CPU, or on a GPU using cuda.\n \n You can choose which implementation to use with the `device()` call. If\n you do not choose an implementation explicitly the default implementation that\n@@ -396,43 +442,51 @@ to enable the use of SSE, AVX, and other instructions.\n For example, the following code adds two tensors using the default\n single-threaded CPU implementation:\n \n- Tensor<float, 2> a(30, 40);\n- Tensor<float, 2> b(30, 40);\n- Tensor<float, 2> c = a + b;\n+```cpp\n+Tensor<float, 2> a(30, 40);\n+Tensor<float, 2> b(30, 40);\n+Tensor<float, 2> c = a + b;\n+```\n \n To choose a different implementation you have to insert a `device()` call\n before the assignment of the result. For technical C++ reasons this requires\n-that the Tensor for the result be declared on its own. This means that you\n-have to know the size of the result.\n+that the `Tensor` for the result be declared on its own.\n+This means that you have to know the size of the result.\n \n- Eigen::Tensor<float, 2> c(30, 40);\n- c.device(...) = a + b;\n+```cpp\n+Eigen::Tensor<float, 2> c(30, 40);\n+c.device(...) = a + b;\n+```\n \n The call to `device()` must be the last call on the left of the operator=.\n \n You must pass to the `device()` call an Eigen device object. There are\n-presently three devices you can use: DefaultDevice, ThreadPoolDevice and\n-GpuDevice.\n+presently three devices you can use: `DefaultDevice`, `ThreadPoolDevice` and\n+`GpuDevice`.\n \n \n #### Evaluating With the DefaultDevice\n \n This is exactly the same as not inserting a `device()` call.\n \n- DefaultDevice my_device;\n- c.device(my_device) = a + b;\n+```cpp\n+DefaultDevice my_device;\n+c.device(my_device) = a + b;\n+```\n \n #### Evaluating with a Thread Pool\n \n- // Create the Eigen ThreadPool\n- Eigen::ThreadPool pool(8 /* number of threads in pool */)\n+```cpp\n+// Create the Eigen ThreadPool\n+Eigen::ThreadPool pool(8 /* number of threads in pool */)\n \n- // Create the Eigen ThreadPoolDevice.\n- Eigen::ThreadPoolDevice my_device(&pool, 4 /* number of threads to use */);\n+// Create the Eigen ThreadPoolDevice.\n+Eigen::ThreadPoolDevice my_device(&pool, 4 /* number of threads to use */);\n \n- // Now just use the device when evaluating expressions.\n- Eigen::Tensor<float, 2> c(30, 50);\n- c.device(my_device) = a.contract(b, dot_product_dims);\n+// Now just use the device when evaluating expressions.\n+Eigen::Tensor<float, 2> c(30, 50);\n+c.device(my_device) = a.contract(b, dot_product_dims);\n+```\n \n \n #### Evaluating On GPU\n@@ -451,7 +505,7 @@ that are tensor-type specific:\n \n #### <Tensor-Type>::Dimensions\n \n-Acts like an array of ints. Has an `int size` attribute, and can be\n+Acts like an array of `int`. Has an `int size` attribute, and can be\n indexed like an array to access individual values. Used to represent the\n dimensions of a tensor. See `dimensions()`.\n \n@@ -463,8 +517,7 @@ Acts like an `int`. Used for indexing tensors along their dimensions. See\n #### <Tensor-Type>::Scalar\n \n Represents the datatype of individual tensor elements. For example, for a\n-`Tensor<float>`, `Scalar` is the type `float`. See\n-`setConstant()`.\n+`Tensor<float>`, `Scalar` is the type `float`. See `setConstant()`.\n \n #### (Operation)\n \n@@ -473,8 +526,8 @@ method. We indicate in the text the type and dimensions of the tensor that the\n Operation returns after evaluation.\n \n The Operation will have to be evaluated, for example by assigning it to a\n-tensor, before you can access the values of the resulting tensor. You can also\n-access the values through a TensorRef.\n+`Tensor`, before you can access the values of the resulting tensor. You can also\n+access the values through a `TensorRef`.\n \n \n ## Built-in Tensor Methods\n@@ -482,71 +535,80 @@ access the values through a TensorRef.\n These are usual C++ methods that act on tensors immediately. They are not\n Operations which provide delayed evaluation of their results. Unless specified\n otherwise, all the methods listed below are available on all tensor classes:\n-Tensor, TensorFixedSize, and TensorMap.\n+`Tensor`, `TensorFixedSize`, and `TensorMap`.\n \n ## Metadata\n \n ### int NumDimensions\n \n-Constant value indicating the number of dimensions of a Tensor. This is also\n-known as the tensor \"rank\".\n+Constant value indicating the number of dimensions of a `Tensor`.\n+This is also known as the tensor rank.\n \n- Eigen::Tensor<float, 2> a(3, 4);\n- cout << \"Dims \" << a.NumDimensions;\n- => Dims 2\n+```cpp\n+Eigen::Tensor<float, 2> a(3, 4);\n+std::cout << \"Dims \" << a.NumDimensions;\n+// Dims 2\n+```\n \n ### Dimensions dimensions()\n \n Returns an array-like object representing the dimensions of the tensor.\n-The actual type of the `dimensions()` result is `<Tensor-Type>::``Dimensions`.\n+The actual type of the `dimensions()` result is `<Tensor-Type>::Dimensions`.\n \n- Eigen::Tensor<float, 2> a(3, 4);\n- const Eigen::Tensor<float, 2>::Dimensions& d = a.dimensions();\n- cout << \"Dim size: \" << d.size << \", dim 0: \" << d[0]\n- << \", dim 1: \" << d[1];\n- => Dim size: 2, dim 0: 3, dim 1: 4\n+```cpp\n+Eigen::Tensor<float, 2> a(3, 4);\n+const Eigen::Tensor<float, 2>::Dimensions& d = a.dimensions();\n+std::cout << \"Dim size: \" << d.size << \", dim 0: \" << d[0]\n+ << \", dim 1: \" << d[1];\n+// Dim size: 2, dim 0: 3, dim 1: 4\n+```\n \n If you use a C++11 compiler, you can use `auto` to simplify the code:\n \n- const auto& d = a.dimensions();\n- cout << \"Dim size: \" << d.size << \", dim 0: \" << d[0]\n- << \", dim 1: \" << d[1];\n- => Dim size: 2, dim 0: 3, dim 1: 4\n+```cpp\n+const auto& d = a.dimensions();\n+std::cout << \"Dim size: \" << d.size << \", dim 0: \" << d[0]\n+ << \", dim 1: \" << d[1];\n+// Dim size: 2, dim 0: 3, dim 1: 4\n+```\n \n ### Index dimension(Index n)\n \n Returns the n-th dimension of the tensor. The actual type of the\n-`dimension()` result is `<Tensor-Type>::``Index`, but you can\n+`dimension()` result is `<Tensor-Type>::Index`, but you can\n always use it like an int.\n \n- Eigen::Tensor<float, 2> a(3, 4);\n- int dim1 = a.dimension(1);\n- cout << \"Dim 1: \" << dim1;\n- => Dim 1: 4\n+```cpp\n+Eigen::Tensor<float, 2> a(3, 4);\n+int dim1 = a.dimension(1);\n+std::cout << \"Dim 1: \" << dim1;\n+// Dim 1: 4\n+```\n \n ### Index size()\n \n Returns the total number of elements in the tensor. This is the product of all\n the tensor dimensions. The actual type of the `size()` result is\n-`<Tensor-Type>::``Index`, but you can always use it like an int.\n-\n- Eigen::Tensor<float, 2> a(3, 4);\n- cout << \"Size: \" << a.size();\n- => Size: 12\n+`<Tensor-Type>::Index`, but you can always use it like an int.\n \n+```cpp\n+Eigen::Tensor<float, 2> a(3, 4);\n+std::cout << \"Size: \" << a.size();\n+/// Size: 12\n+```\n \n ### Getting Dimensions From An Operation\n \n A few operations provide `dimensions()` directly,\n e.g. `TensorReslicingOp`. Most operations defer calculating dimensions\n until the operation is being evaluated. If you need access to the dimensions\n-of a deferred operation, you can wrap it in a TensorRef (see Assigning to a\n-TensorRef above), which provides `dimensions()` and `dimension()` as\n-above.\n+of a deferred operation, you can wrap it in a TensorRef (see \n+[Assigning to a TensorRef.](#assigning-to-a-tensorref)), which provides \n+`dimensions()` and `dimension()` as above.\n \n-TensorRef can also wrap the plain Tensor types, so this is a useful idiom in\n-templated contexts where the underlying object could be either a raw Tensor\n-or some deferred operation (e.g. a slice of a Tensor). In this case, the\n+`TensorRef` can also wrap the plain `Tensor` types, so this is a useful idiom in\n+templated contexts where the underlying object could be either a raw `Tensor`\n+or some deferred operation (e.g. a slice of a `Tensor`). In this case, the\n template code can wrap the object in a TensorRef and reason about its\n dimensionality while remaining agnostic to the underlying type.\n \n@@ -558,41 +620,46 @@ dimensionality while remaining agnostic to the underlying type.\n Creates a tensor of the specified size. The number of arguments must be equal\n to the rank of the tensor. The content of the tensor is not initialized.\n \n- Eigen::Tensor<float, 2> a(3, 4);\n- cout << \"NumRows: \" << a.dimension(0) << \" NumCols: \" << a.dimension(1) << endl;\n- => NumRows: 3 NumCols: 4\n-\n+```cpp\n+Eigen::Tensor<float, 2> a(3, 4);\n+std::cout << \"NumRows: \" << a.dimension(0) << \" NumCols: \" << a.dimension(1) << endl;\n+// NumRows: 3 NumCols: 4\n+```\n ### TensorFixedSize\n \n-Creates a tensor of the specified size. The number of arguments in the Sizes<>\n+Creates a tensor of the specified size. The number of arguments in the `Sizes<>`\n template parameter determines the rank of the tensor. The content of the tensor\n is not initialized.\n \n- Eigen::TensorFixedSize<float, Sizes<3, 4>> a;\n- cout << \"Rank: \" << a.rank() << endl;\n- => Rank: 2\n- cout << \"NumRows: \" << a.dimension(0) << \" NumCols: \" << a.dimension(1) << endl;\n- => NumRows: 3 NumCols: 4\n+```cpp\n+Eigen::TensorFixedSize<float, Sizes<3, 4>> a;\n+std::cout << \"Rank: \" << a.rank() << endl;\n+// Rank: 2\n+std::cout << \"NumRows: \" << a.dimension(0) \n+ << \" NumCols: \" << a.dimension(1) << endl;\n+// NumRows: 3 NumCols: 4\n+```\n \n ### TensorMap\n \n Creates a tensor mapping an existing array of data. The data must not be freed\n-until the TensorMap is discarded, and the size of the data must be large enough\n+until the `TensorMap` is discarded, and the size of the data must be large enough\n to accommodate the coefficients of the tensor.\n \n- float data[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};\n- Eigen::TensorMap<Tensor<float, 2>> a(data, 3, 4);\n- cout << \"NumRows: \" << a.dimension(0) << \" NumCols: \" << a.dimension(1) << endl;\n- => NumRows: 3 NumCols: 4\n- cout << \"a(1, 2): \" << a(1, 2) << endl;\n- => a(1, 2): 7\n-\n+```cpp\n+float data[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};\n+Eigen::TensorMap<Tensor<float, 2>> a(data, 3, 4);\n+std::cout << \"NumRows: \" << a.dimension(0) << \" NumCols: \" << a.dimension(1) << endl;\n+// NumRows: 3 NumCols: 4\n+std::cout << \"a(1, 2): \" << a(1, 2) << endl;\n+// a(1, 2): 7\n+```\n \n ## Contents Initialization\n \n-When a new Tensor or a new TensorFixedSize are created, memory is allocated to\n+When a new `Tensor` or a new `TensorFixedSize` are created, memory is allocated to\n hold all the tensor elements, but the memory is not initialized. Similarly,\n-when a new TensorMap is created on top of non-initialized memory the memory its\n+when a new `TensorMap` is created on top of non-initialized memory the memory its\n contents are not initialized.\n \n You can use one of the methods below to initialize the tensor memory. These\n@@ -607,39 +674,42 @@ convertible to that type.\n \n Returns the tensor itself in case you want to chain another call.\n \n- a.setConstant(12.3f);\n- cout << \"Constant: \" << endl << a << endl << endl;\n- =>\n- Constant:\n- 12.3 12.3 12.3 12.3\n- 12.3 12.3 12.3 12.3\n- 12.3 12.3 12.3 12.3\n+```cpp\n+a.setConstant(12.3f);\n+std::cout << \"Constant: \" << endl << a << endl << endl;\n \n+// Constant:\n+// 12.3 12.3 12.3 12.3\n+// 12.3 12.3 12.3 12.3\n+// 12.3 12.3 12.3 12.3\n+```\n Note that `setConstant()` can be used on any tensor where the element type\n has a copy constructor and an `operator=()`:\n \n- Eigen::Tensor<string, 2> a(2, 3);\n- a.setConstant(\"yolo\");\n- cout << \"String tensor: \" << endl << a << endl << endl;\n- =>\n- String tensor:\n- yolo yolo yolo\n- yolo yolo yolo\n+```cpp\n+Eigen::Tensor<string, 2> a(2, 3);\n+a.setConstant(\"yolo\");\n+std::cout << \"String tensor: \" << endl << a << endl << endl;\n \n+// String tensor:\n+// yolo yolo yolo\n+// yolo yolo yolo\n+```\n \n ### <Tensor-Type> setZero()\n \n Fills the tensor with zeros. Equivalent to `setConstant(Scalar(0))`.\n Returns the tensor itself in case you want to chain another call.\n \n- a.setZero();\n- cout << \"Zeros: \" << endl << a << endl << endl;\n- =>\n- Zeros:\n- 0 0 0 0\n- 0 0 0 0\n- 0 0 0 0\n+```cpp\n+a.setZero();\n+std::cout << \"Zeros: \" << endl << a << endl << endl;\n \n+// Zeros:\n+// 0 0 0 0\n+// 0 0 0 0\n+// 0 0 0 0\n+```\n \n ### <Tensor-Type> setValues({..initializer_list})\n \n@@ -647,7 +717,7 @@ Fills the tensor with explicit values specified in a std::initializer_list.\n The type of the initializer list depends on the type and rank of the tensor.\n \n If the tensor has rank N, the initializer list must be nested N times. The\n-most deeply nested lists must contains P scalars of the Tensor type where P is\n+most deeply nested lists must contains P scalars of the `Tensor` type where P is\n the size of the last dimension of the Tensor.\n \n For example, for a `TensorFixedSize<float, 2, 3>` the initializer list must\n@@ -656,120 +726,129 @@ contains 2 lists of 3 floats each.\n `setValues()` returns the tensor itself in case you want to chain another\n call.\n \n- Eigen::Tensor<float, 2> a(2, 3);\n- a.setValues({{0.0f, 1.0f, 2.0f}, {3.0f, 4.0f, 5.0f}});\n- cout << \"a\" << endl << a << endl << endl;\n- =>\n- a\n- 0 1 2\n- 3 4 5\n+```cpp\n+Eigen::Tensor<float, 2> a(2, 3);\n+a.setValues({{0.0f, 1.0f, 2.0f}, {3.0f, 4.0f, 5.0f}});\n+std::cout << \"a\" << endl << a << endl << endl;\n+\n+// a\n+// 0 1 2\n+// 3 4 5\n+```\n \n If a list is too short, the corresponding elements of the tensor will not be\n changed. This is valid at each level of nesting. For example the following\n code only sets the values of the first row of the tensor.\n \n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setConstant(1000);\n- a.setValues({{10, 20, 30}});\n- cout << \"a\" << endl << a << endl << endl;\n- =>\n- a\n- 10 20 30\n- 1000 1000 1000\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setConstant(1000);\n+a.setValues({{10, 20, 30}});\n+std::cout << \"a\" << endl << a << endl << endl;\n+// a\n+// 10 20 30\n+// 1000 1000 1000\n+```\n \n ### <Tensor-Type> setRandom()\n \n Fills the tensor with random values. Returns the tensor itself in case you\n want to chain another call.\n \n- a.setRandom();\n- cout << \"Random: \" << endl << a << endl << endl;\n- =>\n- Random:\n- 0.680375 0.59688 -0.329554 0.10794\n- -0.211234 0.823295 0.536459 -0.0452059\n- 0.566198 -0.604897 -0.444451 0.257742\n+```cpp\n+a.setRandom();\n+std::cout << \"Random: \" << endl << a << endl << endl;\n+// Random:\n+// 0.680375 0.59688 -0.329554 0.10794\n+// -0.211234 0.823295 0.536459 -0.0452059\n+// 0.566198 -0.604897 -0.444451 0.257742\n+```\n \n You can customize `setRandom()` by providing your own random number\n generator as a template argument:\n \n- a.setRandom<MyRandomGenerator>();\n+```cpp\n+a.setRandom<MyRandomGenerator>();\n+```\n \n Here, `MyRandomGenerator` must be a struct with the following member\n-functions, where Scalar and Index are the same as `<Tensor-Type>::``Scalar`\n-and `<Tensor-Type>::``Index`.\n+functions, where Scalar and Index are the same as `<Tensor-Type>::Scalar`\n+and `<Tensor-Type>::Index`.\n \n See `struct UniformRandomGenerator` in TensorFunctors.h for an example.\n \n- // Custom number generator for use with setRandom().\n- struct MyRandomGenerator {\n- // Default and copy constructors. Both are needed\n- MyRandomGenerator() { }\n- MyRandomGenerator(const MyRandomGenerator& ) { }\n-\n- // Return a random value to be used. \"element_location\" is the\n- // location of the entry to set in the tensor, it can typically\n- // be ignored.\n- Scalar operator()(Eigen::DenseIndex element_location,\n- Eigen::DenseIndex /*unused*/ = 0) const {\n- return <randomly generated value of type T>;\n- }\n-\n- // Same as above but generates several numbers at a time.\n- typename internal::packet_traits<Scalar>::type packetOp(\n- Eigen::DenseIndex packet_location, Eigen::DenseIndex /*unused*/ = 0) const {\n- return <a packet of randomly generated values>;\n- }\n- };\n+```cpp\n+// Custom number generator for use with setRandom().\n+struct MyRandomGenerator {\n+ // Default and copy constructors. Both are needed\n+ MyRandomGenerator() { }\n+ MyRandomGenerator(const MyRandomGenerator& ) { }\n+\n+ // Return a random value to be used. \"element_location\" is the\n+ // location of the entry to set in the tensor, it can typically\n+ // be ignored.\n+ Scalar operator()(Eigen::DenseIndex element_location,\n+ Eigen::DenseIndex /*unused*/ = 0) const {\n+ return <randomly generated value of type T>;\n+ }\n+\n+ // Same as above but generates several numbers at a time.\n+ typename internal::packet_traits<Scalar>::type packetOp(\n+ Eigen::DenseIndex packet_location, Eigen::DenseIndex /*unused*/ = 0) const {\n+ return <a packet of randomly generated values>;\n+ }\n+};\n+```\n \n You can also use one of the 2 random number generators that are part of the\n tensor library:\n * UniformRandomGenerator\n * NormalRandomGenerator\n \n-\n ## Data Access\n \n The Tensor, TensorFixedSize, and TensorRef classes provide the following\n accessors to access the tensor coefficients:\n \n- const Scalar& operator()(const array<Index, NumIndices>& indices)\n- const Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)\n- Scalar& operator()(const array<Index, NumIndices>& indices)\n- Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)\n+```cpp\n+const Scalar& operator()(const array<Index, NumIndices>& indices)\n+const Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)\n+Scalar& operator()(const array<Index, NumIndices>& indices)\n+Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)\n+```\n \n The number of indices must be equal to the rank of the tensor. Moreover, these\n accessors are not available on tensor expressions. In order to access the\n values of a tensor expression, the expression must either be evaluated or\n wrapped in a TensorRef.\n \n-\n ### Scalar* data() and const Scalar* data() const\n \n Returns a pointer to the storage for the tensor. The pointer is const if the\n tensor was const. This allows direct access to the data. The layout of the\n-data depends on the tensor layout: RowMajor or ColMajor.\n+data depends on the tensor layout: `RowMajor` or `ColMajor`.\n \n This access is usually only needed for special cases, for example when mixing\n Eigen Tensor code with other libraries.\n \n Scalar is the type of data stored in the tensor.\n \n- Eigen::Tensor<float, 2> a(3, 4);\n- float* a_data = a.data();\n- a_data[0] = 123.45f;\n- cout << \"a(0, 0): \" << a(0, 0);\n- => a(0, 0): 123.45\n-\n+```cpp\n+Eigen::Tensor<float, 2> a(3, 4);\n+float* a_data = a.data();\n+a_data[0] = 123.45f;\n+std::cout << \"a(0, 0): \" << a(0, 0);\n+// a(0, 0): 123.45\n+```\n \n ## Tensor Operations\n \n All the methods documented below return non evaluated tensor `Operations`.\n-These can be chained: you can apply another Tensor Operation to the value\n+These can be chained: you can apply another `Tensor` Operation to the value\n returned by the method.\n \n The chain of Operation is evaluated lazily, typically when it is assigned to a\n-tensor. See \"Controlling when Expression are Evaluated\" for more details about\n+tensor. See [Controlling When Expression are Evaluated](#controlling-when-expression-are-evaluated) for more details about\n their evaluation.\n \n ### (Operation) constant(const Scalar& val)\n@@ -779,26 +858,29 @@ where all elements have the value `val`.\n \n This is useful, for example, when you want to add or subtract a constant from a\n tensor, or multiply every element of a tensor by a scalar.\n-\n- Eigen::Tensor<float, 2> a(2, 3);\n- a.setConstant(1.0f);\n- Eigen::Tensor<float, 2> b = a + a.constant(2.0f);\n- Eigen::Tensor<float, 2> c = b * b.constant(0.2f);\n- cout << \"a\" << endl << a << endl << endl;\n- cout << \"b\" << endl << b << endl << endl;\n- cout << \"c\" << endl << c << endl << endl;\n- =>\n- a\n- 1 1 1\n- 1 1 1\n-\n- b\n- 3 3 3\n- 3 3 3\n-\n- c\n- 0.6 0.6 0.6\n- 0.6 0.6 0.6\n+However, such operations can also be performed using operator overloads (see [operator+](#operation-operator-scalar-s)).\n+\n+\n+```cpp\n+Eigen::Tensor<float, 2> a(2, 3);\n+a.setConstant(1.0f);\n+Eigen::Tensor<float, 2> b = a + a.constant(2.0f);\n+Eigen::Tensor<float, 2> c = b * b.constant(0.2f);\n+std::cout << \"a\" << endl << a << endl << endl;\n+std::cout << \"b\" << endl << b << endl << endl;\n+std::cout << \"c\" << endl << c << endl << endl;\n+// a\n+// 1 1 1\n+// 1 1 1\n+\n+// b\n+// 3 3 3\n+// 3 3 3\n+\n+// c\n+// 0.6 0.6 0.6\n+// 0.6 0.6 0.6\n+```\n \n ### (Operation) random()\n \n@@ -809,20 +891,20 @@ This is for example useful to add random values to an existing tensor.\n The generation of random values can be customized in the same manner\n as for `setRandom()`.\n \n- Eigen::Tensor<float, 2> a(2, 3);\n- a.setConstant(1.0f);\n- Eigen::Tensor<float, 2> b = a + a.random();\n- cout << \"a\" << endl << a << endl << endl;\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- a\n- 1 1 1\n- 1 1 1\n-\n- b\n- 1.68038 1.5662 1.82329\n- 0.788766 1.59688 0.395103\n-\n+```cpp\n+Eigen::Tensor<float, 2> a(2, 3);\n+a.setConstant(1.0f);\n+Eigen::Tensor<float, 2> b = a + a.random();\n+std::cout << \"a\\n\" << a << \"\\n\\n\";\n+std::cout << \"b\\n\" << b << \"\\n\\n\";\n+\n+// a\n+// 1 1 1\n+// 1 1 1\n+// b\n+// 1.68038 1.5662 1.82329\n+// 0.788766 1.59688\n+```\n \n ## Unary Element Wise Operations\n \n@@ -835,19 +917,21 @@ requested operations are applied to each element independently.\n Returns a tensor of the same type and dimensions as the original tensor\n containing the opposite values of the original tensor.\n \n- Eigen::Tensor<float, 2> a(2, 3);\n- a.setConstant(1.0f);\n- Eigen::Tensor<float, 2> b = -a;\n- cout << \"a\" << endl << a << endl << endl;\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- a\n- 1 1 1\n- 1 1 1\n-\n- b\n- -1 -1 -1\n- -1 -1 -1\n+```cpp\n+Eigen::Tensor<float, 2> a(2, 3);\n+a.setConstant(1.0f);\n+Eigen::Tensor<float, 2> b = -a;\n+std::cout << \"a\\n\" << a << \"\\n\\n\";\n+std::cout << \"b\\n\" << b << \"\\n\\n\";\n+\n+// a\n+// 1 1 1\n+// 1 1 1\n+// \n+// b\n+// -1 -1 -1\n+// -1 -1 -1\n+```\n \n ### (Operation) sqrt()\n \n@@ -894,12 +978,14 @@ original tensor.\n \n Returns a tensor with the same dimensions as the original tensor\n containing the real part of the complex values of the original tensor.\n+The result has a real-valued scalar type.\n \n ### (Operation) imag()\n \n Returns a tensor with the same dimensions as the original tensor\n containing the imaginary part of the complex values of the original\n tensor.\n+The result has a real-valued scalar type.\n \n ### (Operation) pow(Scalar exponent)\n \n@@ -911,35 +997,158 @@ The type of the exponent, Scalar, is always the same as the type of the\n tensor coefficients. For example, only integer exponents can be used in\n conjunction with tensors of integer values.\n \n-You can use cast() to lift this restriction. For example this computes\n+You can use `cast()` to lift this restriction. For example this computes\n cubic roots of an int Tensor:\n \n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setValues({{0, 1, 8}, {27, 64, 125}});\n- Eigen::Tensor<double, 2> b = a.cast<double>().pow(1.0 / 3.0);\n- cout << \"a\" << endl << a << endl << endl;\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- a\n- 0 1 8\n- 27 64 125\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{0, 1, 8}, {27, 64, 125}});\n+Eigen::Tensor<double, 2> b = a.cast<double>().pow(1.0 / 3.0);\n+std::cout << \"a\" << endl << a << endl << endl;\n+std::cout << \"b\" << endl << b << endl << endl;\n+\n+// a\n+// 0 1 8\n+// 27 64 125\n+// \n+// b\n+// 0 1 2\n+// 3 4 5\n+```\n+\n+### (Operation) operator* (Scalar s)\n+\n+Multiplies every element of the input tensor by the scalar `s`:\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{1, 2, 3},\n+ {4, 5, 6}});\n+Eigen::Tensor<int,2> scaled_a = a * 2;\n+\n+std::cout << \"a\\n\" << a << \"\\n\";\n+std::cout << \"scaled_a\\n\" << scaled_a << \"\\n\";\n+\n+// a\n+// 1 2 3\n+// 4 5 6\n+// \n+// scaled_a\n+// 2 4 6\n+// 8 10 12\n+```\n+### (Operation) operator+ (Scalar s)\n+Adds `s` to every element in the tensor.\n+\n+### (Operation) operator- (Scalar s)\n+Subtracts `s` from every element in the tensor.\n+\n+### (Operation) operator/ (Scalar s)\n+Divides every element in the tensor by `s`.\n+\n+### (Operation) operator% (Scalar s)\n+Computes the element-wise modulus (remainder) of each tensor element divided by `s`\n+\n+**Only integer types are supported.** \n+For floating-point tensors, implement a [unaryExpr](#operation-unaryexprcustomunaryop-func) using `std::fmod`.\n+\n+### (Operation) cwiseMax(Scalar threshold)\n+Returns the coefficient-wise maximum between two tensors.\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500}});\n \n- b\n- 0 1 2\n- 3 4 5\n+Eigen::Tensor<int, 2> b(2, 3);\n+b.setValues({{-1, -2, 300}, {-4, 555, -6}});\n \n-### (Operation) operator * (Scalar scale)\n+Eigen::Tensor<int, 2> c = a.cwiseMax(b);\n \n-Multiplies all the coefficients of the input tensor by the provided scale.\n+std::cout << \"a\\n\" << a << \"\\n\"\n+ << \"b\\n\" << b << \"\\n\"\n+ << \"c\\n\" << c << \"\\n\";\n \n-### (Operation) cwiseMax(Scalar threshold)\n-TODO\n+// a\n+// 0 100 200\n+// 300 400 500\n+\n+// b\n+// -1 -2 300\n+// -4 555 -6\n \n+// c\n+// 0 100 300\n+// 300 555 500\n+```\n ### (Operation) cwiseMin(Scalar threshold)\n-TODO\n+Returns the coefficient-wise minimum between two tensors.\n+\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 2);\n+a.setValues({{0, 100}, {300, -900}});\n+\n+Eigen::Tensor<int, 2> b(2, 2);\n+b.setValues({{-1, -2}, {400, 555}});\n+\n+Eigen::Tensor<int, 2> c = a.cwiseMin(b);\n+\n+std::cout << \"a\\n\" << a << \"\\n\"\n+ << \"b\\n\" << b << \"\\n\"\n+ << \"c\\n\" << c << \"\\n\";\n+\n+// a\n+// 0 100\n+// 300 -900\n+\n+// b\n+// -1 -2\n+// 400 555\n+\n+// c\n+// -1 -2\n+// 300 -900\n+```\n \n ### (Operation) unaryExpr(const CustomUnaryOp& func)\n-TODO\n+Applies a user defined function to each element in the tensor.\n+Supports lambdas or functor structs with an operator().\n+\n+Using lambda:\n+```cpp\n+Eigen::Tensor<float, 2> a(2, 3);\n+a.setValues({{0, -.5, -1}, {.5, 1.5, 2.0}});\n+auto my_func = [](float el){ return std::abs(el + 0.5f);};\n+Eigen::Tensor<float, 2> b = a.unaryExpr(my_func);\n+std::cout << \"a\\n\" << a << \"\\n\"\n+ << \"b\\n\" << b << \"\\n\";\n+=>\n+a\n+ 0 -0.5 -1\n+0.5 1.5 2\n+b\n+0.5 0 0.5\n+ 1 2 2.5\n+```\n+\n+Using a functor to normalize and clamp values to `[-1.0, 1.0]`:\n+\n+```cpp\n+template<typename Scalar>\n+struct NormalizedClamp {\n+NormalizedClamp(Scalar lo, Scalar hi) : _lo(lo), _hi(hi) {}\n+Scalar operator()(Scalar x) const {\n+ if (x < _lo) return Scalar(0);\n+ if (x > _hi) return Scalar(1);\n+ return (x - _lo) / (_hi - _lo);\n+}\n+Scalar _lo, _hi;\n+};\n+\n+Eigen::Tensor<float, 2> c = a.unaryExpr(NormalizedClamp<float>(-1.0f, 1.0f));\n+std::cout << \"c\\n\" << c << \"\\n\";\n+\n+// c\n+// 0.5 0.25 0\n+// 0.75 1 1\n+```\n \n \n ## Binary Element Wise Operations\n@@ -984,7 +1193,7 @@ containing the coefficient wise mimimums of the inputs.\n \n ### (Operation) Logical operators\n \n-The following logical operators are supported as well:\n+The following boolean operators are supported:\n \n * `operator&&(const OtherDerived& other)`\n * `operator||(const OtherDerived& other)`\n@@ -994,19 +1203,26 @@ The following logical operators are supported as well:\n * `operator>=(const OtherDerived& other)`\n * `operator==(const OtherDerived& other)`\n * `operator!=(const OtherDerived& other)`\n+ \n+ as well as bitwise operators:\n \n-They all return a tensor of boolean values.\n+ * `operator&(const OtherDerived& other)` \n+ * `operator|(const OtherDerived& other)`\n+ * `operator^(const OtherDerived& other)`\n \n+The resulting tensor retains the input scalar type.\n \n ## Selection (select(const ThenDerived& thenTensor, const ElseDerived& elseTensor)\n \n Selection is a coefficient-wise ternary operator that is the tensor equivalent\n to the if-then-else operation.\n \n+```cpp\n Tensor<bool, 3> if = ...;\n Tensor<float, 3> then = ...;\n Tensor<float, 3> else = ...;\n Tensor<float, 3> result = if.select(then, else);\n+```\n \n The 3 arguments must be of the same dimensions, which will also be the dimension\n of the result. The 'if' tensor must be of type boolean, the 'then' and the\n@@ -1023,27 +1239,29 @@ resulting coefficient will come from the 'else' tensor.\n Tensor *contractions* are a generalization of the matrix product to the\n multidimensional case.\n \n- // Create 2 matrices using tensors of rank 2\n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setValues({{1, 2, 3}, {6, 5, 4}});\n- Eigen::Tensor<int, 2> b(3, 2);\n- b.setValues({{1, 2}, {4, 5}, {5, 6}});\n+```cpp\n+// Create 2 matrices using tensors of rank 2\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{1, 2, 3}, {6, 5, 4}});\n+Eigen::Tensor<int, 2> b(3, 2);\n+b.setValues({{1, 2}, {4, 5}, {5, 6}});\n \n- // Compute the traditional matrix product\n- Eigen::array<Eigen::IndexPair<int>, 1> product_dims = { Eigen::IndexPair<int>(1, 0) };\n- Eigen::Tensor<int, 2> AB = a.contract(b, product_dims);\n+// Compute the traditional matrix product\n+Eigen::array<Eigen::IndexPair<int>, 1> product_dims = { Eigen::IndexPair<int>(1, 0) };\n+Eigen::Tensor<int, 2> AB = a.contract(b, product_dims);\n \n- // Compute the product of the transpose of the matrices\n- Eigen::array<Eigen::IndexPair<int>, 1> transposed_product_dims = { Eigen::IndexPair<int>(0, 1) };\n- Eigen::Tensor<int, 2> AtBt = a.contract(b, transposed_product_dims);\n+// Compute the product of the transpose of the matrices\n+Eigen::array<Eigen::IndexPair<int>, 1> transposed_product_dims = { Eigen::IndexPair<int>(0, 1) };\n+Eigen::Tensor<int, 2> AtBt = a.contract(b, transposed_product_dims);\n \n- // Contraction to scalar value using a double contraction.\n- // First coordinate of both tensors are contracted as well as both second coordinates, i.e., this computes the sum of the squares of the elements.\n- Eigen::array<Eigen::IndexPair<int>, 2> double_contraction_product_dims = { Eigen::IndexPair<int>(0, 0), Eigen::IndexPair<int>(1, 1) };\n- Eigen::Tensor<int, 0> AdoubleContractedA = a.contract(a, double_contraction_product_dims);\n+// Contraction to scalar value using a double contraction.\n+// First coordinate of both tensors are contracted as well as both second coordinates, i.e., this computes the sum of the squares of the elements.\n+Eigen::array<Eigen::IndexPair<int>, 2> double_contraction_product_dims = { Eigen::IndexPair<int>(0, 0), Eigen::IndexPair<int>(1, 1) };\n+Eigen::Tensor<int, 0> AdoubleContractedA = a.contract(a, double_contraction_product_dims);\n \n- // Extracting the scalar value of the tensor contraction for further usage\n- int value = AdoubleContractedA(0);\n+// Extracting the scalar value of the tensor contraction for further usage\n+int value = AdoubleContractedA(0);\n+```\n \n ## Reduction Operations\n \n@@ -1074,116 +1292,165 @@ results, but the code may execute faster if you list the dimensions in\n increasing order.\n \n Example: Reduction along one dimension.\n-\n- // Create a tensor of 2 dimensions\n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setValues({{1, 2, 3}, {6, 5, 4}});\n- // Reduce it along the second dimension (1)...\n- Eigen::array<int, 1> dims({1 /* dimension to reduce */});\n- // ...using the \"maximum\" operator.\n- // The result is a tensor with one dimension. The size of\n- // that dimension is the same as the first (non-reduced) dimension of a.\n- Eigen::Tensor<int, 1> b = a.maximum(dims);\n- cout << \"a\" << endl << a << endl << endl;\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- a\n- 1 2 3\n- 6 5 4\n-\n- b\n- 3\n- 6\n-\n+```cpp\n+// Create a tensor of 2 dimensions\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{1, 2, 3}, {6, 5, 4}});\n+// Reduce it along the second dimension (1)...\n+Eigen::array<int, 1> dims({1 /* dimension to reduce */});\n+// ...using the \"maximum\" operator.\n+// The result is a tensor with one dimension. The size of\n+// that dimension is the same as the first (non-reduced) dimension of a.\n+Eigen::Tensor<int, 1> b = a.maximum(dims);\n+std::cout << \"a\" << endl << a << endl << endl;\n+std::cout << \"b\" << endl << b << endl << endl;\n+\n+// a\n+// 1 2 3\n+// 6 5 4\n+\n+// b\n+// 3\n+// 6\n+```\n Example: Reduction along two dimensions.\n-\n- Eigen::Tensor<float, 3, Eigen::ColMajor> a(2, 3, 4);\n- a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},\n- {7.0f, 6.0f, 5.0f, 4.0f},\n- {8.0f, 9.0f, 10.0f, 11.0f}},\n- {{12.0f, 13.0f, 14.0f, 15.0f},\n- {19.0f, 18.0f, 17.0f, 16.0f},\n- {20.0f, 21.0f, 22.0f, 23.0f}}});\n- // The tensor a has 3 dimensions. We reduce along the\n- // first 2, resulting in a tensor with a single dimension\n- // of size 4 (the last dimension of a.)\n- // Note that we pass the array of reduction dimensions\n- // directly to the maximum() call.\n- Eigen::Tensor<float, 1, Eigen::ColMajor> b =\n- a.maximum(Eigen::array<int, 2>({0, 1}));\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- b\n- 20\n- 21\n- 22\n- 23\n-\n+```cpp\n+Eigen::Tensor<float, 3, Eigen::ColMajor> a(2, 3, 4);\n+a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},\n+ {7.0f, 6.0f, 5.0f, 4.0f},\n+ {8.0f, 9.0f, 10.0f, 11.0f}},\n+ {{12.0f, 13.0f, 14.0f, 15.0f},\n+ {19.0f, 18.0f, 17.0f, 16.0f},\n+ {20.0f, 21.0f, 22.0f, 23.0f}}});\n+// The tensor a has 3 dimensions. We reduce along the\n+// first 2, resulting in a tensor with a single dimension\n+// of size 4 (the last dimension of a.)\n+// Note that we pass the array of reduction dimensions\n+// directly to the maximum() call.\n+Eigen::Tensor<float, 1, Eigen::ColMajor> b =\n+ a.maximum(Eigen::array<int, 2>({0, 1}));\n+std::cout << \"b\" << endl << b << endl << endl;\n+\n+// b\n+// 20\n+// 21\n+// 22\n+// 23\n+```\n #### Reduction along all dimensions\n \n As a special case, if you pass no parameter to a reduction operation the\n original tensor is reduced along *all* its dimensions. The result is a\n scalar, represented as a zero-dimension tensor.\n \n- Eigen::Tensor<float, 3> a(2, 3, 4);\n- a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},\n- {7.0f, 6.0f, 5.0f, 4.0f},\n- {8.0f, 9.0f, 10.0f, 11.0f}},\n- {{12.0f, 13.0f, 14.0f, 15.0f},\n- {19.0f, 18.0f, 17.0f, 16.0f},\n- {20.0f, 21.0f, 22.0f, 23.0f}}});\n- // Reduce along all dimensions using the sum() operator.\n- Eigen::Tensor<float, 0> b = a.sum();\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- b\n- 276\n-\n-\n-### (Operation) sum(const Dimensions& new_dims)\n+```cpp\n+Eigen::Tensor<float, 3> a(2, 3, 4);\n+a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},\n+ {7.0f, 6.0f, 5.0f, 4.0f},\n+ {8.0f, 9.0f, 10.0f, 11.0f}},\n+ {{12.0f, 13.0f, 14.0f, 15.0f},\n+ {19.0f, 18.0f, 17.0f, 16.0f},\n+ {20.0f, 21.0f, 22.0f, 23.0f}}});\n+// Reduce along all dimensions using the sum() operator.\n+Eigen::Tensor<float, 0> b = a.sum();\n+std::cout << \"b\\n\" << b;\n+\n+// b\n+// 276\n+```\n+You can extract the scalar directly by casting the expression and extract the first and only coefficient:\n+```cpp\n+float sum = static_cast<Eigen::Tensor<float, 0>>(a.sum())();\n+```\n+\n+### (Operation) sum(const Dimensions& reduction_dims)\n ### (Operation) sum()\n \n-Reduce a tensor using the sum() operator. The resulting values\n+Reduce a tensor using the `sum()` operator. The resulting values\n are the sum of the reduced values.\n \n-### (Operation) mean(const Dimensions& new_dims)\n+### (Operation) mean(const Dimensions& reduction_dims)\n ### (Operation) mean()\n \n-Reduce a tensor using the mean() operator. The resulting values\n+Reduce a tensor using the `mean()` operator. The resulting values\n are the mean of the reduced values.\n \n-### (Operation) maximum(const Dimensions& new_dims)\n+### (Operation) maximum(const Dimensions& reduction_dims)\n ### (Operation) maximum()\n \n-Reduce a tensor using the maximum() operator. The resulting values are the\n+Reduce a tensor using the `maximum()` operator. The resulting values are the\n largest of the reduced values.\n \n-### (Operation) minimum(const Dimensions& new_dims)\n+### (Operation) minimum(const Dimensions& reduction_dims)\n ### (Operation) minimum()\n \n-Reduce a tensor using the minimum() operator. The resulting values\n+Reduce a tensor using the `minimum()` operator. The resulting values\n are the smallest of the reduced values.\n \n-### (Operation) prod(const Dimensions& new_dims)\n+### (Operation) prod(const Dimensions& reduction_dims)\n ### (Operation) prod()\n \n-Reduce a tensor using the prod() operator. The resulting values\n+Reduce a tensor using the `prod()` operator. The resulting values\n are the product of the reduced values.\n \n-### (Operation) all(const Dimensions& new_dims)\n+### (Operation) all(const Dimensions& reduction_dims)\n ### (Operation) all()\n-Reduce a tensor using the all() operator. Casts tensor to bool and then checks\n+Reduce a tensor using the `all()` operator. Casts tensor to bool and then checks\n whether all elements are true. Runs through all elements rather than\n short-circuiting, so may be significantly inefficient.\n \n-### (Operation) any(const Dimensions& new_dims)\n+### (Operation) any(const Dimensions& reduction_dims)\n ### (Operation) any()\n-Reduce a tensor using the any() operator. Casts tensor to bool and then checks\n+Reduce a tensor using the `any()` operator. Casts tensor to bool and then checks\n whether any element is true. Runs through all elements rather than\n short-circuiting, so may be significantly inefficient.\n \n \n-### (Operation) reduce(const Dimensions& new_dims, const Reducer& reducer)\n+### (Operation) argmax(const Dimensions& reduction_dim)\n+### (Operation) argmax()\n+\n+Reduce a tensor using the `argmax()` operator.\n+\n+The resulting values are the indices of the largest elements along the specified dimension.\n+\n+Only a single `reduction_dim` is supported.\n+\n+If multiple elements share the maximum value, the one with the **lowest index** is returned.\n+\n+```cpp\n+Eigen::Tensor<float, 2> a(2, 3);\n+a.setValues({{1, 4, 8}, {3, 4, 2}});\n+\n+Eigen::Tensor<Eigen::Index, 1> argmax_dim0 = a.argmax(0);\n+\n+std::cout << \"a:\\n\" << a << \"\\n\";\n+for (int i = 0; i < argmax_dim0.size(); ++i) {\n+ std::cout << \"argmax along dim 0 at index \" << i << \" = \" << argmax_dim0(i) << \"\\n\";\n+}\n+\n+// a:\n+// 1 4 8\n+// 3 4 2\n+// argmax along dim 0 at index 0 = 1\n+// argmax along dim 0 at index 1 = 0\n+// argmax along dim 0 at index 2 = 0\n+```\n+\n+ To compute the index of the global maximum, use the overload without arguments (which flattens the tensor).\n+\n+\n+```cpp\n+Eigen::Tensor<Eigen::Index, 0> argmax_flat = a.argmax();\n+std::cout << \"Flat argmax index: \" << argmax_flat();\n+\n+// Flat argmax index: 4\n+```\n+\n+### (Operation) argmin(const Dimensions& reduction_dim)\n+### (Operation) argmin()\n+See [argmax](#operation-argmaxconst-dimensions-reduction_dim)\n+\n+### (Operation) reduce(const Dimensions& reduction_dims, const Reducer& reducer)\n \n Reduce a tensor using a user-defined reduction operator. See `SumReducer`\n in TensorFunctors.h for information on how to implement a reduction operator.\n@@ -1201,23 +1468,24 @@ the trace dimensions must have the same size.\n \n Example: Trace along 2 dimensions.\n \n- // Create a tensor of 3 dimensions\n- Eigen::Tensor<int, 3> a(2, 2, 3);\n- a.setValues({{{1, 2, 3}, {4, 5, 6}}, {{7, 8, 9}, {10, 11, 12}}});\n- // Specify the dimensions along which the trace will be computed.\n- // In this example, the trace can only be computed along the dimensions\n- // with indices 0 and 1\n- Eigen::array<int, 2> dims({0, 1});\n- // The output tensor contains all but the trace dimensions.\n- Tensor<int, 1> a_trace = a.trace(dims);\n- cout << \"a_trace:\" << endl;\n- cout << a_trace << endl;\n- =>\n- a_trace:\n- 11\n- 13\n- 15\n-\n+```cpp\n+// Create a tensor of 3 dimensions\n+Eigen::Tensor<int, 3> a(2, 2, 3);\n+a.setValues({{{1, 2, 3}, {4, 5, 6}}, {{7, 8, 9}, {10, 11, 12}}});\n+// Specify the dimensions along which the trace will be computed.\n+// In this example, the trace can only be computed along the dimensions\n+// with indices 0 and 1\n+Eigen::array<int, 2> dims({0, 1});\n+// The output tensor contains all but the trace dimensions.\n+Tensor<int, 1> a_trace = a.trace(dims);\n+std::cout << \"a_trace:\" << endl;\n+std::cout << a_trace << endl;\n+\n+// a_trace:\n+// 11\n+// 13\n+// 15\n+```\n \n ### (Operation) trace(const Dimensions& new_dims)\n ### (Operation) trace()\n@@ -1227,19 +1495,20 @@ along *all* dimensions of the input tensor.\n \n Example: Trace along all dimensions.\n \n- // Create a tensor of 3 dimensions, with all dimensions having the same size.\n- Eigen::Tensor<int, 3> a(3, 3, 3);\n- a.setValues({{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}},\n- {{10, 11, 12}, {13, 14, 15}, {16, 17, 18}},\n- {{19, 20, 21}, {22, 23, 24}, {25, 26, 27}}});\n- // Result is a zero dimension tensor\n- Tensor<int, 0> a_trace = a.trace();\n- cout<<\"a_trace:\"<<endl;\n- cout<<a_trace<<endl;\n- =>\n- a_trace:\n- 42\n-\n+```cpp\n+// Create a tensor of 3 dimensions, with all dimensions having the same size.\n+Eigen::Tensor<int, 3> a(3, 3, 3);\n+a.setValues({{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}},\n+ {{10, 11, 12}, {13, 14, 15}, {16, 17, 18}},\n+ {{19, 20, 21}, {22, 23, 24}, {25, 26, 27}}});\n+// Result is a zero dimension tensor\n+Tensor<int, 0> a_trace = a.trace();\n+std::cout<<\"a_trace:\"<<endl;\n+std::cout<<a_trace<<endl;\n+\n+// a_trace:\n+// 42\n+```\n \n ## Scan Operations\n \n@@ -1251,24 +1520,26 @@ If the reduction operation corresponds to summation, then this computes the\n prefix sum of the tensor along the given axis.\n \n Example:\n-dd a comment to this line\n-\n- // Create a tensor of 2 dimensions\n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setValues({{1, 2, 3}, {4, 5, 6}});\n- // Scan it along the second dimension (1) using summation\n- Eigen::Tensor<int, 2> b = a.cumsum(1);\n- // The result is a tensor with the same size as the input\n- cout << \"a\" << endl << a << endl << endl;\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- a\n- 1 2 3\n- 4 5 6\n-\n- b\n- 1 3 6\n- 4 9 15\n+Cumulative sum along the second dimension\n+\n+```cpp\n+// Create a tensor of 2 dimensions\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{1, 2, 3}, {4, 5, 6}});\n+// Scan it along the second dimension (1) using summation\n+Eigen::Tensor<int, 2> b = a.cumsum(1);\n+// The result is a tensor with the same size as the input\n+std::cout << \"a\" << endl << a << endl << endl;\n+std::cout << \"b\" << endl << b << endl << endl;\n+\n+// a\n+// 1 2 3\n+// 4 5 6\n+\n+// b\n+// 1 3 6\n+// 4 9 15\n+```\n \n ### (Operation) cumsum(const Index& axis)\n \n@@ -1278,7 +1549,6 @@ Perform a scan by summing consecutive entries.\n \n Perform a scan by multiplying consecutive entries.\n \n-\n ## Convolutions\n \n ### (Operation) convolve(const Kernel& kernel, const Dimensions& dims)\n@@ -1286,265 +1556,367 @@ Perform a scan by multiplying consecutive entries.\n Returns a tensor that is the output of the convolution of the input tensor with the kernel,\n along the specified dimensions of the input tensor. The dimension size for dimensions of the output tensor\n which were part of the convolution will be reduced by the formula:\n-output_dim_size = input_dim_size - kernel_dim_size + 1 (requires: input_dim_size >= kernel_dim_size).\n+```cpp\n+output_dim_size = input_dim_size - kernel_dim_size + 1 // (requires: input_dim_size >= kernel_dim_size).\n+```\n The dimension sizes for dimensions that were not part of the convolution will remain the same.\n Performance of the convolution can depend on the length of the stride(s) of the input tensor dimension(s) along which the\n-convolution is computed (the first dimension has the shortest stride for ColMajor, whereas RowMajor's shortest stride is\n+convolution is computed (the first dimension has the shortest stride for `ColMajor`, whereas `RowMajor`'s shortest stride is\n for the last dimension).\n \n- // Compute convolution along the second and third dimension.\n- Tensor<float, 4, DataLayout> input(3, 3, 7, 11);\n- Tensor<float, 2, DataLayout> kernel(2, 2);\n- Tensor<float, 4, DataLayout> output(3, 2, 6, 11);\n- input.setRandom();\n- kernel.setRandom();\n-\n- Eigen::array<ptrdiff_t, 2> dims({1, 2}); // Specify second and third dimension for convolution.\n- output = input.convolve(kernel, dims);\n-\n- for (int i = 0; i < 3; ++i) {\n- for (int j = 0; j < 2; ++j) {\n- for (int k = 0; k < 6; ++k) {\n- for (int l = 0; l < 11; ++l) {\n- const float result = output(i,j,k,l);\n- const float expected = input(i,j+0,k+0,l) * kernel(0,0) +\n- input(i,j+1,k+0,l) * kernel(1,0) +\n- input(i,j+0,k+1,l) * kernel(0,1) +\n- input(i,j+1,k+1,l) * kernel(1,1);\n- VERIFY_IS_APPROX(result, expected);\n- }\n- }\n+```cpp\n+// Compute convolution along the second and third dimension.\n+Tensor<float, 4, DataLayout> input(3, 3, 7, 11);\n+Tensor<float, 2, DataLayout> kernel(2, 2);\n+Tensor<float, 4, DataLayout> output(3, 2, 6, 11);\n+input.setRandom();\n+kernel.setRandom();\n+\n+Eigen::array<ptrdiff_t, 2> dims({1, 2}); // Specify second and third dimension for convolution.\n+output = input.convolve(kernel, dims);\n+\n+for (int i = 0; i < 3; ++i) {\n+ for (int j = 0; j < 2; ++j) {\n+ for (int k = 0; k < 6; ++k) {\n+ for (int l = 0; l < 11; ++l) {\n+ const float result = output(i,j,k,l);\n+ const float expected = input(i,j+0,k+0,l) * kernel(0,0) +\n+ input(i,j+1,k+0,l) * kernel(1,0) +\n+ input(i,j+0,k+1,l) * kernel(0,1) +\n+ input(i,j+1,k+1,l) * kernel(1,1);\n+ VERIFY_IS_APPROX(result, expected);\n }\n }\n-\n+ }\n+}\n+```\n \n ## Geometrical Operations\n \n-These operations return a Tensor with different dimensions than the original\n-Tensor. They can be used to access slices of tensors, see them with different\n+These operations return a `Tensor` with different dimensions than the original\n+`Tensor`. They can be used to access slices of tensors, see them with different\n dimensions, or pad tensors with additional data.\n \n ### (Operation) reshape(const Dimensions& new_dims)\n \n Returns a view of the input tensor that has been reshaped to the specified\n-new dimensions. The argument new_dims is an array of Index values. The\n-rank of the resulting tensor is equal to the number of elements in new_dims.\n+new dimensions.\n+\n+The argument `new_dims` is an array of Index values.\n+\n+The rank of the resulting tensor is equal to the number of elements in `new_dims`.\n \n The product of all the sizes in the new dimension array must be equal to\n the number of elements in the input tensor.\n \n- // Increase the rank of the input tensor by introducing a new dimension\n- // of size 1.\n- Tensor<float, 2> input(7, 11);\n- array<int, 3> three_dims{{7, 11, 1}};\n- Tensor<float, 3> result = input.reshape(three_dims);\n+```cpp\n+// Increase the rank of the input tensor by introducing a new dimension\n+// of size 1.\n+Tensor<float, 2> input(7, 11);\n+array<int, 3> three_dims{{7, 11, 1}};\n+Tensor<float, 3> result = input.reshape(three_dims);\n \n- // Decrease the rank of the input tensor by merging 2 dimensions;\n- array<int, 1> one_dim{{7 * 11}};\n- Tensor<float, 1> result = input.reshape(one_dim);\n+// Decrease the rank of the input tensor by merging 2 dimensions;\n+array<int, 1> one_dim{{7 * 11}};\n+Tensor<float, 1> result = input.reshape(one_dim);\n+```\n \n This operation does not move any data in the input tensor, so the resulting\n-contents of a reshaped Tensor depend on the data layout of the original Tensor.\n+contents of a reshaped `Tensor` depend on the data layout of the original `Tensor`.\n \n-For example this is what happens when you `reshape()` a 2D ColMajor tensor\n+For example this is what happens when you `reshape()` a 2D `ColMajor` tensor\n to one dimension:\n \n- Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);\n- a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});\n- Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});\n- Eigen::Tensor<float, 1, Eigen::ColMajor> b = a.reshape(one_dim);\n- cout << \"b\" << endl << b << endl;\n- =>\n- b\n- 0\n- 300\n- 100\n- 400\n- 200\n- 500\n-\n-This is what happens when the 2D Tensor is RowMajor:\n-\n- Eigen::Tensor<float, 2, Eigen::RowMajor> a(2, 3);\n- a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});\n- Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});\n- Eigen::Tensor<float, 1, Eigen::RowMajor> b = a.reshape(one_dim);\n- cout << \"b\" << endl << b << endl;\n- =>\n- b\n- 0\n- 100\n- 200\n- 300\n- 400\n- 500\n+```cpp\n+Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);\n+a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});\n+Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});\n+Eigen::Tensor<float, 1, Eigen::ColMajor> b = a.reshape(one_dim);\n+std::cout << \"b\" << endl << b << endl;\n+\n+// b\n+// 0\n+// 300\n+// 100\n+// 400\n+// 200\n+// 500\n+```\n+\n+This is what happens when the 2D `Tensor` is `RowMajor`:\n+\n+```cpp\n+Eigen::Tensor<float, 2, Eigen::RowMajor> a(2, 3);\n+a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});\n+Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});\n+Eigen::Tensor<float, 1, Eigen::RowMajor> b = a.reshape(one_dim);\n+std::cout << \"b\" << endl << b << endl;\n+\n+// b\n+// 0\n+// 100\n+// 200\n+// 300\n+// 400\n+// 500\n+```\n \n The reshape operation is a lvalue. In other words, it can be used on the left\n side of the assignment operator.\n \n The previous example can be rewritten as follow:\n \n- Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);\n- a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});\n- Eigen::array<Eigen::DenseIndex, 2> two_dim({2, 3});\n- Eigen::Tensor<float, 1, Eigen::ColMajor> b(6);\n- b.reshape(two_dim) = a;\n- cout << \"b\" << endl << b << endl;\n- =>\n- b\n- 0\n- 300\n- 100\n- 400\n- 200\n- 500\n+```cpp\n+Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);\n+a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});\n+Eigen::array<Eigen::DenseIndex, 2> two_dim({2, 3});\n+Eigen::Tensor<float, 1, Eigen::ColMajor> b(6);\n+b.reshape(two_dim) = a;\n+std::cout << \"b\" << endl << b << endl;\n+\n+// b\n+// 0\n+// 300\n+// 100\n+// 400\n+// 200\n+// 500\n+```\n \n Note that \"b\" itself was not reshaped but that instead the assignment is done to\n the reshape view of b.\n \n-\n ### (Operation) shuffle(const Shuffle& shuffle)\n \n-Returns a copy of the input tensor whose dimensions have been\n-reordered according to the specified permutation. The argument shuffle\n-is an array of Index values. Its size is the rank of the input\n-tensor. It must contain a permutation of 0, 1, ..., rank - 1. The i-th\n-dimension of the output tensor equals to the size of the shuffle[i]-th\n-dimension of the input tensor. For example:\n+Returns a view of the input tensor whose dimensions have been\n+reordered according to the specified permutation.\n \n- // Shuffle all dimensions to the left by 1.\n- Tensor<float, 3> input(20, 30, 50);\n- // ... set some values in input.\n- Tensor<float, 3> output = input.shuffle({1, 2, 0})\n+The argument `shuffle` is an array of `Index` values:\n+* Its size is the rank of the input tensor.\n+* It must contain a permutation of `[0, 1, ..., rank - 1]`.\n+* The `i`-th dimension of the output tensor corresponds to the size of the dimension at position `shuffle[i]` in the input tensor. For example:\n \n- eigen_assert(output.dimension(0) == 30);\n- eigen_assert(output.dimension(1) == 50);\n- eigen_assert(output.dimension(2) == 20);\n+```cpp\n+// Shuffle all dimensions to the left by 1.\n+Tensor<float, 3> input(20, 30, 50);\n+// ... set some values in input.\n+Tensor<float, 3> output = input.shuffle({1, 2, 0});\n \n-Indices into the output tensor are shuffled accordingly to formulate\n-indices into the input tensor. For example, one can assert in the above\n-code snippet that:\n+eigen_assert(output.dimension(0) == 30);\n+eigen_assert(output.dimension(1) == 50);\n+eigen_assert(output.dimension(2) == 20);\n \n- eigen_assert(output(3, 7, 11) == input(11, 3, 7));\n+// Indices into the output tensor are shuffled accordingly to formulate\n+// indices into the input tensor.\n+eigen_assert(output(3, 7, 11) == input(11, 3, 7));\n \n-In general, one can assert that\n-\n- eigen_assert(output(..., indices[shuffle[i]], ...) ==\n- input(..., indices[i], ...))\n+// In general:\n+eigen_assert(output(..., indices[shuffle[i]], ...) ==\n+ input(..., indices[i], ...));\n+```\n \n The shuffle operation results in a lvalue, which means that it can be assigned\n to. In other words, it can be used on the left side of the assignment operator.\n \n Let's rewrite the previous example to take advantage of this feature:\n \n- // Shuffle all dimensions to the left by 1.\n- Tensor<float, 3> input(20, 30, 50);\n- // ... set some values in input.\n- Tensor<float, 3> output(30, 50, 20);\n- output.shuffle({2, 0, 1}) = input;\n-\n+```cpp\n+// Shuffle all dimensions to the left by 1.\n+Tensor<float, 3> input(20, 30, 50);\n+input.setRandom();\n+Tensor<float, 3> output(30, 50, 20);\n+output.shuffle({2, 0, 1}) = input;\n+```\n \n ### (Operation) stride(const Strides& strides)\n \n Returns a view of the input tensor that strides (skips stride-1\n-elements) along each of the dimensions. The argument strides is an\n-array of Index values. The dimensions of the resulting tensor are\n-ceil(input_dimensions[i] / strides[i]).\n+elements) along each of the dimensions.\n+\n+The argument strides is an array of `Index` values:\n+* Its size is the rank of the input tensor.\n+* Must be >= 1\n+\n+ The dimensions of the resulting tensor are `ceil(input_dimensions[i] / strides[i])`.\n \n For example this is what happens when you `stride()` a 2D tensor:\n \n- Eigen::Tensor<int, 2> a(4, 3);\n- a.setValues({{0, 100, 200}, {300, 400, 500}, {600, 700, 800}, {900, 1000, 1100}});\n- Eigen::array<Eigen::DenseIndex, 2> strides({3, 2});\n- Eigen::Tensor<int, 2> b = a.stride(strides);\n- cout << \"b\" << endl << b << endl;\n- =>\n- b\n- 0 200\n- 900 1100\n+```cpp\n+Eigen::Tensor<int, 2> a(4, 3);\n+a.setValues({{0, 100, 200},\n+ {300, 400, 500},\n+ {600, 700, 800},\n+ {900, 1000, 1100}});\n+Eigen::array<Eigen::DenseIndex, 2> strides({3, 2});\n+Eigen::Tensor<int, 2> b = a.stride(strides);\n+std::cout << \"b\" << endl << b << endl;\n+// b\n+// 0 200\n+// 900 1100\n+```\n \n It is possible to assign a tensor to a stride:\n- Tensor<float, 3> input(20, 30, 50);\n- // ... set some values in input.\n- Tensor<float, 3> output(40, 90, 200);\n- output.stride({2, 3, 4}) = input;\n-\n+```cpp\n+Tensor<float, 3> input(20, 30, 50);\n+input.setRandom();\n+Tensor<float, 3> output(40, 90, 200);\n+output.stride({2, 3, 4}) = input;\n+```\n \n ### (Operation) slice(const StartIndices& offsets, const Sizes& extents)\n \n Returns a sub-tensor of the given tensor. For each dimension i, the slice is\n-made of the coefficients stored between offset[i] and offset[i] + extents[i] in\n+made of the coefficients stored between `offset[i]` and `offset[i] + extents[i]` in\n the input tensor.\n \n- Eigen::Tensor<int, 2> a(4, 3);\n- a.setValues({{0, 100, 200}, {300, 400, 500},\n- {600, 700, 800}, {900, 1000, 1100}});\n- Eigen::array<Eigen::Index, 2> offsets = {1, 0};\n- Eigen::array<Eigen::Index, 2> extents = {2, 2};\n- Eigen::Tensor<int, 2> slice = a.slice(offsets, extents);\n- cout << \"a\" << endl << a << endl;\n- =>\n- a\n- 0 100 200\n- 300 400 500\n- 600 700 800\n- 900 1000 1100\n- cout << \"slice\" << endl << slice << endl;\n- =>\n- slice\n- 300 400\n- 600 700\n+```cpp\n+Eigen::Tensor<int, 2> a(4, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500},\n+ {600, 700, 800}, {900, 1000, 1100}});\n+Eigen::array<Eigen::Index, 2> offsets = {1, 0};\n+Eigen::array<Eigen::Index, 2> extents = {2, 2};\n+Eigen::Tensor<int, 2> slice = a.slice(offsets, extents);\n+std::cout << \"a\" << endl << a << endl;\n+// a\n+// 0 100 200\n+// 300 400 500\n+// 600 700 800\n+// 900 1000 1100\n+\n+std::cout << \"slice\" << endl << slice << endl;\n+// slice\n+// 300 400\n+// 600 700\n+```\n+\n+### (Operation) stridedSlice(const StartIndices& start, const StopIndices& stop, const Strides& strides)\n+\n+Returns a sub-tensor by selecting elements using `start`, `stop` (exclusive), and `strides` for each dimension.\n+\n+This is similar to slicing in Python using [start:stop:step].\n+\n+``` cpp\n+Eigen::Tensor<int, 2> a(4, 6);\n+a.setValues({{ 0, 10, 20, 30, 40, 50},\n+ {100, 110, 120, 130, 140, 150},\n+ {200, 210, 220, 230, 240, 250},\n+ {300, 310, 320, 330, 340, 350}});\n+\n+Eigen::array<Eigen::Index, 2> start = {1, 1};\n+Eigen::array<Eigen::Index, 2> stop = {4, 6}; // Stop is exclusive\n+Eigen::array<Eigen::Index, 2> strides = {2, 2};\n+\n+Eigen::Tensor<int, 2> sub = a.stridedSlice(start, stop, strides);\n+\n+std::cout << \"a\\n\" << a << \"\\n\";\n+std::cout << \"sub\\n\" << sub << \"\\n\";\n+\n+// a\n+// 0 10 20 30 40 50\n+// 100 110 120 130 140 150\n+// 200 210 220 230 240 250\n+// 300 310 320 330 340 350\n+\n+// sub\n+// 110 130 150\n+// 310 330 350\n+```\n+It is also possible to assign to a strided slice:\n+\n+``` cpp\n+Eigen::Tensor<int, 2> b(sub.dimensions());\n+b.setConstant(-1);\n+a.stridedSlice(start, stop, strides) = b;\n+std::cout << \"modified a\\n\" << a << \"\\n\";\n+\n+\n+// modified a\n+// 0 10 20 30 40 50\n+// 100 -1 120 -1 140 -1\n+// 200 210 220 230 240 250\n+// 300 -1 320 -1 340 -1\n+\n+```\n+### (Operation) chip(const Index offset, const Index dim)\n+\n+A chip is a special kind of slice.\n+It is the subtensor at the given offset in the dimension `dim`.\n \n+The returned tensor has one fewer dimension than the input tensor: the dimension dim is removed.\n \n-### (Operation) chip(const Index offset, const Index dim)\n+For example, a matrix chip would be either a row or a column of the input matrix:\n \n-A chip is a special kind of slice. It is the subtensor at the given offset in\n-the dimension dim. The returned tensor has one fewer dimension than the input\n-tensor: the dimension dim is removed.\n+```cpp\n+Eigen::Tensor<int, 2> a(4, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500},\n+ {600, 700, 800}, {900, 1000, 1100}});\n+Eigen::Tensor<int, 1> row_3 = a.chip(2, 0);\n+Eigen::Tensor<int, 1> col_2 = a.chip(1, 1);\n+std::cout << \"a\\n\" << a << \"\\n\";\n \n-For example, a matrix chip would be either a row or a column of the input\n-matrix.\n+// a\n+// 0 100 200\n+// 300 400 500\n+// 600 700 800\n+// 900 1000 1100\n \n- Eigen::Tensor<int, 2> a(4, 3);\n- a.setValues({{0, 100, 200}, {300, 400, 500},\n- {600, 700, 800}, {900, 1000, 1100}});\n- Eigen::Tensor<int, 1> row_3 = a.chip(2, 0);\n- Eigen::Tensor<int, 1> col_2 = a.chip(1, 1);\n- cout << \"a\" << endl << a << endl;\n- =>\n- a\n- 0 100 200\n- 300 400 500\n- 600 700 800\n- 900 1000 1100\n- cout << \"row_3\" << endl << row_3 << endl;\n- =>\n- row_3\n- 600 700 800\n- cout << \"col_2\" << endl << col_2 << endl;\n- =>\n- col_2\n- 100 400 700 1000\n+std::cout << \"row_3\\n\" << row_3 << \"\\n\";\n+// row_3\n+// 600 700 800\n+\n+std::cout << \"col_2\\n\" << col_2 << \"\\n\";\n+// col_2\n+// 100 400 700 1000\n+```\n \n It is possible to assign values to a tensor chip since the chip operation is a\n lvalue. For example:\n \n- Eigen::Tensor<int, 1> a(3);\n- a.setValues({{100, 200, 300}});\n- Eigen::Tensor<int, 2> b(2, 3);\n- b.setZero();\n- b.chip(0, 0) = a;\n- cout << \"a\" << endl << a << endl;\n- =>\n- a\n- 100\n- 200\n- 300\n- cout << \"b\" << endl << b << endl;\n- =>\n- b\n- 100 200 300\n- 0 0 0\n+```cpp\n+Eigen::Tensor<int, 1> a(3);\n+a.setValues({{100, 200, 300}});\n+Eigen::Tensor<int, 2> b(2, 3);\n+b.setZero();\n+b.chip(0, 0) = a;\n+std::cout << \"a\\n\" << a << \"\\n\";\n+std::cout << \"b\\n\" << b << \"\\n\";\n+\n+// a\n+// 100\n+// 200\n+// 300\n+\n+// b\n+// 100 200 300\n+// 0 0 0\n+```\n+\n+\n+The dimension can also be passed as a template parameter:\n+\n+```cpp\n+b.chip<0>(1) = a; // Equivalent to b.chip(1,0) = a;\n+```\n+\n+Note that only one dimension can be chipped at a time.\n+To chip off multiple dimensions, you can chain calls\n+\n+```cpp\n+Eigen::Tensor<int, 3> a(2, 3, 4);\n+Eigen::Tensor<int, 1> b = b.chip<2>(0) // Now has shape [2,3]\n+ .chip<1>(0); // Now has shape [2]\n+```\n+\n+Be careful in which order you chip, as each operation affects the shape of the intermediate result.\n+For example:\n+\n+```cpp\n+// AVOID THIS\n+Eigen::Tensor<int, 1> c = b.chip<1>(0) // Now has shape [2,4]\n+ .chip<1>(0); // Now has shape [2]\n+```\n+\n+In general, it’s more intuitive to chip from the outermost dimension first.\n \n \n ### (Operation) reverse(const ReverseDimensions& reverse)\n@@ -1558,24 +1930,59 @@ of the input tensor.\n For example this is what happens when you `reverse()` the first dimension\n of a 2D tensor:\n \n- Eigen::Tensor<int, 2> a(4, 3);\n- a.setValues({{0, 100, 200}, {300, 400, 500},\n- {600, 700, 800}, {900, 1000, 1100}});\n- Eigen::array<bool, 2> reverse({true, false});\n- Eigen::Tensor<int, 2> b = a.reverse(reverse);\n- cout << \"a\" << endl << a << endl << \"b\" << endl << b << endl;\n- =>\n- a\n- 0 100 200\n- 300 400 500\n- 600 700 800\n- 900 1000 1100\n- b\n- 900 1000 1100\n- 600 700 800\n- 300 400 500\n- 0 100 200\n-\n+```cpp\n+Eigen::Tensor<int, 2> a(4, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500},\n+ {600, 700, 800}, {900, 1000, 1100}});\n+Eigen::array<bool, 2> reverse({true, false});\n+Eigen::Tensor<int, 2> b = a.reverse(reverse);\n+std::cout << \"a\\n\" << a << \"\\n\";\n+std::cout << \"b\\n\" << b << \"\\n\";\n+\n+// a\n+// 0 100 200\n+// 300 400 500\n+// 600 700 800\n+// 900 1000 1100\n+// b\n+// 900 1000 1100\n+// 600 700 800\n+// 300 400 500\n+// 0 100 200\n+```\n+\n+### (Operation) roll(const Rolls& shifts)\n+\n+Returns a tensor with the elements **circularly shifted** (like bit rotation) along one or more dimensions. \n+\n+For each dimension `i`, the content is shifted by `shifts[i]` positions:\n+\n+- A **positive shift** of `+s` moves each value to a **lower index** by `s`.\n+- A **negative shift** of `-s` moves each value to a **higher index** by `s`.\n+\n+```cpp\n+Eigen::Tensor<int, 2> a(3, 4);\n+a.setValues({{ 1, 2, 3, 4},\n+ { 5, 6, 7, 8},\n+ { 9, 10, 11, 12}});\n+\n+Eigen::array<Eigen::Index, 2> shifts = {1, -2};\n+\n+Eigen::Tensor<int, 2> rolled = a.roll(shifts);\n+\n+std::cout << \"a\\n\" << a << \"\\n\";\n+std::cout << \"rolled\\n\" << rolled << \"\\n\";\n+\n+// a\n+// 1 2 3 4\n+// 5 6 7 8\n+// 9 10 11 12\n+//\n+// rolled\n+// 7 8 5 6\n+// 11 12 9 10\n+// 3 4 1 2\n+```\n \n ### (Operation) broadcast(const Broadcast& broadcast)\n \n@@ -1584,97 +1991,152 @@ times.\n The broadcast argument specifies how many copies of the input tensor need to be\n made in each of the dimensions.\n \n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setValues({{0, 100, 200}, {300, 400, 500}});\n- Eigen::array<int, 2> bcast({3, 2});\n- Eigen::Tensor<int, 2> b = a.broadcast(bcast);\n- cout << \"a\" << endl << a << endl << \"b\" << endl << b << endl;\n- =>\n- a\n- 0 100 200\n- 300 400 500\n- b\n- 0 100 200 0 100 200\n- 300 400 500 300 400 500\n- 0 100 200 0 100 200\n- 300 400 500 300 400 500\n- 0 100 200 0 100 200\n- 300 400 500 300 400 500\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500}});\n+Eigen::array<int, 2> bcast({3, 2});\n+Eigen::Tensor<int, 2> b = a.broadcast(bcast);\n+std::cout << \"a\" << endl << a << endl << \"b\" << endl << b << endl;\n+// a\n+// 0 100 200\n+// 300 400 500\n+// b\n+// 0 100 200 0 100 200\n+// 300 400 500 300 400 500\n+// 0 100 200 0 100 200\n+// 300 400 500 300 400 500\n+// 0 100 200 0 100 200\n+// 300 400 500 300 400 500\n+```\n+\n+Note: Broadcasting does not increase rank.\n+To broadcast into higher dimensions, you must first reshape the tensor with singleton (1) dimensions:\n+\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500}});\n+\n+Eigen::array<Eigen::Index, 3> new_shape = {1, 2, 3}; //Reshape to [1, 2, 3]\n+Eigen::array<int, 3> bcast = {4, 1, 1}; // Broadcast to [4, 2, 3]\n+Eigen::Tensor<int, 3> b = a.reshape(new_shape).broadcast(bcast);\n+\n+std::cout << \"b dimensions: \" << b.dimensions() << \"\\n\";\n+std::cout << b << \"\\n\";\n+```\n \n ### (Operation) concatenate(const OtherDerived& other, Axis axis)\n \n-TODO\n+Returns a view of two tensors joined along a specified axis.\n+The dimensions of the two tensors must match on all axes except the concatenation axis.\n+The resulting tensor has the same rank as the inputs.\n+\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500}});\n+\n+Eigen::Tensor<int, 2> b(2, 3);\n+b.setValues({{-1, -2, -3}, {-4, -5, -6}});\n+\n+// Concatenate along dimension 0: resulting shape is [4, 3]\n+Eigen::Tensor<int, 2> c = a.concatenate(b, 0);\n+\n+// Concatenate along dimension 1: resulting shape is [2, 6]\n+Eigen::Tensor<int, 2> d = a.concatenate(b, 1);\n+\n+std::cout << \"a\\n\" << a << \"\\n\"\n+ << \"b\\n\" << b << \"\\n\"\n+ << \"c (concatenated along dim 0)\\n\" << c << \"\\n\"\n+ << \"d (concatenated along dim 1)\\n\" << d << \"\\n\";\n+// a\n+// 0 100 200\n+// 300 400 500\n+// b\n+// -1 -2 -3\n+// -4 -5 -6\n+// c (concatenated along dim 0)\n+// 0 100 200\n+// 300 400 500\n+// -1 -2 -3\n+// -4 -5 -6\n+// d (concatenated along dim 1)\n+// 0 100 200 -1 -2 -3\n+// 300 400 500 -4 -5 -6\n+```\n \n ### (Operation) pad(const PaddingDimensions& padding)\n \n Returns a view of the input tensor in which the input is padded with zeros.\n \n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setValues({{0, 100, 200}, {300, 400, 500}});\n- Eigen::array<pair<int, int>, 2> paddings;\n- paddings[0] = make_pair(0, 1);\n- paddings[1] = make_pair(2, 3);\n- Eigen::Tensor<int, 2> b = a.pad(paddings);\n- cout << \"a\" << endl << a << endl << \"b\" << endl << b << endl;\n- =>\n- a\n- 0 100 200\n- 300 400 500\n- b\n- 0 0 0 0\n- 0 0 0 0\n- 0 100 200 0\n- 300 400 500 0\n- 0 0 0 0\n- 0 0 0 0\n- 0 0 0 0\n-\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{0, 100, 200}, {300, 400, 500}});\n+Eigen::array<pair<int, int>, 2> paddings;\n+paddings[0] = make_pair(0, 1);\n+paddings[1] = make_pair(2, 3);\n+Eigen::Tensor<int, 2> b = a.pad(paddings);\n+std::cout << \"a\" << endl << a << endl << \"b\" << endl << b << endl;\n+// a\n+// 0 100 200\n+// 300 400 500\n+// b\n+// 0 0 0 0\n+// 0 0 0 0\n+// 0 100 200 0\n+// 300 400 500 0\n+// 0 0 0 0\n+// 0 0 0 0\n+// 0 0 0 0\n+```\n \n ### (Operation) extract_patches(const PatchDims& patch_dims)\n \n Returns a tensor of coefficient patches extracted from the input tensor, where\n-each patch is of dimension specified by 'patch_dims'. The returned tensor has\n+each patch is of dimension specified by `patch_dims`. The returned tensor has\n one greater dimension than the input tensor, which is used to index each patch.\n The patch index in the output tensor depends on the data layout of the input\n-tensor: the patch index is the last dimension ColMajor layout, and the first\n-dimension in RowMajor layout.\n+tensor: the patch index is the last dimension `ColMajor` layout, and the first\n+dimension in `RowMajor` layout.\n \n For example, given the following input tensor:\n \n- Eigen::Tensor<float, 2, DataLayout> tensor(3,4);\n- tensor.setValues({{0.0f, 1.0f, 2.0f, 3.0f},\n- {4.0f, 5.0f, 6.0f, 7.0f},\n- {8.0f, 9.0f, 10.0f, 11.0f}});\n+```cpp\n+Eigen::Tensor<float, 2, DataLayout> tensor(3,4);\n+tensor.setValues({{0.0f, 1.0f, 2.0f, 3.0f},\n+ {4.0f, 5.0f, 6.0f, 7.0f},\n+ {8.0f, 9.0f, 10.0f, 11.0f}});\n+\n+std::cout << \"tensor: \" << endl << tensor << endl;\n \n- cout << \"tensor: \" << endl << tensor << endl;\n- =>\n- tensor:\n- 0 1 2 3\n- 4 5 6 7\n- 8 9 10 11\n+// tensor:\n+// 0 1 2 3\n+// 4 5 6 7\n+// 8 9 10 11\n+```\n \n Six 2x2 patches can be extracted and indexed using the following code:\n \n- Eigen::Tensor<float, 3, DataLayout> patch;\n- Eigen::array<ptrdiff_t, 2> patch_dims;\n- patch_dims[0] = 2;\n- patch_dims[1] = 2;\n- patch = tensor.extract_patches(patch_dims);\n- for (int k = 0; k < 6; ++k) {\n- cout << \"patch index: \" << k << endl;\n- for (int i = 0; i < 2; ++i) {\n- \tfor (int j = 0; j < 2; ++j) {\n- \t if (DataLayout == ColMajor) {\n- \t\tcout << patch(i, j, k) << \" \";\n- \t } else {\n- \t\tcout << patch(k, i, j) << \" \";\n- \t }\n- \t}\n- \tcout << endl;\n+```cpp\n+Eigen::Tensor<float, 3, DataLayout> patch;\n+Eigen::array<ptrdiff_t, 2> patch_dims;\n+patch_dims[0] = 2;\n+patch_dims[1] = 2;\n+patch = tensor.extract_patches(patch_dims);\n+for (int k = 0; k < 6; ++k) {\n+ std::cout << \"patch index: \" << k << endl;\n+ for (int i = 0; i < 2; ++i) {\n+ for (int j = 0; j < 2; ++j) {\n+ if (DataLayout == ColMajor) {\n+ std::cout << patch(i, j, k) << \" \";\n+ } else {\n+ std::cout << patch(k, i, j) << \" \";\n }\n }\n+ std::cout << endl;\n+ }\n+}\n+```\n \n-This code results in the following output when the data layout is ColMajor:\n+This code results in the following output when the data layout is `ColMajor`:\n \n patch index: 0\n 0 1\n@@ -1696,7 +2158,8 @@ This code results in the following output when the data layout is ColMajor:\n 10 11\n \n This code results in the following output when the data layout is RowMajor:\n-(NOTE: the set of patches is the same as in ColMajor, but are indexed differently).\n+\n+**NOTE**: the set of patches is the same as in `ColMajor`, but are indexed differently\n \n patch index: 0\n 0 1\n@@ -1723,119 +2186,141 @@ Returns a tensor of coefficient image patches extracted from the input tensor,\n which is expected to have dimensions ordered as follows (depending on the data\n layout of the input tensor, and the number of additional dimensions 'N'):\n \n-*) ColMajor\n-1st dimension: channels (of size d)\n-2nd dimension: rows (of size r)\n-3rd dimension: columns (of size c)\n-4th-Nth dimension: time (for video) or batch (for bulk processing).\n+- `ColMajor`\n+ - 1st dimension: channels (of size d)\n+ - 2nd dimension: rows (of size r)\n+ - 3rd dimension: columns (of size c)\n+ - 4th-Nth dimension: time (for video) or batch (for bulk processing).\n \n-*) RowMajor (reverse order of ColMajor)\n-1st-Nth dimension: time (for video) or batch (for bulk processing).\n-N+1'th dimension: columns (of size c)\n-N+2'th dimension: rows (of size r)\n-N+3'th dimension: channels (of size d)\n+* `RowMajor` (reverse order of `ColMajor`)\n+ - 1st-Nth dimension: time (for video) or batch (for bulk processing).\n+ - N+1'th dimension: columns (of size c)\n+ - N+2'th dimension: rows (of size r)\n+ - N+3'th dimension: channels (of size d)\n \n The returned tensor has one greater dimension than the input tensor, which is\n used to index each patch. The patch index in the output tensor depends on the\n data layout of the input tensor: the patch index is the 4'th dimension in\n-ColMajor layout, and the 4'th from the last dimension in RowMajor layout.\n+`ColMajor` layout, and the 4'th from the last dimension in `RowMajor` layout.\n \n For example, given the following input tensor with the following dimension\n sizes:\n- *) depth: 2\n- *) rows: 3\n- *) columns: 5\n- *) batch: 7\n+- depth: 2\n+- rows: 3\n+- columns: 5\n+- batch: 7\n \n- Tensor<float, 4> tensor(2,3,5,7);\n- Tensor<float, 4, RowMajor> tensor_row_major = tensor.swap_layout();\n+```cpp\n+Tensor<float, 4> tensor(2,3,5,7);\n+Tensor<float, 4, RowMajor> tensor_row_major = tensor.swap_layout();\n+```\n \n 2x2 image patches can be extracted and indexed using the following code:\n \n-*) 2D patch: ColMajor (patch indexed by second-to-last dimension)\n-\n- Tensor<float, 5> twod_patch;\n- twod_patch = tensor.extract_image_patches<2, 2>();\n- // twod_patch.dimension(0) == 2\n- // twod_patch.dimension(1) == 2\n- // twod_patch.dimension(2) == 2\n- // twod_patch.dimension(3) == 3*5\n- // twod_patch.dimension(4) == 7\n-\n-*) 2D patch: RowMajor (patch indexed by the second dimension)\n-\n- Tensor<float, 5, RowMajor> twod_patch_row_major;\n- twod_patch_row_major = tensor_row_major.extract_image_patches<2, 2>();\n- // twod_patch_row_major.dimension(0) == 7\n- // twod_patch_row_major.dimension(1) == 3*5\n- // twod_patch_row_major.dimension(2) == 2\n- // twod_patch_row_major.dimension(3) == 2\n- // twod_patch_row_major.dimension(4) == 2\n+#### 2D patch: `ColMajor` (patch indexed by second-to-last dimension)\n+\n+```cpp\n+Tensor<float, 5> twod_patch;\n+twod_patch = tensor.extract_image_patches<2, 2>();\n+// twod_patch.dimension(0) == 2\n+// twod_patch.dimension(1) == 2\n+// twod_patch.dimension(2) == 2\n+// twod_patch.dimension(3) == 3*5\n+// twod_patch.dimension(4) == 7\n+```\n+\n+#### 2D patch: `RowMajor` (patch indexed by the second dimension)\n+\n+```cpp\n+Tensor<float, 5, RowMajor> twod_patch_row_major;\n+twod_patch_row_major = tensor_row_major.extract_image_patches<2, 2>();\n+// twod_patch_row_major.dimension(0) == 7\n+// twod_patch_row_major.dimension(1) == 3*5\n+// twod_patch_row_major.dimension(2) == 2\n+// twod_patch_row_major.dimension(3) == 2\n+// twod_patch_row_major.dimension(4) == 2\n+```\n \n ## Special Operations\n \n ### (Operation) cast<T>()\n \n-Returns a tensor of type T with the same dimensions as the original tensor.\n+Returns a tensor of type `T` with the same dimensions as the original tensor.\n The returned tensor contains the values of the original tensor converted to\n-type T.\n+type `T`.\n \n- Eigen::Tensor<float, 2> a(2, 3);\n- Eigen::Tensor<int, 2> b = a.cast<int>();\n+```cpp\n+Eigen::Tensor<float, 2> a(2, 3);\n+Eigen::Tensor<int, 2> b = a.cast<int>();\n+```\n \n This can be useful for example if you need to do element-wise division of\n-Tensors of integers. This is not currently supported by the Tensor library\n+Tensors of integers.\n+This is not currently supported by the Tensor library\n but you can easily cast the tensors to floats to do the division:\n \n- Eigen::Tensor<int, 2> a(2, 3);\n- a.setValues({{0, 1, 2}, {3, 4, 5}});\n- Eigen::Tensor<int, 2> b =\n- (a.cast<float>() / a.constant(2).cast<float>()).cast<int>();\n- cout << \"a\" << endl << a << endl << endl;\n- cout << \"b\" << endl << b << endl << endl;\n- =>\n- a\n- 0 1 2\n- 3 4 5\n-\n- b\n- 0 0 1\n- 1 2 2\n-\n+```cpp\n+Eigen::Tensor<int, 2> a(2, 3);\n+a.setValues({{0, 1, 2}, {3, 4, 5}});\n+Eigen::Tensor<int, 2> b =\n+ (a.cast<float>() / a.constant(2).cast<float>()).cast<int>();\n+std::cout << \"a\\n\" << a << \"\\n\";\n+std::cout << \"b\\n\" << b << \"\\n\";\n+\n+// a\n+// 0 1 2\n+// 3 4 5\n+//\n+// b\n+// 0 0 1\n+// 1 2 2\n+```\n \n ### (Operation) eval()\n+See [Calling eval()](#calling-eval)\n+\n \n-TODO\n \n ## Tensor Printing\n Tensors can be printed into a stream object (e.g. `std::cout`) using different formatting options.\n \n-\tEigen::Tensor<float, 3> tensor3d = {4, 3, 2};\n-\ttensor3d.setValues( {{{1, 2}, {3, 4}, {5, 6}}, {{7, 8}, {9, 10}, {11, 12}}, {{13, 14}, {15, 16}, {17, 18}}, {{19, 20}, {21, 22}, {23, 24}}} );\n-\tstd::cout << tensor3d.format(Eigen::TensorIOFormat::Plain()) << std::endl;\n-\t==>\n-\t 1 2\n-\t 3 4\n-\t 5 6\n-\n-\t 7 8\n-\t 9 10\n-\t11 12\n-\n-\t13 14\n-\t15 16\n-\t17 18\n-\n-\t19 20\n-\t21 22\n-\t23 24\n-\n+```cpp\n+Eigen::Tensor<float, 3> tensor3d = {4, 3, 2};\n+tensor3d.setValues( {{{1, 2},\n+ {3, 4},\n+ {5, 6}},\n+ {{7, 8},\n+ {9, 10},\n+ {11, 12}},\n+ {{13, 14},\n+ {15, 16},\n+ {17, 18}},\n+ {{19, 20},\n+ {21, 22},\n+ {23, 24}}} );\n+std::cout << tensor3d.format(Eigen::TensorIOFormat::Plain()) << ;\n+// 1 2\n+// 3 4\n+// 5 6\n+//\n+// 7 8\n+// 9 10\n+// 11 12\n+//\n+// 13 14\n+// 15 16\n+// 17 18\n+//\n+// 19 20\n+// 21 22\n+// 23 24\n+```\n \n In the example, we used the predefined format `Eigen::TensorIOFormat::Plain`.\n Here is the list of all predefined formats from which you can choose:\n - `Eigen::TensorIOFormat::Plain()` for a plain output without braces. Different submatrices are separated by a blank line.\n - `Eigen::TensorIOFormat::Numpy()` for numpy-like output.\n-- `Eigen::TensorIOFormat::Native()` for a `c++` like output which can be directly copy-pasted to setValues().\n+- `Eigen::TensorIOFormat::Native()` for a `c++` like output which can be directly copy-pasted to `setValues()`.\n - `Eigen::TensorIOFormat::Legacy()` for a backwards compatible printing of tensors.\n \n If you send the tensor directly to the stream the default format is called which is `Eigen::IOFormats::Plain()`.\n@@ -1849,14 +2334,19 @@ You can define your own format by explicitly providing a `Eigen::TensorIOFormat`\n \n ## Representation of scalar values\n \n-Scalar values are often represented by tensors of size 1 and rank 0.For example\n-Tensor<T, N>::maximum() currently returns a Tensor<T, 0>. Similarly, the inner\n-product of 2 1d tensors (through contractions) returns a 0d tensor.\n+Scalar values are often represented by tensors of size 1 and rank 0.\n+\n+For example `Tensor<T, N>::maximum()` returns a `Tensor<T, 0>`.\n+\n+Similarly, the inner product of 2 1d tensors (through contractions) returns a 0d tensor.\n+\n+The scalar value can be extracted as explained in [Reduction along all dimensions](#reduction-along-all-dimensions).\n+\n \n ## Limitations\n \n * The number of tensor dimensions is currently limited to 250 when using a\n compiler that supports cxx11. It is limited to only 5 for older compilers.\n-* The IndexList class requires a cxx11 compliant compiler. You can use an\n+* The `IndexList` class requires a cxx11 compliant compiler. You can use an\n array of indices instead if you don't have access to a modern compiler.\n-* On GPUs only floating point values are properly tested and optimized for.\n+* On GPUs only floating point values are properly tested and optimized for.\n\\ No newline at end of file\n","new_path":"unsupported/Eigen/CXX11/src/Tensor/README.md","old_path":"unsupported/Eigen/CXX11/src/Tensor/README.md","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\ntensor documentation\n\n## Author:\nHenric Ryden (henric.ryden)\n\n## Summary\n### Key Changes:\n- Added documentation to the tensor readme file\n\n### Improvements:\n- Added documentation to the tensor readme file\n\n### Impact:\n- Improved documentation for the tensor feature in the Eigen library"}
{"iid":1915,"title":"Decommission aarch64 ampere runner.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1915","description":"Unfortunately onARM is decommissioning their fleet, so switching\narm builds/tests to GitLab runners. We'll need to monitor our usage.","created_at":"2025-06-20T16:32:31.799Z","merged_at":"2025-06-20T20:33:53.060Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -268,6 +268,8 @@ build:linux:cross:arm:clang-12:default:\n EIGEN_CI_TARGET_ARCH: aarch64\n EIGEN_CI_CROSS_TARGET_TRIPLE: aarch64-linux-gnu\n EIGEN_CI_ADDITIONAL_ARGS: -DEIGEN_TEST_CUSTOM_CXX_FLAGS=-march=armv8.2-a+fp16\n+ tags:\n+ - saas-linux-large-arm64\n \n build:linux:cross:aarch64:gcc-10:default:\n extends: .build:linux:cross:aarch64\n@@ -320,10 +322,6 @@ build:linux:cross:ppc64le:clang-12:default:\n variables:\n EIGEN_CI_TARGET_ARCH: loongarch64\n EIGEN_CI_CROSS_TARGET_TRIPLE: loongarch64-linux-gnu\n- tags:\n- - eigen-runner\n- - linux\n- - cross-compiler\n \n # GCC-14 (minimum on Ubuntu 24)\n build:linux:cross:loongarch64:gcc-14:default:\n","new_path":"ci/build.linux.gitlab-ci.yml","old_path":"ci/build.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -292,9 +292,7 @@ test:linux:cuda-12.2:clang-12:\n # Enable cross-compiled arm binary to run on aarch64.\n EIGEN_CI_BEFORE_SCRIPT: \"ln -s /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 /lib/ && export LD_LIBRARY_PATH=/usr/arm-linux-gnueabihf/lib/\"\n tags:\n- - eigen-runner\n- - linux\n- - aarch64\n+ - saas-linux-large-arm64\n \n .test:linux:arm:gcc-10:default:\n extends: .test:linux:arm\n@@ -336,9 +334,7 @@ test:linux:arm:clang-12:default:unsupported:\n EIGEN_CI_TARGET_ARCH: aarch64\n EIGEN_CI_CROSS_TARGET_TRIPLE: aarch64-linux-gnu\n tags:\n- - eigen-runner\n- - linux\n- - aarch64\n+ - saas-linux-large-arm64\n \n .test:linux:aarch64:gcc-10:default:\n extends: .test:linux:aarch64\n","new_path":"ci/test.linux.gitlab-ci.yml","old_path":"ci/test.linux.gitlab-ci.yml","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"2","labels":["5.0"],"state":"merged","summary":"## Title:\nDecommission aarch64 ampere runner.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Modified `ci/build.linux.gitlab-ci.yml` and `ci/test.linux.gitlab-ci.yml` to switch to GitLab runners for ARM architecture.\n\n### Improvements:\n- No improvements reported.\n\n### Impact:\n- Switched to GitLab runners for ARM builds to accommodate decommissioning of AArch64 ampere runner."}
{"iid":1914,"title":"Provide macro to explicitly disable alloca","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1914","description":"### Reference issue\n\nFixes #2939 \n### What does this implement/fix?\n\n\n\n### Additional information","created_at":"2025-06-19T02:55:57.959Z","merged_at":"2025-06-19T04:23:36.509Z","author":{"name":"Charles Schlosser","username":"chuckyschluz"},"changes":[{"diff":"@@ -762,7 +762,7 @@ void swap(scoped_array<T>& a, scoped_array<T>& b) {\n * This is accomplished through alloca if this later is supported and if the required number of bytes\n * is below EIGEN_STACK_ALLOCATION_LIMIT.\n */\n-#ifdef EIGEN_ALLOCA\n+#if defined(EIGEN_ALLOCA) && !defined(EIGEN_NO_ALLOCA)\n \n #if EIGEN_DEFAULT_ALIGN_BYTES > 0\n // We always manually re-align the result of EIGEN_ALLOCA.\n","new_path":"Eigen/src/Core/util/Memory.h","old_path":"Eigen/src/Core/util/Memory.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nProvide macro to explicitly disable alloca\n\n## Author:\nCharles Schlosser (chuckyschluz)\n\n## Summary\n### Key Changes:\n- Added a macro to explicitly disable `alloca` in the Eigen library.\n\n### Improvements:\n- Added a macro `EIGEN_DISABLE_ALLOCA` to control the use of `alloca` in Eigen code.\n\n### Impact:\n- The macro allows users to disable `alloca` if it is not desired, potentially improving portability and avoiding issues related to stack overflow or undefined behavior."}
{"iid":1912,"title":"Fix unprotected SIZE in macro.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1912","description":"Gotta love macros. Because of the absense of brackets around\nan expression like `mat.cols() + 1` in a macro call,\n```\nei_declare_aligned_stack_constructed_variable(int, ptr, 10 + 1, nullptr)\n```\nends up calling\n```\nEigen::internal::aligned_alloc(sizeof(int) * 10 + 1); // 41\n```\ninstead of\n```\nEigen::internal::aligned_alloc(sizeof(int) * (10 + 1)); // 44\n```\nleading to a buffer that is too small. Protecting the `SIZE`\nargument with brackets in `ei_declare_aligned_stack_constructed_variable`\nfixes this.\n\nFixes #2941.","created_at":"2025-06-16T22:34:43.867Z","merged_at":"2025-06-16T22:54:25.818Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -785,14 +785,14 @@ EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE void* eigen_aligned_alloca_helper(void* pt\n #define EIGEN_ALIGNED_ALLOCA(SIZE) EIGEN_ALLOCA(SIZE)\n #endif\n \n-#define ei_declare_aligned_stack_constructed_variable(TYPE, NAME, SIZE, BUFFER) \\\n- Eigen::internal::check_size_for_overflow<TYPE>(SIZE); \\\n- TYPE* NAME = (BUFFER) != 0 ? (BUFFER) \\\n- : reinterpret_cast<TYPE*>((sizeof(TYPE) * SIZE <= EIGEN_STACK_ALLOCATION_LIMIT) \\\n- ? EIGEN_ALIGNED_ALLOCA(sizeof(TYPE) * SIZE) \\\n- : Eigen::internal::aligned_malloc(sizeof(TYPE) * SIZE)); \\\n- Eigen::internal::aligned_stack_memory_handler<TYPE> EIGEN_CAT(NAME, _stack_memory_destructor)( \\\n- (BUFFER) == 0 ? NAME : 0, SIZE, sizeof(TYPE) * SIZE > EIGEN_STACK_ALLOCATION_LIMIT)\n+#define ei_declare_aligned_stack_constructed_variable(TYPE, NAME, SIZE, BUFFER) \\\n+ Eigen::internal::check_size_for_overflow<TYPE>(SIZE); \\\n+ TYPE* NAME = (BUFFER) != 0 ? (BUFFER) \\\n+ : reinterpret_cast<TYPE*>((sizeof(TYPE) * (SIZE) <= EIGEN_STACK_ALLOCATION_LIMIT) \\\n+ ? EIGEN_ALIGNED_ALLOCA(sizeof(TYPE) * (SIZE)) \\\n+ : Eigen::internal::aligned_malloc(sizeof(TYPE) * (SIZE))); \\\n+ Eigen::internal::aligned_stack_memory_handler<TYPE> EIGEN_CAT(NAME, _stack_memory_destructor)( \\\n+ (BUFFER) == 0 ? NAME : 0, SIZE, sizeof(TYPE) * (SIZE) > EIGEN_STACK_ALLOCATION_LIMIT)\n \n #define ei_declare_local_nested_eval(XPR_T, XPR, N, NAME) \\\n Eigen::internal::local_nested_eval_wrapper<XPR_T, N> EIGEN_CAT(NAME, _wrapper)( \\\n@@ -805,10 +805,11 @@ EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE void* eigen_aligned_alloca_helper(void* pt\n \n #else\n \n-#define ei_declare_aligned_stack_constructed_variable(TYPE, NAME, SIZE, BUFFER) \\\n- Eigen::internal::check_size_for_overflow<TYPE>(SIZE); \\\n- TYPE* NAME = (BUFFER) != 0 ? BUFFER : reinterpret_cast<TYPE*>(Eigen::internal::aligned_malloc(sizeof(TYPE) * SIZE)); \\\n- Eigen::internal::aligned_stack_memory_handler<TYPE> EIGEN_CAT(NAME, _stack_memory_destructor)( \\\n+#define ei_declare_aligned_stack_constructed_variable(TYPE, NAME, SIZE, BUFFER) \\\n+ Eigen::internal::check_size_for_overflow<TYPE>(SIZE); \\\n+ TYPE* NAME = \\\n+ (BUFFER) != 0 ? BUFFER : reinterpret_cast<TYPE*>(Eigen::internal::aligned_malloc(sizeof(TYPE) * (SIZE))); \\\n+ Eigen::internal::aligned_stack_memory_handler<TYPE> EIGEN_CAT(NAME, _stack_memory_destructor)( \\\n (BUFFER) == 0 ? NAME : 0, SIZE, true)\n \n #define ei_declare_local_nested_eval(XPR_T, XPR, N, NAME) \\\n","new_path":"Eigen/src/Core/util/Memory.h","old_path":"Eigen/src/Core/util/Memory.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nFix unprotected SIZE in macro.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Modified `Eigen/src/Core/util/Memory.h` to protect the `SIZE` argument in the macro `ei_declare_aligned_stack_constructed_variable`.\n\n### Improvements:\n- Fixed a potential buffer overflow issue by ensuring the `SIZE` argument is properly protected in macro expansions.\n\n### Impact:\n- Prevents incorrect allocation of memory sizes in macro calls, ensuring safer and more reliable memory management."}
{"iid":1911,"title":"Remove MSVC warnings in FindCoeff.h","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1911","description":"Fix `warning C4305: 'initializing': truncation from 'unsigned int' to 'const bool'` in MSVC","created_at":"2025-06-16T16:14:30.766Z","merged_at":"2025-06-17T00:39:03.211Z","author":{"name":"Filippo Basso","username":"bassofil"},"changes":[{"diff":"@@ -277,7 +277,7 @@ struct find_coeff_evaluator : public evaluator<Derived> {\n using Scalar = typename Derived::Scalar;\n using Packet = typename packet_traits<Scalar>::type;\n static constexpr int Flags = Base::Flags;\n- static constexpr bool IsRowMajor = Flags & RowMajorBit;\n+ static constexpr bool IsRowMajor = bool(Flags & RowMajorBit);\n EIGEN_DEVICE_FUNC inline find_coeff_evaluator(const Derived& xpr) : Base(xpr), m_xpr(xpr) {}\n \n EIGEN_DEVICE_FUNC inline Scalar coeffByOuterInner(Index outer, Index inner) const {\n@@ -313,7 +313,7 @@ struct find_coeff_impl {\n using Packet = typename Evaluator::Packet;\n \n static constexpr int PacketSize = unpacket_traits<Packet>::size;\n- static constexpr bool Linearize = Flags & LinearAccessBit;\n+ static constexpr bool Linearize = bool(Flags & LinearAccessBit);\n static constexpr bool DontVectorize =\n enum_lt_not_dynamic(Linearize ? MaxSizeAtCompileTime : MaxInnerSizeAtCompileTime, PacketSize);\n static constexpr bool Vectorize =\n","new_path":"Eigen/src/Core/FindCoeff.h","old_path":"Eigen/src/Core/FindCoeff.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nRemove MSVC warnings in FindCoeff.h\n\n## Author:\nFilippo Basso (bassofil)\n\n## Summary\n### Key Changes:\n- Fixed MSVC warning about truncation from `unsigned int` to `const bool` in `FindCoeff.h`\n\n### Improvements:\n- Addressed compiler warnings related to type truncation in MSVC\n\n### Impact:\n- Reduced compiler warnings in the Eigen library for MSVC users."}
{"iid":1910,"title":"Faster emulated half comparisons","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1910","description":"### Reference issue\n\n\n### What does this implement/fix?\n\nIEEE floating point comparisons are essentially signed integer comparisons. Instead of converting fp16 to fp32 for comparisons, we can convert the sign-magnitude representation to two's complement and compare the bit patterns as signed integers. This handles all edge cases automatically (even +0 vs -0) except NaN. For that, we need to check for an ordered comparison. Results look pretty good! The new scalar version is much faster than the old vectorized version!\n\nThis MR implements provides the scalar and intel intrinsic for this concept. I expect we could do the same for arm, power, and so on.\n\nWe could try it for bfloat16, though the results would be less impressive (perhaps much worse) since the conversion from bfloat16 to f32 is easier.\n\n### Additional information\n\n\nUbuntu clang version 18.1.3 -DNDEBUG -O3 -mavx2\n\n| Size | Scalar Old [ns] | Scalar New [ns] | Scalar Change [%] | Vector Old [ns] | Vector New [ns] | Vector Change [%] |\n|------|-----------------|-----------------|-------------------|-----------------|-----------------|-------------------|\n| 1 | 1.45 | 1.02 | -29.66% | 2.68 | 1.02 | -61.94% |\n| 2 | 2.68 | 1.85 | -30.97% | 3.8 | 1.86 | -51.05% |\n| 4 | 5.17 | 3.34 | -35.40% | 7.93 | 3.48 | -56.12% |\n| 8 | 10.2 | 1.49 | -85.39% | 4.54 | 0.84 | -81.52% |\n| 16 | 21.2 | 1.55 | -92.69% | 8.98 | 1.30 | -85.52% |\n| 32 | 48.9 | 2.92 | -94.03% | 17.7 | 2.23 | -87.40% |\n| 64 | 90.1 | 5.45 | -93.95% | 35.2 | 4.45 | -87.36% |\n| 128 | 178 | 11.00 | -93.82% | 70.1 | 8.90 | -87.30% |\n| 256 | 331 | 21.30 | -93.56% | 140 | 19.00 | -86.43% |\n| 512 | 671 | 41.40 | -93.83% | 280 | 36.90 | -86.82% |\n| 1024 | 1296 | 82.00 | -93.67% | 558 | 72.50 | -87.01% |\n\n```cpp\n#include <benchmark/benchmark.h>\n#include <Eigen/Core>\nusing namespace Eigen;\n\nusing T = half;\nusing Vec = VectorX<T>;\n\nstatic void half_scalar(benchmark::State& state) {\n Index n = state.range(0);\n Vec a(n), b(n);\n VectorX<bool> r(n);\n a.setRandom();\n b.setRandom();\n for (auto s : state) {\n for(Index i = 0; i < n; i++)\n {\n r.coeffRef(i) = a.coeff(i) < b.coeff(i);\n }\n benchmark::DoNotOptimize(r);\n }\n}\n\nstatic void half_vector(benchmark::State& state) {\n Index n = state.range(0);\n Vec a(n), b(n), r(n);\n a.setRandom();\n b.setRandom();\n for (auto s : state) {\n r = a.cwiseTypedLess(b);\n benchmark::DoNotOptimize(r);\n }\n}\n\nBENCHMARK(half_scalar)->RangeMultiplier(2)->Range(1<<0, 1<<10);\nBENCHMARK(half_vector)->RangeMultiplier(2)->Range(1<<0, 1<<10);\nBENCHMARK_MAIN();\n```","created_at":"2025-06-14T11:43:35.973Z","merged_at":"2025-06-17T17:06:00.101Z","author":{"name":"Charles Schlosser","username":"chuckyschluz"},"changes":[{"diff":"@@ -2249,24 +2249,64 @@ EIGEN_STRONG_INLINE Packet8h ptrunc<Packet8h>(const Packet8h& a) {\n return float2half(ptrunc<Packet8f>(half2float(a)));\n }\n \n+template <>\n+EIGEN_STRONG_INLINE Packet8h pisinf<Packet8h>(const Packet8h& a) {\n+ constexpr uint16_t kInf = ((1 << 5) - 1) << 10;\n+ constexpr uint16_t kAbsMask = (1 << 15) - 1;\n+ return _mm_cmpeq_epi16(_mm_and_si128(a.m_val, _mm_set1_epi16(kAbsMask)), _mm_set1_epi16(kInf));\n+}\n+\n+template <>\n+EIGEN_STRONG_INLINE Packet8h pisnan<Packet8h>(const Packet8h& a) {\n+ constexpr uint16_t kInf = ((1 << 5) - 1) << 10;\n+ constexpr uint16_t kAbsMask = (1 << 15) - 1;\n+ return _mm_cmpgt_epi16(_mm_and_si128(a.m_val, _mm_set1_epi16(kAbsMask)), _mm_set1_epi16(kInf));\n+}\n+\n+// convert the sign-magnitude representation to two's complement\n+EIGEN_STRONG_INLINE __m128i pmaptosigned(const __m128i& a) {\n+ constexpr uint16_t kAbsMask = (1 << 15) - 1;\n+ // if 'a' has the sign bit set, clear the sign bit and negate the result as if it were an integer\n+ return _mm_sign_epi16(_mm_and_si128(a, _mm_set1_epi16(kAbsMask)), a);\n+}\n+\n+// return true if both `a` and `b` are not NaN\n+EIGEN_STRONG_INLINE Packet8h pisordered(const Packet8h& a, const Packet8h& b) {\n+ constexpr uint16_t kInf = ((1 << 5) - 1) << 10;\n+ constexpr uint16_t kAbsMask = (1 << 15) - 1;\n+ __m128i abs_a = _mm_and_si128(a.m_val, _mm_set1_epi16(kAbsMask));\n+ __m128i abs_b = _mm_and_si128(b.m_val, _mm_set1_epi16(kAbsMask));\n+ // check if both `abs_a <= kInf` and `abs_b <= kInf` by checking if max(abs_a, abs_b) <= kInf\n+ // SSE has no `lesser or equal` instruction for integers, but comparing against kInf + 1 accomplishes the same goal\n+ return _mm_cmplt_epi16(_mm_max_epu16(abs_a, abs_b), _mm_set1_epi16(kInf + 1));\n+}\n+\n template <>\n EIGEN_STRONG_INLINE Packet8h pcmp_eq(const Packet8h& a, const Packet8h& b) {\n- return Pack16To8(pcmp_eq(half2float(a), half2float(b)));\n+ __m128i isOrdered = pisordered(a, b);\n+ __m128i isEqual = _mm_cmpeq_epi16(pmaptosigned(a.m_val), pmaptosigned(b.m_val));\n+ return _mm_and_si128(isOrdered, isEqual);\n }\n \n template <>\n EIGEN_STRONG_INLINE Packet8h pcmp_le(const Packet8h& a, const Packet8h& b) {\n- return Pack16To8(pcmp_le(half2float(a), half2float(b)));\n+ __m128i isOrdered = pisordered(a, b);\n+ __m128i isGreater = _mm_cmpgt_epi16(pmaptosigned(a.m_val), pmaptosigned(b.m_val));\n+ return _mm_andnot_si128(isGreater, isOrdered);\n }\n \n template <>\n EIGEN_STRONG_INLINE Packet8h pcmp_lt(const Packet8h& a, const Packet8h& b) {\n- return Pack16To8(pcmp_lt(half2float(a), half2float(b)));\n+ __m128i isOrdered = pisordered(a, b);\n+ __m128i isLess = _mm_cmplt_epi16(pmaptosigned(a.m_val), pmaptosigned(b.m_val));\n+ return _mm_and_si128(isOrdered, isLess);\n }\n \n template <>\n EIGEN_STRONG_INLINE Packet8h pcmp_lt_or_nan(const Packet8h& a, const Packet8h& b) {\n- return Pack16To8(pcmp_lt_or_nan(half2float(a), half2float(b)));\n+ __m128i isUnordered = por(pisnan(a), pisnan(b));\n+ __m128i isLess = _mm_cmplt_epi16(pmaptosigned(a.m_val), pmaptosigned(b.m_val));\n+ return _mm_or_si128(isUnordered, isLess);\n }\n \n template <>\n","new_path":"Eigen/src/Core/arch/AVX/PacketMath.h","old_path":"Eigen/src/Core/arch/AVX/PacketMath.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -497,16 +497,56 @@ EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC half& operator/=(half& a, const half& b) {\n a = half(float(a) / float(b));\n return a;\n }\n+\n+// Non-negative floating point numbers have a monotonic mapping to non-negative integers.\n+// This property allows floating point numbers to be reinterpreted as integers for comparisons, which is useful if there\n+// is no native floating point comparison operator. Floating point signedness is handled by the sign-magnitude\n+// representation, whereas integers typically use two's complement. Converting the bit pattern from sign-magnitude to\n+// two's complement allows the transformed bit patterns be compared as signed integers. All edge cases (+/-0 and +/-\n+// infinity) are handled automatically, except NaN.\n+//\n+// fp16 uses 1 sign bit, 5 exponent bits, and 10 mantissa bits. The bit pattern conveys NaN when all the exponent\n+// bits (5) are set, and at least one mantissa bit is set. The sign bit is irrelevant for determining NaN. To check for\n+// NaN, clear the sign bit and check if the integral representation is greater than 01111100000000. To test\n+// for non-NaN, clear the sign bit and check if the integeral representation is less than or equal to 01111100000000.\n+\n+// convert sign-magnitude representation to two's complement\n+EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC int16_t mapToSigned(uint16_t a) {\n+ constexpr uint16_t kAbsMask = (1 << 15) - 1;\n+ // If the sign bit is set, clear the sign bit and return the (integer) negation. Otherwise, return the input.\n+ return (a >> 15) ? -(a & kAbsMask) : a;\n+}\n+EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool isOrdered(const half& a, const half& b) {\n+ constexpr uint16_t kInf = ((1 << 5) - 1) << 10;\n+ constexpr uint16_t kAbsMask = (1 << 15) - 1;\n+ return numext::maxi(a.x & kAbsMask, b.x & kAbsMask) <= kInf;\n+}\n EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator==(const half& a, const half& b) {\n- return numext::equal_strict(float(a), float(b));\n+ bool result = mapToSigned(a.x) == mapToSigned(b.x);\n+ result &= isOrdered(a, b);\n+ return result;\n+}\n+EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator!=(const half& a, const half& b) { return !(a == b); }\n+EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator<(const half& a, const half& b) {\n+ bool result = mapToSigned(a.x) < mapToSigned(b.x);\n+ result &= isOrdered(a, b);\n+ return result;\n }\n-EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator!=(const half& a, const half& b) {\n- return numext::not_equal_strict(float(a), float(b));\n+EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator<=(const half& a, const half& b) {\n+ bool result = mapToSigned(a.x) <= mapToSigned(b.x);\n+ result &= isOrdered(a, b);\n+ return result;\n+}\n+EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator>(const half& a, const half& b) {\n+ bool result = mapToSigned(a.x) > mapToSigned(b.x);\n+ result &= isOrdered(a, b);\n+ return result;\n+}\n+EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator>=(const half& a, const half& b) {\n+ bool result = mapToSigned(a.x) >= mapToSigned(b.x);\n+ result &= isOrdered(a, b);\n+ return result;\n }\n-EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator<(const half& a, const half& b) { return float(a) < float(b); }\n-EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator<=(const half& a, const half& b) { return float(a) <= float(b); }\n-EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator>(const half& a, const half& b) { return float(a) > float(b); }\n-EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool operator>=(const half& a, const half& b) { return float(a) >= float(b); }\n \n #if EIGEN_COMP_CLANG && defined(EIGEN_GPUCC)\n #pragma pop_macro(\"EIGEN_DEVICE_FUNC\")\n@@ -706,7 +746,11 @@ EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool(isnan)(const half& a) {\n #endif\n }\n EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool(isfinite)(const half& a) {\n- return !(isinf EIGEN_NOT_A_MACRO(a)) && !(isnan EIGEN_NOT_A_MACRO(a));\n+#if defined(EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC) || defined(EIGEN_HAS_BUILTIN_FLOAT16)\n+ return (numext::bit_cast<numext::uint16_t>(a.x) & 0x7fff) < 0x7c00;\n+#else\n+ return (a.x & 0x7fff) < 0x7c00;\n+#endif\n }\n \n EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC half abs(const half& a) {\n","new_path":"Eigen/src/Core/arch/Default/Half.h","old_path":"Eigen/src/Core/arch/Default/Half.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false},{"diff":"@@ -72,17 +72,16 @@ void test_conversion() {\n // NaNs and infinities.\n VERIFY(!(numext::isinf)(float(half(65504.0f)))); // Largest finite number.\n VERIFY(!(numext::isnan)(float(half(0.0f))));\n+ VERIFY((numext::isfinite)(float(half(65504.0f))));\n+ VERIFY((numext::isfinite)(float(half(0.0f))));\n VERIFY((numext::isinf)(float(half(__half_raw(0xfc00)))));\n VERIFY((numext::isnan)(float(half(__half_raw(0xfc01)))));\n VERIFY((numext::isinf)(float(half(__half_raw(0x7c00)))));\n VERIFY((numext::isnan)(float(half(__half_raw(0x7c01)))));\n \n-#if !EIGEN_COMP_MSVC\n- // Visual Studio errors out on divisions by 0\n- VERIFY((numext::isnan)(float(half(0.0 / 0.0))));\n- VERIFY((numext::isinf)(float(half(1.0 / 0.0))));\n- VERIFY((numext::isinf)(float(half(-1.0 / 0.0))));\n-#endif\n+ VERIFY((numext::isnan)(float(NumTraits<half>::quiet_NaN())));\n+ VERIFY((numext::isinf)(float(NumTraits<half>::infinity())));\n+ VERIFY((numext::isinf)(float(-NumTraits<half>::infinity())));\n \n // Exactly same checks as above, just directly on the half representation.\n VERIFY(!(numext::isinf)(half(__half_raw(0x7bff))));\n@@ -92,12 +91,9 @@ void test_conversion() {\n VERIFY((numext::isinf)(half(__half_raw(0x7c00))));\n VERIFY((numext::isnan)(half(__half_raw(0x7c01))));\n \n-#if !EIGEN_COMP_MSVC\n- // Visual Studio errors out on divisions by 0\n- VERIFY((numext::isnan)(half(0.0 / 0.0)));\n- VERIFY((numext::isinf)(half(1.0 / 0.0)));\n- VERIFY((numext::isinf)(half(-1.0 / 0.0)));\n-#endif\n+ VERIFY((numext::isnan)(NumTraits<half>::quiet_NaN()));\n+ VERIFY((numext::isinf)(NumTraits<half>::infinity()));\n+ VERIFY((numext::isinf)(-NumTraits<half>::infinity()));\n \n // Conversion to bool\n VERIFY(!static_cast<bool>(half(0.0)));\n@@ -204,19 +200,25 @@ void test_comparison() {\n VERIFY(half(1.0f) != half(2.0f));\n \n // Comparisons with NaNs and infinities.\n-#if !EIGEN_COMP_MSVC\n- // Visual Studio errors out on divisions by 0\n- VERIFY(!(half(0.0 / 0.0) == half(0.0 / 0.0)));\n- VERIFY(half(0.0 / 0.0) != half(0.0 / 0.0));\n-\n- VERIFY(!(half(1.0) == half(0.0 / 0.0)));\n- VERIFY(!(half(1.0) < half(0.0 / 0.0)));\n- VERIFY(!(half(1.0) > half(0.0 / 0.0)));\n- VERIFY(half(1.0) != half(0.0 / 0.0));\n-\n- VERIFY(half(1.0) < half(1.0 / 0.0));\n- VERIFY(half(1.0) > half(-1.0 / 0.0));\n-#endif\n+ VERIFY(!(NumTraits<half>::quiet_NaN() == NumTraits<half>::quiet_NaN()));\n+ VERIFY(NumTraits<half>::quiet_NaN() != NumTraits<half>::quiet_NaN());\n+\n+ VERIFY(!(internal::random<half>() == NumTraits<half>::quiet_NaN()));\n+ VERIFY(!(internal::random<half>() < NumTraits<half>::quiet_NaN()));\n+ VERIFY(!(internal::random<half>() > NumTraits<half>::quiet_NaN()));\n+ VERIFY(!(internal::random<half>() <= NumTraits<half>::quiet_NaN()));\n+ VERIFY(!(internal::random<half>() >= NumTraits<half>::quiet_NaN()));\n+ VERIFY(internal::random<half>() != NumTraits<half>::quiet_NaN());\n+\n+ VERIFY(!(NumTraits<half>::quiet_NaN() == internal::random<half>()));\n+ VERIFY(!(NumTraits<half>::quiet_NaN() < internal::random<half>()));\n+ VERIFY(!(NumTraits<half>::quiet_NaN() > internal::random<half>()));\n+ VERIFY(!(NumTraits<half>::quiet_NaN() <= internal::random<half>()));\n+ VERIFY(!(NumTraits<half>::quiet_NaN() >= internal::random<half>()));\n+ VERIFY(NumTraits<half>::quiet_NaN() != internal::random<half>());\n+\n+ VERIFY(internal::random<half>() < NumTraits<half>::infinity());\n+ VERIFY(internal::random<half>() > -NumTraits<half>::infinity());\n }\n \n void test_basic_functions() {\n","new_path":"test/half_float.cpp","old_path":"test/half_float.cpp","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"3","labels":["5.0"],"state":"merged","summary":"## Title:\nFaster emulated half comparisons\n\n## Author:\nCharles Schlosser (chuckyschluz)\n\n## Summary\n### Key Changes:\n- Added scalar and intrinsic implementations for emulated half comparisons.\n- Modified `PacketMath.h` and `Half.h` to support the new comparison logic.\n- Updated `half_float.cpp` to include the new implementation.\n\n### Improvements:\n- Implemented scalar version of half comparisons with faster performance compared to vectorized version.\n- Improved handling of sign-magnitude to two's complement comparisons for IEEE floating point.\n- Added benchmarking to measure performance differences.\n\n### Impact:\n- Significant performance improvement in scalar comparison operations for half-precision data.\n- Better handling of edge cases in comparisons (including +0 vs -0).\n- Potential for portability across different architectures (Intel, ARM, etc.)."}
{"iid":1909,"title":"Add OpenBLAS sbgemm.","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1909","description":"If using OpenBLAS with the `BUILD_BFLOAT16=1` extension and\n`EIGEN_USE_BLAS=1`, will use OpenBLAS's `sbgemm` for bfloat16 * bfloat16.\n\nDecided to use a separate macro `EIGEN_USE_OPENBLAS_BFLOAT16` since\nthe default openblas build does not include the `sbgemm` symbols.\n\nTowards #2936","created_at":"2025-06-13T22:53:38.499Z","merged_at":"2025-06-16T18:23:05.644Z","author":{"name":"Antonio Sánchez","username":"cantonios"},"changes":[{"diff":"@@ -55,7 +55,7 @@ namespace internal {\n ConjugateRhs, ColMajor, 1> { \\\n typedef gebp_traits<EIGTYPE, EIGTYPE> Traits; \\\n \\\n- static void run(Index rows, Index cols, Index depth, const EIGTYPE* _lhs, Index lhsStride, const EIGTYPE* _rhs, \\\n+ static void run(Index rows, Index cols, Index depth, const EIGTYPE* lhs_, Index lhsStride, const EIGTYPE* rhs_, \\\n Index rhsStride, EIGTYPE* res, Index resIncr, Index resStride, EIGTYPE alpha, \\\n level3_blocking<EIGTYPE, EIGTYPE>& /*blocking*/, GemmParallelInfo<Index>* /*info = 0*/) { \\\n using std::conj; \\\n@@ -84,20 +84,20 @@ namespace internal {\n \\\n /* Set a, b, c */ \\\n if ((LhsStorageOrder == ColMajor) && (ConjugateLhs)) { \\\n- Map<const MatrixX##EIGPREFIX, 0, OuterStride<> > lhs(_lhs, m, k, OuterStride<>(lhsStride)); \\\n+ Map<const MatrixX##EIGPREFIX, 0, OuterStride<> > lhs(lhs_, m, k, OuterStride<>(lhsStride)); \\\n a_tmp = lhs.conjugate(); \\\n a = a_tmp.data(); \\\n lda = convert_index<BlasIndex>(a_tmp.outerStride()); \\\n } else \\\n- a = _lhs; \\\n+ a = lhs_; \\\n \\\n if ((RhsStorageOrder == ColMajor) && (ConjugateRhs)) { \\\n- Map<const MatrixX##EIGPREFIX, 0, OuterStride<> > rhs(_rhs, k, n, OuterStride<>(rhsStride)); \\\n+ Map<const MatrixX##EIGPREFIX, 0, OuterStride<> > rhs(rhs_, k, n, OuterStride<>(rhsStride)); \\\n b_tmp = rhs.conjugate(); \\\n b = b_tmp.data(); \\\n ldb = convert_index<BlasIndex>(b_tmp.outerStride()); \\\n } else \\\n- b = _rhs; \\\n+ b = rhs_; \\\n \\\n BLASFUNC(&transa, &transb, &m, &n, &k, (const BLASTYPE*)&numext::real_ref(alpha), (const BLASTYPE*)a, &lda, \\\n (const BLASTYPE*)b, &ldb, (const BLASTYPE*)&numext::real_ref(beta), (BLASTYPE*)res, &ldc); \\\n@@ -116,6 +116,88 @@ GEMM_SPECIALIZATION(dcomplex, cd, double, zgemm_)\n GEMM_SPECIALIZATION(scomplex, cf, float, cgemm_)\n #endif\n \n+// If OpenBLAS with BUILD_BFLOAT16=1 support is available,\n+// use sbgemm for bfloat16.\n+#if EIGEN_USE_OPENBLAS_BFLOAT16\n+\n+extern \"C\" {\n+// OpenBLAS prototype.\n+void sbgemm_(const char* trans_a, const char* trans_b, const int* M, const int* N, const int* K, const float* alpha,\n+ const Eigen::bfloat16* A, const int* lda, const Eigen::bfloat16* B, const int* ldb, const float* beta,\n+ float* C, const int* ldc);\n+} // extern \"C\"\n+\n+template <typename Index, int LhsStorageOrder, bool ConjugateLhs, int RhsStorageOrder, bool ConjugateRhs>\n+struct general_matrix_matrix_product<Index, Eigen::bfloat16, LhsStorageOrder, ConjugateLhs, Eigen::bfloat16,\n+ RhsStorageOrder, ConjugateRhs, ColMajor, 1> {\n+ typedef gebp_traits<Eigen::bfloat16, Eigen::bfloat16> Traits;\n+\n+ static void run(Index rows, Index cols, Index depth, const Eigen::bfloat16* lhs_, Index lhsStride,\n+ const Eigen::bfloat16* rhs_, Index rhsStride, Eigen::bfloat16* res, Index resIncr, Index resStride,\n+ Eigen::bfloat16 alpha, level3_blocking<Eigen::bfloat16, Eigen::bfloat16>& /*blocking*/,\n+ GemmParallelInfo<Index>* /*info = 0*/) {\n+ using std::conj;\n+ if (rows == 0 || cols == 0 || depth == 0) return;\n+ EIGEN_ONLY_USED_FOR_DEBUG(resIncr);\n+ eigen_assert(resIncr == 1);\n+ char transa, transb;\n+ BlasIndex m, n, k, lda, ldb, ldc;\n+ const Eigen::bfloat16 *a, *b;\n+\n+ float falpha = static_cast<float>(alpha);\n+ float fbeta = float(1.0);\n+\n+ using MatrixXbf16 = Matrix<Eigen::bfloat16, Dynamic, Dynamic>;\n+ MatrixXbf16 a_tmp, b_tmp;\n+ MatrixXf r_tmp;\n+\n+ /* Set transpose options */\n+ transa = (LhsStorageOrder == RowMajor) ? ((ConjugateLhs) ? 'C' : 'T') : 'N';\n+ transb = (RhsStorageOrder == RowMajor) ? ((ConjugateRhs) ? 'C' : 'T') : 'N';\n+\n+ /* Set m, n, k */\n+ m = convert_index<BlasIndex>(rows);\n+ n = convert_index<BlasIndex>(cols);\n+ k = convert_index<BlasIndex>(depth);\n+\n+ /* Set lda, ldb, ldc */\n+ lda = convert_index<BlasIndex>(lhsStride);\n+ ldb = convert_index<BlasIndex>(rhsStride);\n+ ldc = convert_index<BlasIndex>(m);\n+\n+ /* Set a, b, c */\n+ if ((LhsStorageOrder == ColMajor) && (ConjugateLhs)) {\n+ Map<const MatrixXbf16, 0, OuterStride<> > lhs(lhs_, m, k, OuterStride<>(lhsStride));\n+ a_tmp = lhs.conjugate();\n+ a = a_tmp.data();\n+ lda = convert_index<BlasIndex>(a_tmp.outerStride());\n+ } else {\n+ a = lhs_;\n+ }\n+\n+ if ((RhsStorageOrder == ColMajor) && (ConjugateRhs)) {\n+ Map<const MatrixXbf16, 0, OuterStride<> > rhs(rhs_, k, n, OuterStride<>(rhsStride));\n+ b_tmp = rhs.conjugate();\n+ b = b_tmp.data();\n+ ldb = convert_index<BlasIndex>(b_tmp.outerStride());\n+ } else {\n+ b = rhs_;\n+ }\n+\n+ // Evaluate to a temporary intermediate array.\n+ r_tmp.resize(m, n);\n+\n+ sbgemm_(&transa, &transb, &m, &n, &k, (const float*)&numext::real_ref(falpha), a, &lda, b, &ldb,\n+ (const float*)&numext::real_ref(fbeta), r_tmp.data(), &ldc);\n+\n+ // Cast to the output.\n+ Map<MatrixXbf16, 0, OuterStride<> > result(res, m, n, OuterStride<>(resStride));\n+ result = r_tmp.cast<Eigen::bfloat16>();\n+ }\n+};\n+\n+#endif // EIGEN_USE_OPENBLAS_SBGEMM\n+\n } // namespace internal\n \n } // end namespace Eigen\n","new_path":"Eigen/src/Core/products/GeneralMatrixMatrix_BLAS.h","old_path":"Eigen/src/Core/products/GeneralMatrixMatrix_BLAS.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nAdd OpenBLAS sbgemm.\n\n## Author:\nAntonio Sánchez (cantonios)\n\n## Summary\n### Key Changes:\n- Added support for OpenBLAS's `sbgemm` function for bfloat16 matrix multiplication.\n- Introduced a separate macro `EIGEN_USE_OPENBLAS_BFLOAT16` to enable the use of `sbgemm` in OpenBLAS builds.\n\n### Improvements:\n- Enabled efficient bfloat16 matrix multiplication using OpenBLAS when appropriate.\n- Improved compatibility with OpenBLAS builds that do not include `sbgemm` by using a custom macro.\n\n### Impact:\n- Enhanced performance for bfloat16 matrix operations in OpenBLAS builds.\n- Increased flexibility in enabling OpenBLAS-specific optimizations."}
{"iid":1906,"title":"Fix neon compilation bug","web_url":"https://gitlab.com/libeigen/eigen/-/merge_requests/1906","description":"### Reference issue\n\n\n### What does this implement/fix?\n\n\n### Additional information","created_at":"2025-06-10T21:52:51.306Z","merged_at":"2025-06-11T01:10:05.164Z","author":{"name":"Charles Schlosser","username":"chuckyschluz"},"changes":[{"diff":"@@ -5704,23 +5704,23 @@ EIGEN_STRONG_INLINE Packet4hf pmadd(const Packet4hf& a, const Packet4hf& b, cons\n }\n \n template <>\n-EIGEN_STRONG_INLINE Packet8hf pmsub(const Packet8hf& a, const Packet8hf& b, const Packet8hf& c) {\n- return pnegate(pnmadd(a, b, c));\n+EIGEN_STRONG_INLINE Packet8hf pnmadd(const Packet8hf& a, const Packet8hf& b, const Packet8hf& c) {\n+ return vfmsq_f16(c, a, b);\n }\n \n template <>\n-EIGEN_STRONG_INLINE Packet4hf pmsub(const Packet4hf& a, const Packet4hf& b, const Packet4hf& c) {\n- return pnegate(pnmadd(a, b, c));\n+EIGEN_STRONG_INLINE Packet4hf pnmadd(const Packet4hf& a, const Packet4hf& b, const Packet4hf& c) {\n+ return vfms_f16(c, a, b);\n }\n \n template <>\n-EIGEN_STRONG_INLINE Packet8hf pnmadd(const Packet8hf& a, const Packet8hf& b, const Packet8hf& c) {\n- return vfmsq_f16(c, a, b);\n+EIGEN_STRONG_INLINE Packet8hf pmsub(const Packet8hf& a, const Packet8hf& b, const Packet8hf& c) {\n+ return pnegate(pnmadd(a, b, c));\n }\n \n template <>\n-EIGEN_STRONG_INLINE Packet4hf pnmadd(const Packet4hf& a, const Packet4hf& b, const Packet4hf& c) {\n- return vfms_f16(c, a, b);\n+EIGEN_STRONG_INLINE Packet4hf pmsub(const Packet4hf& a, const Packet4hf& b, const Packet4hf& c) {\n+ return pnegate(pnmadd(a, b, c));\n }\n \n template <>\n","new_path":"Eigen/src/Core/arch/NEON/PacketMath.h","old_path":"Eigen/src/Core/arch/NEON/PacketMath.h","a_mode":"100644","b_mode":"100644","new_file":false,"renamed_file":false,"deleted_file":false,"generated_file":false}],"changes_count":"1","labels":["5.0"],"state":"merged","summary":"## Title:\nFix neon compilation bug\n\n## Author:\nCharles Schlosser (chuckyschluz)\n\n## Summary\n### Key Changes:\n- Fixed a compilation bug in the NEON implementation of Eigen's PacketMath.h file.\n\n### Improvements:\n- Addressed a compilation issue related to NEON architecture.\n\n### Impact:\n- Resolved a compilation error that prevented the NEON code from building correctly."}
View raw

(Sorry about that, but we can’t show files that are this big right now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment