- #485: Removes deprecated CMake package config variables, potentially breaking existing CMake configurations
- #608: Removes CI jobs for C++03 compatibility, signaling transition to modern C++ standards
- #649: Move Eigen::all, last, and lastp1 back to Eigen::placeholders namespace to reduce name collision risks
- #658: Refactored SVD module with new template parameter for computation options, breaking existing API
- #725: Removed deprecated MappedSparseMatrix type from internal library code
- #742: Updates minimum CMake version to 3.10, removes C++11 test disable option, and sets minimum GCC version to 5
- #744: Updated compiler requirements by removing deprecated feature test macros and enforcing newer GCC and MSVC versions
- #749: Reverts disruptive SVD module update that caused compatibility issues with third-party libraries
- #771: Renamed internal
sizefunction tossizeto prevent ADL conflicts and improve C++ standard compatibility - #808: Introduces explicit type casting requirements for
pmaddfunction to improve type safety and compatibility with custom scalar types - #826: Significant updates to SVD module with new Options template parameter, introducing API breaking changes for improved flexibility
- #840: Fixed CUDA feature flag handling to respect
EIGEN_NO_CUDAcompilation option - #857: Reintroduced
svd::compute(Matrix, options)method to prevent breaking external projects - #862: Restores fixed sizes for U/V matrices in matrix decompositions for fixed-sized inputs
- #911: Fixed critical assumption about RowMajorBit and RowMajor, potentially impacting matrix storage order logic
- #932: Replaced
make_coherentwithCoherentPadOp, removing a function that modifiesconstinputs and introducing a more performant padding operator for derivative vector sizing - #946: Removed legacy macro EIGEN_EMPTY_STRUCT_CTOR, potentially impacting older GCC compatibility
- #966: Simplified Accelerate LLT and LDLT solvers by removing explicit Symmetric flag requirement
- #1015: Disabled AVX512 GEMM kernels by default due to segmentation fault issues
- #1240: Changes default comparison overloads to return
boolarrays and introducescwiseTypedLesserfor typed comparisons - #1254: Backwards compatible implementation of DenseBase::select with swapped template argument order
- #1280: Disabled raw array indexed view access for 1D arrays to prevent potential bugs and improve library safety
- #1301: Introduces canonical range corrections for Euler angles with new default behavior, potentially breaking existing angle computations
- #1383: Introduces
EIGEN_TEMPORARY_UNALIGNED_SCALAR_UBmacro to handle unaligned scalar undefined behavior, primarily addressing TensorFlow Lite compatibility issues - #1497: Removed non-standard
intreturn types and unnecessary arguments from BLAS/LAPACK function interfaces to improve package compatibility - #1520: Removes
using namespace Eigenfrom blas/common.h to prevent symbol collisions - #1550: Modify
rbegin/rendhandling for GPU, now explicitly marking unsupported with clearer compile-time errors - #1553: Restored C++03 compatibility by modifying 2x2 matrix construction to support older C++ standards
- #1696: Makes fixed-size matrices and arrays
trivially_default_constructible, requiringEIGEN_NO_DEBUGorEIGEN_DISABLE_UNALIGNED_ARRAY_ASSERT - #1730: Reverts previous change to fixed-size object move assignability, restoring correct
setZero()behavior - #1751: Revert problematic commit that caused debug mode build failures
- #1795: Changes
Eigen::aligned_allocatorto no longer inherit fromstd::allocator, modifying allocator behavior and potentially breaking existing code - #1827: Removes default assumption of
std::complexfor complex scalar types, allowing more flexibility with custom complex types
- #356: Introduced PocketFFT as a more performant and accurate replacement for KissFFT in Eigen's FFT module
- #489: Added AVX512 and AVX2 support for Packet16i and Packet8i, enhancing vectorization capabilities for integer types
- #515: Adds random matrix generation via SVD with two strategies for generating singular values
- #610: Updates CMake configuration to centralize C++11 standard setting, simplifying build process
- #667: Significantly speeds up tensor reduction performance through loop strip mining and unrolling techniques
- #673: Vectorized implementation of Visitor.h with up to 39% performance improvement using AVX2 instructions
- #698: Optimizes
CommaInitializerto reuse fixed dimensions more efficiently during matrix block initialization - #702: Added AVX vectorized implementation for float2half/half2float conversion functions with significant performance improvements
- #732: Removes EIGEN_HAS_CXX11 macro, simplifying Eigen's codebase and focusing on C++11+ support
- #736: SFINAE improvements for transpose methods in self-adjoint and triangular views
- #764: Performance improvements for VSX and MMA GEMV operations on PowerPC, with up to 4X speedup
- #796: Makes fixed-size Matrix and Array trivially copyable in C++20, improving memory management and compatibility
- #817: Added support for int64 packets on x86 architectures, enabling more efficient vectorized operations
- #820: Added reciprocal packet operation with optimized SSE, AVX, and AVX512 specializations for float, improving computational performance and accuracy
- #824: Removed inline assembly for FMA (AVX) and added new packet operations pmsub, pnmadd, and pnmsub with performance improvements
- #827: Optimized precipitation function implementation with IEEE compliance for 1/0 and 1/inf cases, improving performance and handling of special mathematical scenarios
- #829: Replace Eigen type metaprogramming with standard C++ types and alias templates
- #834: Introduces AVX512 optimized kernels for floating-point triangular solve operations, enhancing performance for smaller matrix sizes
- #856: Adds support for Apple's Accelerate sparse matrix solvers with significant performance improvements for various factorization methods
- #860: Adds AVX512 optimizations for matrix multiplication with significant performance improvements for single and double precision kernels
- #868: Optimized SQRT/RSQRT implementations for modern x86 processors with improved performance and special value handling
- #880: Fix critical SVD functionality bug for Microsoft Visual Studio (MSVC) compilation
- #892: Added support for constant evaluation and improved alignment check assertions
- #899: Adds C++14 constexpr support for Map initialization and basic operations with compile-time computation capabilities
- #936: Performance improvements for GEMM on Power architecture with vector_pair loads and optimized matrix multiplication
- #971: Introduces R-Bidiagonalization step to BDCSVD, optimizing SVD performance for tall and wide matrices using QR decomposition
- #972: AVX512 optimizations for s/dgemm compute kernel, resolving previous architectural and build compatibility issues
- #975: Introduced subMappers for Power GEMM packing, improving performance by approximately 10% through simplified address calculations
- #978: Added efficient sparse subset of matrix inverse computation using Takahashi algorithm with improved numerical stability
- #983: Extends SYCL backend's QueueInterface to accept existing SYCL queues for improved framework integration
- #986: SYCL-2020 range handling updated to ensure at least one thread execution by replacing default ranges with ranges of size 1
- #990: Adds product operations and static initializers for DiagonalMatrix, improving matrix algebra convenience
- #992: Enhanced AVX512 TRSM kernels to respect EIGEN_NO_MALLOC memory allocation configuration
- #996: Updates SYCL kernel naming to comply with SYCL-2020 specification, improving SYCL compatibility and integration
- #1008: Add Power10 (AltiVec) MMA instructions support for bfloat16 computations with enhanced performance
- #1017: Add support for AVX512-FP16 instruction set, introducing
Packet32hand optimizing half-precision floating-point operations with up to 8-9x performance improvement - #1018: Optimize gebp_kernel for arm64-neon with 3px8/2px8/1px8 configuration to improve matrix multiplication performance
- #1024: Partial Packet support for GEMM real-only operations on PowerPC, with compilation warning fixes and performance improvements
- #1034: Improved pow performance with more efficient division algorithm, achieving 11-15% speedup
- #1036: Replaced malloc/free with conditional_aligned memory allocation in sparse classes to improve memory management and potential performance
- #1038: Vectorized implementations of
acos,asin, andatanfor float with significant performance improvements - #1073: Adds AVX vectorized implementation for int32_t division with improved performance
- #1074: Reverted previous
constexprimplementation and tests - #1076: Adds vectorized integer division for int32 using AVX512, AVX, and SSE instructions with performance optimizations
- #1082: Adds vectorized implementation of atan2 with array syntax, providing significant performance improvements for mathematical computations
- #1089: Unconditionally enables CXX11 math features for all compilers supporting C++14 and later
- #1090: Adds constexpr support for std::initializer_list constructors in Eigen matrices and arrays, enabling compile-time initialization in C++20 and partially in C++14/17
- #1103: Added new utility for sorting inner vectors of sparse matrices and vectors with custom comparison function
- #1126: Added Intel DPCPP compiler support for SYCL backend with SYCL-2020 compatibility
- #1147: Comprehensive overhaul of SparseMatrix core functionality, improving performance, efficiency, and maintainability of sparse matrix operations
- #1148: Introduced runtime memory allocation guards and modified assertion behavior to improve debugging and error handling
- #1152: Adds template for QR permutation index type and improves ColPivHouseholderQR LAPACKE bindings
- #1160: Improved insert strategy for compressed sparse matrices with enhanced performance and capacity management
- #1164: Improved sparse matrix permutation performance by reducing memory allocations and optimizing data handling strategies
- #1166: Introduces custom ODR-safe assertion mechanism for improved C++20 module compatibility
- #1168: Adds thread-local storage for
is_malloc_allowed()state to improve multi-threaded safety - #1170: Significant performance improvements for sparse matrix insertion, reducing insertion times by orders of magnitude and optimizing memory management
- #1196: Introduced vectorized comparison optimizations with typed comparisons and new selection operation, improving performance for comparison operations
- #1197: Removed all LGPL licensed code and references to simplify licensing and improve compatibility with MPL2
- #1203: Introduces typed logical operators for full vectorization and generalized boolean evaluations across scalar types
- #1210: Optimizations for bfloat16 Matrix-Matrix Multiplication (MMA) with performance improvements up to 10%
- #1211: Add CArg function for vectorized complex argument calculations
- #1233: Vectorized
any()andall()methods, improved performance for matrix operations and custom visitors - #1236: Added partial linear access for bfloat16 GEMM MMA, improving performance by 30% with reduced memory loads
- #1244: Introduces mechanism to specify permutation index types for PartialPivLU and FullPivLU, improving compatibility with Lapacke ILP64 interfaces
- #1255: Added Matrix Multiply Accumulate (MMA) for BF16 GEMV, achieving 5.0-6.3X performance improvement on Power architecture
- #1260: Upgrades NaN and Inf detection to use modern C++14 standard features for improved floating-point value handling
- #1273: Replaced internal pointer typedefs with standard
std::(u)intptr_ttypes and removed ICC workaround - #1279: Refactors indexed view expressions to enable non-const reference access with symbolic indices
- #1281: Introduces
insertFromTripletsandinsertFromSortedTripletsmethods for efficient sparse matrix batch insertion and optimizessetFromTriplets - #1285: Introduces Unified Shared Memory (USM) support for SYCL, simplifying device pointer management and improving expression construction efficiency
- #1289: Moves thread pool code from Tensor to Core module, enhancing multithreading infrastructure
- #1293: Enable new AVX512 GEMM kernel by default, improving performance for supported hardware
- #1295: Refactored IndexedView to simplify SFINAE usage, improve readability, and re-enable raw fixed-size array access
- #1296: Adds dynamic dispatch for BF16 GEMM on Power architecture with new VSX implementation, achieving up to 13.4X speedup and improved conversion performance
- #1304: Specialized vectorized casting evaluator for improved packet type conversion efficiency
- #1307: New VSX implementation of BF16 GEMV for Power architecture with up to 6.7X performance improvement
- #1314: Introduced
canonicalEulerAnglesmethod to replace deprecatedeulerAngles, improving Euler angle calculation standardization and accuracy - #1329: Added macros to customize ThreadPool synchronization primitives for enhanced performance and flexibility
- #1330: Added half precision type support for SYCL-2020 with efficient
Eigen::halfandcl::sycl::halfconversions - #1336: Introduces linear redux evaluators with efficient linear access methods for expressions, improving traversal and potential performance
- #1347: Adds compile-time and run-time assertions for
Ref<const>construction to improve memory layout safety and error handling - #1375: Add architecture definition files for Qualcomm Hexagon Vector Extension (HVX), introducing support for
EIGEN_VECTORIZE_HVXand optimized vector operations - #1387: Introduced a new method to improve handling of block expressions, offering a backwards compatible solution for converting blocks of block expressions
- #1389: New panel modes for GEMM MMA with real and complex number support, delivering performance improvements up to 2.84X for small matrices and 34-75% speed enhancements for large matrices
- #1395: Introduced ThreadPool in Eigen Core, enabling parallel vector and matrix computations with a new
CoreThreadPoolDevice - #1408: Generalized parallel GEMM implementation to support
Eigen::ThreadPoolin addition to OpenMP, enhancing library flexibility across different platforms - #1454: Added half and quarter vector support for HVX architecture, enabling performance improvements for small matrix operations
- #1511: Added direct access methods and strides for IndexedView, enhancing matrix operation efficiency and usability
- #1522: Introduces SIMD vectorized sine and cosine functions for double precision using Veltkamp method and Padé approximant
- #1544: Added
Packet2lfor SSE to support vectorizedint64_toperations - #1545: Improved
CwiseUnaryViewwith enhanced functionality for accessing and modifying complex array components - #1546: Added optimized casting support between
doubleandint64_tfor SSE and AVX2 instruction sets - #1554: Add
SimplicialNonHermitianLLTandSimplicialNonHermitianLDLTsolvers for complex symmetric matrices - #1555: Enhance Matrix functions with
constexprdefault constructor and assignment operators for improved compile-time evaluation - #1556: Reorganized CMake configuration to improve build efficiency and installation support, reducing configuration time and simplifying integration
- #1565: Enables symbols in compile-time expressions, enhancing
Eigen::indexing::lastusability and compile-time computation efficiency - #1572: Implemented fully vectorized
doubletoint64_tcasting using AVX2, achieving 70% throughput improvement - #1578: Updates to Geometry_SIMD.h to enhance SIMD performance and compatibility with modern architectures
- #1593: Introduced specialized vectorized evaluation for
(a < b).select(c, d)ternary operations to improve performance - #1600: Optimized transpose product calculations to reduce memory allocations and improve performance for matrix operations
- #1636: Enable
pointer_based_stl_iteratorto conform tocontiguous_iteratorconcept in C++20, improving range and view compatibility - #1654: Introduced
EIGEN_ALIGN_TO_AVOID_FALSE_SHARINGmacro to reduce atomic false sharing inRunQueue, improving multithreaded performance - #1655: Optimizes ThreadPool task submission with significant performance improvements, reducing execution time up to 49% in multi-threaded scenarios
- #1662: Speed up complex * complex matrix multiplication by dynamically adjusting block panel size for enhanced performance (8-33% improvement)
- #1670: Introduces new rational approximation for
tanhwith up to 50% performance gain and improved numerical accuracy - #1671: Introduced a new inner product evaluator with direct reduction for improved dot product performance, supporting explicit unrolling for small vectors and enhancing SIMD operations
- #1673: Performance optimization for SVE intrinsics by replacing
_zsuffix with_xsuffix to reduce instruction overhead - #1675: Adds vectorized implementation of
tanh<double>with significant performance speedups across different instruction set architectures - #1694: Make fixed-size matrices and arrays trivially copy and move constructible, enabling better compiler optimizations
- #1735: Added
constexprsupport for element accessors likeoperator()andoperator[]to enable compile-time computations - #1737: Enhance fixed-size matrices to conform to
std::is_standard_layout, improving type safety and memory handling - #1777: Add support for LoongArch64 LSX architecture, expanding hardware compatibility
- #1801: Significantly improves Simplicial Cholesky
analyzePatternperformance using advanced sparse matrix algorithms, reducing computation time dramatically - #1813: Increases maximum alignment to 256 bytes, enhancing support for
MaxSizeVectorand optimizing alignment for modern ARM architectures - #1820: Improves fixed-size assignment handling by optimizing vectorized traversals and reducing compiler
Warray-boundswarnings - #1830: Make assignment operations
constexpr, enabling compile-time evaluation and performance optimizations - #1838: Simplified parallel task API for
ParallelForandParallelForAsync, optimizing task function definition and completion handling
- #611: Included
<unordered_map>header to resolve header inclusion issue - #613: Fix
fix<N>implementation for environments without variable templates support - #614: Fixed LAPACK test compilation issues with type mismatches in older Fortran code
- #621: Fixed GCC 4.8 ARM compilation issues by improving register constraints and resolving warnings
- #628: Renamed 'vec_all_nan' symbol in cxx11_tensor_expr test to resolve build conflicts with altivec.h on ppc64le platform
- #629: Fixed EIGEN_OPTIMIZATION_BARRIER compatibility for arm-clang compiler
- #630: Fixed AVX2 integer packet issues and corrected AVX512 implementation details
- #635: Fixed tridiagonalization selector issue by modifying
hCoeffsvector handling to improve type compatibility - #638: Fixed missing packet types in pset1 function call, improving packet data handling robustness
- #639: Fixed AVX2 PacketMath.h implementation with typo corrections and unaligned load resolution
- #643: Minor fix for compilation error on HIP
- #651: Remove
-fabi-version=6flag from AVX512 builds to improve compatibility - #654: Silenced GCC string overflow warning in initializer_list_construction test
- #656: Resolved strict aliasing bug causing product_small function failures in matrix multiplication
- #657: Fixes implicit conversion warnings in tuple_test, improving type safety
- #659: Fixed undefined behavior in BFloat16 float conversion by replacing
reinterpret_castwith a safer alternative, improving reliability on PPC platforms - #664: Fixed MSVC compilation issues with complex compound assignment operators by disabling related tests
- #665: Fix tuple compilation issues in Visual Studio 2017 by replacing tuple alias with TupleImpl
- #666: Fixed MSVC+NVCC compilation issue with EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR macro
- #680: Fixed PowerPC packing issue, correcting row and depth inversion in non-vectorized code with 10% performance improvement
- #686: Reverted bit_cast implementation to use memcpy for CUDA to prevent undefined behavior
- #689: Fixed broadcasting index-out-of-bounds error for vectorized 1-dimensional inputs, particularly for std::complex types
- #691: Fixed Clang warnings by replacing bitwise operators with correct logical operators
- #694: Fixed ZVector build issues for s390x cross-compilation, enabling packetmath tests under QEMU
- #696: Fixed build compatibility issues with pload and ploadu functions on ARM and PPC architectures by removing const from visitor return type
- #703: Fix NaN propagation in min/max functions with scalar inputs
- #707: Fixed total deflation issue in BDCSVD for diagonal matrices
- #709: Fixed BDCSVD total deflation logic to correctly handle diagonal matrices
- #711: Bug fix for incorrect definition of EIGEN_HAS_FP16_C macro across different compilers
- #713: Prevent integer overflow in EigenMetaKernel indexing for improved reliability, especially on Windows builds
- #714: Fixed uninitialized matrix issue to prevent potential computation errors
- #719: Fixed Sparse-Sparse product implementation for mixed StorageIndex types
- #728: Fixed compilation errors for Windows build systems
- #733: Fixed warnings about shadowing definitions to improve code clarity and maintainability
- #741: Fixes HIP compilation failure in DenseBase by adding appropriate EIGEN_DEVICE_FUNC modifiers
- #745: Fixed HIP compilation issues in selfAdjoint and triangular view classes
- #746: Fixed handling of 0-sized matrices in LAPACKE-based Cholesky decomposition
- #759: Fixed typo of
StableNormtostableNormin IDRS.h file - #762: Fixed documentation code snippets to improve accuracy and readability
- #765: Resolved Clang compiler ambiguity in index list overloads to improve code stability
- #769: Fixed header inclusion issues in CholmodSupport to prevent direct access to internal files
- #782: Fix a bug with the EIGEN_IMPLIES macro's side-effects introduced in a previous merge request
- #785: Fixed Clang warnings related to alignment and floating-point precision
- #789: Fixed inclusion of immintrin.h for F16C intrinsics when vectorization is disabled
- #794: Fixed header guard conflicts between AltiVec and ZVector packages
- #800: Fixes serialization API issues disrupting HIP GPU unit tests
- #801: Fixes and cleanups for BFloat16 and Half numeric_limits, including AVX
psqrtfunction workaround - #802: Fixed improper truncation of unsigned int to bool, improving type conversion reliability
- #803: Fixed GCC 8.5 warning about missing base class initialization
- #805: Fixed inconsistency in scalar and vectorized paths for array.exp() function
- #806: Fix assertion messages in IterativeSolverBase to correctly reference its own class name
- #809: Fixed broken assertions to improve runtime error checking and library reliability
- #810: Fixed two corner cases in logistic sigmoid implementation for improved accuracy and robustness
- #811: Fixed compilation issue with GCC < 10 and -std=c++2a standard
- #812: Fix implicit conversion warning in vectorwise_reverse_inplace function by adding explicit casting
- #815: Fixed implicit conversion warning in GEBP kernel's packing by changing variable types from
inttoIndex - #818: Silenced specific MSVC compiler warnings in
construct_elements_of_array()function - #822: Fixed potential overflow issue in random test by making casts explicit and adjusting variable types
- #828: Fixed GEMV cache overflow issue for PowerPC architecture
- #833: Fixes type discrepancy in 32-bit ARM platforms by replacing
intwithint32_tfor proper bit pattern extraction - #835: Fixed ODR violations by removing unnamed namespaces and internal linkage from header files
- #842: Fixed documentation typo in Complete Orthogonal Decomposition (COD) method reference
- #843: Fixed naming collision with resolve.h by renaming local variables
- #847: Cleaned up compiler warnings for PowerPC GEMM and GEMV implementations
- #851: Fixed JacobiSVD_LAPACKE bindings to align with SVD module runtime options
- #858: Fixed sqrt/rsqrt implementations for NEON with improved accuracy and special case handling
- #859: Fixed MSVC+NVCC 9.2 pragma compatibility issue by replacing
_Pragmawith__pragma - #863: Modified test expression to avoid numerical differences during optimization
- #865: Added assertion for edge case when requesting thin unitaries with incompatible matrix dimensions
- #866: Fix crash bug in SPQRSupport by initializing pointers to nullptr to prevent invalid memory access
- #870: Fixed test macro conflicts with STL headers in C++20 for GCC 9-11
- #873: Disabled deprecated warnings in SVD tests to clean up build logs
- #874: Fixed gcc-5 packetmath_12 bug with memory initialization in
packetmath_minus_zero_add() - #875: Fixed compilation error in packetmath by introducing a wrapper struct for
psqrtfunction - #876: Fixed AVX512 instruction handling and complex type computation issues for g++-11
- #877: Disabled deprecated warnings for SVD tests on MSVC to improve build log clarity
- #878: Fixed frexp packetmath tests for MSVC to handle non-finite input exponent behavior
- #882: Fixed SVD compatibility issues for MSVC and CUDA by resolving Index type and function return warnings
- #883: Adjusted matrix_power test tolerance for MSVC to reduce test failures
- #885: Fixed enum conversion warnings in BooleanRedux component
- #886: Fixed denormal test to skip when condition is false
- #900: Fix swap test for size 1 matrix inputs to prevent assertion failures
- #901: Fixed
construct_atcompilation issue on ROCm/HIP environments - #908: Corrected reference code for
ata_productfunction in STL_interface.hh - #910: Reverted previous changes to PowerPC MMA flags to restore stability
- #914: Disabled Schur non-convergence test to reduce flaky results and improve reliability
- #915: Fixed missing pound directive to improve compilation and code robustness
- #917: Resolved g++-10 docker compiler optimization issue in geo_orthomethods_4 test
- #918: Added missing explicit reinterprets for
_mm512_shuffle_f32x4to resolve g++ build errors - #919: Fixed a missing parenthesis in the tutorial documentation
- #922: Work around MSVC compiler bug dropping
constqualifier in method definitions - #923: Fixed AVX512 build compatibility issues with MSVC compiler
- #924: Disabled f16c scalar conversions for MSVC to prevent compatibility issues
- #925: Fixed ODR violation in trsm module by marking specific functions as inline
- #926: Fixed compilation errors by correcting namespace usage in the codebase
- #930: Fixed compilation issue in GCC 9 by adding missing typename and removing unused typedef
- #934: Fixed order of arguments in BLAS SYRK implementation to resolve compilation errors
- #937: Eliminates warnings related to unused trace statements, improving code cleanliness
- #945: Restored correct max size expressions that were unintentionally modified in a previous merge request
- #948: Fix compatibility issues between MSVC and CUDA for diagonal and transpose functionality
- #949: Fixed ODR violations in lapacke_helpers module to improve library reliability
- #953: Fixed ambiguous constructors for DiagonalMatrix to prevent compile-time errors with initializer lists
- #958: Fixed compiler bugs for GCC 10 & 11 in Power GEMM inline assembly
- #963: Fixed NaN propagation for scalar input by adding missing template parameter
- #964: Fix compilation issue in HouseholderSequence.h related to InnerPanel template parameter
- #974: Fixed BDCSVD crash caused by index out of bounds in matrix processing
- #976: Fix LDLT decomposition with AutoDiffScalar when value is 0
- #977: Fixed numerical stability issue in BDCSVD algorithm
- #980: Fixed signed integer overflow in adjoint test to improve code safety
- #987: Fixed integer shortening warnings in visitor tests
- #988: Fixed MSVC build issues with AVX512 by temporarily disabling specific optimizations to reduce memory consumption and prevent compilation failures
- #991: Resolved ambiguous comparison warnings in clang for C++20 by adjusting TensorBase comparison operators
- #993: Corrected row vs column vector terminology typo in Matrix class tutorial documentation
- #1003: Eliminated undefined warnings for non-AVX512 compilation by adding appropriate macro guards
- #1007: Fixed One Definition Rule (ODR) violations by converting unnamed type declarations to named types
- #1010: Fixed inner iterator for sparse block to correctly handle outer index and improve sparse matrix operations
- #1012: Fixed vectorized Jacobi Rotation implementation by correcting logic for applying vectorized operations
- #1014: Fixed
aligned_reallocto correctly check memory allocation constraints when pointer is null - #1019: Prevent
<sstream>inclusion whenEIGEN_NO_IOis defined, improving embedded system compatibility - #1023: Fixed flaky packetmath_1 test by adjusting inputs to prevent value cancellations
- #1025: Fixed Packet2d type implementation for non-VSX platforms to improve portability
- #1027: Fixed vectorized pow() function to handle edge cases with negative zero and negative infinity correctly
- #1028: Fixed build compatibility for non-VSX PowerPC architectures
- #1030: Resolves Half function definition conflict on aarch64 for GPU compilation
- #1032: Fixed invalid deprecation warnings in BDCSVD constructor handling
- #1033: Fixed SYCL tests by correcting sigmoid function, binary logic operators, and resolving test failures in tensor math operations
- #1037: Protected new pblend implementation with EIGEN_VECTORIZE_AVX2 to address build compatibility issues
- #1039: Fixed
psignfunction for unsigned integer types, preventing incorrect behavior with bool types - #1042: Fixed undefined behavior in array_cwise test related to signed integer overflow
- #1044: Fixed memory allocation issue by adding missing pointer in realloc call
- #1045: Fixed GeneralizedEigenSolver::info() method to improve initialization checks and error messaging
- #1048: Fixed test build errors in unary power operations with improved type handling for real and complex numbers
- #1049: Fixed typos in documentation table for slicing tutorial
- #1051: Fixed mixingtypes tests related to unary power operation
- #1053: Fixed MSVC compilation error in GeneralizedEigenSolver.h by adding a missing semi-colon
- #1055: Added safeguard in
aligned_realloc()to prevent memory reallocation whenEIGEN_RUNTIME_NO_MALLOCis defined - #1057: Adjusted overflow threshold bounds in power function tests to prevent integer and floating-point overflow scenarios
- #1060: Fixed memory reallocation for non-trivial types to handle self-referencing pointers and improve stability
- #1061: Fixed bound for pow function to handle floating-point type limitations
- #1063: Fixed type safety and comparison issues in unary pow() function
- #1065: Fixes sparse matrix compilation issues on ROCm backend
- #1069: Removed faulty skew_symmetric_matrix3 test with uninitialized matrix comparison errors
- #1070: Fixed test for pow function handling of mixed integer types
- #1077: Fixed unused-result warning for ROCm gpuGetDevice function with better error reporting
- #1085: Fixed 4x4 matrix inverse computation when compiling with -Ofast optimization flag
- #1094: Fixed unused variable warnings in Eigen/Sparse module with clang 16.0.0git
- #1096: Fixed a bug in the
atan2function related topselectbehavior with single-bit packets - #1104: Fix NEON instruction fmla bug for half data type, preventing compiler errors and performance issues
- #1105: Fixed pragma check for disabling fastmath optimization
- #1106: Fixed handmade_aligned_malloc offset computation to prevent potential out-of-bounds memory writes and compiler warnings
- #1107: Disable patan for double precision on PowerPC to prevent build failures
- #1112: Corrected a typo in the CholmodSupport module
- #1113: Fixed duplicate execution code for Power 8 Altivec in pstore_partial function
- #1115: Fixed AVX2 psignbit implementation to resolve accuracy and reliability issues
- #1116: Corrected pnegate function to accurately handle floating-point zero by directly flipping the sign bit
- #1118: Fixed ambiguity in PowerPC vec_splats call for uint64_t type compatibility
- #1120: Fixed critical bugs in
handmade_aligned_reallocto prevent memory management issues and potential undefined behavior - #1124: Fixed sparseLU solver to handle destinations with non-unit stride
- #1127: Fixed serialization process for non-compressed matrices by correcting data buffer size calculation
- #1130: Fixed index type typo in sparse index sorting implementation
- #1142: Fixed incorrect NEON native fp16 multiplication kernel for ARM hardware
- #1143: Reverted type mixing restrictions in CompressedStorage.h to restore previous functionality
- #1149: Fixed
.gitignoreissue preventingscripts/buildtests.infrom being added withgit add . - #1150: Fixes Altivec detection and VSX instruction handling for macOS PowerPC systems
- #1151: Fixed EIGEN_HAS_CXX17_OVERALIGN configuration for Intel C++ Compiler (icc)
- #1153: Fix macro guards for emulated FP16 operators on GPU, improving compatibility and reducing compilation errors
- #1155: Fixes overalign check preprocessor directive handling for improved compiler compatibility
- #1156: Fixed minor build and test issues including header paths, vectorization, GPU support, and removing unnecessary headers
- #1161: Fixes unused parameter warning on 32-bit ARM with Clang compiler
- #1162: Rolled back previous QR commit to resolve build error with
StorageIndexdefinitions - #1173: Reverted QR test changes to restore original functionality and compatibility
- #1178: Resolved compiler warnings related to sparse matrix operations
- #1179: Fixed consistency issue in reciprocal square root (rsqrt) vectorized implementation
- #1180: Fixed critical sparse matrix handling bugs for empty matrices to prevent segmentation faults
- #1181: Fixed bugs in GPU convolution operations by enabling GPU assertions
- #1183: Fixes undefined behavior in Block access to prevent pointer arithmetic on null pointers
- #1184: Fixes pre-POWER8_VECTOR bugs in pcmp_lt and pnegate, and reactivates psqrt function
- #1185: Improved special case handling in atan2 function to resolve test failure in TensorFlow with Clang
- #1188: Reverted StlIterators edit to address potential undefined behavior
- #1189: Added EIGEN_DEVICE_FUNC qualifiers to SkewSymmetricDense to fix CUDA compatibility
- #1201: Fixes ODR violation with
gemm_extra_colsfunction on PowerPC to prevent potential crashes - #1202: Fixed MSVC ARM build compatibility by resolving intrinsic function and vector type handling issues
- #1212: Disabled array BF16 to F32 conversions on Power architecture to improve performance and stability
- #1213: Resolved multiple compiler warnings to improve code quality and maintainability
- #1215: Fixed compiler warnings in test files to improve code quality and maintainability
- #1216: Fixed a typo in the NEON
make_packet2ffunction to improve correctness - #1218: Fix MSVC atan2 test to align with POSIX specification for underflow cases
- #1220: Fixed NEON packetmath compilation issues with GCC and resolved preinterpret stack overflow problem
- #1221: Guard complex sqrt function for compatibility with old MSVC compilers
- #1222: Fixed epsilon value for long double in double-doubles to improve algorithm convergence on PowerPC
- #1228: Fixed compiler compatibility for
vec_divinstructions on Power architecture - #1229: Resolved MemorySanitizer (MSAN) failures in SVD tests by fixing uninitialized matrix entry issues
- #1235: Fixed ODR issues with Intel's AVX512 TRSM kernels by removing static qualifiers
- #1239: Fixed NEON integer shift operation test for signed shifts to prevent incorrect argument handling
- #1245: Fixed cwise test by resolving signed integer overflow issues using
.abs()method - #1248: Fixed typo in LinAlgSVD example code to enable compilation and correct least-squares solution
- #1249: Fixed MSVC test failures related to intrinsic operations by replacing set1 intrinsics with set intrinsics
- #1252: Implemented a workaround for a compiler bug in Tridiagonalization.h, improving stability across compiler environments
- #1256: Fixed bug in minmax_coeff_visitor when matrix contains only NaN values
- #1257: Fixed minmax visitor behavior for PropagateFast option to prevent out-of-bounds index issues with NaN matrices
- #1258: Reverted BF16 GEMM changes that caused register spillage and performance degradation on LLVM Power architecture
- #1263: Fixed PowerPC and Clang compiler warnings to improve code stability
- #1269: Reverted CMake pools changes to stabilize build process and eliminate configuration errors
- #1270: Fixed ARM build compatibility issues including casting, MSVC packet conversion, and 32-bit ARM macro definitions
- #1271: Fixed issues with SparseMatrix::Map typedef and setFromTriplets method robustness
- #1277: Fix incorrect casting in AVX512DQ vectorization path
- #1282: ASAN fixes for AVX512 GEMM/TRSM kernels, addressing memory safety issues with buffer overrun prevention
- #1283: Corrected intrinsic function for accurate truncation during double-to-int casting
- #1291: Fixed
.gitignoreto prevent accidentally ignoring Eigen's Core directory on Windows - #1302: Fixed typo in SSE packetmath implementation
- #1308: Fix
powfunction foruint32_tand disable problematic packet multiplication operation - #1311: Fixed sparse iterator compatibility and warnings on macOS with Clang by modifying
StorageRefand replacing deprecatedstd::random_shuffle - #1312: Fixed boolean bitwise and warning in test code
- #1318: Add safeguard in JacobiSVD to handle non-finite inputs by setting
m_nonzeroSingularValuesto zero - #1319: Fixed ColMajor BF16 GEMV implementation for RowMajor input vectors
- #1321: Cleaned up array_cwise test by suppressing MSVC warnings, resolving operator precedence, and removing redundant shift tests
- #1322: Fixed specialized
loadColDataimplementation for BF16 GEMV, improving LLVM compatibility - #1323: Fixed compiler warning related to modulo by zero in visitor pattern
- #1327: Fixed CUDA compilation issues by adjusting header inclusion order and resolving
EIGEN_AVOID_STL_ARRAYrelated problems - #1333: Fixed SVD initialization issues and compiler warnings in JacobiSVD and BDCSVD routines
- #1339: Fixes CUDA compilation issues with
EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETICby preventing miscompilation in host/device functions - #1343: Fixed unary
pow()error handling with improved edge case management and test robustness - #1344: Prevent underflow in
prsqrtfunction by adding numerical stability safeguards - #1349: Fixed AVX
pstorefunction for integer types to ensure correct aligned store intrinsics - #1350: Fixed
safe_absinint_powto improve compatibility with Clang compiler - #1351: Fixed SVD test stability by removing deprecated test behavior
- #1357: Fixed
supportsMMAto correctly handleEIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCHcompilation flag and compiler support - #1359: Fixed AVX512 trsm kernel memory allocation issues in nomalloc environments
- #1360: Fixed return type of
ivcSizeinIndexedViewMethods.hto improve type safety and consistency - #1361: Fixes Altivec compilation compatibility with C++20 and higher standards
- #1362: Fixed
_mm256_cvtps_phintrinsic argument to eliminate MSVC compilation warning - #1363: Fixed
argfunction compatibility in CUDA environments, resolving compilation issues with MSVC and C++20 - #1367: Addresses GCC compiler warnings by fixing zero-sized block handling, assignment operators, and uninitialized variable issues
- #1369: Fixed ARM build warnings by addressing type casting and variable shadowing issues in Eigen's Tensor module
- #1370: Fix compiler warning for matrix-vector multiplication loop optimizations on x86-64 gcc 10+
- #1371: Fixed
-Wmaybe-uninitializedwarning in SVD implementation by improving dimension initialization and type safety - #1376: Fixed nullptr dereference issue in triangular product for zero-sized matrices
- #1377: Fix undefined behavior in triangular solves for empty systems
- #1379: Prevents nullptr dereference in SVD implementation for small matrices
- #1380: Fixes undefined behavior related to scalar memory alignment with improved memory alignment checks
- #1382: Fix tensor strided linear buffer copy to prevent negative index issues and improve integer arithmetic safety
- #1386: Fixed ARM32 floating-point division issues, improving accuracy and reliability of float computations
- #1388: Fixed stage success check in Pardiso solver to only report success when
m_info == Eigen::Success - #1392: Fixed CUDA device function calls by adding
EIGEN_DEVICE_FUNCattribute to static run methods - #1394: Fixed extra semicolon in
XprHelpercausing compilation issues with-Wextra-semiflag - #1396: Fixed sparse triangular view iterator by restoring
row()andcol()function implementations to prevent segmentation faults - #1399: Disable denorm deprecation warnings in MSVC C++23 to reduce compiler noise
- #1402: Work around MSVC compiler issue with Block XprType by removing dependent typedef
- #1407: Fixed
Wshorten-64-to-32warnings indiv_ceilfunction to improve code robustness - #1409: Resolved compiler warnings and critical bug fixes in
Memory.h - #1411: Fixed typo in
EIGEN_RUNTIME_NO_MALLOCmacro to resolve nomalloc test failure on AVX512 - #1412: Backports fix for disambiguation of overloads with empty index lists, resolving compilation errors
- #1415: Link pthread library for
product_threadedtest to resolve test execution issues - #1416: Fixed
Wshorten-64-to-32warning in gemm parallelizer to improve code quality - #1417: Fixed
getNbThreads()to correctly return 1 when threading is not parallelized - #1419: Ensures
mcis not smaller thanTraits::nrto prevent potential calculation errors - #1422: Fix 64-bit integer to float conversion precision on ARM architectures
- #1425: Fixes typecasting issue for arm32 architecture, restoring proper functionality
- #1431: Fixed
scalar_logistic_functionoverflow handling for complex inputs by improving comparison mechanism - #1434: Fixed CUDA syntax error introduced by clang-format
- #1439: Fixed
_BitScanReversefunction implementation for MSVC to correctly count leading zeros - #1444: Fixed index type handling in
StorageIndexto prevent overflow during resize operations in Eigen::SPQR module - #1447: Addressed multiple AddressSanitizer (asan) errors including out-of-bounds, use-after-scope, and memory leak issues across various Eigen components
- #1448: Fixed MSAN failures by resolving uninitialized memory use in matrices
- #1449: Fixed GPU computation issue with Clang and AddressSanitizer by replacing function pointers with lambdas
- #1451: Fixed build error in SPQR module due to StorageIndex and Index type mismatch
- #1456: Improved memory safety by adding pointer checks before freeing to prevent potential undefined behavior
- #1457: Add runtime assertions for .chip to improve error handling and parameter validation
- #1458: Fixed
stableNormfunction to handle zero-sized input correctly - #1460: Reverted changes to
stableNormto restore performance for large vectors - #1463: Reverted previous assertions for
.chip()to resolve broken tests - #1467: Fixed compile-time error related to chip static assertions for dimension checks
- #1468: Addressed ARM32 floating-point computation issues by improving
Eigen::halfhandling and numerical precision - #1476: Fixed multiple One Definition Rule (ODR) violations across several Eigen library components
- #1478: Fixed comparison bug in detection of subnormal floating-point numbers
- #1481: Fixed CI compatibility issues for clang-6 during cross-compilation
- #1482: Fixed
presheartransformation function implementation with corrected constructor and added verification test - #1485: Fixed PPC architecture test failures related to random integer value ranges and signed integer overflow
- #1486: Fixed gcc-6 compiler bug in rand test by adding
noinlineattribute to preserve variable value - #1487: Fixed skew symmetric matrix test by excluding problematic dimension cases to improve test reliability
- #1488: Fixed test compatibility issues with bfloat16 and half scalar types
- #1489: Fixed undefined behavior in
getRandomBitswhen generating random values with zero bits - #1490: Fixed undefined behavior in boolean packetmath test by correcting select mask loading
- #1492: Fixes C++20 compilation error related to arithmetic between different enumeration types
- #1494: Prevent segmentation fault in
CholmodBase::factorize()when handling zero matrices - #1496: Fixed division by zero undefined behavior in packet size logic
- #1498: Removes
r_cnjgfunction to resolve conflicts with libf2c and inlines related complex conjugate functions - #1499: Eliminated warning about writing bytes directly to non-trivial type using
void*casting - #1500: Fixes explicit scalar conversion issue in ternary expressions, resolving bug #2780
- #1503: Fix random number generation for custom scalars without
constexprdigits()method - #1504: Fixed undefined behavior in
pabsdifffunction on ARM to prevent compiler overflow issues - #1507: Fixed deflation process in BDCSVD to improve numerical stability and correctness when handling large constant matrices
- #1510: Fixed potential infinite loop in real Schur decomposition and improved polynomial solver reliability
- #1513: Fixed pexp_complex_test to improve C++ standard compliance
- #1514: Fix exp complex test by using
intinstead of index type - #1517: Fixed uninitialized memory usage in kronecker_product test by properly initializing matrices
- #1518: Fixed header guard inconsistencies in
GeneralMatrixMatrix.handParallelizer.hto resolve build errors - #1521: Fix crash in Incomplete Cholesky algorithm when input matrix has zeros on diagonal
- #1524: Fixed signed integer undefined behavior in random number generation functionality
- #1526: Fix build issue with MSVC GPU compilation by resolving
allocate()function definition conflict - #1528: Fixed QR colpivoting warnings by replacing
abswithnumext::absto handle floating-point types correctly - #1529: Fix triangular matrix-vector multiplication uninitialized warning by removing
const_castand simplifying implementation - #1531: Add degenerate checks before calling BLAS routines to prevent crashes with zero-sized matrices/vectors
- #1533: Fixed edge-cases and test failures for complex
pexpfunction - #1535: Resolved deprecated anonymous enum-enum conversion warnings to improve code quality and compiler compatibility
- #1536: Fixed unaligned memory access in
trmvfunction, resolvingnomalloc_3test failure - #1537: Fixed static_assert compatibility for C++14, improving compilation error clarity
- #1538: Fixes volume calculation for empty
AlignedBoxto return 0 instead of a negative value - #1540: Fixed pexp test for 32-bit ARM architectures to handle subnormal number flushing
- #1541: Fixed
packetmathplog test compatibility on Windows by updating comparison method usingnumext::log - #1549: Fix const access in
CwiseUnaryViewwith improved matrix mutability checks - #1551: Resolved VS2015 compilation issue by adding explicit
static_castto handlebool(...)casting - #1552: Fixed
CwiseUnaryViewcompatibility with MSVC compiler by resolving default parameter declaration issues - #1559: Fix SIMD intrinsics compatibility for 32-bit builds by introducing workarounds for
_mm_cvtsi128_si64and_mm_extract_epi64 - #1562: Protect use of
allocato prevent breakages on 32-bit ARM systems - #1566: Fixed
Packet2lhandling on Windows 32-bit platforms - #1567: Fixes double to int64 conversion on 32-bit SSE architecture and adds Windows build smoketests
- #1568: Fix redefinition of
ScalarPrinterfor gcc compilation compatibility - #1570: Fixed casting from
Packet2dtoPacket2lto use truncation instead of rounding - #1573: Fixed compiler warnings related to unsigned type negation and type casting on MSVC
- #1574: Guarded AVX
Packet4ldefinition to prevent potential compilation conflicts and improve stability - #1576: Fixed preprocessor condition for fast float logistic implementation, restoring optimal performance by correcting
EIGEN_CPUCChandling - #1577: Fixed
preversefunction implementation for PowerPC architecture - #1585: Fixed GCC bug handling
pfirst<Packet16i>AVX512 intrinsic - #1588: Fixed build compatibility for pblend, psin_double, and pcos_double when AVX is supported but AVX2 is not
- #1591: Fixes compilation problems with
PacketIon PowerPC architecture - #1594: Fix
tridiagonalization_inplace_selector::run()method compatibility with CUDA by addingEIGEN_DEVICE_FUNCmacro - #1598: Fixed transposed matrix product memory allocation bug when using
noalias() - #1601: Fixed sine and cosine function implementation for PowerPC platforms
- #1602: Adjusted error bound for nonlinear tests with AVX to maintain accurate algorithm convergence
- #1604: Fixed AVX512
preduce_mulimplementation for MSVC to correctly handle negative outputs - #1606: Fixed undefined behavior in predux_mul test input generation to prevent signed integer overflow
- #1607: Fixed hard-coded magic bounds in nonlinear tests for improved cross-platform reliability
- #1610: Fixed generic nearest integer operations for GPU compatibility and performance
- #1611: Fixed CMake package include path configuration to ensure correct include directory setup
- #1614: Fix FFT functionality when destination does not have unit stride
- #1616: Fixed GCC 6 compilation error by removing namespace prefixes from struct specializations
- #1620: Fix compilation failures for
constexprmatrices with GCC 14 - #1622: Fixed undefined behavior sanitizer (ubsan) failures in
array_for_matrixwith integer type handling - #1628: Fixed threading tests by adjusting header inclusion order and resolving C++20 capture warnings
- #1630: Resolved warnings about repeated macro definitions to improve code reliability
- #1631: Fixed multiple GCC warnings related to enum comparisons, improving code clarity and maintainability
- #1633: Resolved compiler warnings introduced by previous warning fixes
- #1635: Fixed deprecated enum comparison warnings by improving type-safe comparisons
- #1637: Fix scalar
pselectNaN handling inconsistency in MSVC fast-math mode - #1639: Resolve AVX512FP16 build failure by implementing vectorized cast specializations for
packet16handpacket16f - #1642: Reverted a previous change addressing scalar
pselectfunctionality - #1648: Fix overflow warnings in
PacketMathFP16by adding explicitshortcasts - #1649: Fix compiler
-Wmaybe-uninitializedwarnings in BDCSVD by using placement new for object initialization - #1650: Removes incorrect C++23 check for suppressing
has_denormdeprecation warnings in MSVC - #1651: Fixes compilation issues with
Eigen::halfto_Float16conversion in AVX512 code usingbit_castand user-defined literals - #1658: Fixed pi definition in kissfft module to improve computational precision
- #1679: Addressed
Wmaybe-uninitializedwarnings in BDCSVD module, improving memory safety and code reliability - #1685: Fixed out-of-range argument handling for
_mm_permute_pdfunction to prevent undefined behavior - #1688: Fixed bug in
atanhfunction for input value -1, improving numerical stability and accuracy - #1690: Fixed bug in
atanhfunction implementation to improve accuracy and reliability - #1693: Fix generic SSE2 ceil implementation for negative numbers near zero
- #1697: Removed unneeded
_mm_setzero_si128function call, addressing issue #2858 and potentially improving code efficiency - #1699: Fixed compiler warning in
EigenSolver::pseudoEigenvalueMatrix()for matrix dimension handling - #1707: Fix
erf(x)computation to avoid NaN for large input values - #1708: Fixed
atantest for 32-bit ARM architecture by adjusting handling of flush-to-zero behavior - #1711: Fixed compilation bug in
DenseBase::tailfor dynamic template arguments, improving function flexibility - #1718: Fixed out-of-bounds access in triangular matrix multiplication code, improving safety and reliability
- #1720: Fixed NVCC build issues for CUDA 10+ by resolving warnings and assignment operator problems
- #1722: Fixed matrix parameter passing issues affecting internal data alignment in GCC arm environment
- #1723: Fixes compiler-specific issues with clang6 optimization, addressing problems in
small_product_5,cross3, and SSEpabsfunctions - #1724: Removes default FFT macros from CMake test declarations to eliminate macro redefinition warnings in FFTW tests
- #1725: Fixed clang6 and ARM architecture compatibility by modifying SSE instruction handling
- #1726: Fixed GPU builds by adding initializers for
constexprglobal variables to ensure CUDA compatibility - #1740: Fixed CMake compatibility by reverting
separate_arguments()syntax for older CMake versions - #1742: Fix compilation issue in
Assign_MKL.hby casting enum tointfor comparison - #1760: Fixed undefined behavior in
setZerofor null destination arrays and zero-sized blocks - #1761: Fixed map fill logic to support more flexible stride configurations, including 0/0 stride and outer stride equal to underlying inner size
- #1762: Fixed IOFormat alignment computation for more consistent matrix output
- #1764: Fixed CI checkformat stage by updating base Ubuntu image to latest LTS version
- #1769: Fixes special packetmath
erfcflushing for ARM32 architecture, improving subnormal number handling - #1785: Fixes build issue by adding missing
#include <new>header - #1790: Fix uninitialized threshold read in
SparseQR::factorize()method to improve code safety - #1792: Resolves
std::fill_nreference issue by removingEIGEN_USING_STDto prevent namespace conflicts - #1793: Zero-initialize test arrays to prevent uninitialized memory reads and improve test reliability
- #1799: Fixed task retrieval logic in
NonBlockingThreadPoolto correctly enable task stealing between threads - #1802: Fixed initialization order and removed unused variables in
NonBlockingThreadPool.h - #1804: Fixes potential data race on
spin_count_inNonBlockingThreadPoolby making itconstand properly initializing it - #1806: Fixed UTF-8 encoding errors in
SimplicialCholesky_impl.hcausing compilation issues in MSVC and Apple Clang - #1810: Fixed midpoint calculation in
Eigen::ForkJoinSchedulerto prevent out-of-bounds errors and improve parallel computation reliability - #1814: Added missing return statements in PowerPC architecture implementation to improve code reliability
- #1816: Fix Android NDK compatibility issue with
__cpp_lib_hardware_interference_sizemacro - #1825: Eliminate type-punning undefined behavior in
Eigen::halfby using safer bit-cast approach - #1831: Fixed Power architecture builds for configurations without VSX and POWER8 support
- #1833: Fixed
Warray-boundswarning in inner product implementation, preventing potential array access out-of-bounds errors - #1834: Fixed uninitialized matrix elements in bicgstab test to improve test reliability
- #1835: Resolved bitwise operation compilation error when compiling with C++26
- #1841: Fixed documentation job configuration for nightly builds
- #1842: Resolved CMake BOOST warning by updating configuration to eliminate deprecated behavior
- #1850: Fixed x86 complex vectorized FMA implementation to improve computational accuracy and performance
- #1851: Fixed implementation of Givens rotation algorithm to improve accuracy and reliability
- #544: Added GDB pretty printer support for Eigen::Block types to improve debugging experience
- #572: Removed unnecessary
constqualifiers from AutodiffScalar return types to improve code quality and readability - #605: Updated SparseExtra RandomSetter to use unordered_map for improved performance
- #609: Optimize predux, predux_min, and predux_max operations for AArch64 architecture using specialized intrinsics
- #615: Adds intrin header for Windows ARM to improve compatibility and intrinsic function support
- #617: Extended matrixmarket reader/writer to support handling of dense matrices
- #618: Added EIGEN_DEVICE_FUNC labels to improve CUDA 9 compatibility for gpu_basic tests
- #631: Introduced error handling to prevent direct inclusion of internal Eigen headers
- #632: Simplified CMake configuration by removing unused interface definitions
- #633: Simplified CMake versioning for architecture-independent package configurations using
ARCH_INDEPENDENToption - #634: Improved CMake package registry configuration for better dependency management
- #641: Removed unnecessary std::tuple reference to simplify codebase
- #647: Cleaned up EIGEN_STATIC_ASSERT to use standard C++11 static_assert, improving error messages and code organization
- #648: Corrected typographical errors in copyright dates across project files
- #652: Added a macro to pass arguments to ctest for running tests in parallel
- #655: Improved CI test execution by running tests in parallel across all available CPU cores
- #660: Corrected multiple typos in documentation and comments to improve code clarity and readability
- #661: Corrected typographical errors in documentation and code comments
- #662: Reorganized test main file for improved maintainability and code structure
- #663: Reduced CUDA compilation warnings for versions 9.2 and 11.4
- #668: Updated CMake Windows compiler and OS detection with more reliable and maintainable methods
- #677: Optimized type punning in CUDA code by replacing memcpy with reinterpret_cast for improved GPU performance
- #687: Adds nan-propagation options to elementwise min/max operations and reductions in matrix and array plugins
- #692: Extend Eigen's Qt support to Qt6 by modifying compatibility functions in Transform.h
- #693: Enhanced documentation for Stride class inner stride behavior in compile-time vectors
- #697: Optimize CMake scripts to improve Eigen subproject integration and reduce default test build overhead
- #700: Vectorized tanh and logistic functions for fp16 on Neon, improving computational performance
- #701: Move alignment qualifier to improve consistency and resolve compiler warnings
- #712: Improved documentation for Quaternion constructor from MatrixBase, clarifying element order and usage
- #716: Converted diagnostic pragmas to standardized nv_diag format, improving code consistency and maintainability
- #717: Moved pruning code from CompressedStorage to SparseVector.h to improve code organization
- #718: Update SparseMatrix::Map and TransposedSparseMatrix to use consistent StorageIndex across implementations
- #720: Fixed a documentation typo to improve clarity
- #722: Optimized Umeyama algorithm computation by conditionally skipping unnecessary scaling calculations
- #726: Added basic iterator support for Eigen::array to simplify array usage and transition from std::array
- #727: Made numeric_limits members constexpr for improved compile-time evaluation
- #734: Improved AVX2 optimization selection for non-multiple-of-8 data sizes
- #735: Simplified C++11 feature checks by removing redundant macros and compiler version checks
- #737: Refactored Lapacke LLT macro binding to improve code clarity and maintainability
- #748: Improved Lapacke bindings for HouseholderQR and PartialPivLU by replacing macros with C++ code and extracting common binding logic
- #753: Convert computational macros to type-safe constexpr functions for improved code quality
- #756: Conditional inclusion of header to improve compatibility with toolchains lacking atomic operations support
- #757: Refactored IDRS code, replacing
norm()withStableNorm()to improve code stability and numerical performance - #760: Removed
using namespace Eigenfrom sample code to promote better coding practices - #761: Cleanup of obsolete compiler checks and flags, streamlining the codebase and reducing maintenance overhead
- #763: Cleaned up CMake scripts by removing deprecated
COMPILE_FLAGSand adopting moderntarget_compile_options - #767: Improved
exp()function behavior for-Infarguments in vectorized expressions with performance optimizations - #772: Cleanup of internal macros and sequence implementations to simplify codebase
- #773: Optimized row-major sparse-dense matrix product implementation with two accumulation variables to improve computational efficiency
- #774: Fixes for enabling HIP unit tests and updating CMake compatibility
- #776: Improved CMake handling of
EIGEN_TEST_CUSTOM_CXX_FLAGSby converting spaces to semicolons - #779: Optimize
exp<float>()with reduced polynomial degree, expanded denormal range, and 4% speedup for AVX2 - #780: Improved accuracy and performance of logistic sigmoid function implementation, reducing maximum relative error and extending computational range
- #783: Simplified
logical_xor()implementation for bool types, improving code clarity and efficiency - #786: Small cleanup of GDB pretty printer code, improving code readability and maintenance
- #788: Small documentation and code quality improvements, including fixing warnings and documentation formatting
- #790: Added missing internal namespace qualifiers to vectorization logic tests
- #791: Added support for Cray, Fujitsu, and Intel ICX compilers with new preprocessor macros
- #792: Enables manual specification of inner and outer strides for CWiseUnaryView, enhancing stride control and flexibility
- #795: Refactored identifiers to reduce usage of reserved names in compliance with C++ standard guidelines
- #797: Adds bounds checking to Eigen serializer to improve data integrity and prevent out-of-bounds access
- #799: Performance improvement for logarithm function with 20% speedup for float and better denormal handling
- #813: Corrected and clarified documentation for Least Squares Conjugate Gradient (LSCG) solver, improving mathematical descriptions and user understanding
- #814: Updated comments to remove references to outdated macro and improve code clarity
- #816: Port EIGEN_OPTIMIZATION_BARRIER to support soft float ARM architectures
- #819: Enhance clang warning suppressions by checking for supported warnings before applying suppressions
- #821: Prevent unnecessary heap allocation in diagonal product by setting NestByRefBit for more efficient memory management
- #825: Introduced utility functions to reduce floating-point warnings and improve comparison precision
- #830: Removed documentation referencing obsolete C++98/C++03 standards
- #832: Improved AVX512 math function consistency and ICC compatibility for more reliable mathematical computations
- #836: Refined compiler-specific
maxpdworkaround to target only GCC<6.3 - #838: Corrected definition of EIGEN_HAS_AVX512_MATH in PacketMath to improve AVX512 math capabilities
- #841: Consolidated and improved generic implementations of psqrt and prsqrt functions with correct handling of special cases
- #844: Updated MPL2 license link to use HTTPS for improved security
- #845: Improved numeric_limits implementation to ensure One Definition Rule (ODR) compliance and enhance static data member definitions
- #846: Optimize performance by returning alphas() and betas() vectors as const references
- #849: Improved documentation for MatrixXNt and MatrixNXt matrix patterns and fixed documentation compilation issues
- #850: Added descriptive comments to Matrix typedefs to improve Doxygen documentation
- #854: Added scaling function overload to handle vector rvalue references, improving diagonal matrix creation from temporary vectors
- #861: Improved FixedInt constexpr support and resolved potential ODR violations
- #864: Cleaned up unnecessary EIGEN_UNUSED decorations to improve code clarity and maintainability
- #869: Improved SYCL support by simplifying CMake configuration and enhancing compatibility with C++ versions
- #872: Improved sqrt/rsqrt handling of denormal numbers and performance optimizations for AVX512
- #879: Improved efficiency of any/all reduction operations for row-major matrix layouts
- #884: Simplified non-convergence checks in NonLinearOptimization tests to improve test reliability across different architectures
- #887: Enhance vectorization logic tests for improved cross-platform compatibility and test reliability
- #888: Optimized least_square_conjugate_gradient() performance using .noalias() to reduce temporary allocations
- #889: Introduced
construct_atanddestroy_atwrappers, improving code clarity and modernizing memory management practices throughout Eigen - #890: Removed duplicate IsRowMajor declaration to reduce compilation warnings and improve code clarity
- #891: Optimized SVD test memory consumption by splitting and reducing test matrix sizes
- #893: Adds new CMake configuration options for more flexible build control of Eigen library components
- #895: Added move constructors to SparseSolverBase and IterativeSolverBase for improved solver object management
- #903: Replaces enum with constexpr for floating point bit size calculations, reducing type casts and improving code readability
- #904: Converted static const class members to constexpr for improved compile-time efficiency
- #907: Enhances PowerPC MMA build options with dynamic dispatch and improved compatibility for Power10 processors
- #909: Removed outdated GCC-4 warning workarounds, simplifying and improving code maintainability
- #913: PowerPC MMA build configuration enhancement with dynamic dispatch option
- #916: Updated Altivec MMA dynamic dispatch flags to support binary values for improved TensorFlow compatibility
- #921: Optimized visitor traversal for RowMajor inputs, improving matrix operation performance
- #927: Update warning suppression techniques for improved compiler compatibility
- #929: Split general matrix-vector product interface for Power architectures to improve TensorFlow compatibility
- #931: Re-enabled Aarch64 CI pipelines to improve testing and validation for Aarch64 architecture
- #939: Improved LAPACK module code organization by removing
.cppfile inclusions - #940: Reintroduced std::remove* aliases to restore compatibility with third-party libraries
- #941: Improve scalar test_isApprox handling of inf/nan values
- #943: Enhanced
constexprhelper functions inXprHelper.hto improve compile-time computations and code clarity - #944: Converted metaprogramming utility to constexpr function for improved compile-time evaluation and code simplification
- #947: Added partial loading, storing, gathering, and scattering packet operations to improve memory access efficiency and performance
- #951: Optimized Power GEMV predux operations for MMA, reducing instruction count and improving compatibility with GCC
- #952: Introduced workarounds to allow all tests to pass with
EIGEN_TEST_NO_EXPLICIT_VECTORIZATIONsetting - #959: Improved AVX512 implementation with header file renaming and hardware capability restrictions
- #960: Removed AVX512VL dependency in trsm function, improving compatibility across different AVX configurations
- #962: Optimized Householder sequence block handling to eliminate unnecessary heap allocations and improve performance
- #967: Optimized GEMM MMA with vector_pairs loading and improved predux GEMV performance
- #968: Made diagonal matrix
cols()androws()methods constexpr to improve compile-time evaluation - #969: Conditionally add
uninstalltarget to prevent CMake installation conflicts - #984: Removes executable flag from files to improve project file permission management
- #985: Improved logical shift operation implementations and fixed typo in SVE/PacketMath.h
- #989: Resolves C++20 comparison operator ambiguity in template comparisons
- #994: Marks
index_remapasEIGEN_DEVICE_FUNCto enable GPU expression reshaping - #997: Enhances AVX512 TRSM kernels memory management by using
allocawhenEIGEN_NO_MALLOCis requested - #998: Improved tanh and erf vectorized implementation for EIGEN_FAST_MATH in VSX architecture
- #999: Update Householder.h to use numext::sqrt for improved custom type support
- #1000: Performance optimization for GEMV on Power10 architecture using more load and store vector pairs
- #1002: Addressed clang-tidy warnings by reformatting function definitions in headers and improving code clarity
- #1006: Improved AutoDiff module header management by including necessary Core dependencies
- #1009: Corrected Doxygen group usage to improve documentation clarity and structure
- #1011: Optimized pblend AVX implementation, reducing execution time by 24.84%
- #1013: Added compiler flag to enable/disable AVX512 GEBP kernels, improving configuration flexibility
- #1016: Resolved Emscripten header inclusion issue with
immintrin.h - #1020: Modify ConjugateGradient to use numext::sqrt for improved type compatibility
- #1021: Updated AccelerateSupport documentation for improved clarity and accuracy
- #1026: Vectorized sign operator for real types to enhance computational performance across different CPU architectures
- #1031: Eliminated bool bitwise warnings by refactoring code to use logical operations instead of bitwise operations
- #1035: Removed redundant FP16C checks for AVX512 intrinsics, improving performance for float-to-half and half-to-float conversions
- #1040: Specialized
psign<Packet8i>for AVX2 with up to 79.45% performance improvement and removed vectorization ofpsign<bool> - #1043: Vectorized implementation of pow for integer base and exponent types, improving performance and numerical robustness
- #1050: Added index-out-of-bounds assertions in IndexedView to improve error detection and library safety
- #1052: Improved CMake configuration by disabling default benchmark builds and fixing test dependencies with sparse libraries
- #1054: Fixed documentation typo in TutorialSparse.dox
- #1056: Reduced compiler warnings in test code to improve build output and code quality
- #1058: Added missing comparison operators for GPU packets, resolving CUDA build issues and improving GPU computation support
- #1064: Improved constexpr compatibility for g++-6 and C++20, addressing build errors and compiler-specific constraints
- #1066: Improved
pow()function to allow mixed types with safe type promotions - #1075: Optimized sign function for complex numbers by conditionally using generic vectorization
- #1078: Added macro to configure
nrtrait in GEBP kernel for NEON architecture, potentially improving matrix computation performance - #1079: Optimize GEBP kernel compilation time and memory usage with EIGEN_IF_CONSTEXPR
- #1083: Reduced memory footprint of GEBP kernel for non-ARM targets to mitigate MSVC heap memory issues
- #1084: Vectorized implementation of atan() for double precision, improving computational efficiency
- #1086: Conditional vectorization of
atan<double>for Altivec with VSX support - #1087: Simplified range reduction strategy for
atan<float>()with 20-40% speedup on x86 architectures - #1088: Replaced standard
assertwitheigen_assertfor improved consistency and assertion control - #1091: Added macros to AttributeMacros to improve clang-format compatibility and code formatting
- #1093: Improved handling of NaN inputs in atan2 function to enhance mathematical computation reliability
- #1095: Refactored special values tests for pow and added new test for atan2, improving mathematical function testing
- #1099: Clarified documentation requirement that indices must be sorted to improve library usability
- #1100: Enhanced resizing capabilities for dynamic empty matrices, improving matrix dimension handling and flexibility
- #1101: Improved memory management by using 1-byte offset for address alignment in handmade allocation functions
- #1102: Add assertion to validate outer index array size in SparseMapBase, improving input validation and preventing potential runtime errors
- #1109: Removed unnecessary assert in SparseMapBase to improve flexibility in sparse matrix initialization
- #1110: Removed unused parameter name to improve code readability
- #1111: Fixed Neon vectorization issues to improve ARM architecture performance and compatibility
- #1114: Enhanced BiCGSTAB parameter initialization to support custom types
- #1117: Small cleanup of IDRS.h, removing unused variable and improving comment formatting
- #1122: Reduced compiler warnings in test files by addressing narrowing conversions and improving code quality
- #1128: Enables direct access for NestByValue construct, improving performance and usability
- #1129: Added BDCSVD LAPACKE binding for more flexible and efficient SVD computations
- #1131: Increased L2 and L3 cache sizes for Power10 architecture to improve matrix operation performance by 1.33X
- #1134: Optimized
equalspacepacket operation to improve performance and computational efficiency - #1135: Improved divide by zero error handling for better cross-platform compatibility
- #1136: Reviewed and cleaned up compiler version checks to improve maintainability and compatibility
- #1137: Improved bfloat16 support by replacing std::signbit with numext::signbit
- #1138: Improved test coverage for numext::signbit function
- #1139: Adds comparison,
+=, and-=operators toCompressedStorageIteratorto improve iterator functionality - #1140: Improved SparseLU implementation by updating dense GEMM kernel and fixing initialization bug in SparseLUTransposeView
- #1141: Enables NEON absolute value operations for unsigned integer types, improving performance for
.cwiseAbs()operations - #1144: Improved C++ version detection macros and CMake tests to enhance compatibility and reduce CI failures
- #1145: Improved bfloat16 product test thresholds to enhance comparison reliability
- #1146: Enabled additional NEON instructions including complex psqrt and plset operations
- #1154: Significantly improved Power10 MMA bfloat16 GEMM performance with up to 61X speedup
- #1158: Clarified help message for spbenchsolver to improve matrix file naming instructions
- #1165: Added missing
EIGEN_DEVICE_FUNCin assertions, improved code compatibility and clarity - #1167: Improved
ColPivHouseholderQRmove assignment to enhance compiler compatibility - #1169: Replaced deprecated CMake generator expression
$<CONFIGURATION>with$<CONFIG>to improve build system compatibility - #1172: Refactored SparseMatrix.h to improve code consistency and readability by directly referencing class members
- #1174: Performance optimization for bfloat16 matrix-matrix multiplication in non-standard matrix dimensions
- #1175: Improved
atan2implementation with better corner case handling and performance optimization - #1176: Optimized mathematical packet operations including
atan,atan2,acos, and binary/unary power computations - #1186: Update to ForwardDeclarations.h for improved header organization and maintainability
- #1190: Standardized zero comparisons using VERIFY_IS_EQUAL macro for improved code consistency and reliability
- #1191: Improved LAPACKE configuration with better complex type handling and LAPACK library compatibility
- #1192: Improved EIGEN_DEVICE_FUNC compatibility across CUDA 10/11/12 versions and cleaned up warnings
- #1198: Replaced eigen_asserts with eigen_internal_asserts in Power module to reduce unnecessary error checking in release builds
- #1199: Added IWYU export pragmas to top-level headers to improve tooling compatibility and code maintainability
- #1206: Enhances type handling for complex numbers in ColPivHouseholderQR_LAPACKE.h using LAPACKe specializations
- #1207: Optimized psign implementation for floating point types with reduced computational complexity
- #1208: Reverted ODR changes and added EIGEN_ALWAYS_INLINE to gemm_extra_cols and gemm_complex_extra_cols functions to optimize performance
- #1214: Optimized BF16 to F32 array conversions on Power architectures by reducing vector instructions
- #1219: Optimized
pasin_floatfunction with bit manipulation for improved performance and fixedpsqrt_complexerror handling - #1223: Vectorized implementation of atanh, added atan definition, and new unit tests for mathematical functions
- #1224: Added Packet int divide support for Power10 architecture, improving computational performance
- #1226: Improved performance of pow() on Skylake by using pmsub instruction in twoprod
- #1230: Eliminates EIGEN_HAS_AVX512_MATH workaround, simplifying AVX512 packet math implementation
- #1232: Introduced guard mechanism to manage
long doubleusage on GPU devices, improving compilation compatibility - #1234: Streamlined BLAS/LAPACK routine declarations by removing unused headers and improving file organization
- #1241: Improved CMake configuration to prevent unintended modifications when Eigen is a sub-project
- #1242: Optimized memory allocation during tridiagonalization for eigenvector computation
- #1250: Replaced instances of 'Lesser' with 'Less' to improve terminology consistency
- #1251: Minor code style improvement by adding newline to end of file
- #1253: Simplified packetmath specializations using a macro, improving code readability and maintainability across backends
- #1259: Reinstated and expanded deadcode checks to improve code quality and maintainability
- #1262: Limits build and link jobs on PowerPC to reduce out-of-memory issues
- #1264: Introduced EIGEN_NOT_A_MACRO macro to improve compatibility with TensorFlow build process
- #1266: Removed pools for CMake versions less than 3.11, streamlining build configuration
- #1267: Corrected various typographical errors to improve code readability and documentation quality
- #1268: Improved CMake argument parsing to support semi-colon separated lists for better build system compatibility
- #1272: Optimized casting performance for x86_64 architectures, with significant speedups in bool and float casting operations
- #1274: Optimize float->bool cast performance for AVX2 with significant speed improvements
- #1275: Added vectorized integer casts for x86 and removed redundant unit tests, improving performance by up to 66.77%
- #1276: Optimized
generic_rsqrt_newton_stepfunction, improving accuracy and performance of square root calculations - #1284: Clean up packet math implementation by removing unused traits, adding missing specializations, and setting blend properties
- #1286: Improves type safety for non-const symbolic indexed view expressions by adding explicit l-value qualification
- #1288: Updated documentation for Eigen 3.4.x to improve build process and clarity
- #1294: Improved accuracy of
erf()function with refined rational approximation and enhanced clamping methods - #1298: Improved tensor select evaluator using typed ternary selection operator for better performance
- #1303: Improved
Erf()function performance and accuracy, ensuring +/-1 return values at clamping points with computational speed enhancements - #1305: Enhanced
StridedLinearBufferCopywith half-Packetoperations to improve computational efficiency - #1313: Added pmul and abs2 operations for Packet4ul in AVX2 implementation
- #1316: Implemented
pcmp,pmin, andpmaxfunctions forPacket4uitype in SSE to improve vectorization compliance - #1317: Optimized F32 to BF16 conversions with loop unrolling, achieving 1.8X faster performance for LLVM and vector pair improvements for GCC
- #1320: Improved memory management for FFTW/IMKL FFT backends using
std::shared_ptr - #1324: Update
ndtrifunction to return NaN for out-of-range input values, improving consistency with SciPy and MATLAB - #1325: Renamed
array_cwisetest to prevent naming conflicts and suppressed compiler warnings - #1328: Implements specialized vectorization for
scalar_cast_opevaluator, enhancing performance and safety in casting operations - #1334: Improved unrolled assignment evaluator with more consistent linear access methods for small fixed-size arrays and matrices
- #1337: Clean-up of Redux.h and vectorization_logic test to improve code readability and test reliability
- #1338: Optimized error handling for
scalar_unary_pow_opwith improved performance and robustness for integer base and exponent scenarios - #1342: Optimized Newton-Raphson step for reciprocal square root, reducing max relative error from 3 to 2 ulps in floating-point calculations
- #1346: Introduced move constructor for
Ref<const...>to improve performance and reduce unnecessary copying - #1352: Improved
rint,round,floor, andceilmathematical functions for enhanced precision and performance - #1353: Removed deprecated function calls in SVD test suite to improve code maintainability
- #1354: Added optional offset parameter to
ploadu_partialandpstoreu_partialto improve API consistency - #1356: Fixed compilation warning by unconditionally defining
EIGEN_HAS_ARM64_FP16_VECTOR_ARITHMETICon ARM architectures - #1358: Addressed multiple compiler warnings across various modules through strategic code refactoring and type handling improvements
- #1364: Optimized
check_rows_cols_for_overflowwith partial template specialization for more efficient matrix size checks - #1365: Added missing x86 primary casts for float, int, and double type conversions across SIMD instruction sets
- #1372: Enhanced Power architecture support with partial packet resolution, CPU improvements, DataMapper updates, and
bfloat16type compatibility - #1373: Adds
max_digits10function toEigen::NumTraitsfor improved floating-point decimal digit representation - #1378: Improved handling of reference forwarding by replacing
std::move()withstd::forward()to address clang-tidy warning - #1381: Update boost MP test suite to reference new SVD test cases
- #1384: Added IWYU private pragmas to internal headers to enhance tooling capabilities and header management
- #1385: Renamed non-standard plugin headers to use
.incextension for improved header management - #1391: Exported
ThreadPoolsymbols from legacy header to silence Clang include-cleaner warnings - #1393: Update ROCm configuration to use
ROCM_PATHfor improved compatibility with ROCm 6.0 - #1397: Consolidated and simplified multiple implementations of divup/div_up/div_ceil functions
- #1398: Eliminated use of
_residentifier to resolve macro conflicts and improve code compilation - #1400: Modifies
div_ceilfunction to pass arguments by value, reducing potential ODR-usage errors - #1401: Fixed a typo in code comments to improve documentation clarity
- #1404: Improve build system by avoiding documentation builds during cross-compilation or non-top-level builds
- #1413: Improved
traits<Ref>::matchto correctly handle strides for contiguous memory layouts, eliminating unnecessary copying and enhancingRefclass efficiency - #1421: Gemv microoptimization improving loop performance and reducing compilation warnings
- #1424: Optimized matrix-vector operations in
GeneralMatrixVector.hfor improved performance whenPacketSizeis a power of two - #1428: Introduced clang-format in CI to ensure consistent code formatting and improve code maintainability
- #1429: Applied clang-format to entire Eigen codebase for consistent code style and improved maintainability
- #1430: Introduced
.git-blame-ignore-revsfile to improve git blame functionality for contributors - #1432: Comprehensive clang-format-17 update across Eigen library, improving code consistency and readability
- #1433: Improved formatting of
.git-blame-ignore-revsfile for better Git blame operations - #1437: Improved random number generation functionality for scalar types, addressing entropy limitations and enhancing randomness across platforms
- #1438: Improved documentation for SparseLU module, clarifying function relationships and usage
- #1443: Updated continuous integration testing framework to enhance testing processes and reliability
- #1446: Removed C++11-specific code from count trailing/leading zeros implementations to improve portability
- #1450: Simplified and optimized
stableNormfunction, eliminating GCC uninitialized variable warnings and improving code efficiency - #1452: Minor documentation improvements for basic slicing examples
- #1459: Added
constexprqualifiers to improve compile-time evaluation capabilities - #1461: Eliminated unused warnings in failtest, improving code quality and developer experience
- #1469: Removed explicit member function specialization to improve compiler compatibility
- #1470: Improved codebase formatting for better readability and consistency
- #1471: Update LAPACK CPU time function naming conventions for improved consistency
- #1473: Improved documentation for LAPACK's
secondanddsecndfunctions to enhance user understanding - #1483: Integrated
stableNorm()in ComplexEigenSolver to improve numerical stability - #1491: Applied clang-format to improve code consistency in lapack and blas directories
- #1495: Optimized JacobiSVD implementation by removing unnecessary member variables
m_scaledMatrixandm_adjoint, improving memory efficiency and compiler optimization - #1505: Disable float16 packet casting for native AVX512 f16 support, enhancing stability of packet operations
- #1506: Replaced
Matrix::Optionswithtraits<Matrix>::Optionsto improve consistency across Eigen object types - #1515: Enhanced random number generation for custom float types with improved accuracy and reduced rounding bias
- #1519: Updates
array_sizeresult type from enum toconstexprto improve type safety and reduce compiler warnings - #1523: Optimized SparseQR module performance, reducing computation time from 256 to 200 seconds
- #1525: Speed up sparse x dense dot product with optimization techniques and inline methods, reducing SparseQR computation time
- #1527: Removed shadowed typedefs to improve code clarity and maintainability
- #1530: Eliminate FindCUDA CMake warning to improve build configuration process
- #1532: Improved error message clarity for C++14 requirement
- #1539: Improves static vector allocation alignment in TRMV module, ensuring consistent memory alignment for fixed-sized vectors
- #1542: Split cxx11_tensor_gpu test to reduce Windows test timeouts and improve test suite reliability
- #1543: Improved incomplete Cholesky decomposition with new
findOrInsertCoeffmethod and enhanced verification of sparse matrix operations - #1547: Improved const input handling and C++20 compatibility in unary views by preserving const-ness and updating type trait implementation
- #1557: Improved documentation consistency for the Jacobi module by adjusting documentation tag placement
- #1558: Performance optimization for
Tensor::resizeby removing slow index checks and modernizing code - #1560: Added
cwiseSquarefunction and improved tests for element-wise matrix operations - #1561: Removed unnecessary
extern Cdeclarations in CholmodSupport module, simplifying code and maintaining library compatibility - #1563: Introduced custom formatting for complex numbers improving Numpy and Native compatibility
- #1564: Vectorization and MSVC compatibility improvements for
cross3_productfunction in Eigen's core operations - #1569: Optimized move constructors and assignment operators for
SparseMatrixto improve performance during object transfers - #1571: Improved compatibility between
Eigen::arrayandstd::array, preparing for C++17 transition - #1580: Added support for
Packet8lin AVX512 architecture, optimizing performance for specific instruction set operations - #1581: Add
constexprqualifiers to accessors inDenseBase,Quaternions, andTranslationsto improve compile-time computation capabilities - #1582: Refactored indexed view template definitions to improve MSVC 14.16 compatibility and eliminate parameter redefinition warnings
- #1583: Optimized
pexpfunction performance with speed improvements up to 6% across different SIMD architectures - #1584: Implemented performance optimizations for Intel
pblendfunctionality using more efficient integer operations and simplified mask creation - #1590: Introduced optimizations for
pblendfunctionality with improved bitmask generation and auto-vectorization techniques - #1592: Fixed psincos implementation for PowerPC and 32-bit ARM, improving vectorized trigonometric computations
- #1595: Update CI scripts with Windows compatibility improvements, AVX tests, and local CI environment scripts
- #1605: Removed unnecessary semicolons to improve code readability and maintainability
- #1609: Improved test reliability for eigenvector orthonormality by adjusting error tolerance for scaled matrices
- #1613: Improved 128-bit integer operations for MSVC by replacing
__uint128_twith MSVC-supported functions - #1615: Updated
preduxfor PowerPCPacket4ito align summation behavior with other architectures - #1618: Fixed grammatical error in Matrix class documentation
- #1619: Suppress C++23 deprecation warnings for
std::has_denormandstd::has_denorm_loss - #1621: Adds validation checks for indices in
SparseMatrix::insertto improve robustness and error handling - #1623: Reformatted
EIGEN_STATIC_ASSERT()macro as a statement macro to improve code consistency and maintainability - #1624: Improved memory allocation and pointer arithmetic in
aligned_allocafunction to enhance performance and code quality - #1625: Utilizes
__builtin_alloca_with_alignto optimize memory allocation efficiency and potentially improve performance - #1626: Refactored
data()functions to beconstexpr, enabling compile-time evaluation and potential performance improvements - #1629: Vectorized implementation of
isfiniteandisinffunctions for improved performance - #1632: Vectorized
allFinite()function with approximately 2.7x performance speedup on AVX-compatible hardware - #1640: Fixed markdown formatting in README.md for improved readability
- #1641: Introduced AVX512F-based casting optimization from
doubletoint64_tfor enhanced performance - #1644: Add async support for
chipandextract_volume_patchesoperations in Eigen's Tensor module - #1656: Corrected multiple typographical errors across the Eigen codebase using codespell
- #1659: Updated
.clang-formatconfiguration to improve JavaScript file formatting compatibility - #1660: Updated
eigen_navtree_hacks.jsfile to improve code readability, performance, and maintenance - #1661: Improved
hlogsymbol lookup to allow local namespace definitions, enhancing function flexibility - #1663: Optimized SSE/AVX complex multiplication kernels using
vfmaddsubinstructions for improved performance - #1665: Cleanups to threaded product code and test cases for improved maintainability and readability
- #1666: Add
std::this_thread::yield()to spinloops in threaded matrix multiplication to optimize CPU resource usage and instruction efficiency - #1667: Optimized
StableNormperformance for non-trivial sizes with improved consistency between aligned and unaligned inputs - #1668: Added
<thread>header to enablestd::this_thread::yield()for improved thread management - #1669: Introduced ARM NEON complex intrinsics
pmulandpmaddfor improved complex number computation performance on ARM architectures - #1672: Vectorized implementation of
squaredNorm()for complex types, improving performance of norm-related operations - #1676: Improved documentation for
GeneralizedEigenSolver::eigenvectors()method to ensure proper rendering and clarity - #1677: Consolidated and optimized
patan()implementations for float and double types, achieving significant performance speedups across various instruction set architectures - #1681: Improved complex number trait handling by modifying
NumTraits<std::complex<Real_>>::IsSignedand addingpnmsubtests - #1682: Added support for nvc++ compiler with configuration macro and improved compilation compatibility
- #1683: Introduced SSE and AVX implementations for complex FMA operations, improving performance and computational accuracy
- #1684: Vectorized
atanh<double>implementation with standard-compliant handling for |x| >= 1, delivering significant performance speedups across different instruction set architectures - #1689: Fixed ARM SVE intrinsics bug and added
svsqrt_f32_xsqrt support - #1691: Updated
NonBlockingThreadPool.hto useeigen_plain_assertfor better C++26 compatibility - #1692: Optimized dot product implementation with performance improvements for smaller vector sizes
- #1700: Improved test debugging capabilities by adding extra information to
float_pow_test_impland cleaning uparray_cwisetests - #1701: Add missing
EIGEN_DEVICE_FUNCannotations to improve CUDA build compatibility - #1702: Added
max_digits10support formprealtypes inNumTraits - #1703: Enhance inverse evaluator compatibility for CUDA device execution by marking as host+device function
- #1706: Improved speed and accuracy of
erf()function with reduced maximum error and performance benchmarks - #1709: Standardizes polynomial evaluation using
ppolevlhelper function across the codebase - #1710: Introduced vectorized implementation of
erfc()for float with significant performance improvements (45-72% speedup) - #1712: Suppressed ARM out-of-bounds warnings for
reverseInPlacefunction on fixed-size matrices - #1716: Improved stack allocation assert handling to reduce performance overhead and enhance evaluator class usability
- #1727: Enhances performance by making fixed-size objects trivially move assignable, improving resource management and move operations efficiency
- #1729: Add nvc++ compiler support to Eigen v3.4, improving compilation compatibility
- #1731: Replace standard
__cplusplusmacro with library-specificEIGEN_CPLUSPLUSto improve MSVC compatibility - #1732: Vectorized and improved
erfc(x)function performance for double and float with up to 83% speedup and enhanced accuracy - #1734: Enhances AVX implementation of
predux_anyfunction for improved vector reduction performance - #1736: Added missing
EIGEN_DEVICE_FUNCTIONdecorations to improve device compatibility - #1739: Update overflow check implementation using C++ numeric limits for improved type safety and compatibility
- #1741: Improved lldb debugging support by ensuring non-inlined destructors for
MatrixBasesymbols - #1743: Vectorized implementation of
erf(x)for double with significant SIMD performance improvements across SSE 4.2, AVX2+FMA, and AVX512 architectures - #1745: Fixed C++20 constexpr test compilation issues to improve test suite compatibility
- #1747: Optimized error function (erf) computation for large input values by eliminating redundant calculations
- #1748: Removed unnecessary
HasBlendtrait check, improving code readability and efficiency - #1749: Disabled
fill_noptimization for MSVC to improve performance of zero-initialization across compilers - #1750: Optimized
exp(x)function with performance improvements of 30-35%, enhancing computational efficiency for exponential calculations - #1752: Improved
exp(x)function performance with 3-4% speedup and prevented premature overflow - #1753: Reinstates vectorized
erf<double>(x)implementation for SSE and AVX architectures - #1754: Simplified and optimized
pow()function, achieving 5-6% performance speedup for float and double data types - #1755: Optimized
setConstantandsetZerofunctions usingstd::fill_nandmemsetfor improved performance across different matrix and array types - #1756: Optimize
pow<float>(x,y)with 25% speedup and improved accuracy for integer exponents - #1759: Refactored special case handling in
pow(x,y)and reintroduced repeated squaring for float and integer types - #1763: Improved documentation for move constructor and move assignment methods
- #1765: Added CI deploy phase to tag successful nightly pipelines
- #1766: Updated ROCm docker image to improve CI reliability and functionality
- #1773: Improved CI pipeline to fetch commits using tags for better commit traceability and workflow consistency
- #1774: Introduced equality comparison operator for matrices with dissimilar sizes
- #1775: Simplifies nightly tag job by removing branch name from CI/CD pipeline
- #1776: Switched to Alpine image for more efficient nightly tag deployment
- #1779: Optimizes matrix construction and assignment using
fill_nandmemsetfor improved performance in matrix initialization - #1786: Reinstates default threading behavior by using
omp_get_max_threadswhensetNbThreadsis not set - #1787: Improved CUDA device compatibility by adding
EIGEN_DEVICE_FUNCqualifiers and revising function implementations for better CUDA support - #1788: Simplified CI configuration by removing unnecessary Ubuntu ToolChain PPA
- #1791: Adds ForkJoin-based ParallelFor algorithm to ThreadPool module, enhancing parallel computation performance
- #1794: Updated documentation to clarify cross product behavior for complex numbers
- #1796: Updated documentation to clarify block objects can have non-square dimensions
- #1797: Improved support for loongarch architecture in Eigen
- #1800: Documentation cleanup for
ForkJoin.hwith typo fixes and formatting improvements - #1803: Fix threadpool compatibility issues for C++14 compilers, resolving initialization and warning problems
- #1807: Comprehensive documentation cleanup resolving Doxygen warnings and improving documentation clarity
- #1808: Minor documentation typo fixes in
ForkJoin.h - #1809: Improved tensor documentation by correcting class name references and streamlining documentation
- #1811: Improved cmake configuration for loongarch64 emulated tests, enhancing testing framework compatibility
- #1815: Updated check for
std::hardware_destructive_interference_sizeto improve compatibility on Android platforms - #1817: Introduced
EIGEN_CI_CTEST_ARGSfor custom test timeout control and standardized ctest-related argument naming - #1818: Improved documentation generation with nightly Doxygen builds and enhanced error handling
- #1821: Improved BiCGSTAB numerical convergence by refining initialization and restart conditions
- #1823: Added graphviz dependency to improve documentation build and graph rendering
- #1824: Improved rcond estimate algorithm to return zero condition number for non-invertible matrices
- #1826: Adds missing MathJax/LaTeX configuration to improve mathematical formula rendering
- #1829: Refactored
AssignEvaluator.hto modernize code, remove legacy enums, and improve maintainability - #1832: Remove
fno-check-newcompiler flag for Clang to reduce build warnings - #1837: Modify documentation build process to prevent automatic deletion of nightly docs on pipeline failures
- #1839: Specify constructor template arguments for
ConstexprTeststruct to improve class template argument deduction - #1843: Improved STL feature detection for C++20 compatibility, preventing compilation issues across different compiler and library versions
- #1844: Optimized division operations in
TensorVolumePatch.hby reducing unnecessary divisions when packet size is 1 - #1846: Refactored
AssignmentFunctors.hto reduce code redundancy and improve consistency in assignment operations
- #121: Added a
make formatcommand to enforce consistent code styling across the project - #447: Introduces BiCGSTAB(L) algorithm for solving linear systems with potential improvements for non-symmetric systems
- #482: Adds LLDB synthetic child provider for structured display of Eigen matrices and vectors during debugging
- #646: Added new make targets
buildtests_gpuandcheck_gputo simplify GPU testing infrastructure - #688: Added nan-propagation options to matrix and array plugins for enhanced NaN value handling
- #729: Implemented
reverse_iteratorforEigen::array<...>to enhance iteration capabilities - #758: Added GPU unit tests for HIP using C++14, improving testing for GPU functionalities
- #852: Adds convenience
constexpr std::size_t size() constmethod toEigen::IndexList - #965: Added three new fused multiply functions (pmsub, pnmadd, pnmsub) for PowerPC architecture
- #981: Added MKL adapter and implementations for KFR and FFTS FFT libraries in Eigen's FFT module
- #995: Added comprehensive documentation for the DiagonalBase class to improve library usability
- #1004: Adds
determinant()method for various QR decomposition classes including HouseholderQR, ColPivHouseholderQR, FullPivHouseholderQR, and CompleteOrthogonalDecomposition - #1029: Added fixed power unary operation for coefficientwise real-valued power operations on arrays
- #1046: Re-enabled pow function for complex number types, expanding mathematical computation capabilities
- #1047: Added skew symmetric matrix class for 3D vectors to enhance vector transformations
- #1097: Adds a new
signbitfunction for efficient floating-point sign checking with AVX2 packet operation support - #1098: Implemented cross product for 2D vectors, computing a scalar representing the signed area of the spanned parallelogram
- #1121: Adds serialization capabilities for sparse matrices and sparse vectors
- #1133: Introduces new
setEqualSpacedfunction for creating equally spaced vectors with vectorized implementation - #1209: Added functionality to directly print diagonal matrix expressions without requiring dense object assignment
- #1297: Added
Packet4ui,Packet8ui, andPacket4ulpacket types for SSE/AVX to support unsigned integer SIMD operations - #1299: Added BF16 pcast functions and centralized type casting in
TypeCasting.h - #1309: Added
Abs2method forPacket4uldata type to enhance vectorized operations - #1331: Added new test to validate SYCL functionalities in Eigen core library
- #1335: Added new methods
removeOuterVectors()andinsertEmptyOuterVectors()for flexible sparse matrix manipulation - #1345: Adds new Quaternion constructor that accepts a real scalar and 3D vector for more intuitive quaternion creation
- #1403: Adds component-wise cubic root (
cbrt) functionality for arrays and matrices - #1414: Implemented
plog_complexfunction for vectorized complex logarithm calculations - #1436: Added internal implementations for count trailing zeros (
ctz) and count leading zeros (clz) functions - #1445: Added factor getter functions for Cholmod LLT and LDLT solvers to access L, Lᵀ, and D factors
- #1455: Added test support for ROCm MI300 series architectures (gfx940, gfx941, gfx942)
- #1462: Adds ability to specify a custom temporary directory for file I/O outputs
- #1493: Added
truncoperation for truncating floating-point numbers towards zero - #1501: Implemented SIMD complex function
pexp_complexforfloatto enhance performance of complex number operations - #1512: Added
signDeterminant()method to QR and related decompositions to determine determinant sign - #1612: Added scalar bit shifting functions
logical_shift_left,logical_shift_right, andarithmetic_shift_rightfor integer types - #1704: Added free-function
swapfor dense and sparse matrices and blocks to improve C++ algorithm compatibility - #1714: Added
std::nextafterimplementation for bfloat16 data type - #1715: Adds
exp2(x)function with improved numerical accuracy using TwoProd algorithm - #1719: Added new tests for
sizeof()with one dynamic dimension - #1733: Added missing AVX
predux_anyfunctions to enhance vectorized reduction operations - #1758: Added test case for
pcastfunction with scalar types - #1778: Added
install-docCMake target for documentation installation - #1805: Added
matrixL()andmatrixU()functions to fetch L and U factors from IncompleteLUT sparse matrix decomposition - #1812: Added automated Doxygen documentation build and deployment to GitLab Pages
- #636: Removed stray references to deprecated DynamicSparseMatrix class
- #740: Removed redundant
nonZeros()method fromDenseBaseclass, which simply calledsize() - #752: Deprecated unused macro EIGEN_GPU_TEST_C99_MATH to reduce code clutter
- #768: Removed custom Find*.cmake scripts for BLAS, LAPACK, GLEW, and GSL, now using CMake's built-in modules
- #793: Removed unused
EIGEN_HAS_STATIC_ARRAY_TEMPLATEmacro to clean up the codebase - #855: Removed unused macros related to
prsqrtimplementation, improving code clarity and maintainability - #897: Removed obsolete gcc 4.3 copy_bool workaround in testsuite
- #1080: Removed an unused typedef to improve code clarity and maintainability
- #1092: Removed references to M_PI_2 and M_PI_4 constants from Eigen codebase
- #1200: Remove custom implementations of
equal_toandnot_equal_nono longer needed in C++14 - #1306: Removed last remaining instances of unused
HasHalfPacketenum - #1474: Removes the Skyline module due to long-standing build issues and lack of tests
- #1475: Removed
MoreVectorizationfeature, relocatingpasinimplementation toGenericPacketMathto reduce code complexity and potential ODR violations - #1477: Removed obsolete relicense script to streamline codebase
- #739: Disabled tests for GCC-4.8 to facilitate transition to C++14
- #606: Removal of Sparse Dynamic Matrix from library API
- #704: Removed problematic
take<n, numeric_list<T>>implementation to resolve g++-11 compiler crash - #1423: Adds static assertions to Tensor constructors to validate tensor dimension compatibility at compile-time
- #327: Reimplemented Tensor stream output with new predefined formats and improved IO functionality
- #534: Introduces preliminary HIP bfloat16 GPU support for AMD GPUs
- #577: Introduces IDR(s)STAB(l) method, a new iterative solver for sparse matrix problems combining features of IDR(s) and BiCGSTAB(l)
- #612: Adds support for EIGEN_TENSOR_PLUGIN, EIGEN_TENSORBASE_PLUGIN, and EIGEN_READONLY_TENSORBASE_PLUGIN in tensor classes
- #622: Renamed existing Tuple class to Pair and introduced a new Tuple class for improved device compatibility
- #623: Introduces device-compatible Tuple implementation for GPU testing, addressing compatibility issues with std::tuple
- #625: Introduced new GPU test utilities with flexible kernel execution functions for CPU and GPU environments
- #676: Improved accuracy of full tensor reduction for half and bfloat16 types using tree summation algorithm
- #681: Prevents integer overflows in EigenMetaKernel indexing for CUDA tensor operations
- #1125: Adds synchronize method to all device types, improving device operation consistency and flexibility
- #1265: Vectorize tensor.isnan() using typed predicates with performance optimizations for AVX512
- #1287: Fixed potential crash in tensor contraction with empty tensors by removing restrictive assert
- #1627: Added
.roll()function for circular shifts in Tensor module, enabling NumPy/TensorFlow-like tensor rotation capabilities - #1828: Enhances TensorRef implementation with improved type handling and immutability enforcement
- #1848: Cleaned up and improved TensorDeviceThreadPool implementation with method removals, enhanced C++20 compatibility, and simplified type erasure
- #653: Disabled specific HIP subtests that fail due to non-functional device side malloc/free
- #671: Fixed GPU special function tests by correcting checks and updating verification methods
- #679: Disabled Tree reduction for GPU to resolve memory errors and improve GPU operation stability
- #695: Fix compilation compatibility issue with older Boost versions in boostmultiprec test
- #705: Fixes TensorReduction test warnings and improves sum accuracy error bound calculation
- #715: Fixed failing test for tensor reduction by improving error bound comparisons
- #723: Fixed off-by-one error in tensor broadcasting affecting packet size handling
- #730: Fixed stride computation for indexed views with non-Eigen index types to prevent potential signed integer overflow
- #755: Fixed leftover else branch in unsupported code
- #770: Fixed customIndices2Array function to correctly handle the first index in tensor module
- #853: Resolved ODR failures in TensorRandom component to improve code stability and reliability
- #894: Fixed tensor executor test and added support for tensor packets of size 1
- #898: Fixed zeta function edge case for large inputs, preventing NaN and overflow issues
- #902: Temporarily disabled aarch64 CI due to unavailable Windows on Arm machines
- #1001: Fixed build compatibility for f16/bf16 Bessel function specializations on AVX512 for older compilers
- #1123: Fix reshaping strides handling for inputs with non-zero inner stride in Eigen's Tensor module
- #1159: Re-added missing header to restore GPU test functionality
- #1227: Fixed null placeholder accessor issue in Reduction SYCL test to prevent segmentation faults
- #1237: Fixed GPU conv3d out-of-resources failure by adjusting 32-bit integer variable handling in kernel
- #1243: Fixed tensor comparison test in unsupported module
- #1355: Disable FP16 arithmetic for arm32 to prevent compatibility issues with Clang compiler
- #1410: Fix integer overflow in
div_ceilfunction preventingcxx11_tensor_gpu_1test from passing - #1435: Protect kernel launch syntax from unintended clang-format modifications that cause syntax errors
- #1453: Fixed memory management issues in
TensorForcedEvalby usingshared_ptrto prevent double-free and invalid memory access errors - #1516: Fixed GPU build compatibility for
ptanh_floatfunction - #1575: Fix
long doublerandom number generation fallback mechanism - #1596: Resolved unused variable warnings in TensorIO component
- #1597: Fix enum comparison warnings in Autodiff module
- #1599: Fixed PPC runner cross-compilation attempt by preventing non-PPC target compilations
- #1678: Fixed
Wmaybe-uninitializedwarning in TensorVolumePatchOp by introducingunreachable()function - #1698: Fixed implicit conversion issues in TensorChipping module
- #1721: Fixes compilation issue with
EIGEN_ALIGNED_ALLOCAfor nvc++ compiler by replacing unsupported__builtin_alloca_with_align - #1836: Fixed compiler warning by adding explicit copy constructor to TensorRef class
- #1840: Fixed boolean scatter and random generation issues in tensor operations, improving reliability and test coverage
- #1847: Removed extra semicolon in
DeviceWrapper.hto fix compilation warnings
- #543: Improved PEP8 compliance and formatting in GDB pretty printer for better code readability
- #616: Disabled CUDA Eigen::half host-side vectorization for compatibility with pre-CUDA 10.0 versions
- #619: Improved documentation for unsupported sparse iterative solvers
- #645: Introduced default constructor for eigen_packet_wrapper to simplify memory operations
- #669: Optimized tensor_contract_gpu test by reducing contractions to improve test performance on Windows
- #678: Reorganized CUDA/Complex.h to GPU/Complex.h and removed deprecated TensorReductionCuda.h header
- #724: Improved TensorIO compatibility with TensorMap containing const elements
- #896: Removed ComputeCpp-specific code from SYCL Vptr, improving compatibility and performance
- #942: Fixed navbar scroll behavior with table of contents by overriding Doxygen JavaScript
- #982: Resolved ambiguities in Tensor comparison operators for C++20 compatibility
- #1005: Re-enabled unit tests for device side malloc in ROCm 5.2
- #1119: Added brackets around unsigned type names to improve code readability and consistency
- #1341: Replaced
CudaStreamDevicewithGpuStreamDevicein tensor GPU benchmarks for improved accuracy - #1406: Replaced deprecated
divupwithdiv_ceilin TensorReduction to reduce warnings - #1441: Improved clang-format CI configuration to operate in non-interactive mode and ensure proper installation
- #1466: Refined assertions for chipping operations in Tensor module, removing dimension checks and improving efficiency
- #1479: Corrected markdown formatting in Eigen::Tensor README.md for improved documentation readability
- #1509: Renamed
generic_fast_tanh_floattoptanh_floatfor improved code clarity and maintainability - #1645: Explicitly capture
thisin lambda expressions in Tensor module to prevent compiler warnings and improve code clarity - #1653: Corrected numerous typographical errors across Eigen's documentation and codebase to improve readability
- #1680: Enhances TensorChipping by detecting "effectively inner/outer" chipping with stride optimization
- #1767: Update ROCm Docker image to Ubuntu 22.04 for improved stability and reliability
- #1768: Update ROCm Docker image to Ubuntu 24.04 to address Ninja crashing issue
- #1770: Experimental Alpine Docker base image for CI to potentially improve build efficiency
- #1771: Updated deployment job to enhance efficiency and workflow reliability
- #1772: Update git clone strategy to improve branch setup and repository management
- #1849: Formatted
TensorDeviceThreadPool.hand improved code usingif constexprfor C++20
- #607: Added flowchart to help users select sparse iterative solvers in unsupported module
- #624: Introduced
Serializer<T>class for binary serialization, enhancing GPU testing data transfer capabilities - #798: Adds a Non-Negative Least Squares (NNLS) solver to Eigen's unsupported modules using an active-set algorithm
- #973: Added
.arg()method to Tensor class for retrieving indices of max/min values along specified dimensions
- #637: Removes obsolete DynamicSparseMatrix references and typographical errors in unsupported directory