You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Standalone Asynchronous RolmOCR Inference Script using vLLM and PyMuPDF.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Alternate Attention Mechanisms for Sequence Modeling (2023–2025)
Transformer-style self-attention has been central to recent advances in language modeling, but its $\mathcal{O}(L^2)$ complexity (for sequence length $L$) motivates research into more efficient alternate attention mechanisms. This report surveys state-of-the-art methods from 2023–2025 that replace or augment standard self-attention in language sequence models. We organize methods by broad families – from linear approximations and sparsity-based variants to convolutional, state-space, and recurrent mechanisms – outlining each method’s motivation, technical formulation, empirical performance on language tasks, and efficiency characteristics.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LayerNorm Scaling implementation to mitigate the Curse of Depth in LLMs.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CLI utility to quickly inspect the latest scalar values from TensorBoard logs.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters