Please also find my publications on my Google scholar page or my DBLP page
* indicates equal contribution.
[Preprint 2025] "Attention with Trained Embeddings Provably Selects Important Tokens." Diyuan Wu*, Aleksandr Shevchenko*, Samet Oymak, Marco Mondelli. arxiv
[ICML 2025 spotlight] "Neural Collapse Beyond the Unconstrained Features Model: Landscape, Dynamics, and Generalization in the Mean-Field Regime." Diyuan Wu, Marco Mondelli. arxiv
[NeurIPS 2024] "The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information." Diyuan Wu, Ionut-Vlad Modoranu, Mher Safaryan, Denis Kuznedelev, Dan Alistarh. arxiv
[TMLR 2023] "Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence." Diyuan Wu, Vyacheslav Kungurtsev, Marco Mondelli. openreview
[ISIT 2021] "On conditional Sibson's \(\alpha\)-Mutual Information." Amedeo Roberto Esposito, Diyuan Wu, Michael Gastpar. link