FAI-Seminar
FAI-Seminar (International Seminar on Foundational Artificial Intelligence) 是一个以人工智能基础为主题的线上中文研讨班。在每一次研讨班中,会有一位讲者分享其近期的工作。欢迎大家来玩!
主题:人工智能基础(以机器学习理论为主,也有有趣的应用工作)
时间:每周五上午10:00 - 11:00 (北京时间)
参加方式:请关注公众号【人工智能基础研究】发送【FAI】加入微信群
官方账号:请关注B站 @FAI-Seminar收看录播/直播,微信公众号
语言:中文
官方手册: 基础信息; 观众须知; 讲者须知
最近新闻 / News!
Update: 2024.6.5
视频播放总数已破17万,感谢大家的支持!
第五轮talk正在筹备中!
第四轮全部 regular talk 已更新完毕,感谢大家的支持!
日程安排 / Schedule
2024 R02
Time |
Speaker |
Talk Title |
Talk Info |
Paper |
Video |
07/19 |
张博航 (北京大学) |
Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness |
Talk Info |
[1], [2], [3] |
B站 |
08/09 |
黎善达 (CMU) |
Inference Scaling Law of Large Language Models and Second-Prize Winning Solution of AIMO |
Talk Info |
[1], [2] |
B站 |
08/16 |
王天浩 (TTIC) |
Tractable training dynamics of transformers for in-context learning |
Talk Info |
[1], [2] |
B站 |
08/23 |
吴京风 (Berkeley) |
Reimaging Gradient Descent: Large Stepsize, Oscillation, and Acceleration |
Talk Info |
[1] |
|
08/30 |
马梓业 (港城大) |
Navigating the non-convex landscape via amplifying escape directions of saddle points |
Talk Info |
[1], [2], [3] |
|
2024 R01
Time |
Speaker |
Talk Title |
Talk Info |
Paper |
Video |
Special talk 05/31 |
李建 (清华大学) |
Generalization Error and Implicit Bias of Gradient Methods in Deep Learning |
Talk Info |
|
B站 |
03/08 |
翟润天 (CMU) |
On the Generalization of Representation Learning and Big Foundation Models |
Talk Info |
[1, 2] |
B站 |
03/15 |
罗胜杰 (北京大学) |
Enabling Efficient Equivariant Operations in the Fourier Basis via Gaunt Tensor Products |
Talk Info |
[1] |
B站 |
03/22 |
高天宇(Princeton) |
Long-Context Language Modeling with Parallel Context Encoding |
Talk Info |
|
B站 |
03/29 |
邹荻凡 (香港大学) |
Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo |
Talk Info |
[1] |
B站 |
04/05 |
陆一平 (NYU) |
Simulation-Calibrated Scientific Machine Learning |
Talk Info |
[1] |
B站 |
04/12 |
俞鼎力(Princeton) |
Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks |
Talk Info |
[1] |
B站 |
04/19 |
吕凯风(Princeton) |
Understanding the Limitations of Neural Networks on Algorithmic Reasoning |
Talk Info |
[1, 2] |
B站 |
04/26 |
李禹辰 (CMU) |
Towards Mathematical Understanding of Modern Language Models |
Talk Info |
[1, 2, 3, 4] |
B站 |
2023 R03
Time |
Speaker |
Talk Title |
Talk Info |
Paper |
Video |
Special Talk 2/16 |
胡威 (UMich) |
Hidden Structures in Neural Network Representations |
Talk Info |
[1, 2] |
B站 |
11/10 |
陈乐偲 (清华大学) |
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles |
Talk Info |
[1] |
B站 |
11/17 |
张博航 (北京大学) |
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective |
Talk Info |
[1] |
B站 |
11/24 |
顾欣然 (清华大学) |
A Quadratic Synchronization Rule for Distributed Deep Learning |
Talk Info |
[1] |
B站 |
12/1 |
石佳欣(DeepMind) |
MultiresConv: From Wavelet Theory to Long Context Modeling with Neural Networks |
Talk Info |
[1] |
B站 |
12/8 |
范凤磊 (香港中文 大学) |
In Pursuit of Deciphering ReLU Networks and Beyond |
Talk Info |
[1] |
B站 |
12/15 |
NeurIPS break |
12/22 |
刘冰彬 (CMU) |
Thinking Fast with Transformers: algorithmic reasoning with shortcuts |
Talk Info |
[1] (ICLR 23' oral), [2] (NeurIPS 23' spotlight) |
B站 |
12/29 |
温凯越 (清华大学) |
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars |
Talk Info |
[1] |
B站 |
1/12 |
游凯超 (清华大学) |
Understand, Learn, and Adopt the PyTorch compiler (torch.compile) |
Talk Info |
[1, 2, 3] |
B站 |
2023 R02
Time |
Speaker |
Talk Title |
Paper |
Video |
(Special)09/15 |
李志远 (Stanford) |
The Generalization Benefit of Flatnes Regularization |
[1][2] |
B站 |
06/23 |
张博航 (北京大学) |
Understanding the Expressivity of Subgraph-based GNNs for Graph Learning |
[1] |
B站 |
06/30 |
罗胜杰 (北京大学) |
One Transformer Can Understand Both 2D & 3D Molecular Data |
[1] |
B站 |
07/07 |
刘子鸣 (MIT) |
Intelligence from hunger |
[1], [2] |
B站 |
07/14 |
马鉴昊 (UMich) |
Robust Sparse Mean Estimation |
[1] |
B站 |
07/21 |
金及凯 (北京大学) |
Minimax optimal operator learning |
[1] |
B站 |
07/28 |
ICML break |
08/04 |
王博涵 (中国科学 技术大学) |
When and Why Momentum Accelerates SGD |
[1] |
B站 |
08/11 |
滕佳烨 (清华大学) |
Predictive inference with feature conformal prediction |
[1] |
B站 |
08/18 |
蔡天乐 (Princeton) |
Large Language Models as Tool Makers |
[1] |
B站 |
2023 R01
Time |
Speaker |
Talk Title |
Paper |
Video |
(Special) 05/26 |
张景昭 (清华大学) |
Two Phases of Scaling Laws for Nearest Neighbor Classifiers |
[1] |
B站 |
03/03 |
张鼎怀 (Mila) |
GFlowNets: Exploration for Probabilistic Inference |
[1],[2],[3],[4] |
B站 |
03/10 |
顾欣然 (清华大学) |
Why (and When) does Local SGD Generalize Better than SGD |
[1] |
B站 |
03/17 |
王博涵 (中国科学 技术大学) |
Provable Benefit of Adaptivity in ADAM |
[1] |
B站 |
03/24 |
温凯越 (清华大学) |
How Does Sharpness-Aware Minimization Minimize Sharpness? |
[1] |
B站 |
03/31 |
张博航 (北京大学) |
Rethinking the Expressive Power of GNNs via Graph Biconnectivity |
[1] (ICLR 2023 Outstanding Paper) |
B站 |
04/07 |
马鉴昊(UMich) |
Escaping Saddle Points Or Not? |
[1], [2] |
B站 |
04/14 |
陈乐偲 (复旦大学) |
On Bilevel Optimization without Lower-level Strong Convexity |
[1] |
B站 |
04/21 |
黄凯旋(Princeton) |
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data |
[1] |
B站 |
04/28 |
戴言 (清华大学) |
Variance-Aware Sparse Linear Bandits |
[1] |
B站 |
组织者 / Organizers
贡献者 / Contributors
特邀嘉宾 / Invited Guests
协办单位 / Universities
合作组织 / Partners
|