ChiDotPhi
Mostly Theoretical Physics stuff

Jax tutorial 3: PMAP and VMAP

Jax tutorial 2: Grads

Jax tutorial 1: Arrays

Weekly AI paper - 4/7/25 - LLAMA4, Multi token attention

Weekly AI paper review - 3/28/25 - Large Memory module, LongRope2, MOBA block attention

Weekly AI paper review - 2/14/25 - S1 Test time scaling, SMOLLM2

Weekly AI paper review - 2/7/25 - RL generalizes, Titans: Learning to Memorize at test time.

Weekly AI paper review - 1/23/25 - Transformers^2, Tensor Product Attention, Deepseek-R1

VSCode environment and extensions for Python Development

Weekly AI paper overview - 12/23/2024 - Deepseekv3, Memory layers

6. Final Words

5. Training the MLLM

Weekly AI paper overview - 12/23/2024 - ModernBERT, No More Adam

Weekly AI paper overview - 12/19/24 - Latent Space Reasoning , Byte Latent Transformer

5. Dataloader for MLLM

4. Writing MLLM from scratch

Torchtune: A new finetuning library for LLMs

3. Vision base - Clip-14

Weekly AI paper overview- 10/29/24 - Pixtral, Differential Transformers

Weekly AI paper overview- 10/1/24 - LLAMA3.2, Self correction using RL

Part 4 : Trainer

Part 3: Training Script

Part 1 : OLMO Paper

Part 2 : The model definition

Weekly AI paper overview - 7/25/2024

Weekly AI paper overview- 7/7/24

Weekly AI paper overview- 6/18/24

Weekly AI paper overview- 6/9/24

2. LLM base - Phi3

Weekly AI paper overview- 5/9/24