How do LLMs get task-specific skills? | Day 5/30 Casual AI Talks

Автор: Roham Koohestani

Загружено: 2025-09-30

Просмотров: 6

Описание: LinkedIn repost   / rohamkoohestani

——————

How do LLMs get task-specific skills? Fine-tuning & transfer learning.

Day 5/30 – Casual AI Talks

Pre-training gives models broad “general knowledge.” Fine-tuning is the last mile: we adapt that general model to your domain, task, and style, often with surprisingly little data.

The classic route (full fine-tuning)

Traditionally, one would start from a pre-trained model, add a small task head, and update all weights. ULMFiT’s (see resources) still-useful recipe was to use (1) Discriminative LRs (lower for early layers, higher for later) (2) Slanted triangular schedule (warm-up then cool-down) and (3) Gradual unfreezing (unlock layers top-down to avoid forgetting)

BERT popularized this simple “pre-train once, adapt many times” pattern across benchmarks. When full FT is overkill, you might want to consider PEFT (Parameter-Efficient Fine-Tuning). Three very common types of PEFT are (1) Adapters: insert tiny bottleneck layers; train only those. Near full-FT quality with a few percent extra parameters. You can then swap adapters per task or client. (2) LoRA: keep base weights frozen, learn low-rank “nudges” inside linear layers. Orders-of-magnitude fewer trainable params, strong quality-to-cost tradeoff. Finally (3) Prompt/Prefix/Soft-prompt tuning: learn a small set of embeddings that steer the frozen model. Shines with very large backbones and minimal storage.

There’s also this entire trajectory of Domain-adaptive pre-training (DAPT/TAPT). Essentially, Before supervised FT, run an extra round of self-supervised pre-training on your in-domain text or code. The model “soaks in” your distribution, making the final supervised step easier. This is especially effective especially when labeled data are scarce.

How to choose (quick playbook)
Do you have Lots of labels + compute? Full fine-tune is simple and strong.
Do you have many variants per customer/domain? Adapters or LoRA → one backbone, many lightweight add-ons.
Very little labeled data but plenty of unlabeled text/code? Do DAPT/TAPT, then a light FT or PEFT.
Tight on storage/ops? Prompt/Prefix tuning keeps artifacts tiny.

While doing Fine-tuning keep this into account:
Forgetting & overfitting: use gradual unfreezing, validation across general skills, and early stopping.
Memorization & privacy: de-identify training data, audit outputs, and avoid feeding sensitive content.
Evaluation drift: test on in-domain AND general benchmarks; track regressions over time.

Why does this even matter for AI4SE
Pre-train on code+docs gives API idioms and project structure “for free.”
You could think about using LoRA or adapters per repo/team for completion style, test scaffolds, commit messages.
You can use DAPT on internal code to make use of repo-aware suggestions, refactoring assistance, and issue triage; all this, without maintaining a separate full model per project.

⸻

Resources
Papers:
Howard & Ruder (2018), ULMFiT — Universal Language Model Fine-tuning for Text Classification https://arxiv.org/abs/1801.06146
Houlsby et al. (2019), Parameter-Efficient Transfer Learning for NLP (Adapters) https://arxiv.org/abs/1902.00751
Hu et al. (2021), LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685

Blogs:
Hugging Face, PEFT (Parameter-Efficient Fine-Tuning) Overview https://huggingface.co/docs/peft/index
AdapterHub, Adapters for Transformers (not really only a blog but also has a blog :) ) https://adapterhub.ml/

Video:
Fine-tuning LLMs (by Shaw Talebi)    • Fine-tuning Large Language Models (LLMs) |...

#AI #LLMs #TransferLearning #FineTuning #PEFT #Adapters #LoRA #MachineLearning #ArtificialIntelligence #AI4SE

The YouTube playlist    • Casual AI talks

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

How do LLMs get task-specific skills? | Day 5/30 Casual AI Talks

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео