T-Retriever: Hierarchical Graph Retrieval via Semantic-Structural Entropy
Автор: Brahmagupta
Загружено: 2026-02-19
Просмотров: 0
Описание:
Paper: https://arxiv.org/pdf/2601.04945v1
Notes:
Formulates graph RAG as top-down tree retrieval, fixing rigid compression quotas and semantic-topological disconnects in standard community detection algorithms.
*Semantic-Structural Entropy (S2-Entropy):* Joint objective unifying graph topology and node semantics.
Evaluates structural entropy via inter-cluster edge volume ratios. Calculates semantic density entropy using Kernel Density Estimation on node embeddings. - Balances semantic coherence against topological connectivity via scaling hyperparameter lambda.
*Adaptive Compression Encoding:* Executes learning-free, top-down recursive partitioning inspired by Shannon-Fano coding.
Replaces bottom-up heuristic merging with global S2-Entropy minimization. Preserves multi-resolution cross-layer dependencies. - Iteratively splits node sets into child partitions minimizing joint entropy until reaching singleton leaves or max tree depth.
*PRUNE Operation:* Fixes height violations by selectively removing internal nodes that trigger the lowest entropy increase.
*REGULATE Operation:* Corrects structural imbalances by inserting buffer nodes when parent-child depth difference exceeds one. Mathematically preserves S2-Entropy.
Generates node representations. Leaf nodes retain raw textual attributes. Non-leaf nodes synthesize LLM-generated summaries from aggregated child node/edge attributes.
Maps summaries to d-dimensional vectors using shared Language Model. Loads vectors into Approximate Nearest Neighbor (ANN) index for log-time search.
*Inference Routing:* Embeds user query and executes flat Top-K similarity search across entire encoding tree. Captures multi-resolution context by treating all tree levels uniformly.
Extracts local subgraphs for retrieved tree nodes. Merges extracted networks into unified context subgraph.
Passes merged subgraph through GNN encoder. Textualizes topology and feeds pooled graph embeddings alongside text into final generator LLM.
*Catalytic Effect Heuristic:* High lambda thresholds intentionally force semantically similar but topologically distant nodes into shared clusters. Organically pulls in structural bridging nodes, preventing retrieval fragmentation.
Disclaimer: This is an AI-powered production. The scripts, insights, and voices featured in this podcast are generated entirely by Artificial Intelligence models. While we strive for technical accuracy by grounding our episodes in original research papers, listeners are encouraged to consult the primary sources for critical applications.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: