SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs Paper • 2405.16325 • Published 16 days ago • 1 • 2
VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks Paper • 2405.15179 • Published 18 days ago • 2
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections Paper • 2405.20271 • Published 11 days ago • 2
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining Paper • 2406.02214 • Published 7 days ago • 2
SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors Paper • 2405.19597 • Published 12 days ago • 2
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters Paper • 2405.17604 • Published 14 days ago • 2
Parameter-Efficient Fine-Tuning with Discrete Fourier Transform Paper • 2405.03003 • Published May 5 • 2
NOLA: Networks as Linear Combination of Low Rank Random Basis Paper • 2310.02556 • Published Oct 4, 2023 • 2 • 2
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model Paper • 2405.20222 • Published 11 days ago • 9 • 1
CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers Paper • 2405.13195 • Published 20 days ago • 8 • 1
CAMELoT: Towards Large Language Models with Training-Free Consolidated Associative Memory Paper • 2402.13449 • Published Feb 21 • 2
Self-Selected Attention Span for Accelerating Large Language Model Inference Paper • 2404.09336 • Published Apr 14 • 2
Tackling the Unlimited Staleness in Federated Learning with Intertwined Data and Device Heterogeneities Paper • 2309.13536 • Published Sep 24, 2023 • 2
Frustratingly Simple Memory Efficiency for Pre-trained Language Models via Dynamic Embedding Pruning Paper • 2309.08708 • Published Sep 15, 2023 • 3 • 2
Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs Paper • 2405.15208 • Published 18 days ago • 2
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition Paper • 2405.14259 • Published 19 days ago • 1 • 2
Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding Paper • 2404.08698 • Published Apr 10 • 3
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models Paper • 2405.14366 • Published 19 days ago • 2
SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget Paper • 2404.04793 • Published Apr 7 • 2
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference Paper • 2405.12532 • Published 21 days ago • 2
Unlimiformer: Long-Range Transformers with Unlimited Length Input Paper • 2305.01625 • Published May 2, 2023 • 6 • 4
Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark Paper • 2404.16563 • Published Apr 25 • 1 • 2
4-bit Shampoo for Memory-Efficient Network Training Paper • 2405.18144 • Published 13 days ago • 5 • 2
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Paper • 2405.18377 • Published 13 days ago • 14 • 2
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice Paper • 2405.19313 • Published 12 days ago • 2
Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation Paper • 2305.13648 • Published May 23, 2023 • 2
Jina CLIP: Your CLIP Model Is Also Your Text Retriever Paper • 2405.20204 • Published 11 days ago • 26 • 1