Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ChrisMcCormick
/
deepseek-tiny-mla-o-v0.1
like
1
Text Generation
Safetensors
wikitext
glue
English
deepseek_v3
transformer
attention
mla
research
output-subspace
License:
mit
Model card
Files
Files and versions
Community
main
deepseek-tiny-mla-o-v0.1
Ctrl+K
Ctrl+K
1 contributor
History:
3 commits
ChrisMcCormick
Adding patching code
c02b2f3
verified
10 days ago
.gitattributes
Safe
1.52 kB
initial commit
10 days ago
README.md
5.1 kB
Adding patching code
10 days ago
config.json
1.32 kB
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
example_usage.py
1.57 kB
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
generation_config.json
176 Bytes
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
merges.txt
Safe
456 kB
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
model.safetensors
67.8 MB
LFS
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
special_tokens_map.json
Safe
131 Bytes
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
tokenizer.json
Safe
3.56 MB
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
tokenizer_config.json
Safe
507 Bytes
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago
vocab.json
Safe
798 kB
Upload deepseek-tiny-mla-o-v0.1 model weights and documentation
10 days ago