Code Retrieval

#2
by kevinlu1248 - opened

Is this model optimized for code retrieval? What about text-to-code retrieval?

This model was pre-trained with the standard BERT objectives (MLM+NSP), so it needs to be fine-tuned before being used for retrieval.

However, in preliminary experiments, we've found it to work kind of ok in theses tasks even without fine-tuning. Maybe this can be useful if you want to try it yourself: https://github.com/bigcode-project/bigcode-encoder/blob/master/embedding_sandbox.ipynb

Sign up or log in to comment