redis
/

langcache-embed-v3

@@ -13,8 +13,8 @@ tags:
 - reranking
 - generated_from_trainer
 - dataset_size:483820
-- loss:OnlineContrastiveLoss
-base_model: redis/langcache-embed-v3
 widget:
 - source_sentence: 'See Precambrian time scale # Proposed Geologic timeline for another
     set of periods 4600 -- 541 MYA .'
@@ -87,40 +87,40 @@ model-index:
       type: test
     metrics:
     - type: cosine_accuracy
-      value: 0.7213201979974525
       name: Cosine Accuracy
     - type: cosine_accuracy_threshold
-      value: 0.8022605776786804
       name: Cosine Accuracy Threshold
     - type: cosine_f1
-      value: 0.7271285588186892
       name: Cosine F1
     - type: cosine_f1_threshold
-      value: 0.7352645397186279
       name: Cosine F1 Threshold
     - type: cosine_precision
-      value: 0.6076617238878748
       name: Cosine Precision
     - type: cosine_recall
-      value: 0.9050651769087523
       name: Cosine Recall
     - type: cosine_ap
-      value: 0.6862317912116369
       name: Cosine Ap
     - type: cosine_mcc
-      value: 0.47517110412821156
       name: Cosine Mcc
 ---
 # Redis fine-tuned BiEncoder model for semantic caching on LangCache
-This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [redis/langcache-embed-v3](https://huggingface.co/redis/langcache-embed-v3) on the [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v1) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.
 ## Model Details
 ### Model Description
 - **Model Type:** Sentence Transformer
-- **Base model:** [redis/langcache-embed-v3](https://huggingface.co/redis/langcache-embed-v3) <!-- at revision 61fe45c8c36be54130d284e132537524a1066dde -->
 - **Maximum Sequence Length:** 100 tokens
 - **Output Dimensionality:** 768 dimensions
 - **Similarity Function:** Cosine Similarity
@@ -173,9 +173,9 @@ print(embeddings.shape)
 # Get the similarity scores for the embeddings
 similarities = model.similarity(embeddings, embeddings)
 print(similarities)
-# tensor([[1.0000, 0.9961, 0.1328],
-#         [0.9961, 1.0000, 0.1235],
-#         [0.1328, 0.1235, 0.9961]], dtype=torch.bfloat16)
 ```
 <!--
@@ -213,14 +213,14 @@ You can finetune this model on your own dataset.
 | Metric                    | Value      |
 |:--------------------------|:-----------|
-| cosine_accuracy           | 0.7213     |
-| cosine_accuracy_threshold | 0.8023     |
-| cosine_f1                 | 0.7271     |
-| cosine_f1_threshold       | 0.7353     |
-| cosine_precision          | 0.6077     |
-| cosine_recall             | 0.9051     |
-| **cosine_ap**             | **0.6862** |
-| cosine_mcc                | 0.4752     |
 <!--
 ## Bias, Risks and Limitations
@@ -254,7 +254,14 @@ You can finetune this model on your own dataset.
   | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>  | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>            | <code>1</code> |
   | <code>After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .</code> | <code>Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .</code> | <code>1</code> |
   | <code>The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 .</code>        | <code>The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon .</code>                      | <code>1</code> |
-* Loss: [<code>OnlineContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#onlinecontrastiveloss)
 ### Evaluation Dataset
@@ -274,12 +281,19 @@ You can finetune this model on your own dataset.
   | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>  | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>            | <code>1</code> |
   | <code>After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .</code> | <code>Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .</code> | <code>1</code> |
   | <code>The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 .</code>        | <code>The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon .</code>                      | <code>1</code> |
-* Loss: [<code>OnlineContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#onlinecontrastiveloss)
 ### Training Logs
 | Epoch | Step | test_cosine_ap |
 |:-----:|:----:|:--------------:|
-| -1    | -1   | 0.6862         |
 ### Framework Versions

 - reranking
 - generated_from_trainer
 - dataset_size:483820
+- loss:MultipleNegativesSymmetricRankingLoss
+base_model: Alibaba-NLP/gte-modernbert-base
 widget:
 - source_sentence: 'See Precambrian time scale # Proposed Geologic timeline for another
     set of periods 4600 -- 541 MYA .'
       type: test
     metrics:
     - type: cosine_accuracy
+      value: 0.7036165169861821
       name: Cosine Accuracy
     - type: cosine_accuracy_threshold
+      value: 0.8524742126464844
       name: Cosine Accuracy Threshold
     - type: cosine_f1
+      value: 0.7123780174627633
       name: Cosine F1
     - type: cosine_f1_threshold
+      value: 0.8120777606964111
       name: Cosine F1 Threshold
     - type: cosine_precision
+      value: 0.5992426552810817
       name: Cosine Precision
     - type: cosine_recall
+      value: 0.8781750465549348
       name: Cosine Recall
     - type: cosine_ap
+      value: 0.6475972261979004
       name: Cosine Ap
     - type: cosine_mcc
+      value: 0.44221965028695587
       name: Cosine Mcc
 ---
 # Redis fine-tuned BiEncoder model for semantic caching on LangCache
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) on the [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v1) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.
 ## Model Details
 ### Model Description
 - **Model Type:** Sentence Transformer
+- **Base model:** [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) <!-- at revision e7f32e3c00f91d699e8c43b53106206bcc72bb22 -->
 - **Maximum Sequence Length:** 100 tokens
 - **Output Dimensionality:** 768 dimensions
 - **Similarity Function:** Cosine Similarity
 # Get the similarity scores for the embeddings
 similarities = model.similarity(embeddings, embeddings)
 print(similarities)
+# tensor([[0.9922, 0.9922, 0.5352],
+#         [0.9922, 0.9961, 0.5391],
+#         [0.5352, 0.5391, 1.0000]], dtype=torch.bfloat16)
 ```
 <!--
 | Metric                    | Value      |
 |:--------------------------|:-----------|
+| cosine_accuracy           | 0.7036     |
+| cosine_accuracy_threshold | 0.8525     |
+| cosine_f1                 | 0.7124     |
+| cosine_f1_threshold       | 0.8121     |
+| cosine_precision          | 0.5992     |
+| cosine_recall             | 0.8782     |
+| **cosine_ap**             | **0.6476** |
+| cosine_mcc                | 0.4422     |
 <!--
 ## Bias, Risks and Limitations
   | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>  | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>            | <code>1</code> |
   | <code>After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .</code> | <code>Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .</code> | <code>1</code> |
   | <code>The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 .</code>        | <code>The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon .</code>                      | <code>1</code> |
+* Loss: [<code>MultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim",
+      "gather_across_devices": false
+  }
+  ```
 ### Evaluation Dataset
   | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>  | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>            | <code>1</code> |
   | <code>After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .</code> | <code>Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .</code> | <code>1</code> |
   | <code>The 12F was officially homologated on August 21 , 1929 and exhibited at the Paris Salon in 1930 .</code>        | <code>The 12F was officially homologated on 21 August 1929 and displayed at the 1930 Paris Salon .</code>                      | <code>1</code> |
+* Loss: [<code>MultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim",
+      "gather_across_devices": false
+  }
+  ```
 ### Training Logs
 | Epoch | Step | test_cosine_ap |
 |:-----:|:----:|:--------------:|
+| -1    | -1   | 0.6476         |
 ### Framework Versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:627f9ffe07353c104220c7a28fb93065ada60272b4d79f87e265243e468bf837
 size 298041696

 version https://git-lfs.github.com/spec/v1
+oid sha256:95d02211c4cca89113f9f3e93ed91f5176bf50170faa2cb835f7bfea15bb9dd2
 size 298041696

tokenizer_config.json CHANGED Viewed

@@ -938,15 +938,9 @@
     "input_ids",
     "attention_mask"
   ],
-  "model_max_length": 100,
-  "pad_to_multiple_of": null,
   "pad_token": "[PAD]",
-  "pad_token_type_id": 0,
-  "padding_side": "right",
   "sep_token": "[SEP]",
-  "stride": 0,
   "tokenizer_class": "PreTrainedTokenizerFast",
-  "truncation_side": "right",
-  "truncation_strategy": "longest_first",
   "unk_token": "[UNK]"
 }

     "input_ids",
     "attention_mask"
   ],
+  "model_max_length": 1000000000000000019884624838656,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "tokenizer_class": "PreTrainedTokenizerFast",
   "unk_token": "[UNK]"
 }