Spaces:

gsarti
/

pecore

Running on Zero

App Files Files Community

gsarti commited on Mar 20

Commit

f75ca7e

•

1 Parent(s): 449ac0a

Added FAQ

Browse files

Files changed (2) hide show

app.py +3 -0
contents.py +15 -0

app.py CHANGED Viewed

@@ -17,6 +17,7 @@ from contents import (
     subtitle,
     title,
     powered_by,
 )
 from gradio_highlightedtextbox import HighlightedTextbox
 from gradio_modal import Modal
@@ -493,6 +494,8 @@ with gr.Blocks(css=custom_css) as demo:
     with gr.Tab("🔧 Usage Guide"):
         gr.Markdown(how_to_use)
         gr.Markdown(example_explanation)
     with gr.Tab("📚 Citing PECoRe"):
         gr.Markdown("To refer to the PECoRe framework for context usage detection, cite:")
         gr.Code(pecore_citation, interactive=False, label="PECoRe (Sarti et al., 2024)")

     subtitle,
     title,
     powered_by,
+    faq,
 )
 from gradio_highlightedtextbox import HighlightedTextbox
 from gradio_modal import Modal
     with gr.Tab("🔧 Usage Guide"):
         gr.Markdown(how_to_use)
         gr.Markdown(example_explanation)
+    with gr.Tab("❓ FAQ"):
+        gr.Markdown(faq)
     with gr.Tab("📚 Citing PECoRe"):
         gr.Markdown("To refer to the PECoRe framework for context usage detection, cite:")
         gr.Code(pecore_citation, interactive=False, label="PECoRe (Sarti et al., 2024)")

contents.py CHANGED Viewed

@@ -45,6 +45,7 @@ example_explanation = """
 <p>Consider the following example, showing inputs and outputs of the <a href='https://huggingface.co/gsarti/cora_mgen' target='_blank'>CORA Multilingual QA</a> model provided as default in the interface, using default settings.</p>
 <img src="file/img/pecore_ui_output_example.png" width=100% />
 <p>The PECoRe CTI step identified two context-sensitive tokens in the generation (<code>287</code> and <code>,</code>), while the CCI step associated each of those with the most influential tokens in the context. It can be observed that in both cases the matching tokens stating the number of inhabitants are identified as salient (<code>,</code> and <code>287</code> for the generated <code>287</code>, while <code>235</code> is also found salient for the generated <code>,</code>). In this case, the influential context found by PECoRe is lexically equal to the generated output, but in principle better LMs might not use their inputs verbatim, hence the interest for using model internals with PECoRe.</p>
 <h2>Usage tips</h3>
 <ol>
     <li>The <code>📂 Download output</code> button allows you to download the full JSON output produced by the Inseq CLI. It includes, among other things, the full set of CTI and CCI scores produced by PECoRe, tokenized versions of the input context and generated output and the full arguments used for the CLI call.</li>
@@ -62,6 +63,20 @@ show_code_modal = """
 <p>The snippets provided below are updated based on the current parameter configuration of the demo, and allow you to use Python and Shell code to call the Inseq CLI. <b>We recommend using the Python version for repeated evaluation, since it allows for model-preloading.</b></p>
 """
 pecore_citation = """@inproceedings{sarti-etal-2023-quantifying,
     title = "Quantifying the Plausibility of Context Reliance in Neural Machine Translation",
     author = "Sarti, Gabriele and

 <p>Consider the following example, showing inputs and outputs of the <a href='https://huggingface.co/gsarti/cora_mgen' target='_blank'>CORA Multilingual QA</a> model provided as default in the interface, using default settings.</p>
 <img src="file/img/pecore_ui_output_example.png" width=100% />
 <p>The PECoRe CTI step identified two context-sensitive tokens in the generation (<code>287</code> and <code>,</code>), while the CCI step associated each of those with the most influential tokens in the context. It can be observed that in both cases the matching tokens stating the number of inhabitants are identified as salient (<code>,</code> and <code>287</code> for the generated <code>287</code>, while <code>235</code> is also found salient for the generated <code>,</code>). In this case, the influential context found by PECoRe is lexically equal to the generated output, but in principle better LMs might not use their inputs verbatim, hence the interest for using model internals with PECoRe.</p>
+<p>"Why wasn't <code>235</code> found as context-sensitive, when it intuitively is?" you might ask. In this case, it's due to the generation being quite short, which makes its CTI score less salient than those of other tokens. The permissivness of result selection is an adjustable parameter (see points below).</p>
 <h2>Usage tips</h3>
 <ol>
     <li>The <code>📂 Download output</code> button allows you to download the full JSON output produced by the Inseq CLI. It includes, among other things, the full set of CTI and CCI scores produced by PECoRe, tokenized versions of the input context and generated output and the full arguments used for the CLI call.</li>
 <p>The snippets provided below are updated based on the current parameter configuration of the demo, and allow you to use Python and Shell code to call the Inseq CLI. <b>We recommend using the Python version for repeated evaluation, since it allows for model-preloading.</b></p>
 """
+faq = """
+<h2>❓ FAQ</h2>
+<p><b>Q: Why should I use PECoRe rather than <a href="https://docs.llamaindex.ai/en/stable/examples/query_engine/citation_query_engine.html" target="_blank">lexical/semantic matching</a>, <a href="https://arxiv.org/abs/2204.04991" target="_blank">NLI</a> or <a href="https://js.langchain.com/docs/use_cases/question_answering/citations" target="_blank">citation prompting</a> for attributing model generation?</b></p>
+<p>A: The main difference concerns <b>faithfulness</b>: all these techniques rely on different forms of surface-level matching to produce plausible citations, but do not guarantee that the model is actually using such information during generation. PECoRe does guarantee a variable degree of faithfulness to model inner workings, depending on the CTI/CCI metrics used.</p>
+<p><b>Q: Can PECoRe be used for my task?</b></p>
+<p>A: PECoRe is designed to be task-agnostic, and can be used with any generative language model for tasks where a division where a contextual component can clearly be identified in the input (e.g. retrieved paragraphs in RAG) or the output (e.g. reasoning steps in chain-of-thought prompting). The current Inseq implementation supports only text as a modality, but conceptually the PECoRe framework can easily be extended to attribute multimodal context components.</p>
+<p><b>Q: What are the main limitations of PECoRe?</b></p>
+<p>A: PECoRe is limited by the need for a present/absent context (either in the input or in the output) for contrastive comparison, and by the choice of parameters (especially results selection ones) that can require specific tuning for different models and tasks.</p>
+<br>
+<h3>⚙️ Technical matters</h3>
+<p><b>Q: Why is it important to separate <code>{context}</code> and <code>{current}</code> tags from other tokens with whitespace in input/output templates?</b></p>
+<p>A: Taking the default CORA template <code>&lt;Q&gt;: {current} &lt;P&gt;: {context}</code> as an example, the whitespace after <code>:</code> for both tags serves the purpose of ensuring that, when tokenized in isolation, the same token will be used in both cases. If this wasn't present, you might end up having e.g. <code>Test</code> for the full tokenization (as no whitespace precedes it) and <code>▁Test</code> for the partial one (as initial tokens are always prefixed with <code>▁</code> in SentencePiece). This might succeed but produce unexpected results if both option are tokenized with the same number of tokens, or fail altogether if the number of tokens for the space-prefixed and the spaceless version differs. Note that this is not necessary if the template includes simply the tag itself (e.g. <code>{current}</code>)</p>
+"""
 pecore_citation = """@inproceedings{sarti-etal-2023-quantifying,
     title = "Quantifying the Plausibility of Context Reliance in Neural Machine Translation",
     author = "Sarti, Gabriele and