UTC_HandyLab / interfaces /summarization.py
BioMike's picture
Upload 22 files
5571f23 verified
raw
history blame
No virus
5.81 kB
from typing import Dict, Union, List
from .base_pipeline import transformers_pipeline
import gradio as gr
text1 = """
"I recently purchased the Sony WH-1000XM4 Wireless Noise-Canceling Headphones from Amazon and I must say, I'm thoroughly impressed. The package arrived in New York within 2 days, thanks to Amazon Prime's expedited shipping.
The headphones themselves are remarkable. The noise-canceling feature works like a charm in the bustling city environment, and the 30-hour battery life means I don't have to charge them every day. Connecting them to my Samsung Galaxy S21 was a breeze, and the sound quality is second to none.
I also appreciated the customer service from Amazon when I had a question about the warranty. They responded within an hour and provided all the information I needed.
However, the headphones did not come with a hard case, which was listed in the product description. I contacted Amazon, and they offered a 10% discount on my next purchase as an apology.
Overall, I'd give these headphones a 4.5/5 rating and highly recommend them to anyone looking for top-notch quality in both product and service."""
text2 = """
Apple Inc. is an American multinational technology company headquartered in Cupertino, California. Apple is the world's largest technology company by revenue, with US$394.3 billion in 2022 revenue. As of March 2023, Apple is the world's biggest company by market capitalization. As of June 2022, Apple is the fourth-largest personal computer vendor by unit sales and the second-largest mobile phone manufacturer in the world. It is considered one of the Big Five American information technology companies, alongside Alphabet (parent company of Google), Amazon, Meta Platforms, and Microsoft.
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
Apple was founded as Apple Computer Company on April 1, 1976, by Steve Wozniak, Steve Jobs (1955–2011) and Ronald Wayne to develop and sell Wozniak's Apple I personal computer. It was incorporated by Jobs and Wozniak as Apple Computer, Inc. in 1977. The company's second computer, the Apple II, became a best seller and one of the first mass-produced microcomputers. Apple went public in 1980 to instant financial success. The company developed computers featuring innovative graphical user interfaces, including the 1984 original Macintosh, announced that year in a critically acclaimed advertisement called "1984". By 1985, the high cost of its products, and power struggles between executives, caused problems. Wozniak stepped back from Apple and pursued other ventures, while Jobs resigned and founded NeXT, taking some Apple employees with him.
"""
open_ie_examples = [
[
f"Summarize the given text, highlighting the most important information:",
text1,
0.5,
False
],
[
f"Summarize the given text, highlighting the most important information:",
text2,
0.5,
False
]]
def process(prompt, text, treshold=0.5, label = "summary"):
"""
Processes text by preparing prompt and adjusting indices.
Args:
text (str): The text to process
prompt (str): The prompt to prepend to the text
Returns:
list: A list of dicts with adjusted spans and scores
"""
# Concatenate text and prompt for full input
input_ = f"{prompt}\n{text}"
results = transformers_pipeline(input_) # Run NLP on full input
processed_results = []
prompt_length = len(prompt) # Get prompt length
for result in results:
# check whether score is higher than treshold
if result['score']<treshold:
continue
# Adjust indices by subtracting prompt length
start = result['start'] - prompt_length
# If indexes belongs to the prompt - continue
if start<0:
continue
end = result['end'] - prompt_length
# Extract span from original text using adjusted indices
span = text[start:end]
# Create processed result dict
processed_result = {
'span': span,
'start': start,
'end': end,
'score': result['score'],
'entity': label
}
processed_results.append(processed_result)
output = {"text": text, "entities": processed_results}
summary = " ".join(entity["span"] for entity in output["entities"])
return summary, output
with gr.Blocks(title="Open Information Extracting") as summarization_interface:
prompt = gr.Textbox(label="Prompt", placeholder="Enter your prompt here")
input_text = gr.Textbox(label="Text input", placeholder="Enter your text here")
threshold = gr.Slider(0, 1, value=0.3, step=0.01, label="Threshold", info="Lower the threshold to increase how many entities get predicted.")
output = [gr.Textbox(label="Summary"), gr.HighlightedText(label="Predicted Entities")]
submit_btn = gr.Button("Submit")
examples = gr.Examples(
open_ie_examples,
fn=process,
inputs=[prompt, input_text, threshold],
outputs=output,
cache_examples=True
)
theme=gr.themes.Base()
input_text.submit(fn=process, inputs=[prompt, input_text, threshold], outputs=output)
prompt.submit(fn=process, inputs=[prompt, input_text, threshold], outputs=output)
threshold.release(fn=process, inputs=[prompt, input_text, threshold], outputs=output)
submit_btn.click(fn=process, inputs=[prompt, input_text, threshold,], outputs=output)
if __name__ == "__main__":
summarization_interface.launch()