Spaces:
Runtime error
Runtime error
#!/usr/bin/env python | |
# coding: utf-8 | |
import transformers | |
from transformers import pipeline, AutoModelForMaskedLM, AutoTokenizer | |
import gradio as gr | |
import torch | |
# List of xlmr(ish) models | |
name_list = [ | |
'AshtonIsNotHere/xlm-roberta-long-base-4096', | |
'markussagen/xlm-roberta-longformer-base-4096' | |
] | |
# List of interfaces to run in parallel | |
interfaces = [] | |
# Add models from list | |
for model_name in name_list: | |
model = AutoModelForMaskedLM.from_pretrained(model_name, max_length=4096) | |
tokenizer = AutoTokenizer.from_pretrained('xlm-roberta-base', max_length=4096, padding="max_length",truncation=True,) | |
p = pipeline("fill-mask", model=model, tokenizer=tokenizer) | |
interfaces.append(gr.Interface.from_pipeline(p, outputs=gr.outputs.Label(label=model_name))) | |
#Manually add xlmr base | |
xlmr_model = AutoModelForMaskedLM.from_pretrained('xlm-roberta-base', max_length=512) | |
xlmr_tokenizer = AutoTokenizer.from_pretrained('xlm-roberta-base', max_length=512, truncation=True,) | |
xlmr_p = pipeline("fill-mask", model=model, tokenizer=tokenizer) | |
def xlmr_base_fn(text): | |
# Find our masked token | |
tokens = xlmr_tokenizer.tokenize(text) | |
mask_token_idx = [i for i, x in enumerate(tokens) if xlmr_tokenizer.mask_token in x][0] | |
max_len = tokenizer.model_max_length | |
max_len = max_len-2 if max_len % 512 == 0 and max_len < 4096 else 510 | |
# Smart truncation for long sequences | |
if not len(tokens) < max_len: | |
# Find left and right bounds for truncated sequences | |
lbound = max(0, mask_token_idx-(max_len//2)) | |
rbound = min(len(tokens), mask_token_idx+(max_len//2)) | |
# If we hit an edge, expand sequence in the other direction | |
if lbound == 0 and rbound != len(tokens)-1: | |
rbound = min(len(tokens), max_len) | |
elif rbound == len(tokens) and lbound != 0: | |
lbound = max(0, len(tokens)-max_len) | |
# Apply truncation and rejoin tokens to form new text | |
truncated_text = ''.join(tokens[lbound:rbound]) | |
# Handle lowbar from xlmr tokenizer | |
truncated_text = ''.join([x if ord(x) != 9601 else ' ' for x in truncated_text]) | |
else: | |
truncated_text = text | |
preds = xlmr_p(truncated_text) | |
pred_dict = {} | |
for pred in preds: | |
pred_dict[pred['token_str']] = pred['score'] | |
return pred_dict | |
interfaces.append(gr.Interface(fn=xlmr_base_fn, inputs=gr.inputs.Textbox(lines=5, | |
placeholder="Choose an example below, or add your own text with a single masked word, using <mask>."), | |
outputs=gr.outputs.Label(label='xlm-roberta-base'))) | |
# Manually add longformer | |
p = pipeline("fill-mask", model='allenai/longformer-base-4096') | |
interfaces.append(gr.Interface.from_pipeline(p, outputs=gr.outputs.Label(label='allenai/longformer-base-4096'))) | |
gr.mix.Parallel(*interfaces, | |
title="Comparison of XLMR Longformer Models", | |
inputs=gr.inputs.Textbox(lines=5, placeholder="Choose an example below, or add your own text with a single masked word, using <mask>."), | |
description="Compares performance of four models: AshtonIsNotHere's xlm-r longformer, markussagen's xlm-r longformer-base, xlm-r base, and Longformer-base. \ | |
Notice that with the small sequences, Maskussagen XLM-R model and AshtonIsNotHere's XLM-R model perform identically. Note that, however with large \ | |
sequence length examples, Markussagen's model fails to return meaningful predictions. Disclaimer: xlm-r base truncates sequences longer than 512 tokens.", | |
examples=["They analyzed the <mask>, and Payne’s own, and found structure and repetition in the sounds, documenting a sonic hierarchy: units, phrases, and themes, which combined into what they called song.", | |
"In 1971, in the journal Science, two scientists, Roger S. Payne and Scott McVay, published a paper titled “Songs of Humpback Whales.” They began by noting how “during the quiet age of sail, under conditions of exceptional calm and proximity, whalers were occasionally able to hear the sounds of whales transmitted faintly through a wooden hull.” In the modern era, we could listen in new ways: Payne and McVay worked with underwater recordings of humpback-whale vocalizations from a naval researcher who, as the story goes, was listening for Soviet submarines off Bermuda. They analyzed the <mask>, and Payne’s own, and found structure and repetition in the sounds, documenting a sonic hierarchy: units, phrases, and themes, which combined into what they called song. They chose the term advisedly, drawing, they said, on a 1963 book titled “Acoustic Behavior of Animals,” which identified a song as “a series of notes, generally of more than one type, uttered in succession and so related as to form a recognizable sequence or pattern in time.” And there was an intuitive sense in which the whales’ vocalizations sounded songlike. The previous year, Payne had published an album of whale recordings called “Songs of the Humpback Whale”; it sold more than a hundred thousand copies, and became a soundtrack for the conservation movement. Artists, including Kate Bush, Judy Collins, and the cast of “The Partridge Family,” integrated whalesong into their work; in 1970, the composer Alan Hovhaness combined whale and orchestra for a piece called “And God Created Great Whales.” In 2014, a group of ambient composers and artists released a compilation album called “POD TUNE.” Whales’ otherworldly emissions are now literally otherworldly: in 1977, NASA included whalesong recordings on records it attached to its Voyager spacecraft. Sara Niksic, a biologist and musician from Croatia, is a recent participant in the genre. In 2019, she self-released an album of electronic music titled “Canticum Megapterae - Song of the Humpback Whale.” (Humpback whales belong to the genus Megaptera.) The album contains a track she produced, alongside songs by seven other artists, and combines psychedelic trance and ambient tones—the building blocks of a genre called psybient—with whalesong. Niksic’s record evokes nineteen-nineties classics such as “The Orb’s Adventures Beyond the Ultraworld”; its synthesized clicks, sweeps, and throbs would sound good in the chill-out room at a rave. But the whales add another dimension. Integrated into the tracks, the vocalizations sound at times soothing or playful, and occasionally experimental—sound for sound’s sake. Listening, you wonder about the minds behind them. Earlier this year, Niksic released “Canticum Megapterae II - The Evolution,” a remix album on which a new group of electronic musicians interprets the track she made for the first volume. The new album, she told me, connects to her own research, which focusses on how whale songs shift from year to year. “Basically, whales remix each other’s songs,” she said. “So I thought this concept of remixes in our music would be perfect to communicate this research about the evolution of whalesong.” Niksic was born in Split, Croatia, on the country’s coast, across the Adriatic Sea from Italy. She could see the water from her window, and learned to swim before she could walk. “I was always curious about the ocean and all the creatures living down there,” she told me. “The more I learned about animal behavior, the more I got interested in marine mammals, because there is social learning, vocal communication, and culture.” She earned a bachelor’s degree in biology and a master’s degree in marine biology at the University of Zagreb, and went on to work with groups that study whales and dolphins in Australia, New Zealand, and elsewhere; eventually she returned to Split to work at the Mediterranean Institute for Life Sciences, as part of a team called ARTScience, finding ways to creatively communicate the institute’s research. Humpback whales seem to produce sound largely with their vocal folds. Songs typically range in length from ten minutes to half an hour. All humpbacks make vocalizations, but only males sing; the songs are most commonly thought to act as mating displays, possibly like the bowers constructed by male bowerbirds or the dances performed by male peacock jumping spiders. Maybe, among humpbacks, “the best singer gets the ladies,” Niksic told me. Songs evolve over time, and differ across populations. This slow evolution can be occasionally interrupted by a kind of revolution, in which one population completely adopts the songs of another in a period of just a couple of years or less. “It’s like a new hit song,” Niksic said—a wide and rapid spread of creative content that’s “unparalleled in the animal kingdom, excepting humans.” She went on, “There’s so many similarities between their culture and ours.” Niksic started working at music festivals after graduate school. When she wasn’t in the field, she was bartending and building stages. She grew curious about producing her own electronic music. As a kid, she’d studied piano and music theory, but she didn’t know how to use software and synthesizers. After spending some time in 2016 helping to map the Great Pacific Garbage Patch, she took courses on electronic-music production. “Most of the time, I was dealing with sound, whether through bioacoustics or music festivals,” she recalled. “So then I thought, I want to try to combine these two things.” At first, Niksic planned to produce the entire album herself. This proved too ambitious a goal, so she enlisted musicians she’d met on the festival circuit, sending them a high-quality, twenty-minute whalesong recording that she’d analyzed for her master’s thesis. (Her adviser had gathered the recording in the Caribbean.) When Niksic put “Canticum Megapterae” online, under the stage name Inner Child, it quickly earned recognition from both music and science communities. Readers of the Web site psybient.org—a “daily source of chillout, psychill, psybient, ambient, psydub, dub, psystep, downtempo, world, ethnic, idm, meditative and other mind expanding music and events”—voted it compilation of the year. She won an Innovation Award from the University of St. Andrews, in Scotland, spoke at the World Marine Mammal Science Conference, in Barcelona, and appeared at the Boom Festival, in Portugal. Her own track, “Theme 7,” built a downtempo pattern around a long excerpt from the whale recording. Weaving around the snares, kicks, and low, grinding bass line, the whale sounds mournful, almost plaintive, and never strays far from the center of attention. I asked Niksic if she thinks about what a whale might be thinking when she listens to or composes with whalesong. “That’s a tricky one,” she said. “Who knows what the whale might be thinking? I’m focussing on sound. Their songs are really so musical. And the frequency range they use is crazy. And the richness of the sounds—it’s so intense. And it’s immersive—when I listen to it, I kind of transport into the ocean.” For the new remix album, Niksic sent “Theme 7” to different artists. One was particularly determined to accurately represent the whale songs. “He didn’t want any whales to think, What the hell is? What the hell did he do with our song?” Niksic said. Perhaps making an electronic whalesong album would be a kind of interspecies cultural appropriation. She was thrilled when Electrypnose, one of her favorite musicians, remixed her track; when she played the remix for the first time, it was “just the most magical night ever,” Niksic said. She was lying on her terrace by the sea, listening to the song, when dolphins swam near. “I’m not kidding you—I think they heard it,” she said. “They were hanging there for the entire night. I didn’t go to sleep. There was a full moon. I was staring at the sky, listening to dolphins breathing, and to this remix, and whales. So even, like, dolphins loved it, not only humans.” Making the albums has increased Niksic’s own curiosity about whalesong. “I started thinking of more and more questions,” she told me. “I probably wouldn’t think of all of them if I were only doing research.” Are there more innovative or creative whales, just as there are more innovative or creative humans? Are some whales eager to introduce changes into the songs they learn, whereas others happily stick with the originals? (“In our own culture, some artists are pioneers of new musical genres, and then others follow them,” she noted.) Do whales collaborate creatively? Does age play a role in innovation? Whale songs have become a familiar part of our own culture. But there’s still much that’s mysterious about them, including what drives change and imitation, and how various features influence potential mates and competitors. “There’s a whole other world below the waves that we don’t know anything about,” Niksic said. “There are other cultures that are much more ancient than our human culture. Whales were here long before humans, and they were singing long before we came. I think they are way more developed than us in some ways.” The music on her albums teaches us, among other things, just how much we have to learn.", | |
"La zona metropolitana de la <mask> está situada a unos 2.400 metros sobre el nivel del mar, en una cuenca rodeada de montañas y de un cinturón industrial altamente tóxico.", | |
"La contaminación acaba de forma prematura con la vida de 8.000 a 14.000 personas cada año en Ciudad de México. La capital del país vive sumergida en un aire que es nocivo para la salud incluso cuando los índices oficiales consideran que es aceptable. El altísimo nivel de concentración de ozono y de partículas finas expone a los citadinos a sufrir más enfermedades respiratorias y cardiovasculares, diabetes y cáncer. Hace solo una semana que la advertencia volvió a saltar en el Valle de México: era peligroso salir a la calle a respirar el aire del exterior. La zona metropolitana de la <mask> está situada a unos 2.400 metros sobre el nivel del mar, en una cuenca rodeada de montañas y de un cinturón industrial altamente tóxico. Se ha convertido en una caldera de contaminantes cada vez más difíciles de dispersar. En lo que va de 2022, se han declarado seis contingencias ambientales, la última a mitad de noviembre. Esta es una época menos usual para estos fenómenos que la llamada temporada seca caliente, antes de las lluvias de verano, pero no se consideran extraños. Según el registro histórico de contingencias, cada año sucede al menos una en estos meses fríos. “Se trata de un fenómeno de inversión térmica. Se da cuando empieza a hacer más frío, pero hay una capa superior de aire más caliente que crea una cápsula que impide que la contaminación se vaya al exterior”, explica la experta en calidad del aire Andrea Bizberg. Los sistemas de alta presión y las altas temperaturas completaron la envoltura del 12 de noviembre. La alarma de la contingencia suena cuando la concentración de ozono supera las 150 ppb (partes por billón), una cifra que sobrepasa con creces el máximo que permite la norma mexicana de 90 ppb y que triplica los 51 que recomienda la Organización Mundial de la Salud (OMS), es decir, la emergencia se despierta en la capital cuando la situación es extrema.El estallido da inicio al programa Hoy no circula —que prohíbe el paso de ciertos vehículos por la ciudad— como parte de la Fase I de la contingencia; en caso de que la concentración esté por encima de los 200 puntos se pasa a la Fase II, en la que también se suspenden las clases escolares y los eventos al aire libre.El ozono es un antioxidante muy potente que además de dolor de cabeza e irritación de ojos y garganta reduce la capacidad respiratoria, provoca inflamación y daña las paredes celulares de los pulmones. También impacta en la esperanza de vida. El máximo que se ha alcanzado este año en Ciudad de México es de 172 ppb y, hasta septiembre, 175 días de 2022 excedían el límite que marca la norma mexicana (NOM-020-SSA1-2021), actualizada en 2021 para acercarse un poquito más a los parámetros de la OMS. Bizberg, que es asesora técnica para Latinoamérica en Calidad del Aire en Cities For Climate, apunta que ante esa situación las medidas que se están aplicando son más paliativas que preventivas: “Impedimos circular a algunos coches cuando ya estamos inundados por la contaminación, pero necesitamos políticas que reduzcan las emisiones antes de que el aire se vuelva irrespirable”. La contingencia de noviembre acabó cómo suelen terminar este tipo de emergencias: los vientos y las lluvias se encargaron de disipar la contaminación. Por esa razón, Bizberg considera que ProAire, el plan anual de gestión atmosférica que engloba las políticas de la ciudad para reducir la contaminación, “no es suficientemente ambicioso”: “No hacemos lo suficiente y lo que nos salva son las condiciones meteorológicas favorables que tenemos de vez en cuando”. El ozono (O₃) se considera un contaminante criterio, es decir, que cuando está presente es porque también hay otros. Así, Ciudad de México tiene un fuerte problema de concentración de las llamadas partículas finas, que son las partículas en suspensión de menos de 10 micras de diámetro (PM₁₀) y de menos de 2,5 micras (PM₂,₅). La masa de estas últimas es minúscula, casi insignificante, su riesgo aparece cuando se acumulan debido a que entran por las vías respiratorias y se intercambian en el torrente sanguíneo. Una investigación de la Universidad de Montana (EE UU), en colaboración con la UNAM, encontró una asociación entre la concentración de partículas ultrafinas con la aparición del alzhéimer a temprana edad en Ciudad de México. Los resultados del estudio concluyeron que, en comparación con los niños que viven con aire limpio, los de la capital del país “exhiben inflamación sistémica, cerebral e intratecal, déficits de memoria de atención y corto plazo, y otras condiciones que indican que esta parte del cerebro es blanco de la contaminación”. Esta inflamación cerebral se vincula con deficiencias cognitivas como la memoria reciente y el desarrollo de marcadores del alzhéimer. El director de economía sectorial del Instituto Nacional de Ecología y Cambio Climático (INECC), Abraham Ortínez, reconoce que todo lo que no se hace en la parte preventiva para reducir las exposiciones de la población a los contaminantes se revierte en un costo mucho mayor para el sector salud. Ortínez apunta a que desde el Instituto —que pertenece al Gobierno de Ciudad de México— se están tratando de trabajar de forma más cercana a la Comisión Ambiental de la Megalópolis (CAMe) para armonizar los índices de calidad del aire y el protocolo de contingencia y ser más claros de “en qué momento hay riesgo”. “Hay que reducir emisiones. Esta ciudad está generando muchos gases de efecto invernadero, seguimos en la línea del auto particular, hay un uso excesivo de la motorización y, por otro lado, falta más transporte público, porque hay una saturación de las líneas. Debemos conjuntar esfuerzos”, apunta Ortínez. En la actualización de septiembre de 2021 de sus Guías de Calidad del Aire, 16 años después de la última revisión, la OMS redujo todavía más el límite de concentración de estas partículas. Sobre las PM₁₀ pasó de considerar aceptable un promedio al año de 20 microgramos por metro cúbico a solo 15. En México el umbral está hasta 36, es decir más del doble, pero la realidad es que la media en 2021 fue de 55 microgramos y en 2022, hasta septiembre, superaba ya los 42. El exceso se repite con las PM₂,₅, la OMS considera buena la calidad del aire por debajo de cinco microgramos por metro cúbico y México cuadruplicó ese nivel: 20 microgramos tanto en 2021 como en lo que llevamos de año. De hecho, ningún año desde 2004, la concentración de partículas ultrafinas ha estado por debajo de 20. Aunque la situación es alarmante en Ciudad de México, prácticamente solo el 1% de la ciudades consigue estar alineada con el nivel que marca la OMS y en América Latina y el Caribe, nueve de cada 10 personas viven en ciudades que no cumplen ni siquiera los niveles de 2005. “Esas directrices de calidad del aire se ajustaron para mandar una señal de que ningún nivel de contaminación atmosférica, sobre todo de partículas finas, es inofensiva para la salud, todo tiene un impacto y de ahí la necesidad de reducir al máximo ese riesgo”, contextualiza Bizberg. La OMS calcula que cada año la exposición a la contaminación del aire causa siete millones de muertes prematuras en el mundo, 320.000 en la región de Latinoamérica, 48.000 en México y entre 8.000 y 14.000 en la capital, según el índice Global Burden of Disease. Es el noveno factor de muerte prematura en México, además de la pérdida de otros tantos años de vida saludable. Para el organismo internacional la contaminación atmosférica se ha convertido en “la amenaza medioambiental más peligrosa para la salud humana”." | |
]).launch() | |