Spaces:

HAERAE-HUB
/

Open-Research-Questions

Sleeping

App Files Files Community

amphora commited on Jul 19

Commit

8e298e3

•

1 Parent(s): f229c82

Update app.py

Browse files

Files changed (1) hide show

app.py +134 -60

app.py CHANGED Viewed

@@ -2,31 +2,126 @@ import streamlit as st
 st.set_page_config(page_title="HAERAE Open Research Questions", layout="wide")
-st.title("HAERAE Open Research Questions")
-st.write("""
-HAERAE is a non-profit research lab focused on the interpretability and evaluation of Korean language models.
-Our mission is to advance the field with insightful benchmarks and tools. Below is an overview of our projects.
-We've been doing most of our projects internally, but for those that have been unsolvable,
-we are planning to open them to get help from the open-source community.
-""")
-st.header("HAERAE-Math Challenge")
-st.write("""
-Today we are introducing our first challenge: HAERAE-Math. We've created high-quality instructions on math
-but don't have an idea on how to generate high-quality answers for them. We are looking for solutions that
-use open-source models with openly available licenses.
-We have created a total of 20,000 instructions already and are generating more. We've opened up a preview
-of 50 of them in this link: [HAERAE-Math Samples](https://huggingface.co/datasets/HAERAE-HUB/HAERAE-Math-samples)
-For those who generate answers for the 50 and share the methodology/results with us, we'll share the
-remaining instructions and credit for the resulting dataset.
-""")
-st.subheader("Example Question")
 example_question = """
 한국의 보안 전문가가 고도화된 데이터 보호 시스템을 개발하고 있습니다. 이 시스템은 3차원 기하학적 잠금 메커니즘을 사용하는데, 잠금 장치는 원뿔 모양으로 되어 있고, 밑면의 반지름은 6cm, 높이는 8cm입니다. 이 원뿔 모양의 잠금 장치에는 원통 모양의 열쇠가 딱 맞게 들어가게 설계되어 있습니다.
@@ -50,38 +145,17 @@ example_question = """
 st.code(example_question, language="markdown")
-st.header("How to Participate")
-st.write("""
-1. Access the 50 sample questions from the provided Hugging Face dataset link.
-2. Generate high-quality answers for these questions using open-source models.
-3. Document your methodology and results.
-4. Share your findings with us through [contact information or submission form].
-5. If your approach is promising, we'll provide access to the full dataset of 20,000 instructions.
-6. Collaborate with us to refine and improve the answer generation process.
-7. Receive credit as a contributor to the final HAERAE-Math dataset.
-""")
-st.header("Why Participate?")
-st.write("""
-- Contribute to advancing Korean language model research
-- Gain access to a large, high-quality dataset of math instructions
-- Collaborate with HAERAE researchers
-- Receive recognition in the field of NLP and math education
-- Potential for co-authorship on related publications
-""")
-st.header("Contact Us")
-st.write("""
-For more information or to submit your results, please contact us at:
-[Your contact information or a link to a submission form]
-""")
-st.sidebar.title("About HAERAE")
-st.sidebar.info("""
-HAERAE is a non-profit research lab dedicated to advancing the field of
-Korean language model interpretability and evaluation. Our work focuses on
-creating insightful benchmarks and tools to push the boundaries of NLP research.
-""")

 st.set_page_config(page_title="HAERAE Open Research Questions", layout="wide")
+# Language selection
+lang = st.radio("Language / 언어", ["English", "한국어"])
+# Content in both languages
+content = {
+    "English": {
+        "title": "HAERAE Open Research Questions",
+        "intro": """
+        HAERAE is a non-profit research lab focused on the interpretability and evaluation of Korean language models.
+        Our mission is to advance the field with insightful benchmarks and tools. Below is an overview of our projects.
+        We've been doing most of our projects internally, but for those that have been unsolvable,
+        we are planning to open them to get help from the open-source community.
+        """,
+        "challenge_title": "HAERAE-Math Challenge",
+        "challenge_desc": """
+        Today we are introducing our first challenge: HAERAE-Math. We've created high-quality instructions on math
+        but don't have an idea on how to generate high-quality answers for them. We are looking for solutions that
+        use open-source models with openly available licenses.
+        We have created a total of 20,000 instructions already and are generating more. We've opened up a preview
+        of 50 of them in this link: [HAERAE-Math Samples](https://huggingface.co/datasets/HAERAE-HUB/HAERAE-Math-samples)
+        For those who generate answers for the 50 and share the methodology/results with us, we'll share the
+        remaining instructions and credit for the resulting dataset.
+        """,
+        "example_title": "Example Question",
+        "how_to_title": "How to Participate",
+        "how_to": """
+        1. Access the 50 sample questions from the provided Hugging Face dataset link.
+        2. Generate high-quality answers for these questions using open-source models.
+        3. Document your methodology and results.
+        4. Share your findings with us through [contact information or submission form].
+        5. If your approach is promising, we'll provide access to the full dataset of 20,000 instructions.
+        6. Collaborate with us to refine and improve the answer generation process.
+        7. Receive credit as a contributor to the final HAERAE-Math dataset.
+        """,
+        "why_title": "Why Participate?",
+        "why": """
+        - Contribute to advancing Korean language model research
+        - Gain access to a large, high-quality dataset of math instructions
+        - Collaborate with HAERAE researchers
+        - Receive recognition in the field of NLP and math education
+        - Potential for co-authorship on related publications
+        """,
+        "contact_title": "Contact Us",
+        "contact": """
+        For more information or to submit your results, please contact us at:
+        [[email protected]]([email protected])
+        """,
+        "sidebar_title": "About HAERAE",
+        "sidebar_content": """
+        HAERAE is a non-profit research lab dedicated to advancing the field of
+        Korean language model interpretability and evaluation. Our work focuses on
+        creating insightful benchmarks and tools to push the boundaries of NLP research.
+        """
+    },
+    "한국어": {
+        "title": "HAERAE 공개 연구 질문",
+        "intro": """
+        HAERAE는 한국어 언어 모델의 해석 가능성과 평가에 중점을 둔 비영리 연구소입니다.
+        우리의 미션은 통찰력 있는 벤치마크와 도구를 통해 이 분야를 발전시키는 것입니다. 다음은 우리 프로젝트의 개요입니다.
+        대부분의 프로젝트를 내부적으로 수행해 왔지만, 해결하기 어려운 문제들에 대해서는
+        오픈 소스 커뮤니티의 도움을 받고자 공개할 계획입니다.
+        """,
+        "challenge_title": "HAERAE-Math 챌린지",
+        "challenge_desc": """
+        오늘 우리는 첫 번째 챌린지인 HAERAE-Math를 소개합니다. 우리는 수학에 관한 고품질 지시문을 만들었지만
+        이에 대한 고품질 답변을 생성하는 방법에 대한 아이디어가 없습니다. 우리는 공개적으로 사용 가능한 라이선스를 가진
+        오픈 소스 모델을 사용하는 솔루션을 찾고 있습니다.
+        우리는 이미 총 20,000개의 지시문을 만들었고 더 많이 생성하고 있습니다. 우리는 이 중 50개의 미리보기를
+        다음 링크에서 공개했습니다: [HAERAE-Math 샘플](https://huggingface.co/datasets/HAERAE-HUB/HAERAE-Math-samples)
+        50개에 대한 답변을 생성하고 방법론/결과를 우리와 공유하는 분들에게는 나머지 지시문을 공유하고
+        최종 데이터셋에 대한 크레딧을 공유할 것입니다.
+        """,
+        "example_title": "예시 질문",
+        "how_to_title": "참여 방법",
+        "how_to": """
+        1. 제공된 Hugging Face 데이터셋 링크에서 50개의 샘플 질문에 접근합니다.
+        2. 오픈 소스 모델을 사용하여 이 질문들에 대한 고품질 답변을 생성합니다.
+        3. 방법론과 결과를 문서화합니다.
+        4. [연락처 정보 또는 제출 양식]을 통해 귀하의 결과를 우리와 공유합니다.
+        5. 귀하의 접근 방식이 유망하다면, 20,000개의 전체 지시문 데이터셋에 대한 접근 권한을 제공할 것입니다.
+        6. 답변 생성 과정을 개선하고 발전시키기 위해 우리와 협력합니다.
+        7. 최종 HAERAE-Math 데이터셋의 기여자로 인정받습니다.
+        """,
+        "why_title": "왜 참여해야 하나요?",
+        "why": """
+        - 한국어 언어 모델 연구 발전에 기여
+        - 대규모의 고품질 수학 지시문 데이터셋에 접근
+        - HAERAE 연구원들과 협력
+        - NLP 및 수학 교육 분야에서 인정받을 기회
+        - 관련 출판물의 공동 저자가 될 가능성
+        """,
+        "contact_title": "연락처",
+        "contact": """
+        더 많은 정보를 원하시거나 결과를 제출하려면 다음 연락처로 문의해 주세요:
+        [[email protected]]([email protected])
+        """,
+        "sidebar_title": "HAERAE 소개",
+        "sidebar_content": """
+        HAERAE는 한국어 언어 모델의 해석 가능성과 평가 분야를 발전시키는 데 전념하는 비영리 연구소입니다.
+        우리의 연구는 NLP 연구의 경계를 넓히기 위한 통찰력 있는 벤치마크와 도구를 만드는 데 중점을 둡니다.
+        """
+    }
+}
+# Main content
+st.title(content[lang]["title"])
+st.write(content[lang]["intro"])
+st.header(content[lang]["challenge_title"])
+st.write(content[lang]["challenge_desc"])
+st.subheader(content[lang]["example_title"])
 example_question = """
 한국의 보안 전문가가 고도화된 데이터 보호 시스템을 개발하고 있습니다. 이 시스템은 3차원 기하학적 잠금 메커니즘을 사용하는데, 잠금 장치는 원뿔 모양으로 되어 있고, 밑면의 반지름은 6cm, 높이는 8cm입니다. 이 원뿔 모양의 잠금 장치에는 원통 모양의 열쇠가 딱 맞게 들어가게 설계되어 있습니다.
 st.code(example_question, language="markdown")
+st.header(content[lang]["how_to_title"])
+st.write(content[lang]["how_to"])
+st.header(content[lang]["why_title"])
+st.write(content[lang]["why"])
+st.header(content[lang]["contact_title"])
+st.write(content[lang]["contact"])
+st.sidebar.title(content[lang]["sidebar_title"])
+st.sidebar.info(content[lang]["sidebar_content"])