AIMakerSpace-Midterm / baseline_ragas_results.csv
xicocdi
save changes
4acec17
question,contexts,answer,ground_truth,faithfulness,answer_relevancy,context_recall,context_precision,answer_correctness
What is the significance of providing notice and explanation as a legal requirement in the context of automated systems?,"['Providing notice has long been a standard practice, and in many cases is a legal requirement, when, for example, making a video recording of someone (outside of a law enforcement or national security context). In some cases, such as credit, lenders are required to provide notice and explanation to consumers. Techniques used to automate the process of explaining such systems are under active research and improvement and such explanations can take many forms. Innovative companies and researchers are rising to the challenge and creating and deploying explanatory systems that can help the public better understand decisions that impact them.'
""While notice and explanation requirements are already in place in some sectors or situations, the American public deserve to know consistently and across sectors if an automated system is being used in a way that impacts their rights, opportunities, or access. This knowledge should provide confidence in how the public is being treated, and trust in the validity and reasonable use of automated systems. \n• A lawyer representing an older client with disabilities who had been cut off from Medicaid-funded home\nhealth-care assistance couldn't determine why\n, especially since the decision went against historical access\npractices. In a court hearing, the lawyer learned from a witness that the state in which the older client\nlived \nhad recently adopted a new algorithm to determine eligibility.83 The lack of a timely explanation made it\nharder \nto understand and contest the decision.\n•\nA formal child welfare investigation is opened against a parent based on an algorithm and without the parent""
'NOTICE & \nEXPLANATION \nWHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS\nThe expectations for automated systems are meant to serve as a blueprint for the development of additional \ntechnical standards and practices that are tailored for particular sectors and contexts. \nAn automated system should provide demonstrably clear, timely, understandable, and accessible notice of use, and \nexplanations as to how and why a decision was made or an action was taken by the system. These expectations are explained below. \nProvide clear, timely, understandable, and accessible notice of use and explanations \nGenerally accessible plain language documentation. The entity responsible for using the automated \nsystem should ensure that documentation describing the overall system (including any human components) is \npublic and easy to find. The documentation should describe, in plain language, how the system works and how'
""Demonstrate protections for notice and explanation \nReporting. Summary reporting should document the determinations made based on the above consider -\nations, including: the responsible entities for accountability purposes; the goal and use cases for the system, identified users, and impacted populations; the assessment of notice clarity and timeliness; the assessment of the explanation's validity and accessibility; the assessment of the level of risk; and the account and assessment of how explanations are tailored, including to the purpose, the recipient of the explanation, and the level of risk. Individualized profile information should be made readily available to the greatest extent possible that includes explanations for any system impacts or inferences. Reporting should be provided in a clear plain language and machine-readable manner. \n44""]","Providing notice and explanation as a legal requirement in the context of automated systems is significant for several reasons:
1. **Transparency**: It ensures that individuals are informed when automated systems are being used to make decisions that affect them. This transparency helps build trust between the public and the entities using these systems.
2. **Accountability**: By requiring organizations to explain how decisions are made, it holds them accountable for the outcomes of their automated systems. This means that if a decision negatively impacts someone, they have the right to understand the reasoning behind it and contest it if necessary.
3. **Empowerment**: Notice and explanation empower individuals by giving them the information they need to understand their rights and the processes that affect their lives. This is particularly important in sensitive areas like healthcare, finance, and child welfare, where decisions can have significant consequences.
4. **Legal Compliance**: In many jurisdictions, providing notice and explanation is not just a best practice but a legal requirement. Compliance with these regulations helps organizations avoid legal repercussions and fosters a culture of ethical responsibility.
5. **Improved Systems**: When organizations are required to document and explain their automated systems, it encourages them to improve the systems' design and functionality. This can lead to better outcomes and reduced risks of bias or error in decision-making.
Overall, these requirements are essential for ensuring that automated systems are used in a way that respects individuals' rights and promotes fairness and justice in society (Source: [document name], p. [page number]).","Providing notice and explanation as a legal requirement in the context of automated systems is significant because it allows individuals to understand how automated systems are impacting their lives. It helps in correcting errors, contesting decisions, and verifying the reasonableness of recommendations before enacting them. Clear and valid explanations are essential to ensure transparency, accountability, and trust in the use of automated systems across various sectors.",0.6153846153846154,0.9713211543186683,1.0,0.999999999975,0.8155904931671101
"How can structured human feedback exercises, such as GAI red-teaming, be beneficial for GAI risk measurement and management?","['Gaps between benchmarks and real-world use of GAI systems may likely be exacerbated due to prompt \nsensitivity and broad heterogeneity of contexts of use. \nA.1.5. Structured Public Feedback \nStructured public feedback can be used to evaluate whether GAI systems are performing as intended and to calibrate and verify traditional measurement methods. Examples of structured feedback include, \nbut are not limited to: \n• Participatory Engagement Methods : Methods used to solicit feedback from civil society groups, \naffected communities, and users, including focus groups, small user studies, and surveys. \n• Field Testing : Methods used to determine how people interact with, consume, use, and make \nsense of AI -generated information, and subsequent actions and effects, including UX, usability, \nand other structured, randomized experiments. \n• AI Red -teaming: A structured testing exercise\n used to probe an AI system to find flaws and'
'environment and in collaboration with AI developers building AI models to identify potential adverse \nbehavior or outcomes of a GAI model or system, how they could occur, an d stress test safeguards” . AI \nred-teaming can be performed before or after AI models or systems are made available to the broader \npublic; this section focuses on red- teaming in pre -deployment contexts. \nThe quality of AI red- teaming outputs is related to the background and expertise of the AI red team \nitself. Demographically and interdisciplinarily diverse AI red teams can be used to identify flaws in the \nvarying contexts where GAI will be used. For best results, AI red teams should demonstrate domain \nexpertise, and awareness of socio -cultural aspects within the deployment context. AI red -teaming results \nshould be given additional analysis before they are incorporated into organizational governance and \ndecision making, policy and procedural updates, and AI risk management efforts.'
'MP-5.1-004 Prioritize GAI structured public feedback processes based on risk assessment estimates. Information Integrity ; CBRN \nInformation or Capabilities ; \nDangerous , Violent, or Hateful \nContent ; Harmful Bias and \nHomogenization \nMP-5.1-005 Conduct adversarial role -playing exercises, GAI red -teaming, or chaos testing to \nidentify anomalous or unforeseen failure modes. Information Security \nMP-5.1-0 06 Profile threats and negative impacts arising from GAI systems interacting with, \nmanipulating, or generating content, and outlining known and potential vulnerabilities and the likelihood of their occurrence. Information Security \nAI Actor Tasks: AI Deployment, AI Design, AI Development, AI Impact Assessment, Affected Individuals and Communities, End -\nUsers, Operation and Monitoring'
'The diverse ways and contexts in which GAI systems may be developed, used, and repurposed \ncomplicates risk mapping and pre -deployment measurement efforts. Robust test, evaluation, validation, \nand verification (TEVV) processes can be iteratively applied – and documented – in early stages of the AI \nlifecycle and informed by representative AI Actors (see Figure 3 of the AI RMF ). Until new and rigorous']","Structured human feedback exercises, such as GAI red-teaming, can significantly benefit GAI risk measurement and management in several ways:
1. **Identifying Flaws**: GAI red-teaming involves testing AI systems to uncover potential flaws or vulnerabilities before they are deployed. By simulating various scenarios, red teams can identify weaknesses that may not be apparent during standard testing processes.
2. **Diverse Perspectives**: Engaging a demographically and interdisciplinarily diverse red team allows for a broader range of insights. Different backgrounds and expertise can help uncover biases and risks that might be overlooked by a more homogenous group, leading to a more comprehensive understanding of potential issues (Source: [document name], p. [page number]).
3. **Stress Testing Safeguards**: Red-teaming exercises can stress test the safeguards built into AI systems. By intentionally probing the system for weaknesses, organizations can evaluate how well their protective measures hold up under various conditions, ensuring that they are robust enough to handle real-world challenges (Source: [document name], p. [page number]).
4. **Informed Decision-Making**: The results from red-teaming can provide valuable insights that inform organizational governance and decision-making. By analyzing the findings, organizations can update their policies and procedures to better manage AI risks and enhance overall system performance (Source: [document name], p. [page number]).
5. **Iterative Improvement**: Red-teaming is not a one-time exercise; it can be part of an iterative process where feedback is continuously integrated into the AI development lifecycle. This ongoing evaluation helps organizations adapt to new risks and improve their systems over time (Source: [document name], p. [page number]).
In summary, structured human feedback exercises like GAI red-teaming are crucial for enhancing the safety, reliability, and ethical deployment of AI systems by proactively identifying risks and informing better governance practices.","Structured human feedback exercises, such as GAI red-teaming, can be beneficial for GAI risk measurement and management by defining use cases, contexts of use, capabilities, and negative impacts where such exercises would be most beneficial. These exercises help in monitoring and improving outputs, evaluating the quality and integrity of data used in training, and tracking and documenting risks or opportunities related to GAI risks that cannot be measured quantitatively. Additionally, seeking active and direct feedback from affected communities through red-teaming can enhance information integrity and help in identifying harmful bias and homogenization in AI systems.",0.7894736842105263,0.9898168951589242,1.0,0.999999999975,0.5344121952707439
How do measurement gaps between laboratory and real-world settings impact the assessment of GAI systems in the context of pre-deployment testing?,"['reliability in those domains. Similarly, jailbreaking or prompt engineering tests may not systematically \nasse ss validity or reliability risks. \nMeasurement gaps can arise from mismatches between laboratory and real -world settings. Current \ntesting approaches often remain focused on laboratory conditions or restricted to benchmark test \ndatasets and in silico techniques that may not extrapolate well to —or directly assess GAI impacts in real -\nworld conditions. For example, current measurement gaps for GAI make it difficult to precisely estimate \nits potential ecosystem -level or longitudinal risks and related political, social, and economic impacts. \nGaps between benchmarks and real-world use of GAI systems may likely be exacerbated due to prompt \nsensitivity and broad heterogeneity of contexts of use. \nA.1.5. Structured Public Feedback'
'conditions similar to deployment setting(s). Measures are documented. \nAction ID Suggested Action GAI Risks \nMS-2.3-001 Consider baseline model performance on suites of benchmarks when selecting a \nmodel for fine tuning or enhancement with retrieval- augmented generation. Information Security ; \nConfabulation \nMS-2.3-002 Evaluate claims of model capabilities using empirically validated methods. Confabulation ; Information \nSecurity \nMS-2.3-003 Share results of pre -deployment testing with relevant GAI Actors , such as those \nwith system release approval authority. Human -AI Configuration'
'49 early lifecycle TEVV approaches are developed and matured for GAI, organizations may use \nrecommended “pre- deployment testing” practices to measure performance, capabilities, limits, risks, \nand impacts. This section describes risk measurement and estimation as part of pre -deployment TEVV, \nand examines the state of play for pre -deployment testing methodologies. \nLimitations of Current Pre -deployment Test Approaches \nCurrently available pre -deployment TEVV processes used for GAI applications may be inadequate, non-\nsystematically applied, or fail to reflect or mismatched to deployment contexts. For example, the \nanecdotal testing of GAI system capabilities through video games or standardized tests designed for \nhumans (e.g., intelligence tests, professional licensing exams) does not guarantee GAI system validity or \nreliability in those domains. Similarly, jailbreaking or prompt engineering tests may not systematically \nasse ss validity or reliability risks.'
'the collection and use of third -party data for model inputs. Organizations may consider varying risk \ncontrols for foundation models, fine -tuned models, and embedded tools, enhanced processes for \ninteracting with external GAI technologies or service providers. Organizations can apply standard or existing risk controls and processes to proprietary or open -source GAI technologies, data, and third -party \nservice providers, including acquisition and procurement due diligence, requests for software bills of materials (SBOMs), application of service level agreements (SLAs), and statement on standards for \nattestation en gagement (SSAE) reports to help with third -party transparency and risk management for \nGAI systems. \nA.1.4. \nPre-Deployment Testing \nOverview \nThe diverse ways and contexts in which GAI systems may be developed, used, and repurposed \ncomplicates risk mapping and pre -deployment measurement efforts. Robust test, evaluation, validation,']","Measurement gaps between laboratory and real-world settings can significantly impact the assessment of Generative AI (GAI) systems during pre-deployment testing in several ways:
1. **Inaccurate Performance Predictions**: Laboratory tests often use controlled environments and standardized datasets, which may not accurately reflect the complexities and variability of real-world scenarios. This can lead to overestimating the system's capabilities and underestimating potential risks when deployed in diverse contexts (Source: [document name], p. [page number]).
2. **Limited Contextual Relevance**: The conditions under which GAI systems are tested in labs may not match the actual environments where they will be used. This mismatch can result in performance issues that were not identified during testing, as the system may behave differently when faced with real-world data and user interactions (Source: [document name], p. [page number]).
3. **Prompt Sensitivity**: GAI systems can be highly sensitive to the prompts they receive. In a lab setting, prompts may be carefully crafted and controlled, but in real-world applications, users may input a wide range of queries that the system was not specifically trained to handle. This can lead to unexpected outputs or failures in understanding user intent (Source: [document name], p. [page number]).
4. **Ecosystem-Level Risks**: Current testing methodologies may not adequately assess the broader ecosystem-level impacts of GAI systems, such as their social, political, and economic implications. Without understanding these risks, organizations may deploy systems that could inadvertently cause harm or exacerbate existing issues in society (Source: [document name], p. [page number]).
5. **Need for Iterative Testing**: The gaps highlight the importance of iterative testing and continuous evaluation of GAI systems even after deployment. Organizations should be prepared to adapt and refine their systems based on real-world feedback and performance data to ensure they remain effective and safe (Source: [document name], p. [page number]).
In summary, measurement gaps between laboratory and real-world settings can lead to significant challenges in accurately assessing GAI systems, emphasizing the need for comprehensive testing approaches that consider the complexities of real-world applications.","Measurement gaps between laboratory and real-world settings can impact the assessment of GAI systems in the context of pre-deployment testing by limiting the extrapolation of results from laboratory conditions to real-world scenarios. Current testing approaches often focus on benchmark test datasets and in silico techniques that may not accurately assess the impacts of GAI systems in real-world conditions. This can make it difficult to estimate the ecosystem-level or longitudinal risks associated with GAI deployment, as well as the political, social, and economic impacts. Additionally, the prompt sensitivity and broad heterogeneity of real-world contexts of use can exacerbate the gaps between benchmarks and actual GAI system performance.",1.0,0.9279528358149373,1.0,0.999999999975,0.906851359144841
How should data collection and use-case scope limits be determined and implemented in automated systems to prevent 'mission creep'?,"['Data collection and use-case scope limits. Data collection should be limited in scope, with specific, \nnarrow identified goals, to avoid ""mission creep."" Anticipated data collection should be determined to be strictly necessary to the identified goals and should be minimized as much as possible. Data collected based on these identified goals and for a specific context should not be used in a different context without assessing for new privacy risks and implementing appropriate mitigation measures, which may include express consent. Clear timelines for data retention should be established, with data deleted as soon as possible in accordance with legal or policy-based limitations. Determined data retention timelines should be documented and justi\n-\nfied. \nRisk identification and mitigation. Entities that collect, use, share, or store sensitive data should attempt to proactively identify harms and seek to manage them so as to avoid, mitigate, and respond appropri\n-'
'-\ntion or use case should not be assumed to transfer to another. \nHuman consideration before any high-risk decision. Automated systems, where they are used in \nsensitive domains, may play a role in directly providing information or otherwise providing positive outcomes to impacted people. However, automated systems should not be allowed to directly intervene in high-risk situations, such as sentencing decisions or medical care, without human consideration. \nMeaningful access to examine the system. Designers, developers, and deployers of automated \nsystems should consider limited waivers of confidentiality (including those related to trade secrets) where necessary in order to provide meaningful oversight of systems used in sensitive domains, incorporating mea\n-'
'ed by default. Privacy risks should be assessed throughout the development life cycle, including privacy risks from reidentification, and appropriate technical and policy mitigation measures should be implemented. This includes potential harms to those who are not users of the automated system, but who may be harmed by inferred data, purposeful privacy violations, or community surveillance or other community harms. Data collection should be minimized and clearly communicated to the people whose data is collected. Data should only be collected or used for the purposes of training or testing machine learning models if such collection and use is legal and consistent with the expectations of the people whose data is collected. User experience research should be conducted to confirm that people understand what data is being collected about them and how it will be used, and that this collection matches their expectations and desires.'
'ly prevalent or severe for them. The extent of this engagement and the form of outreach to relevant stakehold -\ners may differ depending on the specific automated system and development phase, but should include subject matter, sector-specific, and context-specific experts as well as experts on potential impacts such as civil rights, civil liberties, and privacy experts. For private sector applications, consultations before product launch may need to be confidential. Government applications, particularly law enforcement applications or applications that raise national security considerations, may require confidential or limited engagement based on system sensitivities and preexisting oversight laws and structures. Concerns raised in this consultation should be documented, and the automated system developers were proposing to create, use, or deploy should be reconsidered based on this feedback.']","To determine and implement data collection and use-case scope limits in automated systems and prevent ""mission creep,"" organizations should follow these key steps:
1. **Define Clear Objectives**: Start by establishing specific, narrow goals for the automated system. Clearly outline what the system is intended to achieve and the data necessary to meet those objectives. This helps ensure that data collection is focused and relevant (Source: [document name], p. [page number]).
2. **Conduct a Data Necessity Assessment**: Evaluate the data required for the identified goals. Collect only the data that is strictly necessary for the intended use. This minimizes the risk of collecting excessive information that could lead to mission creep (Source: [document name], p. [page number]).
3. **Implement Data Minimization Principles**: Adopt data minimization practices, which involve limiting data collection to what is essential. This includes avoiding the collection of data that is not directly related to the defined objectives (Source: [document name], p. [page number]).
4. **Establish Clear Use Cases**: Document and justify the specific use cases for which the data will be used. Ensure that any new use cases are assessed for potential privacy risks and that appropriate mitigation measures are in place before expanding the scope of data use (Source: [document name], p. [page number]).
5. **Set Data Retention Timelines**: Define clear timelines for how long data will be retained. Data should be deleted as soon as it is no longer necessary for the identified goals, in accordance with legal or policy-based limitations (Source: [document name], p. [page number]).
6. **Regularly Review and Update Policies**: Continuously monitor and review data collection practices and use-case scopes. This includes reassessing the relevance of the data being collected and ensuring that it aligns with the original objectives. Adjust policies as needed to prevent mission creep (Source: [document name], p. [page number]).
7. **Engage Stakeholders**: Involve relevant stakeholders, including privacy experts and affected communities, in discussions about data collection and use-case limits. Their insights can help identify potential risks and ensure that the system aligns with societal values (Source: [document name], p. [page number]).
By following these steps, organizations can effectively manage data collection and use-case scope limits, reducing the risk of mission creep and ensuring that automated systems operate within ethical and legal boundaries.","Data collection and use-case scope limits in automated systems should be determined by setting specific, narrow goals to avoid 'mission creep.' Anticipated data collection should be strictly necessary for the identified goals and minimized as much as possible. Data collected for a specific context should not be used in a different context without assessing new privacy risks and implementing appropriate mitigation measures, which may include obtaining express consent. Clear timelines for data retention should be established, with data deleted as soon as possible in accordance with legal or policy-based limitations. The determined data retention timelines should be documented and justified.",1.0,0.9413398869859043,1.0,0.8055555555287036,0.442306690943327
What action did the Federal Trade Commission take against Kochava regarding the sale of sensitive location tracking data?,"['65. See, e.g., Scott Ikeda. Major Data Broker Exposes 235 Million Social Media Profiles in Data Lead: Info\nAppears to Have Been Scraped Without Permission. CPO Magazine. Aug. 28, 2020. https://\nwww.cpomagazine.com/cyber-security/major-data-broker-exposes-235-million-social-media-profiles-\nin-data-leak/; Lily Hay Newman. 1.2 Billion Records Found Exposed Online in a Single Server . WIRED,\nNov. 22, 2019. https://www.wired.com/story/billion-records-exposed-online/\n66.Lola Fadulu. Facial Recognition Technology in Public Housing Prompts Backlash . New York Times.\nSept. 24, 2019.\nhttps://www.nytimes.com/2019/09/24/us/politics/facial-recognition-technology-housing.html\n67. Jo Constantz. ‘They Were Spying On Us’: Amazon, Walmart, Use Surveillance Technology to Bust\nUnions. Newsweek. Dec. 13, 2021.\nhttps://www.newsweek.com/they-were-spying-us-amazon-walmart-use-surveillance-technology-bust-\nunions-1658603\n68. See, e.g., enforcement actions by the FTC against the photo storage app Everalbaum'
'DATA PRIVACY \nEXTRA PROTECTIONS FOR DATA RELATED TO SENSITIVE\nDOMAINS\nSome domains, including health, employment, education, criminal justice, and personal finance, have long been \nsingled out as sensitive domains deserving of enhanced data protections. This is due to the intimate nature of these domains as well as the inability of individuals to opt out of these domains in any meaningful way, and the historical discrimination that has often accompanied data \nknowledge.69 Domains understood by the public to be \nsensitive also change over time, including because of technological developments. Tracking and monitoring \ntechnologies, personal tracking devices, and our extensive data footprints are used and misused more than ever \nbefore; as such, the protections afforded by current legal guidelines may be inadequate. The American public \ndeserves assurances that data related to such sensitive domains is protected and used appropriately and only in'
'•Karen Kornbluh, Senior Fellow and Director of the Digital Innovation and Democracy Initiative, GermanMarshall Fund\nModerator : \nDevin E. Willis, Attorney, Division of Privacy and Identity Protection, Bureau of Consumer Protection, Federal Trade Commission \nPanelists: \n•Tamika L. Butler, Principal, Tamika L. Butler Consulting\n•Jennifer Clark, Professor and Head of City and Regional Planning, Knowlton School of Engineering, OhioState University\n•Carl Holshouser, Senior Vice President for Operations and Strategic Initiatives, TechNet\n•Surya Mattu, Senior Data Engineer and Investigative Data Journalist, The Markup\n•Mariah Montgomery, National Campaign Director, Partnership for Working Families\n55'
""Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru.\nModel Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and\nTransparency (FAT* '19). Association for Computing Machinery, New York, NY, USA, 220–229. https://\ndl.acm.org/doi/10.1145/3287560.3287596\n90. Sarah Ammermann. Adverse Action Notice Requirements Under the ECOA and the FCRA. Consumer\nCompliance Outlook. Second Quarter 2013.\nhttps://consumercomplianceoutlook.org/2013/second-quarter/adverse-action-notice-requirements-under-ecoa-fcra/\n91.Federal Trade Commission. Using Consumer Reports for Credit Decisions: What to Know About\nAdverse Action and Risk-Based Pricing Notices. Accessed May 2, 2022.\nhttps://www.ftc.gov/business-guidance/resources/using-consumer-reports-credit-decisions-what-\nknow-about-adverse-action-risk-based-pricing-notices#risk\n92. Consumer Financial Protection Bureau. CFPB Acts to Protect the Public from Black-Box Credit""]","The Federal Trade Commission (FTC) took action against Kochava, a data broker, for allegedly selling sensitive location tracking data that could be used to identify individuals' movements to sensitive locations, such as reproductive health clinics, places of worship, and shelters for survivors of domestic violence. The FTC's complaint highlighted that this practice posed significant privacy risks and could lead to harmful consequences for individuals whose data was being sold. The agency argued that Kochava's actions violated consumer protection laws by failing to adequately protect sensitive information and by not providing consumers with sufficient control over their data (Source: [document name], p. [page number]).","FTC sued Kochava for selling data that tracks people at reproductive health clinics, places of worship, and other sensitive locations.",0.0,0.925089570201711,0.0,0.0,0.8533361434160237
How should explanatory mechanisms be built into system design to ensure full behavior transparency in high-risk settings?,"['NOTICE & \nEXPLANATION \nWHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS\nThe expectations for automated systems are meant to serve as a blueprint for the development of additional \ntechnical standards and practices that are tailored for particular sectors and contexts. \nTailored to the level of risk. An assessment should be done to determine the level of risk of the auto -\nmated system. In settings where the consequences are high as determined by a risk assessment, or extensive \noversight is expected (e.g., in criminal justice or some public sector settings), explanatory mechanisms should be built into the system design so that the system’s full behavior can be explained in advance (i.e., only fully transparent models should be used), rather than as an after-the-decision interpretation. In other settings, the extent of explanation provided should be tailored to the risk level.'
'-\nsures to protect intellectual property and trade secrets from unwarranted disclosure as appropriate. This includes (potentially private and protected) meaningful access to source code, documentation, and related data during any associated legal discovery, subject to effective confidentiality or court orders. Such meaning\n-\nful access should include (but is not limited to) adhering to the principle on Notice and Explanation using the highest level of risk so the system is designed with built-in explanations; such systems should use fully-trans\n-\nparent models where the model itself can be understood by people needing to directly examine it. \nDemonstrate access to human alternatives, consideration, and fallback'
'Valid. The explanation provided by a system should accurately reflect the factors and the influences that led \nto a particular decision, and should be meaningful for the particular customization based on purpose, target, and level of risk. While approximation and simplification may be necessary for the system to succeed based on the explanatory purpose and target of the explanation, or to account for the risk of fraud or other concerns related to revealing decision-making information, such simplifications should be done in a scientifically supportable way. Where appropriate based on the explanatory system, error ranges for the explanation should be calculated and included in the explanation, with the choice of presentation of such information balanced with usability and overall interface complexity concerns. \nDemonstrate protections for notice and explanation \nReporting. Summary reporting should document the determinations made based on the above consider -'
'Demonstrate the safety and effectiveness of the system \nIndependent evaluation. Automated systems should be designed to allow for independent evaluation (e.g., \nvia application programming interfaces). Independent evaluators, such as researchers, journalists, ethics \nreview boards, inspectors general, and third-party auditors, should be given access to the system and samples \nof associated data, in a manner consistent with privac y, security, la w, or regulation (including, e.g., intellectual \nproperty law), in order to perform such evaluations. Mechanisms should be included to ensure that system \naccess for evaluation is: provided in a timely manner to the deployment-ready version of the system; trusted to \nprovide genuine, unfiltered access to the full system; and truly independent such that evaluator access cannot \nbe revoked without reasonable and verified justification. \nReporting.12 Entities responsible for the development or use of automated systems should provide']","To integrate explanatory mechanisms into system design and guarantee complete behavior transparency in high-risk environments, organizations should consider the following steps:
1. **Conduct a Risk Assessment**: Begin by assessing the level of risk associated with the automated system. This assessment should identify potential consequences of system failures or biases, particularly in high-stakes areas like criminal justice, healthcare, or public safety (Source: [document name], p. [page number]).
2. **Design for Transparency**: Develop the system using fully transparent models that allow users to understand how decisions are made. This means choosing algorithms and methodologies that can be easily interpreted and explained, rather than relying on ""black box"" models that obscure decision-making processes (Source: [document name], p. [page number]).
3. **Incorporate Explanatory Features**: Build in features that provide clear explanations of the system's behavior. This could include visualizations of decision pathways, summaries of the factors influencing decisions, and user-friendly interfaces that allow stakeholders to query the system about its outputs (Source: [document name], p. [page number]).
4. **Ensure Real-Time Feedback**: Implement mechanisms that allow for real-time feedback on decisions made by the system. This could involve alerts or notifications that inform users when a decision is made, along with the rationale behind it, enabling users to understand and respond to the system's actions promptly (Source: [document name], p. [page number]).
5. **Facilitate Independent Evaluation**: Design the system to allow for independent evaluations by third parties, such as researchers or auditors. This includes providing access to the system and relevant data in a manner that respects privacy and security, enabling external parties to assess the system's transparency and effectiveness (Source: [document name], p. [page number]).
6. **User Training and Support**: Provide training and resources for users to help them understand the system's functionalities and the explanations it provides. This ensures that users can effectively interpret the information and make informed decisions based on the system's outputs (Source: [document name], p. [page number]).
7. **Iterative Improvement**: Continuously gather feedback from users and stakeholders to refine the explanatory mechanisms. This iterative process allows organizations to adapt the system based on real-world experiences and improve the clarity and usefulness of the explanations provided (Source: [document name], p. [page number]).
By following these steps, organizations can enhance the transparency of automated systems in high-risk environments, ensuring that users have a clear understanding of how decisions are made and the factors influencing those decisions.","In settings where the consequences are high as determined by a risk assessment, or extensive oversight is expected (e.g., in criminal justice or some public sector settings), explanatory mechanisms should be built into the system design so that the system’s full behavior can be explained in advance (i.e., only fully transparent models should be used), rather than as an after-the-decision interpretation.",0.42857142857142855,0.8974298933484125,1.0,0.999999999975,0.4524697666984781
What are some examples of GAI risks that organizations need to consider in the development and deployment of AI systems?,"['context, o r may be more speculative and therefore uncertain. \nAI risks can differ from or intensify traditional software risks. Likewise, GAI can exacerbate existing AI \nrisks, and creates unique risks. GAI risks can vary along many dimensions: \n• Stage of the AI lifecycle: Risks can arise during design, development , depl oyment , operation, \nand/or decommission ing. \n• Scope: Risks may exist at individual model or system levels , at the application or implementation \nlevel s (i.e., for a specific use case), or at the ecosystem level – that is, beyond a single system or \norganizational context . Examples of the latter include the expansion of “ algorithmic \nmonocultures ,3” resulting from repeated use of the same model, or impacts on access to \nopportunity, labor markets , and the creative economies .4 \n• Source of risk: Risks may emerge from factors related to the de sign, training, or operation of the'
'generation, summarization, search, and chat. These activities can take place within organizational \nsettings or in the public domain. \nOrganizations can restrict AI applications that cause harm, exceed stated risk tolerances, or that conflict with their tolerances or values. Governance tools and protocols that are applied to other types of AI systems can be applied to GAI systems. These p lans and actions include: \n• Accessibility and reasonable accommodations \n• AI actor credentials and qualifications \n• Alignment to organizational values • Auditing and assessment \n• Change -management controls \n• Commercial use \n• Data provenance'
'GAI practitioners consider during the model selection process. \nTrustworthy AI Characteristics: Accountable and Transparent, Explainable and Interpretable, Fair with \nHarmful Bias Managed, Privacy Enhanced, Safe, Secure and Resilient, Valid and Reliable \n3. Suggested Actions to Manage GAI Risks \nThe following suggested actions target risks unique to or exacerbated by GAI. \nIn addition to the suggested actions below , AI risk management activities and actions set forth in the AI \nRMF 1.0 and Playbook are already applicable for managing GAI risks . Organizations are encouraged to \napply the activities suggested in the AI RMF and its Playbook when managing the risk of GAI systems . \nImplementation of the suggested actions will vary depending on the type of risk, characteristics of GAI \nsystems, stage of the GAI lifecycle, and relevant AI actors involved. \nSuggested actions to manage GAI risks can be found in the tables below :'
'A.1. Governance \nA.1.1. Overview \nLike any other technology system, governance principles and techniques can be used to manage risks \nrelated to generative AI models, capabilities, and applications. Organizations may choose to apply their \nexisting risk tiering to GAI systems, or they may op t to revis e or update AI system risk levels to address \nthese unique GAI risks. This section describes how organizational governance regimes may be re -\nevaluated and adjusted for GAI contexts. It also addresses third -party considerations for governing across \nthe AI value chain. \nA.1.2. Organizational Governance \nGAI opportunities, risks and long- term performance characteristics are typically less well -understood \nthan non- generative AI tools and may be perceived and acted upon by humans in ways that vary greatly. \nAccordingly, GAI may call for different levels of oversight from AI Actors or different human- AI']","Organizations should consider several risks associated with Generative AI (GAI) during the development and deployment of AI systems. Some key examples include:
1. **Bias and Discrimination**: GAI systems can inadvertently perpetuate or amplify biases present in the training data. This can lead to discriminatory outcomes, particularly in sensitive applications like hiring, lending, or law enforcement (Source: [document name], p. [page number]).
2. **Misinformation and Disinformation**: GAI can generate realistic but false information, which can be used to spread misinformation or disinformation. This poses risks to public trust, especially in contexts like news media, social platforms, and political discourse (Source: [document name], p. [page number]).
3. **Privacy Violations**: The use of GAI can lead to the unintentional generation of content that reveals sensitive personal information. This is particularly concerning when GAI systems are trained on data that includes private or confidential information (Source: [document name], p. [page number]).
4. **Intellectual Property Issues**: GAI can create content that may infringe on existing copyrights or trademarks, leading to legal disputes. Organizations must navigate the complexities of ownership and rights associated with AI-generated content (Source: [document name], p. [page number]).
5. **Security Vulnerabilities**: GAI systems can be susceptible to adversarial attacks, where malicious actors manipulate inputs to produce harmful outputs. This can compromise the integrity and reliability of the system (Source: [document name], p. [page number]).
6. **Lack of Accountability**: The opaque nature of some GAI models can make it difficult to determine accountability for decisions made by the system. This raises ethical concerns, especially when harmful outcomes occur (Source: [document name], p. [page number]).
7. **Ecosystem-Level Risks**: GAI can contribute to broader societal issues, such as job displacement or the creation of algorithmic monocultures, where similar models are used across different applications, reducing diversity and resilience in AI systems (Source: [document name], p. [page number]).
8. **Regulatory Compliance**: As governments implement new regulations around AI, organizations must ensure that their GAI systems comply with legal requirements, which can vary by jurisdiction and may impose additional operational burdens (Source: [document name], p. [page number]).
By being aware of these risks, organizations can take proactive measures to mitigate them, ensuring that their GAI systems are developed and deployed responsibly and ethically.","Organizations need to consider various GAI risks in the development and deployment of AI systems, including unacceptable use identified by stakeholder communities, harmful bias and homogenization, dangerous, violent, or hateful content, immature safety or risk cultures related to AI and GAI design, development, and deployment, public information integrity risks impacting democratic processes, unknown long-term performance characteristics of GAI, and risks related to generating illegal content or violating laws.",1.0,0.948006718069318,0.375,0.999999999975,0.2983787812323846
How should the validity of explanations provided by automated systems be ensured?,"['-\nwise, the public is often unable to ascertain how or why an automated system has made a decision or contributed to a particular outcome. The decision-making processes of automated systems tend to be opaque, complex, and, therefore, unaccountable, whether by design or by omission. These factors can make explanations both more challenging and more important, and should not be used as a pretext to avoid explaining important decisions to the people impacted by those choices. In the context of automated systems, clear and valid explanations should be recognized as a baseline requirement.'
'of a prediction process should receive particular attention to the quality and validity of the predicted outcome \nor label to ensure the goal of the automated system is appropriately identified and measured. Additionally , \njustification should be documented for each data attribute and source to explain why it is appropriate to use \nthat data to inform the results of the automated system and why such use will not violate any applicable laws. \nIn cases of high-dimensional and/or derived attributes, such justifications can be provided as overall \ndescriptions of the attribute generation process and appropriateness. \n19'
'Demonstrate the safety and effectiveness of the system \nIndependent evaluation. Automated systems should be designed to allow for independent evaluation (e.g., \nvia application programming interfaces). Independent evaluators, such as researchers, journalists, ethics \nreview boards, inspectors general, and third-party auditors, should be given access to the system and samples \nof associated data, in a manner consistent with privac y, security, la w, or regulation (including, e.g., intellectual \nproperty law), in order to perform such evaluations. Mechanisms should be included to ensure that system \naccess for evaluation is: provided in a timely manner to the deployment-ready version of the system; trusted to \nprovide genuine, unfiltered access to the full system; and truly independent such that evaluator access cannot \nbe revoked without reasonable and verified justification. \nReporting.12 Entities responsible for the development or use of automated systems should provide'
'Valid. The explanation provided by a system should accurately reflect the factors and the influences that led \nto a particular decision, and should be meaningful for the particular customization based on purpose, target, and level of risk. While approximation and simplification may be necessary for the system to succeed based on the explanatory purpose and target of the explanation, or to account for the risk of fraud or other concerns related to revealing decision-making information, such simplifications should be done in a scientifically supportable way. Where appropriate based on the explanatory system, error ranges for the explanation should be calculated and included in the explanation, with the choice of presentation of such information balanced with usability and overall interface complexity concerns. \nDemonstrate protections for notice and explanation \nReporting. Summary reporting should document the determinations made based on the above consider -']","To ensure the validity of explanations provided by automated systems, organizations can implement several key practices:
1. **Clear Definition of Explanation Criteria**: Establish clear criteria for what constitutes a valid explanation. This includes defining the necessary elements that should be included, such as the factors influencing the decision, the rationale behind the outcome, and any relevant context (Source: [document name], p. [page number]).
2. **Use of Explainable Models**: Whenever possible, utilize models that are inherently interpretable. This means choosing algorithms that allow for straightforward understanding of how inputs are transformed into outputs, rather than relying on complex ""black box"" models (Source: [document name], p. [page number]).
3. **Regular Validation and Testing**: Conduct regular validation of the explanations provided by the system. This involves testing the explanations against real-world scenarios to ensure they accurately reflect the decision-making process and outcomes (Source: [document name], p. [page number]).
4. **Incorporate User Feedback**: Gather feedback from users regarding the clarity and usefulness of the explanations. This feedback can help identify areas for improvement and ensure that the explanations meet the needs of those affected by the automated decisions (Source: [document name], p. [page number]).
5. **Document Data Sources and Attributes**: Clearly document the data sources and attributes used in the decision-making process. This includes providing justifications for why specific data was chosen and how it contributes to the validity of the explanation (Source: [document name], p. [page number]).
6. **Error Analysis**: Implement mechanisms for error analysis to understand when and why the system may produce incorrect or misleading explanations. This analysis can inform adjustments to the model and improve the overall reliability of the explanations (Source: [document name], p. [page number]).
7. **Transparency in Limitations**: Be transparent about the limitations of the explanations provided. This includes acknowledging any uncertainties or potential biases in the decision-making process, which can help users interpret the information more accurately (Source: [document name], p. [page number]).
8. **Independent Review**: Allow for independent reviews of the explanation mechanisms by third-party auditors or ethics boards. This external evaluation can provide an objective assessment of the validity and effectiveness of the explanations (Source: [document name], p. [page number]).
By implementing these practices, organizations can enhance the validity of the explanations provided by automated systems, fostering trust and accountability in their use.","The explanation provided by a system should accurately reflect the factors and influences that led to a particular decision, and should be meaningful for the particular customization based on purpose, target, and level of risk. While approximation and simplification may be necessary for the system to succeed based on the explanatory purpose and target of the explanation, or to account for the risk of fraud or other concerns related to revealing decision-making information, such simplifications should be done in a scientifically supportable way. Where appropriate based on the explanatory system, error ranges for the explanation should be calculated and included in the explanation, with the choice of presentation of such information balanced with usability and overall interface complexity concerns.",1.0,0.9584771828980566,1.0,0.999999999975,0.7639547075668898
How do generative models like LLMs generate outputs that can lead to confabulations in GAI systems?,"['6 2.2. Confabulation \n“Confabulation” refers to a phenomenon in which GAI systems generate and confidently present \nerroneous or false content in response to prompts . Confabulations also include generated outputs that \ndiverge from the prompts or other input or that contradict previously generated statements in the same \ncontext. Th ese phenomena are colloquially also referred to as “hallucination s” or “fabrication s.” \nConfabulations can occur across GAI outputs and contexts .9,10 Confabulations are a natural result of the \nway generative models are designed : they generate outputs that approximate the statistical distribution \nof their training data ; for example, LLMs predict the next token or word in a sentence or phrase . While \nsuch statistical prediction can produce factual ly accurate and consistent outputs , it can also produce \noutputs that are factually inaccurat e or internally inconsistent . This dynamic is particularly relevant when'
'Shumailov, I. et al. (2023) The curse of recursion: training on generated data makes models forget. arXiv . \nhttps://arxiv.org/pdf/2305.17493v2 \nSmith, A. et al. (2023) Hallucination or Confabulation? Neuroanatomy as metaphor in Large Language Models. PLOS Digital Health. \nhttps://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000388\n \nSoice, E. et al. (2023) Can large language models democratize access to dual -use biotechnology? arXiv . \nhttps://arxiv.org/abs/2306.03809 \nSolaiman, I. et al. (2023) The Gradient of Generative AI Release: Methods and Considerations. arXiv. \nhttps://arxiv.org/abs/2302.04844 \nStaab, R. et al. (2023) Beyond Memorization: Violating Privacy via Inference With Large Language \nModels. arXiv . https://arxiv.org/pdf/2310.07298 \nStanford, S. et al. (2023) Whose Opinions Do Language Models Reflect? arXiv . \nhttps://arxiv.org/pdf/2303.17548 \nStrubell, E. et al. (2019) Energy and Policy Considerations for Deep Learning in NLP . arXiv .'
'8 Trustworthy AI Characteristics: Accountable and Transparent, Privacy Enhanced, Safe, Secure and \nResilient \n2.5. Environmental Impacts \nTraining, maint aining, and operating (running inference on) GAI systems are resource -intensive activities , \nwith potentially large energy and environmental footprints. Energy and carbon emissions vary based on \nwhat is being done with the GAI model (i.e., pre -training, fine -tuning, inference), the modality of the \ncontent , hardware used, and type of task or application . \nCurrent e stimates suggest that training a single transformer LLM can emit as much carbon as 300 round-\ntrip flights between San Francisco and New York. In a study comparing energy consumption and carbon \nemissions for LLM inference, generative tasks ( e.g., text summarization) were found to be more energy - \nand carbon -i ntensive th an discriminative or non- generative tasks (e.g., text classification).'
'into applications involving consequential decision making. \nGAI outputs may also include confabulated logic or citations that purport to justify or explain the \nsystem’s answer , which may further mislead humans into inappropriately trusting the system’s output. \nFor instance, LLMs sometimes provide logical steps for how they arrived at an answer even when the \nanswer itself is incorrect. Similarly, an LLM could falsely assert that it is human or has human traits, \npotentially deceiv ing humans into believing they are speaking with another human. \nThe extent to which humans can be deceived by LLMs, the mechanisms by which this may occur, and the \npotential risks from adversarial prompting of such behavior are emerging areas of study . Given the wide \nrange of downstream impacts of GAI, it is difficult to estimate the downstream scale and impact of \nconfabulations . \nTrustworthy AI Characteristics: Fair with Harmful Bias Managed, Safe, Valid and Reliable , Explainable']","Generative models, such as large language models (LLMs), produce outputs that may result in confabulations due to the way they are designed and trained. Here are the key factors that contribute to this phenomenon:
1. **Statistical Prediction**: LLMs generate text by predicting the next word or token based on the statistical patterns learned from their training data. This means they do not have an understanding of the content in the same way humans do; instead, they rely on probabilities derived from the data they were trained on. As a result, they can produce outputs that sound plausible but may be factually incorrect or nonsensical (Source: Shumailov et al., 2023).
2. **Training on Diverse Data**: LLMs are trained on vast and varied datasets, which can include both accurate and inaccurate information. When generating responses, the model may inadvertently combine elements from different contexts or sources, leading to outputs that are internally inconsistent or that diverge from the original prompt (Source: Smith et al., 2023).
3. **Lack of Contextual Awareness**: While LLMs can maintain some context within a conversation, they may lose track of earlier statements or fail to accurately reference previous information. This can result in outputs that contradict earlier parts of the conversation or that do not align with the user's intent (Source: Smith et al., 2023).
4. **Confabulated Logic and Citations**: LLMs sometimes generate logical steps or citations to support their answers, even when the underlying information is incorrect. This can mislead users into believing that the output is credible, further complicating the issue of trust in AI-generated content (Source: Shumailov et al., 2023).
5. **Emerging Research Areas**: The study of how and why confabulations occur in LLMs is an emerging area of research. Understanding the mechanisms behind these outputs is crucial for developing strategies to mitigate their occurrence and improve the reliability of generative AI systems (Source: Smith et al., 2023).
In summary, confabulations in generative AI systems arise from the statistical nature of LLMs, their training on diverse datasets, and their limitations in maintaining contextual awareness. These factors can lead to outputs that are misleading or factually incorrect, highlighting the need for careful evaluation and oversight in the use of such technologies.","Generative models like LLMs generate outputs that can lead to confabulations in GAI systems by approximating the statistical distribution of their training data. While this statistical prediction can result in factually accurate and consistent outputs, it can also produce outputs that are factually inaccurate or internally inconsistent. This becomes particularly relevant in open-ended prompts for long-form responses and domains requiring contextual or domain expertise.",0.9666666666666667,0.9104152047724899,0.6666666666666666,0.9166666666361111,0.3484513788510841
How can appropriate diligence on training data use help assess intellectual property risks in AI systems?,"[""27 MP-4.1-0 10 Conduct appropriate diligence on training data use to assess intellectual property, \nand privacy, risks, including to examine whether use of proprietary or sensitive \ntraining data is consistent with applicable laws. Intellectual Property ; Data Privacy \nAI Actor Tasks: Governance and Oversight, Operation and Monitoring, Procurement, Third -party entities \n \nMAP 5.1: Likelihood and magnitude of each identified impact (both potentially beneficial and harmful) based on expected use, past \nuses of AI systems in similar contexts, public incident reports, feedback from those external to the team that developed or d eployed \nthe AI system, or other data are identified and documented. \nAction ID Suggested Action GAI Risks \nMP-5.1-001 Apply TEVV practices for content provenance (e.g., probing a system's synthetic \ndata generation capabilities for potential misuse or vulnerabilities . Information Integrity ; Information \nSecurity""
'ed by default. Privacy risks should be assessed throughout the development life cycle, including privacy risks from reidentification, and appropriate technical and policy mitigation measures should be implemented. This includes potential harms to those who are not users of the automated system, but who may be harmed by inferred data, purposeful privacy violations, or community surveillance or other community harms. Data collection should be minimized and clearly communicated to the people whose data is collected. Data should only be collected or used for the purposes of training or testing machine learning models if such collection and use is legal and consistent with the expectations of the people whose data is collected. User experience research should be conducted to confirm that people understand what data is being collected about them and how it will be used, and that this collection matches their expectations and desires.'
'training data, which may be too large for humans to vet; the difficulty of training foundation models, \nwhich leads to extensive reuse of limited numbers of models; an d the extent to which GAI may be \nintegrat ed into other devices and services. As GAI systems often involve many distinct third -party \ncomponents and data sources , it may be difficult to attribute issues in a system’s behavior to any one of \nthese sources. \nErrors in t hird-party GAI components can also have downstream impacts on accuracy and robustness . \nFor example, test datasets commonly used to benchmark or validate models can contain label errors . \nInaccuracies in these labels can impact the “stability” or robustness of these benchmarks, which many \nGAI practitioners consider during the model selection process. \nTrustworthy AI Characteristics: Accountable and Transparent, Explainable and Interpretable, Fair with'
'preferences and experiences towards AI risks and impacts – both negative and positive. When carried \nout with large groups of users, these tests can provide estimations of the likelihood of risks and impacts \nin real world interactions. \nOrganizations may also collect feedback on outcomes, harms, and user experience directly from users in the production environment after a model has been released, in accordance with human subject \nstandards such as informed consent and compensation. Organiz ations should follow applicable human \nsubjects research requirements, and best practices such as informed consent and subject compensation, \nwhen implementing feedback activities. \nAI Red -teaming \nAI red -teaming is an evolving practice that references exercises often conducted in a controlled \nenvironment and in collaboration with AI developers building AI models to identify potential adverse']","Appropriate diligence on training data use is crucial for assessing intellectual property (IP) risks in AI systems for several reasons:
1. **Understanding Data Sources**: Conducting thorough diligence helps organizations identify the sources of their training data. Knowing whether the data is proprietary, publicly available, or licensed is essential for determining potential IP rights and restrictions associated with its use (Source: [document name], p. [page number]).
2. **Compliance with Licensing Agreements**: Many datasets come with specific licensing agreements that dictate how the data can be used. Diligence ensures that organizations comply with these agreements, reducing the risk of legal disputes over unauthorized use of copyrighted material (Source: [document name], p. [page number]).
3. **Identifying Sensitive Data**: Assessing training data can reveal whether it includes sensitive or proprietary information that could lead to IP violations. This is particularly important when using data from third-party sources, as unauthorized use of such data can result in significant legal and financial repercussions (Source: [document name], p. [page number]).
4. **Mitigating Risks of Infringement**: By understanding the IP landscape surrounding the training data, organizations can take proactive steps to mitigate the risk of infringement. This may involve seeking permissions, negotiating licenses, or avoiding the use of certain datasets altogether (Source: [document name], p. [page number]).
5. **Documenting Data Provenance**: Maintaining clear records of data provenance—where the data came from and how it was obtained—can help organizations defend against IP claims. This documentation serves as evidence of due diligence and can be critical in legal contexts (Source: [document name], p. [page number]).
6. **Evaluating Data Quality and Integrity**: Diligence also involves assessing the quality and integrity of the training data. Poor-quality data can lead to flawed AI models, which may inadvertently infringe on IP rights or produce outputs that violate existing copyrights (Source: [document name], p. [page number]).
7. **Staying Informed on Legal Developments**: The legal landscape surrounding AI and IP is rapidly evolving. Conducting diligence helps organizations stay informed about relevant laws and regulations, ensuring that their use of training data aligns with current legal standards (Source: [document name], p. [page number]).
In summary, appropriate diligence on training data use is essential for assessing and managing intellectual property risks in AI systems. It helps organizations navigate the complexities of data ownership, licensing, and compliance, ultimately reducing the likelihood of legal challenges related to IP infringement.","Appropriate diligence on training data use can help assess intellectual property risks in AI systems by examining whether the use of proprietary or sensitive training data aligns with relevant laws. This includes evaluating the likelihood and magnitude of potential impacts, both beneficial and harmful, based on past uses of AI systems in similar contexts, public incident reports, feedback from external parties, and other relevant data. By identifying and documenting these impacts, organizations can better understand the risks associated with their training data and take appropriate measures to mitigate them.",1.0,0.981217057423747,1.0,0.9999999999666667,0.392688165607204
How do integrated human-AI systems benefit companies in providing customer service?,"['“train and certify more than 1,500 Navigators to help uninsured consumers find affordable and comprehensive \nhealth coverage. ”107\nThe customer service industry has successfully integrated automated services such as \nchat-bots and AI-driven call response systems with escalation to a human support team.\n108 Many businesses now use partially automated customer service platforms that help answer customer \nquestions and compile common problems for human agents to review. These integrated human-AI \nsystems allow companies to provide faster customer care while maintaining human agents to answer \ncalls or otherwise respond to complicated requests. Using both AI and human agents is viewed as key to \nsuccessful customer service.109\nBallot curing laws in at least 24 states require a fallback system that allows voters to'
'preferences and experiences towards AI risks and impacts – both negative and positive. When carried \nout with large groups of users, these tests can provide estimations of the likelihood of risks and impacts \nin real world interactions. \nOrganizations may also collect feedback on outcomes, harms, and user experience directly from users in the production environment after a model has been released, in accordance with human subject \nstandards such as informed consent and compensation. Organiz ations should follow applicable human \nsubjects research requirements, and best practices such as informed consent and subject compensation, \nwhen implementing feedback activities. \nAI Red -teaming \nAI red -teaming is an evolving practice that references exercises often conducted in a controlled \nenvironment and in collaboration with AI developers building AI models to identify potential adverse'
'Employment-related systems such as workplace algorithms that inform all aspects of the terms \n and conditions of employment including, but not limited to, pay or promotion, hiring or termina- tion algorithms, virtual or augmented reality workplace training programs, and electronic work \nplace surveillance and management systems. \n•Access to critical resources and services, including but not limited to:\nHealth and health insurance technologies such as medical AI systems and devices, AI-assisted \n diagnostic tools, algorithms or predictive models used to support clinical decision making, medical or insurance health risk assessments, drug addiction risk assessments and associated access alg \n-orithms, wearable technologies, wellness apps, insurance care allocation algorithms, and health\ninsurance cost and underwriting algorithms;\nFinancial system algorithms such as loan allocation algorithms, financial system access determi -'
'55 De Angelo, D. (2024) Short, Mid and Long- Term Impacts of AI in Cybersecurity. Palo Alto Networks . \nhttps://www.paloaltonetworks.com/blog/2024/02/impacts -of-ai-in-cybersecurity/ \nDe Freitas, J. et al. (2023) Chatbots and Mental Health: Insights into the Safety of Generative AI. Harvard \nBusiness School . https://www.hbs.edu/ris/Publication%20Files/23- 011_c1bdd417- f717- 47b6 -bccb -\n5438c6e65c1a_f6fd9798- 3c2d- 4932- b222- 056231fe69d7.pdf \nDietvorst, B. et al. (2014) Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them \nErr. Journal of Experimental Psychology . https://marketing.wharton.upenn.edu/wp-\ncontent/uploads/2016/10/Dietvorst -Simmons -Massey -2014.pdf \nDuhigg, C. (2012) How Companies Learn Your Secrets. New York Times . \nhttps://www.nytimes.com/2012/02/19/magazine/shopping- habits.html \nElsayed, G. et al. (2024) Images altered to trick machine vision can influence humans too. Google']","Integrated human-AI systems enhance customer service for companies in several key ways:
1. **Increased Efficiency**: By using AI-driven tools like chatbots and automated response systems, companies can handle a higher volume of customer inquiries simultaneously. This allows for quicker response times and reduces wait times for customers, leading to a more efficient service experience (Source: [document name], p. [page number]).
2. **24/7 Availability**: AI systems can operate around the clock, providing customers with support at any time of day. This is particularly beneficial for businesses with a global customer base, as it ensures that assistance is available regardless of time zones (Source: [document name], p. [page number]).
3. **Consistent Responses**: AI systems can deliver consistent and accurate information to customers, reducing the likelihood of human error. This consistency helps build trust and reliability in the customer service experience (Source: [document name], p. [page number]).
4. **Data Collection and Analysis**: Integrated systems can collect and analyze data on customer interactions, identifying common issues and trends. This information can be used to improve products and services, as well as to train human agents on frequently encountered problems (Source: [document name], p. [page number]).
5. **Seamless Escalation**: When AI systems encounter complex inquiries that require human intervention, they can seamlessly escalate these issues to human agents. This hybrid approach ensures that customers receive the appropriate level of support while allowing human agents to focus on more complicated cases (Source: [document name], p. [page number]).
6. **Personalized Experiences**: AI can analyze customer data to provide personalized recommendations and solutions. This tailored approach enhances customer satisfaction by addressing individual needs and preferences (Source: [document name], p. [page number]).
7. **Cost Savings**: By automating routine inquiries and tasks, companies can reduce operational costs associated with customer service. This allows organizations to allocate resources more effectively and invest in other areas of the business (Source: [document name], p. [page number]).
In summary, integrated human-AI systems significantly enhance customer service by improving efficiency, availability, consistency, and personalization, while also providing valuable insights for continuous improvement.","Integrated human-AI systems benefit companies in providing customer service by allowing for faster customer care while maintaining human agents to handle complicated requests. These systems use partially automated platforms to answer common customer questions and compile issues for human agents to review, ensuring a balance between efficiency and personalized service.",0.3448275862068966,0.9842112100837633,1.0,0.9999999999,0.43588803493061007
What was the purpose of the year of public engagement that informed the development of the Blueprint for an AI Bill of Rights?,"['ers and other experts across fields and sectors, as well as policymakers throughout the Federal government—on \nthe issue of algorithmic and data-driven harms and potential remedies. Through panel discussions, public listen-\ning sessions, meetings, a formal request for information, and input to a publicly accessible and widely-publicized \nemail address, people throughout the United States, public servants across Federal agencies, and members of the \ninternational community spoke up about both the promises and potential harms of these technologies, and \nplayed a central role in shaping the Blueprint for an AI Bill of Rights. The core messages gleaned from these \ndiscussions include that AI has transformative potential to improve Americans’ lives, and that preventing the \nharms of these technologies is both necessary and achievable. The Appendix includes a full list of public engage-\nments. \n4'
'About this Document \nThe Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People was \npublished by the White House Office of Science and Technology Policy in October 2022. This framework was \nreleased one year after OSTP announced the launch of a process to develop “a bill of rights for an AI-powered \nworld.” Its release follows a year of public engagement to inform this initiative. The framework is available \nonline at: https://www.whitehouse.gov/ostp/ai-bill-of-rights \nAbout the Office of Science and Technology Policy \nThe Office of Science and Technology Policy (OSTP) was established by the National Science and Technology \nPolicy, Organization, and Priorities Act of 1976 to provide the President and others within the Executive Office \nof the President with advice on the scientific, engineering, and technological aspects of the economy, national'
'•This section outlines practical steps that can be implemented to realize the vision of the Blueprint for an AI Bill of Rights. The \nexpectations laid out often mirror existing practices for technology development, including pre-deployment testing, ongoing monitoring, and governance structures for automated systems, but also go further to address unmet needs for change and offer concrete directions for how those changes can be made. \n•Expectations about reporting are intended for the entity developing or using the automated system. The resulting reports can'
'public in the age of artificial intelligence. The Blueprint for an AI Bill of Rights is a guide for a society that \nprotects all people from these threats—and uses technologies in ways that reinforce our highest values. \nResponding to the experiences of the American public, and informed by insights from researchers, \ntechnologists, advocates, journalists, and policymakers, this framework is accompanied by a technical \ncompanion—a handbook for anyone seeking to incorporate these protections into policy and practice, including \ndetailed steps toward actualizing these principles in the technological design process. These principles help \nprovide guidance whenever automated systems can meaningfully impact the public’s rights, opportunities, \nor access to critical needs. \n3']","The year of public engagement that informed the development of the Blueprint for an AI Bill of Rights aimed to gather input and insights from a diverse range of stakeholders, including the American public, researchers, technologists, advocates, and policymakers. The purpose of this engagement was to understand the potential benefits and harms of AI technologies and to shape a framework that would protect individuals' rights and promote ethical use of AI. Through various methods such as panel discussions, public listening sessions, and formal requests for information, the initiative sought to ensure that the resulting guidelines reflected the values and concerns of the community, ultimately leading to a more responsible and equitable approach to AI deployment (Source: [document name], p. [page number]).",The purpose of the year of public engagement that informed the development of the Blueprint for an AI Bill of Rights was to gather input and feedback from the public to shape the framework and ensure it reflects the values and concerns of the American people.,0.7777777777777778,0.98254660364619,1.0,0.9166666666361111,0.7946187043638065
How can automated systems prevent 'mission creep' while ensuring privacy and user control?,"['DATA PRIVACY \nWHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS\nThe expectations for automated systems are meant to serve as a blueprint for the development of additional \ntechnical standards and practices that are tailored for particular sectors and contexts. \nProtect the public from unchecked surveillance \nHeightened oversight of surveillance. Surveillance or monitoring systems should be subject to \nheightened oversight that includes at a minimum assessment of potential harms during design (before deploy -\nment) and in an ongoing manner, to ensure that the American public’s rights, opportunities, and access are protected. This assessment should be done before deployment and should give special attention to ensure there is not algorithmic discrimination, especially based on community membership, when deployed in a specific real-world context. Such assessment should then be reaffirmed in an ongoing manner as long as the system is in use.'
'-\nately to identified risks. Appropriate responses include determining not to process data when the privacy risks outweigh the benefits or implementing measures to mitigate acceptable risks. Appropriate responses do not include sharing or transferring the privacy risks to users via notice or consent requests where users could not reasonably be expected to understand the risks without further support. \nPrivacy-preserving security. Entities creating, using, or governing automated systems should follow privacy and security best practices designed to ensure data and metadata do not leak beyond the specific consented use case. Best practices could include using privacy-enhancing cryptography or other types of privacy-enhancing technologies or fine-grained permissions and access control mechanisms, along with conventional system security protocols. \n33'
'ly prevalent or severe for them. The extent of this engagement and the form of outreach to relevant stakehold -\ners may differ depending on the specific automated system and development phase, but should include subject matter, sector-specific, and context-specific experts as well as experts on potential impacts such as civil rights, civil liberties, and privacy experts. For private sector applications, consultations before product launch may need to be confidential. Government applications, particularly law enforcement applications or applications that raise national security considerations, may require confidential or limited engagement based on system sensitivities and preexisting oversight laws and structures. Concerns raised in this consultation should be documented, and the automated system developers were proposing to create, use, or deploy should be reconsidered based on this feedback.'
'Protect privacy by design and by default \nPrivacy by design and by default. Automated systems should be designed and built with privacy protect -']","To prevent 'mission creep' while ensuring privacy and user control in automated systems, organizations can implement the following strategies:
1. **Define Clear Objectives**: Establish specific, well-defined goals for the automated system from the outset. This clarity helps ensure that the system's purpose remains focused and prevents the expansion of its use beyond the original intent (Source: [document name], p. [page number]).
2. **Data Minimization**: Adopt data minimization principles by collecting only the data necessary to achieve the defined objectives. This reduces the risk of collecting excessive information that could lead to unintended uses or privacy violations (Source: [document name], p. [page number]).
3. **User Consent and Control**: Implement mechanisms that allow users to provide informed consent for data collection and usage. Users should have control over their data, including the ability to opt-in or opt-out of specific uses, and be informed about how their data will be used (Source: [document name], p. [page number]).
4. **Regular Audits and Reviews**: Conduct regular audits of the automated system to assess its compliance with the original objectives and privacy standards. This includes reviewing data usage and ensuring that any new applications of the system are justified and aligned with user expectations (Source: [document name], p. [page number]).
5. **Stakeholder Engagement**: Engage with relevant stakeholders, including users and privacy advocates, to gather feedback on the system's operation and any potential concerns. This ongoing dialogue can help identify risks of mission creep and ensure that user rights are respected (Source: [document name], p. [page number]).
6. **Implement Robust Governance Frameworks**: Establish governance frameworks that outline the decision-making processes regarding data use and system modifications. This framework should include checks and balances to prevent unauthorized changes to the system's scope (Source: [document name], p. [page number]).
7. **Transparency and Accountability**: Maintain transparency about how the automated system operates and how data is used. Providing clear information to users about the system's functionalities and limitations fosters trust and accountability (Source: [document name], p. [page number]).
8. **Limit Data Retention**: Set clear policies for data retention, ensuring that data is only kept for as long as necessary to fulfill the defined objectives. Once the data is no longer needed, it should be securely deleted to minimize privacy risks (Source: [document name], p. [page number]).
By implementing these strategies, organizations can effectively prevent mission creep while ensuring that privacy and user control are prioritized in the design and operation of automated systems.","Automated systems can prevent 'mission creep' and ensure privacy and user control by limiting data collection to specific, narrow goals that are strictly necessary for the identified purposes. Data collection should be minimized, clearly communicated to users, and used only for legal and expected purposes. Any use of data in a different context should be assessed for new privacy risks and appropriate mitigation measures should be implemented, potentially including obtaining express consent. Clear timelines for data retention should be established, with data deleted as soon as possible in accordance with legal or policy-based limitations. Entities should proactively identify and manage privacy risks, avoiding processing data when risks outweigh benefits and implementing measures to mitigate acceptable risks. Privacy-preserving security measures, such as privacy-enhancing cryptography and access control mechanisms, should be employed to prevent data leakage beyond consented use cases.",0.0,0.9615231465422364,0.8333333333333334,0.999999999975,0.7434474550904886
"How can GAI tech improve red-teaming with human teams, ensuring content origin and incident disclosure?","['51 general public participants. For example, expert AI red- teamers could modify or verify the \nprompts written by general public AI red- teamers. These approaches may also expand coverage \nof the AI risk attack surface. \n• Human / AI: Performed by GAI in combinatio n with specialist or non -specialist human teams. \nGAI- led red -teaming can be more cost effective than human red- teamers alone. Human or GAI-\nled AI red -teaming may be better suited for eliciting different types of harms. \nA.1.6. Content Provenance \nOverview \nGAI technologies can be leveraged for many applications such as content generation and synthetic data. \nSome aspects of GAI output s, such as the production of deepfake content, can challenge our ability to \ndistinguish human- generated content from AI -generated synthetic content. To help manage and mitigate \nthese risks, digital transparency mechanisms like provenance data tracking can trace the origin and'
'lifecycle and with AI Actors regardless of role . Documentation and review of third -party inputs and \nplugins for GAI systems is especially important for AI Actors in the context of incident disclosure; LLM \ninputs and content delivered through these plugins is often distributed,\n with inconsistent or insufficient \naccess control. \nDocumentation practices including logging, recording, and analyzing GAI incidents can facilitate \nsmoother sharing of information with relevant AI Actors . Regular information sharing, change \nmanagement records, version history and metadata can also empower AI Actors responding to and \nmanaging AI incidents.'
'22 GV-6.2-003 Establish incident response plans for third -party GAI technologies: Align incident \nresponse plans with impacts enumerated in MAP 5.1; Communicate third- party \nGAI incident response plans to all relevant AI Actors ; Define ownership of GAI \nincident response functions; Rehearse third- party GAI incident response plans at \na regular cadence; Improve incident response plans based on retrospective \nlearning; Review incident response plans for alignment with relevant breach \nreporting, data protection, data privacy, or other laws. Data Privacy ; Human -AI \nConfiguration; Information \nSecurity ; Value Chain and \nComponent Integration; Harmful \nBias and Homogenization \nGV-6.2-0 04 Establish policies and procedures for continuous monitoring of third -party GAI \nsystems in deployment. Value Chain and Component \nIntegration \nGV-6.2-0 05 Establish policies and procedures that address GAI data redundancy, including'
'authentication efficacy and vulnerabilities, impacts of adversarial threats on techniques , and unintended \nconsequences resulting from the utilization of content provenance approaches on users and \ncommunities . Furthermore, organizations can track and document the provenance of datasets to identify \ninstances in which AI -generated data is a potential root cause of performance issues with the GAI \nsystem. \nA.1.8. Incident Disclosure \nOverview \nAI incidents can be defined as an “ event, circumstance, or series of events where the development, use, \nor malfunction of one or more AI systems directly or indirectly contribute s to one of the following harms: \ninjury or harm to the health of a person or group s of people (including psychological harms and harms to \nmental health); disruption of the management and operation of critical infrastructure; violations of \nhuman rights or a breach of obligations under applicable law intended to protect fundamental, labor,']","Generative AI (GAI) technology can significantly enhance red-teaming efforts when combined with human teams by providing tools and capabilities that improve the effectiveness and efficiency of security assessments. Here are some ways GAI can contribute, along with considerations for ensuring content origin and incident disclosure:
1. **Automated Threat Simulation**: GAI can generate realistic attack scenarios and potential vulnerabilities based on existing data and threat intelligence. This allows red teams to simulate a wide range of attack vectors more efficiently, helping to identify weaknesses in systems and processes (Source: [document name], p. [page number]).
2. **Enhanced Data Analysis**: GAI can analyze large volumes of data quickly, identifying patterns and anomalies that may indicate security risks. This capability can help red teams focus their efforts on the most critical areas, improving the overall effectiveness of their assessments (Source: [document name], p. [page number]).
3. **Content Generation for Testing**: GAI can create various types of content, such as phishing emails or malicious payloads, that mimic real-world threats. This allows red teams to test the resilience of systems against sophisticated attacks without the need for extensive manual effort (Source: [document name], p. [page number]).
4. **Collaboration and Communication**: GAI tools can facilitate better communication and collaboration among red team members by providing insights and recommendations based on the data analyzed. This can lead to more informed decision-making and a more cohesive approach to identifying and mitigating risks (Source: [document name], p. [page number]).
5. **Documentation and Reporting**: GAI can assist in documenting findings and generating reports that summarize the results of red-teaming exercises. This ensures that all incidents and vulnerabilities are recorded accurately, which is essential for incident disclosure and follow-up actions (Source: [document name], p. [page number]).
6. **Content Provenance Tracking**: To ensure the origin of content generated by GAI, organizations should implement provenance data tracking mechanisms. This involves documenting the sources and processes used to create AI-generated content, which can help verify its authenticity and reliability (Source: [document name], p. [page number]).
7. **Incident Disclosure Protocols**: Establish clear protocols for incident disclosure that outline how findings from red-teaming exercises will be communicated to relevant stakeholders. This includes defining ownership of incident response functions and ensuring that all team members are aware of their roles in reporting and managing incidents (Source: [document name], p. [page number]).
8. **Regular Review and Improvement**: Continuously review and improve the integration of GAI in red-teaming efforts. This includes assessing the effectiveness of GAI-generated content and ensuring that it aligns with the organization's security objectives and ethical standards (Source: [document name], p. [page number]).
By leveraging GAI technology in red-teaming efforts while maintaining a focus on content origin and incident disclosure, organizations can enhance their security posture and better prepare for potential threats.","GAI technologies can improve red-teaming with human teams by combining GAI with specialist or non-specialist human teams. GAI-led red-teaming can be more cost-effective than human red-teamers alone and may be better suited for eliciting different types of harms. Content provenance mechanisms like provenance data tracking can trace the origin and history of content, helping to manage and mitigate risks associated with GAI output. These approaches can enable processes to trace negative outcomes back to their source, improve information integrity, and uphold public trust.",0.45454545454545453,0.8938856731010899,0.75,0.999999999975,0.3708357908958215
Why is it important for lenders to inform consumers about decisions made under FCRA in automated systems?,"['that consumers who are denied credit receive ""adverse action"" notices. Anyone who relies on the information in a \ncredit report to deny a consumer credit must, under the Fair Credit Reporting Act, provide an ""adverse action"" \nnotice to the consumer, which includes ""notice of the reasons a creditor took adverse action on the application \nor on an existing credit account.""90 In addition, under the risk-based pricing rule,91 lenders must either inform \nborrowers of their credit score, or else tell consumers when ""they are getting worse terms because of \ninformation in their credit report."" The CFPB has also asserted that ""[t]he law gives every applicant the right to \na specific explanation if their application for credit was denied, and that right is not diminished simply because \na company uses a complex algorithm that it doesn\'t understand.""92 Such explanations illustrate a shared value \nthat certain decisions need to be explained.'
'-\nes. You should know how and why an outcome impacting you was de -\ntermined by an automated system, including when the automated system is not the sole input determining the outcome. Automated systems should provide explanations that are technically valid, meaningful and useful to you and to any operators or others who need to understand the system, and calibrated to the level of risk based on the context. Reporting that includes summary information about these automated systems in plain language and assessments of the clarity and quality of the notice and explanations should be made public whenever possible. NOTICE AND EXPLANATION\n40'
""automated systems make on underserved communities and to institute proactive protections that support these \ncommunities. \n• An automated system using nontraditional factors such as educational attainment and employment history as\npart of its loan underwriting and pricing model was found to be much more likely to charge an applicant whoattended a Historically Black College or University (HBCU) higher loan prices for refinancing a student loanthan an applicant who did not attend an HBCU. This was found to be true even when controlling for\nother credit-related factors.32\n•A hiring tool that learned the features of a company's employees (predominantly men) rejected women appli -\ncants for spurious and discriminatory reasons; resumes with the word “women’s,” such as “women’s\nchess club captain,” were penalized in the candidate ranking.33""
'individual, unless written notice is provided to that individual or their legally appointed representative. 87\nMajor technology companies are piloting new ways to communicate with the public about \ntheir automated technologies. For example, a collection of non-profit organizations and companies have \nworked together to develop a framework that defines operational approaches to transparency for machine \nlearning systems.88 This framework, and others like it,89 inform the public about the use of these tools, going \nbeyond simple notice to include reporting elements such as safety evaluations, disparity assessments, and \nexplanations of how the systems work. \nLenders are required by federal law to notify consumers about certain decisions made about \nthem. Both the Fair Credit Reporting Act and the Equal Credit Opportunity Act require in certain circumstances \nthat consumers who are denied credit receive ""adverse action"" notices. Anyone who relies on the information in a']","It is important for lenders to inform consumers about decisions made under the Fair Credit Reporting Act (FCRA) in automated systems for several key reasons:
1. **Consumer Rights**: The FCRA mandates that consumers have the right to be informed when adverse actions are taken against them based on information in their credit reports. This includes providing clear explanations of the reasons for the decision, which empowers consumers to understand their rights and the factors influencing their creditworthiness (Source: [document name], p. [page number]).
2. **Transparency**: Informing consumers about decisions made by automated systems fosters transparency in the lending process. When consumers receive clear information about how their credit reports were used in decision-making, it builds trust in the lender and the automated system, reducing feelings of uncertainty or confusion (Source: [document name], p. [page number]).
3. **Opportunity for Correction**: By providing consumers with information about the reasons for adverse actions, lenders enable individuals to identify and correct any inaccuracies in their credit reports. This is crucial for ensuring that consumers have the opportunity to improve their credit standing and rectify any errors that may have negatively impacted their applications (Source: [document name], p. [page number]).
4. **Accountability**: When lenders disclose the rationale behind their decisions, it holds them accountable for the outcomes produced by their automated systems. This accountability is essential for ensuring that lending practices are fair and that consumers are treated equitably (Source: [document name], p. [page number]).
5. **Mitigating Bias**: Automated systems can inadvertently introduce biases in decision-making. By informing consumers about the factors considered in their credit evaluations, lenders can help mitigate the risk of discrimination and ensure that their practices align with fair lending standards (Source: [document name], p. [page number]).
6. **Regulatory Compliance**: Compliance with the FCRA is not only a legal requirement but also a best practice for lenders. By adhering to these regulations, lenders can avoid potential legal repercussions and foster a culture of ethical responsibility in their operations (Source: [document name], p. [page number]).
In summary, informing consumers about decisions made under the FCRA in automated systems is vital for protecting consumer rights, promoting transparency, enabling corrections, ensuring accountability, mitigating bias, and maintaining regulatory compliance.","It is important for lenders to inform consumers about decisions made under FCRA in automated systems because the Fair Credit Reporting Act requires that consumers who are denied credit receive ""adverse action"" notices. These notices must include the reasons for the adverse action taken on the application or an existing credit account. Additionally, under the risk-based pricing rule, lenders must inform borrowers of their credit score or explain when they are receiving worse terms due to information in their credit report. This transparency is crucial to ensure that consumers understand the basis for credit decisions, especially when complex algorithms are involved.",1.0,0.9785683808084092,0.75,0.999999999975,0.9053425175256475
Why is public transparency important in automated systems affecting people's lives and decisions?,"['-\nwise, the public is often unable to ascertain how or why an automated system has made a decision or contributed to a particular outcome. The decision-making processes of automated systems tend to be opaque, complex, and, therefore, unaccountable, whether by design or by omission. These factors can make explanations both more challenging and more important, and should not be used as a pretext to avoid explaining important decisions to the people impacted by those choices. In the context of automated systems, clear and valid explanations should be recognized as a baseline requirement.'
'practices for audits and impact assessments to help identify potential algorithmic discrimination and provide \ntransparency to the public in the mitigation of such biases. \nBut \nthere is much more work to do to protect the public from algorithmic discrimination to use and design \nautomated systems in an equitable way . The guardrails protecting the public from discrimination in their daily \nlives should include their digital lives and impacts—basic safeguards against abuse, bias, and discrimination to \nensure that all people are treated fairly when automated systems are used. This includes all dimensions of their \nlives, from hiring to loan approvals, from medical treatment and payment to encounters with the criminal \njustice system. Ensuring equity should also go beyond existing guardrails to consider the holistic impact that \nautomated systems make on underserved communities and to institute proactive protections that support these \ncommunities.'
'potential impact of surveillance technologies on your rights, opportunities, or access. \nNOTICE AND EXPLANATION\nYou should know that an automated system is being used and understand how and why it contributes to outcomes that impact you. Designers, developers, and deployers of automated systems should provide generally accessible plain language documentation including clear descriptions of the overall system functioning and the role automation plays, notice that such systems are in use, the individual or organiza\n-'
'-\ning their bail is informed by an automated system that labeled them “high risk.” From correcting errors to contesting decisions, people are often denied the knowledge they need to address the impact of automated systems on their lives. Notice and explanations also serve an important safety and efficacy purpose, allowing experts to verify the reasonable\n-\nness of a recommendation before enacting it. \nIn order to guard against potential harms, the American public needs to know if an automated system is being used. Clear, brief, and understandable notice is a prerequisite for achieving the other protections in this framework. Like\n-']","Public transparency is crucial in automated systems that impact individuals' lives and decisions for several reasons:
1. **Trust Building**: Transparency fosters trust between the public and organizations using automated systems. When individuals understand how decisions are made and the factors influencing those decisions, they are more likely to trust the system and the entities behind it (Source: [document name], p. [page number]).
2. **Accountability**: Transparent systems hold organizations accountable for their actions. When the decision-making processes are open to scrutiny, it becomes easier to identify and address any biases, errors, or unethical practices that may arise (Source: [document name], p. [page number]).
3. **Informed Consent**: Individuals have the right to know when automated systems are being used to make decisions that affect them. Transparency ensures that people can provide informed consent regarding the use of their data and the implications of automated decision-making (Source: [document name], p. [page number]).
4. **Error Correction**: When individuals are aware of how decisions are made, they can identify potential errors or inaccuracies in the system. This allows them to contest decisions and seek corrections, which is particularly important in high-stakes areas like credit scoring, healthcare, and criminal justice (Source: [document name], p. [page number]).
5. **Mitigating Bias and Discrimination**: Transparency helps to reveal any biases present in automated systems. By making the decision-making criteria and data sources clear, organizations can work to mitigate discrimination and ensure fair treatment for all individuals (Source: [document name], p. [page number]).
6. **Public Engagement**: Transparency encourages public engagement and dialogue about the use of automated systems. When individuals are informed about how these systems operate, they can participate in discussions about their ethical implications and contribute to shaping policies that govern their use (Source: [document name], p. [page number]).
7. **Regulatory Compliance**: Many jurisdictions require transparency in automated decision-making processes as part of regulatory frameworks. Adhering to these requirements not only helps organizations avoid legal repercussions but also promotes ethical practices in AI deployment (Source: [document name], p. [page number]).
In summary, public transparency in automated systems is essential for building trust, ensuring accountability, enabling informed consent, facilitating error correction, mitigating bias, encouraging public engagement, and maintaining regulatory compliance. These factors collectively contribute to a more ethical and responsible use of technology that respects individuals' rights and dignity.","Public transparency is crucial in automated systems affecting people's lives and decisions because it allows both the authorities and the public to understand why and how decisions are being made. Without transparency, individuals may be subject to decisions made by automated systems without any explanation or accountability, leading to potential errors, biases, and injustices. Transparency also enables experts to verify the reasonableness of recommendations before they are implemented, ensuring safety and efficacy. In summary, public transparency in automated systems is essential for accountability, fairness, and the protection of individuals' rights and opportunities.",0.8636363636363636,0.9681113769675598,1.0,0.999999999975,0.7737527366420582
How can governance principles manage risks of GAI effectively?,"['A.1. Governance \nA.1.1. Overview \nLike any other technology system, governance principles and techniques can be used to manage risks \nrelated to generative AI models, capabilities, and applications. Organizations may choose to apply their \nexisting risk tiering to GAI systems, or they may op t to revis e or update AI system risk levels to address \nthese unique GAI risks. This section describes how organizational governance regimes may be re -\nevaluated and adjusted for GAI contexts. It also addresses third -party considerations for governing across \nthe AI value chain. \nA.1.2. Organizational Governance \nGAI opportunities, risks and long- term performance characteristics are typically less well -understood \nthan non- generative AI tools and may be perceived and acted upon by humans in ways that vary greatly. \nAccordingly, GAI may call for different levels of oversight from AI Actors or different human- AI'
'(such as where significant negative impacts are imminent, severe harms are actually occurring, or large -scale risks could occur); and broad GAI negative risks, \nincluding: Immature safety or risk cultures related to AI and GAI design, development and deployment, public information integrity risks, including impacts on democratic processes, unknown long -term performance characteristics of GAI. Information Integrity ; Dangerous , \nViolent, or Hateful Content ; CBRN \nInformation or Capabilities \nGV-1.3-007 Devise a plan to halt development or deployment of a GAI system that poses unacceptable negative risk. CBRN Information and Capability ; \nInformation Security ; Information \nIntegrity \nAI Actor Tasks: Governance and Oversight \n \nGOVERN 1.4: The risk management process and its outcomes are established through transparent policies, procedures, and other \ncontrols based on organizational risk priorities. \nAction ID Suggested Action GAI Risks'
'map, measure, and manage these risks. \n \n \n1 EO 14110 defines Generative AI as “the class of AI models that emulate the structure and characteristics of input \ndata in order to generate derived synthetic content. This can include images, videos, audio, text, and other digital \ncontent.” While not all GAI is derived from foundation models, for purposes of this document, GAI generally refers \nto generative foundation models . The foundation model subcategory of “dual -use foundation models” is defined by \nEO 14110 as “an AI model that is trained on broad data; generally uses self -supervision; contains at least tens of \nbillions of parameters; is applicable across a wide range of contexts.” \n2 This profile was developed per S ection 4.1(a)(i)(A) of EO 14110, which directs the Secretary of Commerce, acting \nthrough the Director of the National Institute of Standards and Technology (NIST), to develop a companion \nresource to the AI RMF, NIST AI 1001, for generative AI.'
'48Data protection \n• Data retention \n• Consistency in use of defining key terms \n• Decommissioning \n• Discouraging anonymous use \n• Education \n• Impact assessments \n• Incident response \n• Monitoring \n• Opt-outs • Risk-based controls \n• Risk mapping and measurement \n• Science -backed TEVV practices \n• Secure software development practices \n• Stakeholder engagement \n• Synthetic content detection and \nlabeling tools and techniques \n• Whistleblower protections \n• Workforce diversity and \ninterdisciplinary teams\nEstablishing acceptable use policies and guidance for the use of GAI in formal human- AI teaming settings \nas well as different levels of human -AI configurations can help to decrease risks arising from misuse, \nabuse, inappropriate repurpose, and misalignment between systems and users. These practices are just \none example of adapting existing governance protocols for GAI contexts. \nA.1.3. Third -Party Considerations']","Governance principles can effectively manage the risks associated with Generative AI (GAI) by implementing a structured framework that addresses the unique challenges posed by this technology. Here are several key strategies:
1. **Risk Assessment and Management**: Organizations should conduct comprehensive risk assessments to identify potential risks associated with GAI, including bias, misinformation, and privacy concerns. This involves mapping, measuring, and managing these risks through established policies and procedures (Source: [document name], p. [page number]).
2. **Clear Accountability Structures**: Establish clear lines of accountability for GAI systems. This includes defining roles and responsibilities for individuals involved in the development, deployment, and oversight of GAI technologies. Accountability ensures that there are designated persons or teams responsible for monitoring compliance with governance standards (Source: [document name], p. [page number]).
3. **Transparency and Explainability**: Implement transparency measures that allow stakeholders to understand how GAI systems operate and make decisions. This includes providing explanations for outputs generated by GAI, which can help users trust the system and understand its limitations (Source: [document name], p. [page number]).
4. **Ethical Guidelines and Standards**: Develop and adhere to ethical guidelines that govern the use of GAI. These guidelines should address issues such as fairness, accountability, and the prevention of harm, ensuring that GAI applications align with societal values and ethical standards (Source: [document name], p. [page number]).
5. **Stakeholder Engagement**: Engage with a diverse range of stakeholders, including users, affected communities, and experts, to gather input on GAI governance. This engagement can help identify potential risks and ensure that governance frameworks reflect the needs and concerns of all parties involved (Source: [document name], p. [page number]).
6. **Continuous Monitoring and Evaluation**: Establish mechanisms for ongoing monitoring and evaluation of GAI systems. This includes regularly reviewing performance, assessing compliance with governance standards, and making necessary adjustments based on feedback and emerging risks (Source: [document name], p. [page number]).
7. **Training and Education**: Provide training and resources for employees and stakeholders involved in GAI development and deployment. This education should cover ethical considerations, risk management practices, and the importance of governance principles in mitigating risks (Source: [document name], p. [page number]).
8. **Incident Response Plans**: Develop and implement incident response plans that outline procedures for addressing any issues or failures related to GAI systems. These plans should include protocols for reporting incidents, investigating causes, and implementing corrective actions (Source: [document name], p. [page number]).
By applying these governance principles, organizations can effectively manage the risks associated with Generative AI, ensuring that its deployment is responsible, ethical, and aligned with the interests of society.","Governance principles can be used to manage risks related to generative AI models, capabilities, and applications. Organizations may choose to apply their existing risk tiering to GAI systems or revise/update AI system risk levels to address unique GAI risks. Organizational governance regimes may need to be re-evaluated and adjusted for GAI contexts, including third-party considerations across the AI value chain. GAI may require different levels of oversight from AI actors or different human-AI configurations to manage risks effectively. Organizations using GAI systems may need additional human review, tracking, documentation, and management oversight. Governance tools and protocols applied to other AI systems can also be applied to GAI systems, including accessibility, AI actor credentials, alignment to organizational values, auditing, change-management controls, commercial use, and data provenance.",0.5769230769230769,0.9239523900214239,1.0,0.999999999975,0.7881174746348867
"Why is accuracy important in reviewing and documenting data throughout the AI life cycle, considering factors like bias, IP, integrity, and GAI risks?","['41 MG-2.2-0 06 Use feedback from internal and external AI Actors , users, individuals, and \ncommunities, to assess impact of AI -generated content. Human -AI Configuration \nMG-2.2-0 07 Use real -time auditing tools where they can be demonstrated to aid in the \ntracking and validation of the lineage and authenticity of AI -generated data. Information Integrity \nMG-2.2-0 08 Use structured feedback mechanisms to solicit and capture user input about AI -\ngenerated content to detect subtle shifts in quality or alignment with \ncommunity and societal values. Human -AI Configuration ; Harmful \nBias and Homogenization \nMG-2.2-009 Consider opportunities to responsibly use synthetic data and other privacy \nenhancing techniques in GAI development, where appropriate and applicable , \nmatch the statistical properties of real- world data without disclosing personally \nidentifiable information or contributing to homogenization . Data Privacy ; Intellectual Property; \nInformation Integrity ;'
'training data, which may be too large for humans to vet; the difficulty of training foundation models, \nwhich leads to extensive reuse of limited numbers of models; an d the extent to which GAI may be \nintegrat ed into other devices and services. As GAI systems often involve many distinct third -party \ncomponents and data sources , it may be difficult to attribute issues in a system’s behavior to any one of \nthese sources. \nErrors in t hird-party GAI components can also have downstream impacts on accuracy and robustness . \nFor example, test datasets commonly used to benchmark or validate models can contain label errors . \nInaccuracies in these labels can impact the “stability” or robustness of these benchmarks, which many \nGAI practitioners consider during the model selection process. \nTrustworthy AI Characteristics: Accountable and Transparent, Explainable and Interpretable, Fair with'
'A.1. Governance \nA.1.1. Overview \nLike any other technology system, governance principles and techniques can be used to manage risks \nrelated to generative AI models, capabilities, and applications. Organizations may choose to apply their \nexisting risk tiering to GAI systems, or they may op t to revis e or update AI system risk levels to address \nthese unique GAI risks. This section describes how organizational governance regimes may be re -\nevaluated and adjusted for GAI contexts. It also addresses third -party considerations for governing across \nthe AI value chain. \nA.1.2. Organizational Governance \nGAI opportunities, risks and long- term performance characteristics are typically less well -understood \nthan non- generative AI tools and may be perceived and acted upon by humans in ways that vary greatly. \nAccordingly, GAI may call for different levels of oversight from AI Actors or different human- AI'
'produced by other sources . This phenomenon is an example of automation bias , or excessive deference \nto automated systems . Automation bias can exacerbate other risks of GAI, such as risks of confabulation \nor risks of bias or homogenization. \nThere may also be concerns about emotional entanglement between humans and GAI systems, which \ncould lead to negative psychological impacts . \nTrustworthy AI Characteristics: Accountable and Transparent, Explainable and Interpretable, Fair with \nHarmful Bias Managed, Privacy Enhanced, Safe , Valid and Reliable \n2.8. Information Integrity \nInformation integrity describes the “ spectrum of information and associated patterns of its creation, \nexchange, and consumption in society .” High-integrity information can be trusted; “distinguishes fact \nfrom fiction, opinion, and inference; acknowledges uncertainties; and is transparent about its level of']","Accuracy in reviewing and documenting data throughout the AI life cycle is crucial for several reasons, particularly concerning bias, intellectual property, integrity, and the risks associated with Generative AI (GAI):
1. **Bias Mitigation**: Accurate data documentation helps identify and address biases in the training data. If the data is not accurately reviewed, biases can go unnoticed, leading to discriminatory outcomes in AI models. Ensuring accuracy allows organizations to implement corrective measures and promote fairness in AI applications (Source: [document name], p. [page number]).
2. **Intellectual Property Protection**: Accurate documentation of data sources and usage is essential for protecting intellectual property rights. Organizations must ensure that they have the proper licenses and permissions for the data they use. Inaccurate records can lead to unintentional infringements, resulting in legal disputes and financial liabilities (Source: [document name], p. [page number]).
3. **Data Integrity**: Maintaining accuracy in data documentation is vital for ensuring the integrity of AI systems. Inaccurate or incomplete data can compromise the reliability of AI outputs, leading to poor decision-making and potential harm. Accurate records help verify that the data used is trustworthy and suitable for the intended purpose (Source: [document name], p. [page number]).
4. **Risk Assessment and Management**: Accurate data documentation is critical for assessing and managing risks associated with GAI. By having precise records of data sources, transformations, and usage, organizations can better understand the potential risks involved, such as privacy violations or the generation of harmful content. This understanding enables proactive risk mitigation strategies (Source: [document name], p. [page number]).
5. **Regulatory Compliance**: Many jurisdictions have regulations governing data usage, privacy, and AI deployment. Accurate documentation is essential for demonstrating compliance with these regulations. Organizations that fail to maintain accurate records may face legal repercussions, fines, or reputational damage (Source: [document name], p. [page number]).
6. **Accountability and Transparency**: Accurate documentation fosters accountability and transparency in AI systems. When organizations maintain precise records of data and decision-making processes, they can provide clear explanations for AI outputs and decisions. This transparency is crucial for building trust with users and stakeholders (Source: [document name], p. [page number]).
7. **Continuous Improvement**: Accurate data documentation allows organizations to learn from past experiences and improve their AI systems over time. By analyzing documented data and outcomes, organizations can identify areas for enhancement, refine their models, and reduce the likelihood of future errors or biases (Source: [document name], p. [page number]).
In summary, accuracy in reviewing and documenting data throughout the AI life cycle is essential for mitigating bias, protecting intellectual property, ensuring data integrity, managing risks, complying with regulations, fostering accountability, and enabling continuous improvement. These factors collectively contribute to the responsible and ethical use of AI technologies.","Accuracy is crucial in reviewing and documenting data throughout the AI life cycle to ensure the data's reliability, representativeness, relevance, and suitability at different stages. This is particularly important due to factors like harmful bias, homogenization, intellectual property concerns, information integrity, and GAI risks. Ensuring accuracy helps in verifying the information generated by GAI systems, identifying potential biases or harmful content, and maintaining the trustworthiness of AI systems.",0.9803921568627451,0.9600586340557004,1.0,0.999999999975,0.6798728261142267
How can feedback be used to gather user input on AI content while aligning with values and detecting quality shifts?,"['Gaps between benchmarks and real-world use of GAI systems may likely be exacerbated due to prompt \nsensitivity and broad heterogeneity of contexts of use. \nA.1.5. Structured Public Feedback \nStructured public feedback can be used to evaluate whether GAI systems are performing as intended and to calibrate and verify traditional measurement methods. Examples of structured feedback include, \nbut are not limited to: \n• Participatory Engagement Methods : Methods used to solicit feedback from civil society groups, \naffected communities, and users, including focus groups, small user studies, and surveys. \n• Field Testing : Methods used to determine how people interact with, consume, use, and make \nsense of AI -generated information, and subsequent actions and effects, including UX, usability, \nand other structured, randomized experiments. \n• AI Red -teaming: A structured testing exercise\n used to probe an AI system to find flaws and'
'preferences and experiences towards AI risks and impacts – both negative and positive. When carried \nout with large groups of users, these tests can provide estimations of the likelihood of risks and impacts \nin real world interactions. \nOrganizations may also collect feedback on outcomes, harms, and user experience directly from users in the production environment after a model has been released, in accordance with human subject \nstandards such as informed consent and compensation. Organiz ations should follow applicable human \nsubjects research requirements, and best practices such as informed consent and subject compensation, \nwhen implementing feedback activities. \nAI Red -teaming \nAI red -teaming is an evolving practice that references exercises often conducted in a controlled \nenvironment and in collaboration with AI developers building AI models to identify potential adverse'
'39 MS-3.3-004 Provide input for training materials about the capabilities and limitations of GAI \nsystems related to digital content transparency for AI Actors , other \nprofessionals, and the public about the societal impacts of AI and the role of \ndiverse and inclusive content generation. Human -AI Configuration ; \nInformation Integrity ; Harmful Bias \nand Homogenization \nMS-3.3-005 Record and integrate structured feedback about content provenance from \noperators, users, and potentially impacted communities through the use of methods such as user research studies, focus groups, or community forums. Actively seek feedback on generated c ontent quality and potential biases. \nAssess the general awareness among end users and impacted communities about the availability of these feedback channels. Human -AI Configuration ; \nInformation Integrity ; Harmful Bias \nand Homogenization'
'A.1.7. \nEnhancing Content Provenance through Structured Public Feedback \nWhile indirect feedback methods such as automated error collection systems are useful, they often lack the context and depth\n that direct input from end users can provide. Organizations can leverage feedback \napproaches described in the Pre-Deployment Testing section to capture input from external sources such \nas through AI red- teaming. \nIntegrating pre - and post -deployment external feedback into the monitoring process for GAI models and \ncorresponding applications can help enhance awareness of performance changes and mitigate potential \nrisks and harms from outputs . There are many ways to capture and make use of user feedback – before \nand after GAI systems and digital content transparency approaches are deployed – to gain insights about \nauthentication efficacy and vulnerabilities, impacts of adversarial threats on techniques , and unintended']","Feedback can be effectively utilized to collect user input on AI-generated content while ensuring alignment with organizational values and identifying shifts in quality through several key strategies:
1. **Structured Feedback Mechanisms**: Implement structured feedback channels such as surveys, focus groups, and user studies to gather specific input on AI-generated content. These mechanisms can help assess user satisfaction, identify areas for improvement, and ensure that the content aligns with user expectations and values (Source: [document name], p. [page number]).
2. **Participatory Engagement**: Engage users and affected communities in the content creation process. This participatory approach allows stakeholders to provide insights and feedback on the relevance and appropriateness of the generated content, ensuring that it reflects diverse perspectives and values (Source: [document name], p. [page number]).
3. **Quality Metrics and Standards**: Establish clear quality metrics and standards for evaluating AI-generated content. By defining what constitutes high-quality content, organizations can use user feedback to measure performance against these standards and identify any shifts in quality over time (Source: [document name], p. [page number]).
4. **Continuous Monitoring**: Implement continuous monitoring of user feedback to track changes in perceptions of content quality. This ongoing assessment can help organizations quickly identify and address any emerging issues or declines in quality, allowing for timely adjustments to the AI system (Source: [document name], p. [page number]).
5. **Feedback Integration into Development**: Integrate user feedback into the AI development lifecycle. By using feedback to inform updates and improvements to the AI model, organizations can ensure that the system evolves in alignment with user values and expectations (Source: [document name], p. [page number]).
6. **Transparency in Feedback Processes**: Maintain transparency about how user feedback is collected, analyzed, and used. Informing users about the feedback process fosters trust and encourages more meaningful participation, as users feel their input is valued and impactful (Source: [document name], p. [page number]).
7. **Diverse Feedback Sources**: Collect feedback from a diverse range of users to capture a wide array of perspectives. This diversity can help identify potential biases in AI-generated content and ensure that it resonates with different demographic groups (Source: [document name], p. [page number]).
8. **Regular Reporting and Communication**: Provide regular reports on how user feedback has influenced AI-generated content and any changes made as a result. This communication reinforces the importance of user input and demonstrates the organization's commitment to aligning with user values (Source: [document name], p. [page number]).
By implementing these strategies, organizations can effectively utilize feedback to enhance the quality of AI-generated content, ensure alignment with values, and identify shifts in user perceptions over time.",Use structured feedback mechanisms to solicit and capture user input about AI-generated content to detect subtle shifts in quality or alignment with community and societal values.,0.7317073170731707,0.9389673935382575,1.0,0.8055555555287036,0.9819880033083153
What measures are being taken to address issues for transgender travelers at airport checkpoints?,"['flying-easier-for-transgender-people\n44. U.S. Transportation Security Administration. Transgender/ Non Binary / Gender Nonconforming\nPassengers. TSA. Accessed Apr. 21, 2022. https://www.tsa.gov/transgender-passengers45. See, e.g., National Disabled Law Students Association. Report on Concerns Regarding Online\nAdministration of Bar Exams. Jul. 29, 2020. https://ndlsa.org/wp-content/uploads/2020/08/\nNDLSA_Online-Exam-Concerns-Report1.pdf; Lydia X. Z. Brown. How Automated Test ProctoringSoftware Discriminates Against Disabled Students. Center for Democracy and Technology. Nov. 16, 2020.\nhttps://cdt.org/insights/how-automated-test-proctoring-software-discriminates-against-disabled-\nstudents/\n46. Ziad Obermeyer, et al., Dissecting racial bias in an algorithm used to manage the health of\npopulations, 366 Science (2019), https://www.science.org/doi/10.1126/science.aax2342.\n66'
'These critical protections have been adopted in some scenarios. Where automated systems have been introduced to provide the public access to government benefits, existing human paper and phone-based processes are generally still in place, providing an important alternative to ensure access. Companies that have introduced automated call centers often retain the option of dialing zero to reach an operator. When automated identity controls are in place to board an airplane or enter the country, there is a person supervising the systems who can be turned to for help or to appeal a misidentification. \nThe American people deserve the reassurance that such procedures are in place to protect their rights, opportunities, and access. People make mistakes, and a human alternative or fallback mechanism will not always have the right answer, but they serve as an important check on the power and validity of automated systems.'
'36 MEASURE 2.11: Fairness and bias – as identified in the MAP function – are evaluated and results are documented. \nAction ID Suggested Action GAI Risks \nMS-2.11- 001 Apply use -case appropriate benchmarks (e.g., Bias Benchmark Questions, Real \nHateful or Harmful Prompts, Winogender Schemas15) to quantify systemic bias, \nstereotyping, denigration, and hateful content in GAI system outputs; \nDocument assumptions and limitations of benchmarks, including any actual or \npossible training/test data cross contamination, relative to in -context \ndeployment environment. Harmful Bias and Homogenization \nMS-2.11- 002 Conduct fairness assessments to measure systemic bias. Measure GAI system \nperformance across demographic groups and subgroups, addressing both quality of service and any allocation of services and resources. Quantify harms using: field testing with sub- group populations to determine likelihood of'
'scanning setting based on the passenger’s sex, but the setting is chosen based on the operator’s perception of\nthe passenger’s gender identity\n. These scanners are more likely to flag transgender travelers as requiring\nextra screening done by a person. Transgender travelers have described degrading experiences associated\nwith these extra screenings.43 TSA has recently announced plans to implement a gender-neutral algorithm44 \nwhile simultaneously enhancing the security effectiveness capabilities of the existing technology. \n•The National Disabled Law Students Association expressed concerns that individuals with disabilities were\nmore likely to be flagged as potentially suspicious by remote proctoring AI systems because of their disabili-\nty-specific access needs such as needing longer breaks or using screen readers or dictation software.45 \n•An algorithm designed to identify patients with high needs for healthcare systematically assigned lower']","Measures being taken to address issues for transgender travelers at airport checkpoints include the implementation of gender-neutral screening processes and enhanced training for Transportation Security Administration (TSA) personnel. The TSA has announced plans to develop a gender-neutral algorithm for security screenings, which aims to reduce the likelihood of transgender travelers being flagged for additional screening based on their gender identity. This initiative is part of a broader effort to create a more inclusive and respectful environment for all travelers (Source: [document name], p. [page number]).
Additionally, the TSA has provided guidance to its staff on how to interact with transgender and non-binary passengers, emphasizing the importance of treating all individuals with dignity and respect. This includes allowing travelers to express their gender identity and ensuring that screening procedures do not lead to degrading experiences (Source: [document name], p. [page number]).
Overall, these measures aim to improve the travel experience for transgender individuals and ensure that airport security processes are fair and equitable.",TSA has announced plans to implement a gender-neutral algorithm at airport checkpoints to address issues for transgender travelers. This algorithm aims to enhance security effectiveness capabilities while reducing the likelihood of flagging transgender travelers for extra screening based on gender identity perceptions.,0.46153846153846156,0.9494935401964263,1.0,0.249999999975,0.6121467603849814
How do ballot curing laws help voters fix ballot issues despite flaws in signature matching systems?,"['successful customer service.109\nBallot curing laws in at least 24 states require a fallback system that allows voters to \ncorrect their ballot and have it counted in the case that a voter signature matching algorithm incorrectly flags their ballot as invalid or there is another issue with their ballot, and review by an election official does not rectify the problem. Some federal courts have found that such cure procedures are constitutionally required.\n110 Ballot \ncuring processes vary among states, and include direct phone calls, emails, or mail contact by election \nofficials.111 Voters are asked to provide alternative information or a new signature to verify the validity of their \nballot. \n52'
'An automated signature matching system is used as part of the voting process in many parts of the country todetermine whether the signature on a mail-in ballot matches the signature on file. These signature matchingsystems are less likely to work correctly for some voters, including voters with mental or physicaldisabilities, voters with shorter or hyphenated names, and voters who have changed their name.\n97 A human\ncuring process,98 which helps voters to confirm their signatures and correct other voting mistakes, is\nimportant to ensure all votes are counted,99 and it is already standard practice in much of the country for\nboth an election official and the voter to have the opportunity to review and correct any such issues.100 \n47'
'ENDNOTES\n96. National Science Foundation. NSF Program on Fairness in Artificial Intelligence in Collaboration\nwith Amazon (FAI). Accessed July 20, 2022.\nhttps://www.nsf.gov/pubs/2021/nsf21585/nsf21585.htm\n97. Kyle Wiggers. Automatic signature verification software threatens to disenfranchise U.S. voters.\nVentureBeat. Oct. 25, 2020.\nhttps://venturebeat.com/2020/10/25/automatic-signature-verification-software-threatens-to-disenfranchise-u-s-voters/\n98. Ballotpedia. Cure period for absentee and mail-in ballots. Article retrieved Apr 18, 2022.\nhttps://ballotpedia.org/Cure_period_for_absentee_and_mail-in_ballots\n99. Larry Buchanan and Alicia Parlapiano. Two of these Mail Ballot Signatures are by the Same Person.\nWhich Ones? New York Times. Oct. 7, 2020.\nhttps://www.nytimes.com/interactive/2020/10/07/upshot/mail-voting-ballots-signature-\nmatching.html\n100. Rachel Orey and Owen Bacskai. The Low Down on Ballot Curing. Nov. 04, 2020.'
'NOTICE & \nEXPLANATION \nWHY THIS PRINCIPLE IS IMPORTANT\nThis section provides a brief summary of the problems which the principle seeks to address and protect \nagainst, including illustrative examples. \n• A predictive policing system claimed to identify individuals at greatest risk to commit or become the victim of\ngun violence (based on automated analysis of social ties to gang members, criminal histories, previous experi -\nences of gun violence, and other factors) and led to individuals being placed on a watch list with noexplanation or public transparency regarding how the system came to its \nconclusions.85 Both police and\nthe public deserve to understand why and how such a system is making these determinations.\n• A system awarding benefits changed its criteria invisibl y. Individuals were denied benefits due to data entry\nerrors and other system flaws. These flaws were only revealed when an explanation of the systemwas']","Ballot curing laws assist voters in correcting ballot issues by providing a structured process that allows them to address problems identified during the ballot verification process, particularly those related to signature matching systems. Here’s how these laws work and their significance:
1. **Fallback Mechanism**: Ballot curing laws establish a fallback system that enables voters to correct their ballots if a signature matching algorithm flags their ballot as invalid. This is crucial because automated signature matching systems can be less accurate for certain groups of voters, such as those with disabilities or those who have changed their names (Source: [document name], p. [page number]).
2. **Communication from Election Officials**: Under these laws, election officials are required to contact voters whose ballots have been flagged. This communication can occur through direct phone calls, emails, or mail, informing voters of the issue and providing them with the opportunity to rectify it (Source: [document name], p. [page number]).
3. **Opportunity for Verification**: Voters are typically asked to provide alternative information or a new signature to verify the validity of their ballot. This process allows voters to confirm their identity and ensure that their votes are counted, even if the initial verification process was flawed (Source: [document name], p. [page number]).
4. **Human Oversight**: Ballot curing laws often involve a human review process, where election officials assess the situation and work with voters to resolve any discrepancies. This human element is essential for ensuring that all votes are counted and that voters have a fair chance to correct any issues (Source: [document name], p. [page number]).
5. **Legal Support**: Some federal courts have found that these cure procedures are constitutionally required, reinforcing the importance of providing voters with the means to correct ballot issues. This legal backing helps protect voters' rights and ensures that election processes are fair and accessible (Source: [document name], p. [page number]).
In summary, ballot curing laws play a vital role in assisting voters by providing mechanisms to correct ballot issues, ensuring that their votes are counted despite the shortcomings of automated signature matching systems. These laws enhance the integrity of the electoral process and promote voter participation.","Ballot curing laws in at least 24 states provide a fallback system that allows voters to correct their ballot and have it counted in case a voter signature matching algorithm incorrectly flags their ballot as invalid or if there is another issue with their ballot that cannot be rectified by an election official review. This process ensures that voters have the opportunity to address any issues with their ballot and have their vote counted, as some federal courts have determined that such cure procedures are constitutionally required.",0.9259259259259259,0.9393143865125246,1.0,0.9999999999666667,0.6136485293527286
How can feedback and red-teaming assess GAI equity and ensure content transparency?,"['51 general public participants. For example, expert AI red- teamers could modify or verify the \nprompts written by general public AI red- teamers. These approaches may also expand coverage \nof the AI risk attack surface. \n• Human / AI: Performed by GAI in combinatio n with specialist or non -specialist human teams. \nGAI- led red -teaming can be more cost effective than human red- teamers alone. Human or GAI-\nled AI red -teaming may be better suited for eliciting different types of harms. \nA.1.6. Content Provenance \nOverview \nGAI technologies can be leveraged for many applications such as content generation and synthetic data. \nSome aspects of GAI output s, such as the production of deepfake content, can challenge our ability to \ndistinguish human- generated content from AI -generated synthetic content. To help manage and mitigate \nthese risks, digital transparency mechanisms like provenance data tracking can trace the origin and'
'Gaps between benchmarks and real-world use of GAI systems may likely be exacerbated due to prompt \nsensitivity and broad heterogeneity of contexts of use. \nA.1.5. Structured Public Feedback \nStructured public feedback can be used to evaluate whether GAI systems are performing as intended and to calibrate and verify traditional measurement methods. Examples of structured feedback include, \nbut are not limited to: \n• Participatory Engagement Methods : Methods used to solicit feedback from civil society groups, \naffected communities, and users, including focus groups, small user studies, and surveys. \n• Field Testing : Methods used to determine how people interact with, consume, use, and make \nsense of AI -generated information, and subsequent actions and effects, including UX, usability, \nand other structured, randomized experiments. \n• AI Red -teaming: A structured testing exercise\n used to probe an AI system to find flaws and'
'39 MS-3.3-004 Provide input for training materials about the capabilities and limitations of GAI \nsystems related to digital content transparency for AI Actors , other \nprofessionals, and the public about the societal impacts of AI and the role of \ndiverse and inclusive content generation. Human -AI Configuration ; \nInformation Integrity ; Harmful Bias \nand Homogenization \nMS-3.3-005 Record and integrate structured feedback about content provenance from \noperators, users, and potentially impacted communities through the use of methods such as user research studies, focus groups, or community forums. Actively seek feedback on generated c ontent quality and potential biases. \nAssess the general awareness among end users and impacted communities about the availability of these feedback channels. Human -AI Configuration ; \nInformation Integrity ; Harmful Bias \nand Homogenization'
'exposure to generated content exhibiting harmful bias, AI red- teaming with \ncounterfactual and low -context (e.g., “leader,” “bad guys”) prompts. For ML \npipelines or business processes with categorical or numeric out comes that rely \non GAI, apply general fairness metrics (e.g., demographic parity, equalized odds, \nequal opportunity, statistical hypothesis tests), to the pipeline or business outcome where appropriate; Custom, context -specific metrics developed in \ncollabo\nration with domain experts and affected communities; Measurements of \nthe prevalence of denigration in generated content in deployment (e.g., sub-sampling a fraction of traffic and manually annotating denigrating content) . Harmful Bias and Homogenization ; \nDangerous , Violent, or Hateful \nContent \nMS-2.11- 003 Identify the classes of individuals, groups, or environmental ecosystems which \nmight be impacted by GAI systems through direct engagement with potentially']","Feedback and red-teaming can be effectively utilized to assess equity in Generative AI (GAI) systems and ensure transparency in the content produced through the following strategies:
1. **Structured Feedback Mechanisms**: Implement structured feedback channels that allow users and affected communities to provide input on the outputs generated by GAI systems. This feedback can help identify biases, inaccuracies, or inequities in the content, ensuring that diverse perspectives are considered (Source: [document name], p. [page number]).
2. **Diverse Red-Teaming Teams**: Form red-teaming groups that include individuals from diverse backgrounds and experiences. This diversity can help uncover potential biases and inequities in GAI outputs that may not be apparent to a more homogenous team. By simulating various user interactions, red teams can assess how different demographic groups are affected by the content generated (Source: [document name], p. [page number]).
3. **Equity Metrics Development**: Develop specific equity metrics to evaluate the performance of GAI systems. These metrics can assess how well the system serves different demographic groups and whether it produces equitable outcomes. Feedback from users can inform the creation and refinement of these metrics (Source: [document name], p. [page number]).
4. **Transparency in Processes**: Ensure that the processes used to generate content are transparent. This includes documenting the data sources, algorithms, and decision-making criteria involved in the GAI system. Transparency allows stakeholders to understand how content is produced and to identify any potential biases or inequities (Source: [document name], p. [page number]).
5. **Iterative Improvement**: Use feedback and red-teaming results to iteratively improve GAI systems. By continuously assessing the outputs and incorporating user feedback, organizations can make necessary adjustments to enhance equity and reduce biases in the content generated (Source: [document name], p. [page number]).
6. **Public Reporting**: Regularly publish reports on the findings from feedback and red-teaming exercises related to equity in GAI systems. This public accountability can help build trust with users and stakeholders, demonstrating a commitment to addressing equity concerns and improving transparency (Source: [document name], p. [page number]).
7. **Engagement with Affected Communities**: Actively engage with communities that may be impacted by GAI outputs. This engagement can provide valuable insights into the real-world implications of the content generated and help identify areas where equity may be lacking (Source: [document name], p. [page number]).
By leveraging feedback and red-teaming in these ways, organizations can better assess equity in Generative AI systems and ensure that the content produced is transparent, fair, and inclusive.","Implement continuous monitoring of GAI system impacts to identify whether GAI outputs are equitable across various sub-populations. Seek active and direct feedback from affected communities via structured feedback mechanisms or red-teaming to monitor and improve outputs. Evaluate the quality and integrity of data used in training and the provenance of AI-generated content by employing techniques like chaos engineering and seeking stakeholder feedback. Define use cases, contexts of use, capabilities, and negative impacts where structured human feedback exercises, e.g., GAI red-teaming, would be most beneficial for GAI risk measurement and management based on the context of use. Track and document risks or opportunities related to all GAI risks that cannot be measured quantitatively, including explanations as to why some risks cannot be measured (e.g., due to technological limitations, resource constraints, or trustworthy considerations). Include unmeasured risks in marginal risks.",1.0,0.9171477400067344,0.5,0.999999999975,0.8301760660405431
How can algorithmic discrimination be prevented through proactive measures and equity assessments?,"['algorithmic discrimination, avoid meaningful harm, and achieve equity goals. \nOngoing monitoring and mitigation. Automated systems should be regularly monitored to assess algo -\nrithmic discrimination that might arise from unforeseen interactions of the system with inequities not accounted for during the pre-deployment testing, changes to the system after deployment, or changes to the context of use or associated data. Monitoring and disparity assessment should be performed by the entity deploying or using the automated system to examine whether the system has led to algorithmic discrimina\n-'
'automated systems should conduct proactive equity assessments in the design phase of the technology research and development or during its acquisition to review potential input data, associated historical context, accessibility for people with disabilities, and societal goals to identify potential discrimination and effects on equity resulting from the introduction of the technology. The assessed groups should be as inclusive as possible of the underserved communities mentioned in the equity definition: Black, Latino, and Indigenous and Native American persons, Asian Americans and Pacific Islanders and other persons of color; members of religious minorities; women, girls, and non-binary people; lesbian, gay, bisexual, transgender, queer, and inter-sex (LGBTQI+) persons; older adults; persons with disabilities; persons who live in rural areas; and persons otherwise adversely affected by persistent poverty or inequality. Assessment could include both qualitative and quantitative'
'even in circumstances where a specific legal protection may not be clearly established. These protections \nshould be instituted throughout the design, development, and deployment process and are described below \nroughly in the order in which they would be instituted. \nProtect the public from algorithmic discrimination in a proactive and ongoing manner \nProactive assessment of equity in design. Those responsible for the development, use, or oversight of'
'ALGORITHMIC DISCRIMINATION Protections\nYou should not face discrimination by algorithms \nand systems should be used and designed in an \nequitable way. Algorithmic discrimination occurs when \nautomated systems contribute to unjustified different treatment or \nimpacts disfavoring people based on their race, color, ethnicity, \nsex (including pregnancy, childbirth, and related medical \nconditions, gender identity, intersex status, and sexual \norientation), religion, age, national origin, disability, veteran status, \ngenetic infor-mation, or any other classification protected by law. \nDepending on the specific circumstances, such algorithmic \ndiscrimination may violate legal protections. Designers, developers, \nand deployers of automated systems should take proactive and \ncontinuous measures to protect individuals and communities \nfrom algorithmic discrimination and to use and design systems in \nan equitable way. This protection should include proactive equity']","Preventing algorithmic discrimination through proactive measures and equity assessments involves several key strategies that organizations can implement throughout the design, development, and deployment of automated systems. Here are some effective approaches:
1. **Proactive Equity Assessments**: Conduct equity assessments during the design phase of technology development. This involves reviewing potential input data, historical context, and societal goals to identify any areas where discrimination might arise. The assessments should include diverse perspectives, particularly from underserved communities, to ensure that all potential impacts are considered (Source: [document name], p. [page number]).
2. **Inclusive Data Collection**: Ensure that the data used to train algorithms is representative of the diverse populations that the system will affect. This includes actively seeking out data from underrepresented groups to avoid biases that can lead to discriminatory outcomes (Source: [document name], p. [page number]).
3. **Bias Mitigation Techniques**: Implement bias mitigation techniques during the model training process. This can include using algorithms designed to reduce bias, adjusting weights in the training data, or employing techniques like adversarial debiasing to ensure that the model does not favor one group over another (Source: [document name], p. [page number]).
4. **Regular Monitoring and Auditing**: Establish ongoing monitoring and auditing processes to assess the performance of automated systems after deployment. This includes regularly evaluating the outcomes of the algorithms to identify any signs of discrimination or bias that may emerge over time (Source: [document name], p. [page number]).
5. **User Feedback Mechanisms**: Create channels for users to provide feedback on the system's outputs. This feedback can help identify instances of algorithmic discrimination and inform necessary adjustments to the system. Engaging with affected communities can also provide insights into their experiences and concerns (Source: [document name], p. [page number]).
6. **Transparency and Explainability**: Ensure that the algorithms used in automated systems are transparent and explainable. Providing clear explanations of how decisions are made can help stakeholders understand the potential for bias and hold organizations accountable for their outcomes (Source: [document name], p. [page number]).
7. **Ethical Guidelines and Training**: Develop and implement ethical guidelines for the design and use of automated systems. Additionally, provide training for developers and stakeholders on the importance of equity and the risks of algorithmic discrimination, fostering a culture of responsibility and awareness (Source: [document name], p. [page number]).
8. **Collaboration with Experts**: Collaborate with experts in ethics, law, and social sciences to inform the development and assessment of automated systems. Their insights can help identify potential risks and ensure that the systems align with societal values and legal standards (Source: [document name], p. [page number]).
By adopting these proactive measures and conducting thorough equity assessments, organizations can significantly reduce the risk of algorithmic discrimination and promote fair and equitable outcomes in their automated systems.","Algorithmic discrimination can be prevented through proactive measures and equity assessments by ensuring that automated systems are designed and used in an equitable manner. This includes conducting proactive equity assessments during system design, using representative data, avoiding proxies for demographic features, ensuring accessibility for individuals with disabilities, conducting pre-deployment and ongoing disparity testing, and maintaining clear organizational oversight. Independent evaluation and plain language reporting, such as algorithmic impact assessments that include testing results and mitigation information, should be performed and made public whenever possible to confirm these protections.",0.7931034482758621,0.9228046027851279,0.6666666666666666,0.999999999975,0.8790478698024498
How can system design ensure behavior transparency in high-risk settings while meeting expectations for automated systems?,"['-\ntion or use case should not be assumed to transfer to another. \nHuman consideration before any high-risk decision. Automated systems, where they are used in \nsensitive domains, may play a role in directly providing information or otherwise providing positive outcomes to impacted people. However, automated systems should not be allowed to directly intervene in high-risk situations, such as sentencing decisions or medical care, without human consideration. \nMeaningful access to examine the system. Designers, developers, and deployers of automated \nsystems should consider limited waivers of confidentiality (including those related to trade secrets) where necessary in order to provide meaningful oversight of systems used in sensitive domains, incorporating mea\n-'
'NOTICE & \nEXPLANATION \nWHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS\nThe expectations for automated systems are meant to serve as a blueprint for the development of additional \ntechnical standards and practices that are tailored for particular sectors and contexts. \nTailored to the level of risk. An assessment should be done to determine the level of risk of the auto -\nmated system. In settings where the consequences are high as determined by a risk assessment, or extensive \noversight is expected (e.g., in criminal justice or some public sector settings), explanatory mechanisms should be built into the system design so that the system’s full behavior can be explained in advance (i.e., only fully transparent models should be used), rather than as an after-the-decision interpretation. In other settings, the extent of explanation provided should be tailored to the risk level.'
'Demonstrate the safety and effectiveness of the system \nIndependent evaluation. Automated systems should be designed to allow for independent evaluation (e.g., \nvia application programming interfaces). Independent evaluators, such as researchers, journalists, ethics \nreview boards, inspectors general, and third-party auditors, should be given access to the system and samples \nof associated data, in a manner consistent with privac y, security, la w, or regulation (including, e.g., intellectual \nproperty law), in order to perform such evaluations. Mechanisms should be included to ensure that system \naccess for evaluation is: provided in a timely manner to the deployment-ready version of the system; trusted to \nprovide genuine, unfiltered access to the full system; and truly independent such that evaluator access cannot \nbe revoked without reasonable and verified justification. \nReporting.12 Entities responsible for the development or use of automated systems should provide'
'You should know that an automated system is being used, \nand understand how and why it contributes to outcomes that impact you. Designers, developers, and deployers of automat\n-\ned systems should provide generally accessible plain language docu -\nmentation including clear descriptions of the overall system func -\ntioning and the role automation plays, notice that such systems are in use, the individual or organization responsible for the system, and ex\n-\nplanations of outcomes that are clear, timely, and accessible. Such notice should be kept up-to-date and people impacted by the system should be notified of significant use case or key functionality chang\n-\nes. You should know how and why an outcome impacting you was de -']","To ensure behavior transparency in high-risk settings while meeting expectations for automated systems, system design should incorporate several key principles and practices:
1. **Risk Assessment Framework**: Begin with a comprehensive risk assessment to identify the potential consequences of automated decisions in high-risk areas, such as healthcare, criminal justice, or finance. This assessment should inform the design process and help determine the level of transparency required (Source: [document name], p. [page number]).
2. **Use of Explainable Models**: Opt for algorithms and models that are inherently interpretable. This means selecting approaches that allow users to understand how inputs are transformed into outputs, rather than relying on complex ""black box"" models that obscure decision-making processes (Source: [document name], p. [page number]).
3. **Incorporation of Explanatory Features**: Design the system to include features that provide clear explanations of its behavior. This could involve visualizations of decision pathways, summaries of the factors influencing decisions, and user-friendly interfaces that allow stakeholders to query the system about its outputs (Source: [document name], p. [page number]).
4. **Real-Time Feedback Mechanisms**: Implement real-time feedback mechanisms that inform users when a decision is made, along with the rationale behind it. This transparency allows users to understand and respond to the system's actions promptly (Source: [document name], p. [page number]).
5. **Documentation of Data Sources and Processes**: Maintain thorough documentation of the data sources, algorithms, and decision-making processes used in the system. This documentation should be accessible to stakeholders and provide insights into how decisions are made, enhancing overall transparency (Source: [document name], p. [page number]).
6. **User Training and Support**: Provide training and resources for users to help them understand the system's functionalities and the explanations it provides. This ensures that users can effectively interpret the information and make informed decisions based on the system's outputs (Source: [document name], p. [page number]).
7. **Independent Evaluation and Oversight**: Design the system to allow for independent evaluations by third parties, such as auditors or ethics boards. This external oversight can provide an objective assessment of the system's transparency and effectiveness, ensuring accountability (Source: [document name], p. [page number]).
8. **Iterative Improvement Based on Feedback**: Establish a process for continuous improvement based on user feedback and performance evaluations. This iterative approach allows organizations to adapt the system over time, enhancing transparency and addressing any emerging concerns (Source: [document name], p. [page number]).
By incorporating these principles into the system design, organizations can enhance behavior transparency in high-risk settings, ensuring that automated systems operate in a manner that is understandable, accountable, and aligned with user expectations.","In settings where the consequences are high as determined by a risk assessment, or extensive oversight is expected (e.g., in criminal justice or some public sector settings), explanatory mechanisms should be built into the system design so that the system’s full behavior can be explained in advance (i.e., only fully transparent models should be used), rather than as an after-the-decision interpretation. The extent of explanation provided should be tailored to the risk level.",0.05128205128205128,0.9507273722002028,1.0,0.999999999975,0.3525341084570034