Sayan Layek's picture

6

Sayan Layek

caprion

AI & ML interests

None yet

Organizations

caprion's activity

upvoted a paper 5 months ago

How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries

Paper • 2402.15302 • Published Feb 23 • 3

upvoted a collection 5 months ago

AI and Safety

6 items • Updated Jun 29 • 3

upvoted 3 papers 5 months ago

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

Paper • 2406.11139 • Published Jun 17 • 12

SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

Paper • 2406.12274 • Published Jun 18 • 14

Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations

Paper • 2406.11801 • Published Jun 17 • 15

upvoted a paper 9 months ago

Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models

Paper • 2401.10647 • Published Jan 19 • 3