garage-bAInd/Platypus2-70B-instruct · Inquiries Regarding Detailed Heuristics and Attention Module Modifications in the Paper

Thank you for sharing valuable insights.

Where can I find the detailed heuristics mentioned in the paper? I'd like to review the specific algorithms and code.
Is it possible to explicitly verify the performance of the model before and after modifying the attention module as mentioned in the paper? Can we obtain specific information about the performance difference between before and after changing the attention module? Are there any models that have been compared under controlled conditions with only structural changes affecting their performance?