Using Adapter/PEFT for finetuning a Subnet extracted from the SmolLM2 for Arduino Tool Calling

#7
by MartialTerran - opened

A plan for extracting/training a subnet from the SmolLM2 model that is reliable for Arduino-Tool calling, by adding an ADAPTER, such as a PEFT, to patch any deficiencies in the extracted subnet, and to support finetuning using a tokenizer that uses the custom Arduino control vocabulary.

Project Plan: SmolLM2 Subnet for Arduino Tool Calling with PEFT Adapter
This plan enhances the previous approach by incorporating a Parameter-Efficient Fine-Tuning (PEFT) adapter to further specialize and improve the performance of the extracted SmolLM2 subnet for Arduino tool calling.

Target Arduino Boards: Arduino Portenta H7 preferred.

Phase 1: Subnet Extraction, Adapter Initialization, and Specialization

Task 1.1: Define a Custom Arduino Control Vocabulary: (Same as previous plan)

Task 1.2: Extract and Quantize a SmolLM2 Subnet:

Description: Extract a small subnet and quantize its weights as described in the previous plan. This serves as the base model for further adaptation.

Details: The initial subnet extraction doesn't require fine-tuning on a specialized dataset yet, as this will be handled by the PEFT adapter.

Task 1.3: Initialize the PEFT Adapter:

Description: Initialize a PEFT adapter (e.g., LoRA, AdaLoRA, or QLoRA) for the extracted subnet.

Details: The adapter's parameters will be trained to specialize the subnet for Arduino tool calling. Choose a PEFT method that is computationally efficient and has a small memory footprint, suitable for the Arduino environment.

Deliverables: An initialized PEFT adapter with randomly initialized parameters.

Task 1.4: Create Arduino Control Dataset & Fine-tune the Adapter:

Description: Create a dataset of natural language instructions paired with corresponding Arduino code, as described in the previous plan. Use this dataset to fine-tune only the PEFT adapter's parameters, keeping the base subnet's weights frozen.

Details: Fine-tuning the adapter is much more efficient than fine-tuning the entire subnet, making it feasible even with limited computational resources. Use the custom Arduino control vocabulary tokenizer during fine-tuning.

Deliverables: A fine-tuned PEFT adapter specialized for Arduino tool calling.

Phase 2: Arduino Development Environment Setup (Same as previous plan, but includes loading the adapter weights)

Task 2.1: Integrate a Lightweight Interpreter/Code Executor: (Same as previous plan)

Task 2.2: Load Subnet and Adapter Weights:

Description: Implement loading the quantized base subnet weights and the fine-tuned adapter weights onto the Arduino, potentially from external storage.

Phase 3: Implementation and Integration

Task 3.1: Specialized Tokenization: (Same as previous plan)

Task 3.2: Inference and Code Generation with Adapter:

Description: Implement the inference loop, similar to the previous plan, but incorporate the PEFT adapter during inference.

Details: The adapter's output is combined with the base subnet's output to generate the final Arduino control code.

Task 3.3: Interpreter/Executor Integration: (Same as previous plan)

Phase 4: Testing and Optimization (Similar to previous plans)

Advantages of using a PEFT Adapter:

Reduced Memory Footprint: Only the small adapter parameters need to be trained and stored, significantly reducing the memory requirements compared to fine-tuning the entire subnet.

Faster Fine-tuning: Training the adapter is much faster than training the entire model.

Improved Performance: The adapter can effectively specialize the base subnet for Arduino tool calling, leading to better performance on the target task.

Flexibility: The base subnet can be reused for other tasks by training different adapters.

This revised plan improves the efficiency and flexibility of the system by incorporating a PEFT adapter. It allows for more specialized fine-tuning while minimizing resource usage on the Arduino. This approach makes complex natural language-based Arduino control more feasible on resource-constrained devices. Remember to choose a PEFT method and adapter size that is appropriate for the target hardware.

A list table of a dataset of probable natural language instructions to be paired with corresponding Arduino code, for finetuning the Subnet with PEFT Adapter for Arduino Tool Calling:

Arduino Tool Calling Dataset Examples
Instruction (Natural Language) Arduino Code (Simplified Interpreter Format) Notes
Turn on the built-in LED. digitalWrite(LED_BUILTIN, HIGH); Basic digital output
Turn off the LED on pin 13. digitalWrite(13, LOW);
Blink the red LED 3 times with a 1-second delay. for (int i = 0; i < 3; i++) { digitalWrite(redLED, HIGH); delay(1000); digitalWrite(redLED, LOW); delay(1000); } Looping and timing
Read the value from analog pin A0 and print it. Serial.println(analogRead(A0)); Analog input and serial output
If the button on pin 2 is pressed, turn on the green LED. if (digitalRead(2) == HIGH) { digitalWrite(greenLED, HIGH); } Digital input and conditional logic
Set the brightness of the LED on pin 9 to half. analogWrite(9, 127); Analog output (PWM)
Rotate the servo motor on pin 5 to 90 degrees. servo.attach(5); servo.write(90); Servo control
Play a tone of 440 Hz on pin 8 for 500 milliseconds. tone(8, 440, 500); Tone generation
Send "Hello" over serial. Serial.println("Hello"); Serial communication
Wait for 2 seconds, then read the temperature sensor on A1. delay(2000); float temperature = analogRead(A1); Combining delay and analog input
If the temperature is greater than 25 degrees, turn on the fan. if (temperature > 25) { digitalWrite(fanPin, HIGH); } Combining analog input and conditional logic
Turn on the LED on pin 7 for 1 second, then turn it off for 2 seconds. Repeat 5 times. for (int i = 0; i < 5; i++) { digitalWrite(7, HIGH); delay(1000); digitalWrite(7, LOW); delay(2000); } More complex timing and control flow
Read the ultrasonic sensor on pins 3 and 4 and print the distance. long duration = pulseIn(4, HIGH); float distance = duration * 0.034 / 2; Serial.println(distance); Working with sensor libraries
If the distance is less than 10 cm, turn on the buzzer. if (distance < 10) { tone(buzzerPin, 1000); } Combining sensor readings and actions
Dataset Creation Guidelines:

Start Simple: Begin with basic commands and gradually increase complexity.

Vary Input Phrasing: Include different ways of expressing the same instruction (e.g., "Turn on LED 7", "Switch on the LED at pin 7", "Activate the LED connected to pin 7"). This helps the model generalize better.

Cover Different Peripherals and Functions: Include instructions for various Arduino functionalities (digital I/O, analog I/O, sensors, actuators, timing, etc.).

Include Error Handling: Consider adding instructions that involve error conditions (e.g., "If the sensor reading is invalid, blink the error LED").

Balance the Dataset: Ensure that the dataset has a good balance of different types of instructions and complexity levels.

Iterative Refinement: Start with a smaller dataset and iteratively expand it as needed during the fine-tuning process. Analyze the model's performance and add more examples for areas where it struggles.

This table provides a starting point for creating your dataset. Remember to adapt it based on the specific Arduino tools and functionalities you want to control. The quality and diversity of your dataset are critical for training a reliable and effective model.

Sign up or log in to comment