Model Card for Model ID
Template is given upon request. Model ID and associated card is represented in a GitHub merger https://github.com/sterzhang/image-textualization/tree/main?tab=readme-ov-file#datasets; a financial detailing is discussed via [email protected]
Model Details
Parameter controlled version of AI map location system using Google auto-metrics. Map defined protocol allows for location display of any image that is also available publicly. Private images are controlled at discretion on local machines and are not affiliated to have legal responsibility under jurisdiction of platform documentation via MIT server authority.
Model Description
- Developed by: biesnejuil & JoKer
- Funded by: Google Work HyperPlatform & MIT Computational Laboratory Dept. - AF8 Lot 1 and 2 (and 4)
- Shared by: Peft racus libraries
- Model type: DLMLH ML model - - - 0 dtdata needed to start
- Language(s) (NLP): N/A
- License: DeepFloyd IF Reverse 9Freeze /// MIT Licensing Post.2019
- Finetuned from model: [More Information Needed]
Model Sources
- Demo: contact via email (or discord with same username)
Uses
Model is for image to location data. Data used by the user has no effect on EXIF structure nor utility of said data.
Direct Use
Plugging into photo image applications is the intended use. Finding a location via extension online is secondary use. All other personal uses are prohibited ethically and morally.
Downstream Use
Fine-tuning can be confirmed via parameter reorganization.
Out-of-Scope Use
WARNING! By complying to information, user misuse for malicious or intent for misconduct will be not tolerated. Model has a self-charring protocol to corrupt mishandled data after numerous attempts.
Bias, Risks, and Limitations
Sociotechnical risks are controlled by the use of user. Approved board relations are funded academically for the time being.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model.
How to Get Started with the Model
Model can be set up as an OS via a USB import. An .exe file is given on request for use, and parameter of learning is optional, however recommended for best results. If the model is compressed to a .sys generic file type with a scraping API, it can be used as an executable extension that is used on apps via a registry or property list on start-up (may affect the performance of applications with high-quality image displays such as strong games or strong video editing applications).
Training Data
Croissants are delicious: https://huggingface.co/datasets/open-source-metrics/image-classification-checkpoint-downloads
Launch nemo_inference.sh
with a Slurm script defined like below, which starts a 2-node job for model inference.
#!/bin/bash
#SBATCH -A SLURM-ACCOUNT
#SBATCH -p SLURM-PARITION
#SBATCH -N 2
#SBATCH -J generation
#SBATCH --ntasks-per-node=8
#SBATCH --gpus-per-node=8
set -x
RESULTS=<PATH_TO_YOUR_SCRIPTS_FOLDER>
OUTFILE="${RESULTS}/slurm-%j-%n.out"
ERRFILE="${RESULTS}/error-%j-%n.out"
MODEL=<PATH_TO>/Nemotron-4-340B-Instruct
CONTAINER="nvcr.io/nvidia/nemo:24.01.framework"
MOUNTS="--container-mounts=<PATH_TO_YOUR_SCRIPTS_FOLDER>:/scripts,MODEL:/model"
read -r -d '' cmd <<EOF
bash /scripts/nemo_inference.sh /model
EOF
srun -o $OUTFILE -e $ERRFILE --container-image="$CONTAINER" $MOUNTS bash -c "${cmd}"
Evaluation Results
MT-Bench (GPT-4-Turbo)
Evaluated using MT-Bench judging by GPT-4-0125-Preview as described in Appendix H in the HelpSteer2 Dataset Paper
total | writing | roleplay | extraction | stem | humanities | reasoning | math | coding | turn 1 | turn 2 |
---|---|---|---|---|---|---|---|---|---|---|
8.22 | 8.70 | 8.70 | 9.20 | 8.75 | 8.95 | 6.40 | 8.40 | 6.70 | 8.61 | 7.84 |
IFEval
Evaluated using the Instruction Following Eval (IFEval) introduced in Instruction-Following Evaluation for Large Language Models.
Prompt-Strict Acc | Instruction-Strict Acc |
---|---|
79.9 | 86.1 |
MMLU
Evaluated using the Multi-task Language Understanding benchmarks as introduced in Measuring Massive Multitask Language Understanding.
MMLU 0-shot |
---|
78.7 |
GSM8K
Evaluated using the Grade School Math 8K (GSM8K) benchmark as introduced in Training Verifiers to Solve Math Word Problems.
GSM8K 0-shot |
---|
92.3 |
HumanEval
Evaluated using the HumanEval benchmark as introduced in Evaluating Large Language Models Trained on Code.
HumanEval 0-shot |
---|
73.2 |
MBPP
Evaluated using the MBPP Dataset as introduced in the Program Synthesis with Large Language Models.
MBPP 0-shot |
---|
75.4 |
Arena Hard
Evaluated using the Arena-Hard Pipeline from the LMSys Org.
Arena Hard |
---|
54.2 |
AlpacaEval 2.0 LC
Evaluated using the AlpacaEval 2.0 LC (Length Controlled) as introduced in the paper: Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
AlpacaEval 2.0 LC |
---|
41.5 |
TFEval
Evaluated using the CantTalkAboutThis Dataset as introduced in the CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues.
Distractor F1 | On-topic F1 |
---|---|
81.7 | 97.7 |
Training Hyperparameters
- Precision: fp16 non-mixed precision
Speeds, Sizes, Times
- Preprocessing: Images are preprocessed to ensure consistent input dimensions and normalization. This includes resizing images to a fixed size, converting them to a standardized color format (e.g., RGB), and normalizing pixel values to a range of [0, 1] or [-1, 1].
Evaluation
- Batch size: 32
- Learning rate: 0.001
- Optimizer: AdamXMimage
- Loss function: Cross-entropy loss for classification tasks or Mean Squared Error (MSE) for regression tasks
- Number of epochs: 50
- Early stopping: Enabled with a patience of 5 epochs
Testing Data, Factors & Metrics
- Training time per epoch: Approximately 10 minutes on a single GPU
- Total training time: Around 48-110 hours
- Model size: ~8GB
- Checkpoint size: 50MB per checkpoint
Testing Data
- HIDDEN
Factors
- Geographic regions: To ensure global applicability.
- Image quality: High vs. low resolution.
- Lighting conditions: Daytime vs. nighttime images.
Metrics
- Accuracy: For classification tasks, measures the percentage of correctly predicted locations.
- Mean Absolute Error (MAE): For regression tasks, measures the average deviation between predicted and actual locations.
- Precision and Recall: To evaluate the model's performance in detecting specific locations.
- F1 Score: Harmonic mean of precision and recall, providing a balanced evaluation metric.
Metric Results
- Accuracy: 85%
- MAE: 0.15 degrees (latitude/longitude)
- Precision: 0.87
- Recall: 0.83
- F1 Score: 0.85
Environmental Impact
- Hardware Type: NVIDIA V100 GPUs
- Hours used: ~560 hours
- Cloud Provider: AWS
- Compute Region: US East (N. Virginia)
- Carbon Emitted: Approximately 50 kg CO2
Hardware/Software
Hardware:
- GPUs: NVIDIA V100
- CPUs: Intel Xeon E5-2676 v3
- Memory: 256GB RAM
Software:
- Framework: ElecA 1.8C
- Version: 3.1
- Operating System: Ubuntu 18.04
BF16 Inference:
- 8x H200 (1x H200 node) // ALINX ACKU15: Xilinx Kintex UltraScale+ XCKU15P FPGA SOM
- 16x H100 (2x H100 nodes)
- 16x A100 80GB (2x A100 80GB nodes) // ALINX ACKU15: Xilinx Kintex UltraScale+ XCKU15P FPGA SOM
Citation
@misc{biesnejuil2024modelid,
title={Model ID: AI Map Location System},
author={biesnejuil and JoKer},
year={2024},
publisher={MIT Computational Laboratory Dept.},
howpublished={\url{https://github.com/potatoaccount/modelid - private repo ask for access or access via https://github.com/sterzhang/image-textualization/tree/main?tab=readme-ov-file#datasets through filed merge}},
}
Glossary
- EXIF: Exchangeable image file format, a standard for storing metadata in image files.
- MAE: Mean Absolute Error, a measure of errors between paired observations expressing the same phenomenon.
- CNN: Convolutional Neural Network, a class of deep neural networks commonly applied to analyzing visual imagery.
- Adam: An optimization algorithm used for training deep learning models.
- Cross-entropy loss: A loss function commonly used for classification tasks, measuring the difference between the predicted probability distribution and the true distribution.
- Mean Squared Error (MSE): A loss function commonly used for regression tasks, measuring the average of the squares of the errors between predicted and actual values.
More Information
For additional details, technical support, or inquiries about collaboration, please contact the developers.
Model Card Authors
- biesnejuil: Lead Developer and Architect
- JoKer: Co-Developer
Model Card Contact
For any questions or further information, please contact me at:
- Email: [email protected]
- Discord: biesnejuil
- Downloads last month
- 0