TomRB22 commited on
Commit
3a0d063
1 Parent(s): 13c33d8

Included full documentation in the README file

Browse files
Files changed (1) hide show
  1. README.md +69 -4
README.md CHANGED
@@ -67,7 +67,7 @@ The first one will clone the repository. Then, fluidsynth, a real-time MIDI synt
67
 
68
  ## Training Details
69
 
70
- Pivaenist was trained on the [MAESTRO v2.0.0 dataset](https://magenta.tensorflow.org/datasets/maestro), which contains 1282 midi files [check it in colab]. Their preprocessing involves splitting each note in pitch, duration and step, which compose a column of a 3xN matrix (which we call song map), where N is the number of notes and a row represents sequentially the different pitches, durations and steps. The VAE's objective is to reconstruct these matrices, making it then possible to generate random maps by sampling from the distribution, and then convert them to a MIDI file.
71
 
72
  <figure>
73
  <img src="https://huggingface.co/TomRB22/pivaenist/resolve/main/.images/map_example.png" style="width:30%; display:block; margin:auto">
@@ -76,7 +76,35 @@ Pivaenist was trained on the [MAESTRO v2.0.0 dataset](https://magenta.tensorflow
76
 
77
  # Documentation
78
 
79
- ## **_Audio_**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
 
81
  ### midi_to_notes
82
 
@@ -103,5 +131,42 @@ Parameters
103
  * pm (pretty_midi.PrettyMIDI): PrettyMIDI object containing a song.
104
  * seconds (int): Time fraction of the song to be displayed. When set to -1, the full length is taken.
105
 
106
- Returns
107
- * display.Audio: Song as an object allowing for display.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
68
  ## Training Details
69
 
70
+ Pivaenist was trained on the midi files of the [MAESTRO v2.0.0 dataset](https://magenta.tensorflow.org/datasets/maestro). Their preprocessing involves splitting each note in pitch, duration and step, which compose a column of a 3xN matrix (which we call song map), where N is the number of notes and a row represents sequentially the different pitches, durations and steps. The VAE's objective is to reconstruct these matrices, making it then possible to generate random maps by sampling from the distribution, and then convert them to a MIDI file.
71
 
72
  <figure>
73
  <img src="https://huggingface.co/TomRB22/pivaenist/resolve/main/.images/map_example.png" style="width:30%; display:block; margin:auto">
 
76
 
77
  # Documentation
78
 
79
+ ## **_model.VAE_**
80
+
81
+ ### encode
82
+
83
+ ```python
84
+ def encode(self, x_input: tf.Tensor) -> tuple[tf.Tensor]:
85
+ ```
86
+ Get a "song map" and make a forward pass through the encoder, in order to return the latent representation and the distribution's parameters.
87
+
88
+ Parameters:
89
+ * x_input (tf.Tensor): Song map to be encoded by the VAE.
90
+
91
+ Returns:
92
+ * tf.Tensor: The parameters of the distribution which encode the song (mu, sd) and a sampled latent representation from this distribution (z_sample).
93
+
94
+ ### generate
95
+
96
+ ```python
97
+ def generate(self, z_sample: tf.Tensor=None) -> tf.Tensor:
98
+ ```
99
+ Decode a latent representation of a song.
100
+
101
+ Parameters:
102
+ * z_sample (tf.Tensor): Song encoding outputed by the encoder. If None, this sampling is done over an unit Gaussian distribution.
103
+
104
+ Returns:
105
+ * tf.Tensor: Song map corresponding to the encoding.
106
+
107
+ ## **_audio_**
108
 
109
  ### midi_to_notes
110
 
 
131
  * pm (pretty_midi.PrettyMIDI): PrettyMIDI object containing a song.
132
  * seconds (int): Time fraction of the song to be displayed. When set to -1, the full length is taken.
133
 
134
+ Returns:
135
+ * display.Audio: Song as an object allowing for display.
136
+
137
+ ### map_to_wav
138
+
139
+ ```python
140
+ def map_to_wav(song_map: pd.DataFrame, out_file: str, velocity: int=100) -> pretty_midi.PrettyMIDI:
141
+ ```
142
+ Convert "song map" to midi file (reverse process with respect to
143
+ midi_to_notes) and (optionally) save it, generating a PrettyMidi object in the process.
144
+
145
+ Parameters:
146
+ * song_map (pd.DataFrame): 3xN matrix where each column is a note, composed of pitch, duration and step.
147
+ * out_file (str): Path or file to write .mid file to. If None, no saving is done.
148
+ * velocity (int): Note loudness, i. e. the hardness a piano key is struck with.
149
+
150
+ Returns:
151
+ * pretty_midi.PrettyMIDI: PrettyMIDI object containing the song's representation.
152
+
153
+ ### generate_and_display
154
+
155
+ ```python
156
+ def generate_and_display(model: VAE,
157
+ out_file: str=None,
158
+ z_sample: tf.Tensor=None,
159
+ velocity: int=100,
160
+ seconds: int=120) -> display.Audio:
161
+ ```
162
+ Generate a song, (optionally) save it and display it.
163
+
164
+ Parameters:
165
+ * model (VAE): Instance of VAE to generate the song with.
166
+ * out_file (str): Path or file to write .mid file to. If None, no saving is done.
167
+ * z_sample (tf.Tensor): Song encoding used to generate a song. If None, perform generate an unconditioned piece.
168
+ * velocity (int): Note loudness, i. e. the hardness a piano key is struck with.
169
+ * seconds (int): Time fraction of the song to be displayed. When set to -1, the full length is taken.
170
+
171
+ Returns:
172
+ * display.Audio: Song as an object allowing for display.