| MIDI generation from the Note Map is
AudioExplorer's batch-mode MIDI-generation tool. Batch mode has
a tremendous advantage over real-time mode MIDI generation in that an entire audio
file can be scanned prior to creating the MIDI data, allowing for more
accurate determination of note events with much less guesswork.
Generation of MIDI from a Note Map is
accomplished in three stages:
- generation of a dynamic note-map image based on current note
selections and the threshold/maximum envelopes.
- calculation and display of "note regions". The
note regions have several properties which can be edited
individually or as groups of notes. These properties provide
fine control over the MIDI generation step.
- MIDI generation based on the note regions and their properties.
|
|
|
|
Dynamic Note-map Generation
|
| The Note Map file stores a full signal vs.
time profile for each of the 128 MIDI notes, with one time-point for
each interval sampled and analyzed by the Power Spectrum Analyzer.
This allows AudioExplorer to very quickly generate a new graphic image
in response to changes made to the note
selection and/or the envelopes.
|
|
|
|
|
|
|
Calculation of Note Regions |
|
| As AudioExplorer examines the signal vs. time profiles
of each tone in the Note Map, it first applies smoothing (if any) to
each profile, and then uses the tone's threshold to determine the time
intervals for which the tone is "on". These
above-threshold time intervals are referred to as "note
regions". |
|
|
| Figure 1: |
Effects of smoothing |
|
Instead of examining every time point sampled by
the Power Spectrum Analyzer, the dynamic note map can used a
windowed average, causing a smoothing of each note's signal vs.
time curve. An example of smoothing is shown below for the
tone G3, which is seen to sound rhythmically during the time
interval shown. When smoothing is applied, many of the
very rapid changes (e.g., "rough edges" of the peaks)
are seen to disappear, and the highest peaks are reduced. |
|
|
|
|
Smoothing = 0
|
Smoothing = 25
|
|

|
|
|
|
|
|
| Figure 2: |
Calculation of Note Regions from
a tone's signal vs. time profile and threshold |
|
The figure shows an area of the note-map image
(including the note region overlay) superimposed over the signal
vs. time profile from which it was generated.
The signal first rises above the threshold between 26.134 and
26.434 seconds, defining the first region.
The second region spans 26.502 to 27.229 seconds, and
includes two small shoulder peaks along with the main central
signal peak.
The third region spans 27.303 to 29.314 seconds, and clearly
includes multiple "events" which have been interpreted
as a single region. Note that these "events" -
or regular fluctuations in the tone's signal - may not be the
result of musical events (plucking or bowing a string, pressing
key, etc.). They may also result from special effects
(such as delay or reverb) which have been applied to the
recording. Although AudioExplorer is capable of breaking
the region into multiple sub-regions using the peak-detect
function, this may not always be musically desirable.
|
|
|

|
|
|
|
|
|
| Figure 3: |
Illustration of the effects of
the minimum duration |
|
In the example below, a tone pulses rhythmically and its signal periodically rises above
threshold. However, if the minimum duration were set to
0.25 seconds (250 ms), several of these rises would be discarded,
since the time interval over which they remain above the
threshold is less than the minimum duration.
|
|
|
|
|
|
|

|
|
|
|
|
When to use "Merge
Neighbors"? |
| Due to limitations in the resolution
of the Power Spectrum Analyzer, there can be "leakage"
of signal from a strongly sounding frequency into neighboring
frequencies. This problem is especially common in the
lowest frequencies. The strategy used to counter this
problem is similar to the "Shoulder
Merging" used by the real-time Note
Processor. |
|
|
| Figure 4A: |
To merge neighbors or not to
merge? |
|
In this figure, the note map clearly shows a
cluster of three adjacent notes sounding at the same time,
making this a candidate for neighbor merging.
Inspection of the signal vs. time profiles shows that the
center note, B flat 2, is the strongest of the three signals,
another bit of evidence that neighbor merging is
appropriate.
However, these profiles have two properties which suggest
that these regions are independent and should not be
merged. First, each of the three notes starts at a
different time, in order B2 - Bb2 - A2. Second, each
note's profile has a distinctive shape, quite unlike its
neighbor's. |
|
|
|
|
|
|

|
|
|
|
Figure 4B: |
In this example, the center note (E3) is again the
strong note. Furthermore, the signal vs. time profiles are
quite similar - a strong center with a shoulder on each
side. E3 is a strong candidate for application of neighbor
merging. |
|
|
|
|

|
|
|
|
|
|
|
Overtone Merging |
|
| If "pitch" as perceived by
the human ear is the dominant frequency of a sound, then the
overtones comprise the sound's "quality",
"character", or "timbre".
Dealing with overtones is, in a word, impossible. A
single note played on a violin might have a strong first
overtone, resulting in a strong signal one octave above the
fundamental note. Alternatively, the same violin might be
playing that note on one string and the note one octave higher
on another string, perhaps resulting in a very similar pair of
signals. Without assuming monophony, there is no way to
distinguish between these two possibilities without extra
information provided by you, the omniscient audio explorer.
By marking a note region with "Merge Overtones",
AudioExplorer is instructed to assume that the marked region is
a fundamental, and all available overtones are to be merged into
it. |
|
|
| Figure 5: |
When to merge overtones |
|
Figure 5 shows the profiles of a note (A4) and of
its overtones. Signals for the first (A5) and second (E6)
overtones strong enough to have been interpreted as
"notes" by AudioExplorer. The profile shapes of
these notes are all quite similar, and it is quite likely that
they are in fact related as overtones.
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
| Generation of MIDI from
the Note Regions
To generate MIDI, a new set of "derived" note regions are
created based on the original set. In creating the derived
regions:
- "hidden" regions are excluded.
- for regions marked "Merge Neighbors", regions from
neighboring notes covering the same time interval are merged and
removed.
- for regions marked "Merge Overtones", regions from the
overtones covering the same time interval are merged and removed.
With all hidden and merged regions removed, generation of MIDI data
from this derived set of note regions is straightforward - one MIDI note
event is created from each remaining note region. Note Velocities
are calculated from the note's signal amplitude, threshold, and maximum:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|