RISM search: matching methods

All the methods we use in this demo are based on alignment. The incipits in the RISM dataset are parts of melodies. A melody is a sequence of notes. With an alignment algorithm we are going to match notes (or groups of notes) to each other. For every pair of notes, one from each sequence, we must have a measure of how well they match. This measure is computed by a rater.

Notes have a few characteristics that we take into account: pitch, duration and metric weight. We represent pitch with the Base 40 number system. This system has nice properties if we want to compute with intervals (subtract them) or if we want to transpose a piece (add a constant). We normally try to be invariant to transpositions of the melody. For this, we use a histogram technique. The pitches of the melodies are stored in a two histograms. A pitch shift is then computed to get maximal overlap of the histograms.

Duration is measured as an amount of time. Metric weight is computed using Inner Metric Analysis. This method finds the stress of a note based solely on the onsets (starting moment) of notes.

Description of the methods

All of the matching methods take pitch into account. The pitch raters can be combined with the metric weight and/or duration. We first describe the pitch raters and then ways of combining them.

Pitch raters

All pitch raters first transpose one of the melodies. We do this, using a histogram approach. The pitch values are stored in a histogram. Then a shift is computed that maximises the overlap of the histogram bins. All pitch raters yield values between -1.0 and 1.0.

Exact Pitch: If the notes have equal pitch, they have a benefit of 1. If they differ exactly by one or more octaves, the benefit is 0.5. Otherwise, the benefit is -1 (a penalty).
Zigzag Pitch: This method favours notes that are 'a little off' over notes that are further removed (modulo an octave). An exact match (modulo an octave) gives a benefit of 1. The benefit decreases linearly to -1 at a distance of 20 and then increases linearly to 1 at a distance of 40 (an octave).
Kranenburg Pitch: This method is described in the thesis of Peter van Kranenburg: A computational Approach to Content-Based Retrieval of Folk Song Melodies (6.3.2). We compare the difference in pitch, modulo an octave. The larger the interval, the worse the match value (from 1 to -1). This method is described in the thesis of Peter van Kranenburg: A computational Approach to Content-Based Retrieval of Folk Song Melodies (6.3.2). It was the most succesful method in that context, but is not very intuitive.

Weight based raters

We can combine the pitch raters in two ways with metric weight as computed with the inner metric analysis (ima) method.

Ima weighted: In this method the influence of a note depends on its metric weight. We scale the weights of the notes of each melody such that the average weight is 1. We then take the value computed by the pitch rater and multiply it by the average of the weights of the two notes that are compared.
Ima combined: In this method, the metric weight has a more independent character than in the previous method. We look at the absolute difference between the two metric weight values and combine this with the value produced by the pitch rater.

Duration based raters

The methods described above do not take the duration of notes into account. Here we describe two methods that do. They are basically the same, except for a preprocessing step.

Duration raters are combined with one of the raters described above (the base rater). They align one note of a melody with one or more notes of the other melody. The value produced by the base rater is multiplied by the duration that the two notes overlap.

If the base rater is an ima weighted rater, we take a variant of this rater where the average weight is computed taking the duration into account.

Fixed duration: Takes the duration as notated without any scaling.
Scaled duration: Uses histograms to collect the duration values and multiplies the note duration of the melodies such that the overlap of their histograms is maximised. This will find similarities if one of the melodies is notated in double tempo. It will work worse for variations on a theme, though.