Photo: Sam Lion

Does Mozart use sonata form?

A number of works composed by Haydn, Mozart and Beethoven are recognized as in sonata forms, especially first movements of string quartets, concerti, symphonies, and piano sonatas.

transactions.ismir.net - Learning Sonata Form Structure on Mozart's String Quartets

How do musicians know so many songs?

How can their brains hold on to this much information? Musicians can memorize many songs for a performance through massive repetition and by having...

Last updated Apr 19, 2020

How long should a 13 year old practice piano?

The Ideal Piano Practice for Ages 13+ At least 30 minutes is pretty necessary to get them to learn more and to get them to the next level. Anything...

Last updated Jul 24, 2022

Promotion

The results show that the sonata form is better identified when the parameters are learned rather than manually set up. We also study how the granularity of the model (i.e. the number of possible states) influences the success of the detection (Section 4). Reproducible MIR research needs to be grounded on publicly available datasets. Here, we systematically study a corpus containing most of the sonata-form movements in Mozart’s string quartets, and we release an open dataset providing two independent analyses of each movement, encoded manually, based on formal modeling of sonata form (Section 2). Extending the approach we introduced before ( Bigo et al., 2017 ), we propose several models of sonata form using Hidden Markov Models for which parameters, emission probabilities, and transition probabilities are automatically learned on the corpus. The states of the HMMs represent the different sections of a sonata form and the observations consist of binary analytical features computed through the pieces (Section 3). We discuss the relationship between the occurrences of these features and the sonata form sections. Finally, some research in the MIR community specifically targets sonata form structure: Jiang and Müller ( 2013 ) detected exposition/recapitulation pairs in Beethoven piano sonatas with self-similarity matrices. They also traced transpositions and harmonic changes through the different parts. Weiß and Müller ( 2014 ) proposed a model of “tonal complexity” and mapped it on sections of sonata forms. Baratè et al. ( 2005 ) introduced a model of sonata form structure based on Petri Nets. We previously proposed a model based on a Hidden Markov Model (HMM) emitting analytical features ( Bigo et al., 2017 ). This model relied on human expertise, following the layout of sonata form as presented by Hepokoski and Darcy ( 2006 ). This previous approach was applied to a small set of pieces and the parameters of the model were hard-coded, based on music theory assumptions. MIR modeling of high-level structures has also been employed in the field of music generation, wherein algorithms often have difficulties in producing long-term coherence. Herremans and Chew ( 2017 ) proposed to formulate this task as a combinatorial optimization problem. Nika et al. ( 2016 ) used harmonic scenarios to produce structured music improvisation. Medeot et al. ( 2018 ) elaborated a Recurrent Neural Network trained on a dataset of structural elements. On the one hand, “analyzing a sonata form”, which implies identifying the boundaries of its successive sections, often requires a number of musicological judgments that are piece-specific, which makes its automation difficult. Being strongly linked to music history, music analysis may indeed include ideas that involve the singularity of the piece, a comparison between composers as well as some aesthetic considerations. On the other hand, music analyses are often built upon specific analytical elements, like themes or patterns that structure the harmony and the texture of the piece. Analyses can therefore be modelled with Music Information Retrieval (MIR) algorithms that can be properly evaluated. Finally, the identification of a large-scale structure such as the sonata form requires the combination of these local features to reach a piece-level analysis, which is itself a challenge for MIR research. We previously reviewed research on computational analysis of musical form ( Giraud et al., 2015 ). Chen et al. ( 2004 ) proposed to segment the musical piece into sections called “sentences”, clustering phrases predicted by the LDBM algorithm by Cambouropoulos ( 2001 ). Rafael and Oertl ( 2010 ) built a global structure from patterns extracted by the algorithm from Hsu et al. ( 1998 ). Some studies, such as by Hamanaka et al. ( 2016 ), have attempted to compute large-scale structures as theorized by Schenker ( 1935 ) or later by the Generative Theory of Tonal Music (GTTM) of Lerdahl and Jackendoff ( 1983 ). Other works also modeled specific large-scale features, such as tonal tension ( Lerdahl and Krumhansl, 2007 ; Farbood, 2010 ). A number of works composed by Haydn, Mozart and Beethoven are recognized as in sonata forms, especially first movements of string quartets, concerti, symphonies, and piano sonatas. However, the theories about the classical sonata form were introduced almost fifty years after its early golden era ( Reicha, 1824 ; Marx, 1845 ; Czerny, 1848 ). One of its earliest formalizations seems to be the grande coupe binaire that Reicha ( 1824 ) described 30 years after Mozart died. The sonata form finally became a normative structure for several generations of romantic composers, being transmitted both through explicit teaching as well as implicit exposure. The large-scale structure referred to as sonata form is a post-hoc formalization of a widely used composer practice since the middle of the 18th century. It is built on a piece-level tonal path concept involving both a primary thematic zone (P) and a contrasting secondary thematic zone (S) (Figure 1 ). This creates a polarization between two tonalities and induces a dramatic turn to the piece. The sonata form can be viewed as an evolution of both aria and concerto Baroque forms ( Rosen, 1980 ; Hepokoski and Darcy, 2006 ). Greenberg ( 2017 ) investigated how sonata-form recapitulation may have come from both the double return of the tonic key and the parallel endings in a two-part movement. The annotation sets described above are distributed as Supplementary Files and at http://www.algomus.fr/data/ under the Open Database License (ODbL v1.0). These analyses are encoded as json files containing labels, each label being defined by a type (Structure/Cadence/Harmony), by an onset and possibly by a duration (Figure 3 , right). Moreover, they are available through Dezrann, an interactive web platform for music annotation and analysis ( Giraud et al., 2018 , http://www.dezrann.net/ ). Finally, 3 out of these 32 movements are differently annotated in the two sets of analyses: We see some movements as sonata forms, while Flothuis favors the loosened two-part form (K155.2, K168.2, K172.2). Moreover, he did not consider the form including a continuous exposition without a medial caesura (K458.1, K499.1). Indeed, Mozart frequently “reopens” PACs by repeating S material. He often restates the immediately preceding cadential progression and sometimes expands it. Thus, we identify an EEC when we encounter a PAC if what follows has not been heard shortly before. “(…) one could not consider S to be completed if either it or its cadential material is immediately restated. The PAC that ends the first statement of S proposes an EEC: by repeating the melody or a portion thereof, the composer reopens the PAC and shifts the EEC forward to the next PAC.” Despite some divergences (see Figure 3 ), 77% of the P/TR/S/C labels of A start at the same location in F. The majority of the differences between A and F occur when annotating the start of C. Indeed, Flothuis usually identifies the end of the S section on the first encountered PAC. On the contrary, Caplin ( 1998 ) usually extends S until a last strong PAC providing a conclusion to the theme or to a group of themes, and keeps in C only post-cadential material called codettas. We follow here the first-PAC rule as stated and nuanced by ( Hepokoski and Darcy, 2006, p. 120 and 156 ): The two encodings were done independently. They total 1939 labels, including more than 600 section labels and more than 500 cadences. A reference annotation requires an agreement on a set of sections that need to be identified but also on the location of their boundaries. Some structural elements, such as the location of the cadences or the boundaries of the S theme, are especially subject to debate, and some of them may even be non-pertinent. For instance, there may be no precise border between P and TR. Reference datasets with divergent analyses may thus be particularly helpful. Following the above notations, we encoded two sets of analyses of the 32 sonata forms included in the corpus (Figure 3 ): Between 1770 and 1790, Mozart composed 23 string quartets totaling 86 movements ( King, 1968 ). We denote by K171.4 the 4th movement of K171. Out of these 86 movements, 42 are in sonata form, including 4 rondo sonata movements (K171.4, K173.1, K465.4, and K499.4), and 6 movements with special forms (K155.2, K168.2, K170.3, K171.1, K458.1, and K499.1). Special forms may include sections in unusual places, as for example the introduction and a “written” repeat of P’ and TR’ before the Coda in K171.1, or a strong bithematic unity (K168.2, continuous exposition in K458.1 “The Hunt” and K499.1). Ten out of these 42 sonata forms were left out because of unavailable clean encoding (K158.2, K160.1, K160.2, K160.3, K169.2, K170.3, K458.4, K464.1, K499.4, K575.1). Note that the dataset does not include pieces with an unusual sonata-form structure, such as K387.2, which is a minuet in sonata form without development, or K387.4, which is a fugue-sonata. The corpus used in this work includes 32 sonata-form movements of string quartets composed by Mozart. The pieces are encoded as .krn Humdrum files ( Huron, 2002 ) downloaded from http://github.com/musedata/humdrum-mozart-quartets . These files were originally available from http://kern.humdrum.org and encoded by Edmund Correia, Jr. and Frances Bennion. Figure 2 displays layouts of sonata form at different granularity, including the sections described above along with short transitional sections. Some of these sections or transitional states may be skipped, leading to forward transitions between non-adjacent states. These models are seen as topologies of Hidden Markov Models, detailed in Section 3. Annotating musical structure is challenging, subjective, and may involve different hypotheses from the analyst. Although different analysts might model sonata forms differently, there are points of consensus. In this work, we follow the notations of Hepokoski and Darcy ( 2006 ). Basically, a sonata form is built by following a piece-level tonal path involving a primary thematic zone (P) and a contrasting secondary thematic zone (S). This is illustrated in Figure 1 on a specific movement. that may be present or absent at each quarter note and a Hidden Markov Model predicting the structure based on these features. Analysis features describe harmony, melody, or other local elements. In this section, we present the different models used in our experiments (section 3.1), the analysis features selected for this study (section 3.2), and the learning method used to set up the parameters of the model (section 3.3). as well as to every valuewith i ≤ j (preventing backward transitions). Note that we considered that the features are independent both in the learning phase and when using the models. This is not true in the general case, especially for features that are mutually exclusive such as the tonality features, but this nevertheless allows for a practical approximation. andbe the observed counts of transitions and emissions on the learning corpus, andthe total duration of the section i on the learning corpus. Any transition or emission probabilities can be computed by the following ratios: The parameters of the HMM can be learned by relating the section boundaries that are manually annotated in the whole corpus and the analysis features that are computed at each quarter note. Note that all features are somewhat heuristic and may not be perfect. Nevertheless, the next section will show that some of them are significantly present or absent in some sections of the sonata form and that they may be used to learn the sonata-form structure. The absence or presence of each feature is computed at every quarter note in every piece of the corpus. Features occurring at the limit between two sections are counted in both sections. All the features consider only information on note pitches and durations as well as on rests. They do not look at any other information such as annotation marks, dynamics, or repeat bars. In particular, in almost all the pieces of the corpus, repeat bars are found at the end of the exposition and could ease the analysis. However, even without this repeat bar, this boundary is almost always unambiguous and can be predicted by automated methods. We added the following two new features that may match more closely particular sections of the sonata form, like the Medial Caesura (Figure 5 ): In ( Bigo et al., 2017 ), we selected binary features “according to whether their presence or absence could be characteristic of (…) sections in a sonata form”. We first included these features: follows a path P = (p, … p), entering by an input state pand exiting from an output state p, while outputting the sequence A…A, one state outputting some symbols at each step, is given by: by a t-tuple of integers P = (p, …, p) ∈ [1, n], meaning that the path goes through the t states q…q. We also consider a sequence of sets of symbols, whereis the set of subsets of Since several features can be predicted at the same step, any state may output simultaneously a set of symbols A ⊂ A M4 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ A subset {cal A} ] end{document} The probabilities of the initial state and of the final state are respectively represented by π = (π 1 , …π n ) and τ = (τ 1 , …τ n ). T(i, j) is the transition probability – i.e. the probability that the state q i goes to the state q j , and E(i, α k ) is the emission probability – i.e. the probability that the state q i emits the feature α k . onis defined by a set of n states Q= {q, …q} corresponding to the successive sections of sonata form. We experimented with different sets of states targeting several model topologies (Figure):

4 Evaluation and Results

Our experiments, including the computation of the analysis features and the HMM parameters, and the implementation of the Viterbi algorithm were done in python3 within the music21 framework (Cuthbert and Ariza, 2010), extended with analytic labels (Bagan et al., 2015). Every analytical feature was computed at each quarter note of every piece included in the corpus. Their occurrences in the corpus are discussed below. To avoid overfitting, the learning strategy was evaluated with a Leave-One-Piece-Out cross-validation strategy. The sonata-form structure was predicted on each of the 32 pieces by the four HMMs described above, their parameters being learned on the 31 remaining pieces of the corpus. The cross-validation process was conducted on the whole corpus as the size and the heterogeneity of the corpus did not allow to have a separate test set dedicated to a final evaluation. Note that we did not identify any hyperparameter in the model that we tried to optimize, apart from the various topologies and feature subsets that are discussed below.

What does white vinegar do to your liver?

The use of vinegar has been shown to be harmful to the liver and to the kidneys. Vinegar is also an irritant to the central nervous system. Regular...

Last updated Sep 26, 2019

What does playing the piano do for your brain?

Adults who learn to play piano experience a decrease in depression, fatigue, and anxiety and an increase in memory, verbal communication, and a...

Last updated Dec 23, 2019

Promotion

The results of the computation of the analysis features, as well as the learned probabilities, can be downloaded from http://www.algomus.fr/data/. 4.1 Discussion on feature statistics Table 1 shows the number of occurrences of the computed features within the 18 sections of the sonata form as indicated by the annotation set A. Comparing occurrences of features or other elements against their expected number in “random” situations helps to evaluate their significance (Conklin and Anagnostopoulou, 2001). For example, the first primary zones (P) span 1130 quarter notes, that is 7.9% of the 14318 quarter notes of the corpus. In all the corpus, ton:I is activated on 4491 quarter notes. Should this feature be randomly distributed, ton:I would be activated on about 354 = 4491 × 7.9% quarter notes in P. However, there are actually 553 quarter notes out of these 1130 quarter notes in P where ton:I is activated. Status Features quarters Intro P TR MC S C TC d Dev RT … pat:P 2448 35 20 ≪ 858* 193 ≫ 341* 235 15 13 ≫ 23* 250 ≫ 0* 200 0* 13 0* 12 10* 385 8* 50 pat:S 3008 0* 25 0* 237 ≪ 482* 289 36* 16 686* 308 ≫ 304* 246 6 16 0* 14 8* 473 0* 61 ton:I 4491 41 38 553* 354 ≫ 311* 432 22 23 229* 460 222* 367 18 24 ≫ 0* 22 ≪ 360* 707 ≪ 115 91 ton:II 510 0 4 18* 40 ≪ 74 49 12* 2 72 52 79* 41 1 2 14* 2 ≫ 107 80 2 10 ton:III 479 0 4 27 37 54 46 4 2 70 49 64* 39 8 2 5 2 125* 75 9 9 ton:IV 1734 31* 14 186* 136 ≫ 104* 166 1 9 34* 177 36* 141 ≪ 21 9 4 8 168* 273 25 35 ton:V 2514 0* 21 41* 198 ≫ 467* 242 36* 13 683* 257 468* 205 ≫ 1* 13 4 12 327* 396 70 51 ton:VI 479 6 4 ≫ 12* 37 ≫ 54 46 2 2 98* 49 66* 39 0 2 2 2 ≫ 90 75 ≫ 0* 9 ton:VII 386 9 3 20 30 30 37 0 2 14* 39 23 31 4 2 0 1 94* 60 1 7 ton:i 892 3 7 84 70 52* 85 3 4 16* 91 23* 72 ≪ 12 4 22* 4 148 140 ≪ 42* 18 ton:ii 534 4 4 33 42 ≫ 10* 51 0 2 13* 54 17* 43 1 2 0 2 156* 84 8 10 ton:iii 356 6 3 5* 28 ≪ 70* 34 4 1 68* 36 46 29 0 1 ≪ 12* 1 ≫ 48 56 14 7 ton:iv 349 12* 2 6* 27 0* 33 0 1 16 35 ≪ 51* 28 0 1 8 1 114* 54 18 7 ton:v 460 0 3 22 36 ≪ 69 44 5 2 38 47 8* 37 3 2 1 2 162* 72 ≫ 2 9 ton:vi 1052 0 8 46* 83 73 101 2 5 112 107 51* 86 ≪ 14 5 ≫ 0 5 ≪ 368* 165 ≫ 0* 21 ton:vii 187 3 1 0* 14 ≪ 22 18 0 0 21 19 36* 15 0 1 4 0 14 29 0 3 cad:PAC 416 4 3 20 32 22 40 ≪ 9 2 48 42 72* 34 1 2 0 2 29* 65 4 8 cad:rIAC 142 2 1 ≫ 16 11 8 13 3 0 15 14 9 11 0 0 0 0 29 22 1 2 harm:# 144 2 1 13 11 18 13 6 0 ≫ 7 14 5 11 1 0 1 0 27 22 1 2 harm:7 1122 4 9 49* 88 ≪ 116 108 0 5 68* 115 86 91 ≪ 18* 6 3 5 271* 176 17 22 ped 971 10 8 116* 76 ≫ 76 93 0 5 42* 99 66 79 1 5 0 4 186 152 20 19 rest 331 6 2 35 26 ≫ 11* 31 ≪ 12* 1 ≫ 15 33 ≪ 38 27 4 1 2 1 39 52 ≪ 18 6 seq 1254 24 10 57* 99 61* 120 2 6 95 128 60* 102 3 6 ≪ 29* 6 ≫ 420* 197 ≫ 0* 25 unison 685 16 5 91* 54 ≫ 43 65 7 3 43 70 59 56 ≪ 24* 3 27* 3 ≫ 68* 107 12 13 break 482 1 4 24 38 50 46 ≪ 12* 2 ≫ 36 49 47 39 5 2 1 2 74 75 9 9 hammer 268 0 2 14 21 8* 25 ≪ 14* 1 ≫ 49* 27 20 21 0 1 0 1 52 42 7 5 Total 14318 122 1130 1378 76 1468 1171 78 71 2255 292 … States Features quarters … r P’ TR’ MC’ S’ C’ TC’ Coda pat:P 2448 5 5 ≪ 770* 184 ≫ 317* 243 15 12 ≫ 20* 264 ≫ 0* 218 0* 17 32* 125 pat:S 3008 0 6 1* 227 ≪ 444* 299 34* 15 692* 324 ≫ 307 269 ≫ 0* 20 7* 153 ton:I 4491 20 10 582* 339 ≫ 524* 447 44* 23 640* 484 497* 401 ≪ 17 31 ≪ 295* 229 ton:II 510 0 1 16* 38 28 50 5 2 25* 54 ≪ 56 45 0 3 0* 26 ton:III 479 1 1 18 36 24* 47 0 2 16* 51 ≪ 46 42 6 3 ≫ 1* 24 ton:IV 1734 0 3 156 130 230* 172 9 9 269* 187 256* 155 16 12 188* 88 ton:V 2514 4 5 36* 189 ≪ 132* 250 13 13 112* 271 77* 224 10 17 34* 128 ton:VI 479 3 1 18 36 35 47 0 2 58 51 27 42 0 3 7* 24 ton:VII 386 0 0 17 29 ≪ 80* 38 3 2 35 41 27 34 ≪ 12* 2 ≫ 18 19 ton:i 892 5 2 115* 67 148* 88 10 4 109 96 46* 79 ≪ 21* 6 ≫ 32 45 ton:ii 534 0 1 44 40 64 53 2 2 81 57 36 47 8 3 58* 27 ton:iii 356 0 0 9* 26 18 35 0 1 26 38 28 31 2 2 0* 18 ton:iv 349 0 0 9* 26 9* 34 0 1 23 37 ≪ 78* 31 0 2 4 17 ton:v 460 0 1 24 34 43 45 0 2 33 49 27 41 2 3 21 23 ton:vi 1052 0 2 57 79 77 104 2 5 104 113 68 94 2 7 77 53 ton:vii 187 0 0 0* 14 ≪ 34 18 1 0 23 20 17 16 7 1 5 9 cad:PAC 416 0 0 26 31 25 41 8 2 46 44 73* 37 1 2 28 21 cad:rIAC 142 0 0 18 10 5 14 1 0 17 15 9 12 0 0 9 7 harm:# 144 0 0 14 10 14 14 4 0 16 15 7 12 2 1 6 7 harm:7 1122 1 2 53* 84 93 111 0 5 100 121 ≪ 196* 100 12 7 35 57 ped 971 1 2 131* 73 ≫ 67 96 0 5 47* 104 ≪ 84 86 2 6 ≪ 120* 49 rest 331 5 0 35 24 ≫ 14 32 ≪ 14* 1 ≫ 16 35 33 29 3 2 31 16 seq 1254 0 2 58* 94 ≪ 172* 125 3 6 118 135 120 112 0 8 32* 64 unison 685 8* 1 96* 51 ≫ 44 68 7 3 43* 73 49 61 ≪ 27* 4 ≫ 22 35 break 482 3 1 24 36 52 48 ≪ 14* 2 ≫ 37 52 52 43 5 3 36 24 hammer 268 1 0 14 20 8* 26 ≪ 12* 1 ≫ 46 28 16 23 0 1 10 13 Total 14318 … 32 1081 1427 74 1545 1280 99 732 For each feature and each section, p-values are estimated by an exact Fisher test computed by the Python scipy package. Fisher tests are computed independently. To account for the large number of tests, both on features and on sections, only features with p-values under 10–4 are considered as significant, either by their presence (bold, *) or their absence (italic, *). For example, as expected, the feature ton:I is significantly present in P and significantly absent in S (both times p < 10–30). The ≫ and ≪ symbols between two adjacent columns show the features which can be considered as significant to distinguish these two states, again with a 10–4 threshold on another Fisher test. For example, the feature ton:II is significantly more present in TR than in P (p < 10–9), even if it is not significantly present in TR compared to all sections. Although most features are not specific to a section, many of them differ significantly from one section to another and confirm their pertinence for the task of sonata form detection. A first observation is that the expected tonal path is confirmed by the ton:x features. Indeed, ton:I is met for most of the P quarter notes while ton:V and ton:III (dominant and relative major tonalities) are significantly present in S. This highlights the opposition between the two tonal zones of the exposition. As expected, this “large-scale dissonance” is resolved by the recapitulation. Indeed, both P’ and S’ are characterized by a high prevalence of ton:I. Another result considering the tonality features is the symmetry between TR and TR’. Whereas TR usually induces an ascending fifth move from ton:I to ton:V, our results confirm that, in TR’, Mozart often moves to ton:IV (called a tonal adjustment by Caplin (1998) or a feint by Rosen (1980) and Hepokoski and Darcy (2006)) in order to reach S’ in ton:I with a move of the same interval. The Perfect Authentic Cadences (PAC) are significantly present in C and C’, and only there. Indeed, S and S’ generally end with a strong structural EEC and ESC although the rest of S and S’ do not significantly contain cadences. The thematic pattern pat:P is significantly present for P and P’, but also for TR and TR’. This is because the starts of TR and TR’ are often the same. The thematic pattern pat:S is significantly present for S and S’, but also for TR, C, TR’ and C’. This is because the part of the exposition that is exactly transposed often starts (contrarily to Figure 1) inside TR and continues through S’ and C’. Features break, harm:#, and rest are especially significant on MC and MC’. Some of these features are triggered by the themes in P/P’ or S/S’ at relevant places. Long harmonic sequences and pedals significantly appear in the developments, but they are also present in other sections. In the small transitional sections before the development (TC, d), before the recapitulation (r), and before the Coda (TC’), many unisons are encountered, but again they are significantly found at other places as well. 4.2 Ability to retrieve the sonata-form structure We evaluate the performance of the four HMMs with learned parameters M 3 , M 7 , M 14 , and M 18 M18 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_3},,{{cal M}_7},,{{cal M}_{14}},,( m and),{{cal M}_{18}} ] end{document} , as well as the HMM with hard-coded parameters proposed previously ( Bigo et al., 2017 M 14 * M19 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {cal M}_{14}^* ] end{document} . 4.2.1 Evaluation measures Tables 2 (focus on quarter notes) and 3 (focus on boundaries) show the performance of the five HMMs using the cross-validation process described above on the 32 pieces of the corpus. Table 2 shows F 1 -measures for all the considered classifiers and for each predicted label. The top table further shows the confusion matrix for M 18 M20 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} that details for each predicted label (rows), the number of corresponding quarter notes in the reference annotation (columns). For example, the second row shows that 36 quarter notes are predicted as P but are labeled Intro in the reference annotation (false positives), whereas 751 quarter notes are labeled as P (true positives). Q 18 Intro P TR MC S C TC d Dev RT r P’ TR’ MC’ S’ C’ TC’ Coda Intro 0 154 30 3 · · · · · · · · · · · · · · P 36 751 238 12 4 · · · · · · · · · · · · · TR 1 86 175 10 121 47 · 28 35 · · 16 32 4 29 · · · MC 1 4 19 3 6 · · · · · · · · · 1 · · · S 1 · 608 27 588 357 2 · 11 · · · · · 30 9 · · C 1 2 40 6 364 355 10 · 202 · 5 6 · · · 38 · 0 TC · 23 68 · 5 1 0 21 114 · · 12 · · · · · · d 3 · 29 · 5 2 6 6 101 · · 9 · · · · · · Dev 49 85 134 11 268 353 60 16 1320 87 3 67 56 2 62 110 32 36 RT 30 24 20 · 30 12 · · 393 141 14 57 51 3 12 · · 5 r · · · · · · · · 20 25 2 49 2 · · · · · P’ · · 1 · 1 · · · 0 · 7 713 282 11 35 14 TR’ · · · · · · · · · · · 46 161 4 174 3 · 1 MC’ · · 1 · 1 · · · · · · 2 18 8 6 · · 3 S’ · · 14 3 73 14 · · · · · 7 549 20 471 393 · 16 C’ · · · · · 25 · · 21 15 · 58 197 10 353 213 32 58 TC’ · · · · · 4 · · 34 · · 9 45 3 42 49 11 12 Coda · · · · · · · · 2 24 · 28 32 8 328 463 24 587 quarter notes 122 1130 1378 76 1468 1171 78 71 2255 292 32 1081 1427 74 1545 1280 99 732 F 1 ( M 18 M23 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} , c-val.) 0.00 0.69 0.18 0.05 0.38 0.32 0.00 0.05 0.53 0.26 0.03 0.66 0.18 0.15 0.30 0.19 0.07 0.53 F 1 (equal) 0.00 0.56 0.14 0.04 0.29 0.24 0.00 0.00 0.30 0.12 0.20 0.42 0.02 0.00 0.19 0.15 0.00 0.26 F 1 (fixed) 0.02 0.15 0.18 0.01 0.19 0.15 0.01 0.01 0.27 0.04 0.00 0.14 0.18 0.01 0.19 0.16 0.01 0.10 Q 14 P TR MC S C d Dev RT r P’ TR’ MC’ S’ C’ quarter notes 1130 1378 76 1468 1250 71 2255 292 32 1081 1427 74 1562 2095 F 1 ( M 14 M24 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{14}} ] end{document} , c-val.) 0.76 0.17 0.05 0.38 0.28 0.05 0.58 0.25 0.03 0.66 0.18 0.15 0.28 0.56 F 1 ( M 14 * M25 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {cal M}_{14}^* ] end{document} ) 0.66 0.35 0.03 0.27 0.26 0.04 0.16 0.14 0.02 0.29 0.33 0.09 0.29 0.61 F 1 (equal) 0.40 0.05 0.00 0.20 0.04 0.00 0.16 0.08 0.11 0.23 0.00 0.00 0.12 0.31 F 1 (fixed) 0.15 0.18 0.01 0.19 0.16 0.01 0.27 0.04 0.00 0.14 0.18 0.01 0.20 0.26 Q7 P S C Dev P’ S’ C’ quarter notes 2582 1471 1321 2580 2580 1565 2095 F 1 ( M 7 M26 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{7}} ] end{document} , c-val.) 0.65 0.37 0.25 0.68 0.54 0.33 0.54 F 1 (equal) 0.50 0.36 0.23 0.44 0.39 0.18 0.37 F 1 (fixed) 0.31 0.19 0.17 0.31 0.31 0.20 0.26 Q3 Exp Dev Rec quarter notes 5374 2580 6240 F 1 ( M 3 M27 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{3}} ] end{document} , c-val.) 0.76 0.57 0.85 F 1 (equal) 0.41 0.30 0.68 F 1 (fixed) 0.55 0.31 0.61 that details for each predicted label (rows), the number of corresponding quarter notes in the reference annotation (columns). For example, the second row shows that 36 quarter notes are predicted as P but are labeled Intro in the reference annotation (false positives), whereas 751 quarter notes are labeled as P (true positives). To evaluate the fact that the model is able to learn transition probabilities, we also compared the learned models to HMMs with “equal” transition probabilities (restricted to forward transitions) but with learned emission probabilities. We also show the best F 1 -measure for “fixed” classifiers always predicting the same section. For example, the “fixed” classifier for Q 18 on P always predicts P on the 14318 quarter notes of the corpus and has an F 1 -measure of 0.15, far below the F 1 -measure of 0.69 obtained by M 18 M28 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} . In Table 3, the first four columns (main boundaries) show the results of the evaluation on four boundaries (starts of sections S, Dev, P’ and S’) corresponding to milestones in the tonal path of sonata form. The last four columns (all boundaries) show results of the evaluation while considering the boundaries of all modeled sections. In what follows, the prediction of a section boundary is considered as “correct” (+ or =) if its distance from the corresponding boundary in the reference annotation is at most 3 measures. main boundaries (total: 124) all boundaries + = – ! + = – ! M 14 * M29 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {cal M}_{14}^* ] end{document} 23 4 54 43 68 21 154 115 M 18 M30 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} 34 17 53 20 90 45 147 104 M 14 M31 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {cal M}_{14}^* ] end{document} 31 16 56 21 87 38 146 87 M 7 M32 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{7}} ] end{document} 35 12 61 16 70 15 101 30 M 3 M33 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{3}} ] end{document} 16 8 40 0 46 8 42 0 M 18 M34 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} , no pat:P/pat:S 13 7 97 7 32 29 229 96 M 18 M35 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} , no ton:* 3 11 100 10 32 31 236 87 M 18 M36 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} , no cad:* 35 16 57 16 90 40 159 97 M 18 M37 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} , only ton:* 3 8 104 9 24 27 247 88 M 18 M38 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} , no break features 33 12 61 18 85 36 168 97 4.2.2 Prediction evaluation For the majority of the sections, the learned HMMs have much better F 1 -measures than HMMs with equal transition probabilities, showing that the model can benefit from learned transitions. Using the HMM M 14 * M39 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {cal M}_{14}^* ] end{document} with hard-coded parameters successfully predicted 27 main boundaries (22%) and 89 out of all boundaries (25%). Table 3 M 3 M40 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{3}} ] end{document} model gives a bad prediction, with 24 main boundaries correctly predicted. Indeed, as M 3 M41 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{3}} ] end{document} merges P and S themes, even most tonality features are not very significant. with hard-coded parameters successfully predicted 27 main boundaries (22%) and 89 out of all boundaries (25%). Tableshows that learning parameters using the very simplemodel gives a bad prediction, with 24 main boundaries correctly predicted. Indeed, asmerges P and S themes, even most tonality features are not very significant. Better predictions are achieved by M 7 , M 14 , M42 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_7},,{{cal M}_{14}}, ] end{document} and M 18 M43 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} . The model M 14 M44 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{14}} ] end{document} correctly predicts 47 main boundaries (38%) and 125 (35%) out of all boundaries, improving the results obtained by the HMM with hard-coded parameters. F 1 -measures are also improved for most of the sections. Even better results are obtained with M 18 M45 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} (41% and 38%). However, M 18 M46 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} models many sections. Some of the 18 corresponding states rarely appear over the pieces of the corpus to be consistently learned by the model, as shown by the very low F 1 -measure on sections Intro, TC, d, RT, and TC’. For example, the Intro section is found in only two movements in the whole corpus, leading to incorrect predictions between Intro and P sections. and. The modelcorrectly predicts 47 main boundaries (38%) and 125 (35%) out of all boundaries, improving the results obtained by the HMM with hard-coded parameters. F-measures are also improved for most of the sections. Even better results are obtained with(41% and 38%). However,models many sections. Some of the 18 corresponding states rarely appear over the pieces of the corpus to be consistently learned by the model, as shown by the very low F-measure on sections Intro, TC, d, RT, and TC’. For example, the Intro section is found in only two movements in the whole corpus, leading to incorrect predictions between Intro and P sections. Note that many false positives reported in the confusion matrix for M 18 M47 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} come from only a few pieces. Indeed, 132 of the 134 = 49 + 85 quarter notes predicted as Dev instead of Intro or P come from the wrong prediction on K465.1 (see below and Figure 7 come from only a few pieces. Indeed, 132 of the 134 = 49 + 85 quarter notes predicted as Dev instead of Intro or P come from the wrong prediction on K465.1 (see below and Figure), and 60 out of the 61 = 25 + 21 + 15 quarter notes predicted as C’ instead of C, Dev, or RT come from the wrong prediction of K171.1 (data not shown). Table 3 also shows the results on M 18 M48 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} while restricting the set of features. This confirms that pat:P and pat:S features are important to ground the prediction, but other features also contribute, even if the cadence features do not appear to improve the detection. while restricting the set of features. This confirms thatandfeatures are important to ground the prediction, but other features also contribute, even if the cadence features do not appear to improve the detection. Finally, Figure 6 details the success of the prediction for the start of each section. Apart from the trivial start of P, the boundary being the best predicted is the start of P’, that is the start of the recapitulation. Whereas the hard-coded M 14 * M49 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {cal M}_{14}^* ] end{document} predicts 9 starts of P’ exactly or within 1 measure compared to A, models M 3 , M 7 , M 14 , and M 18 M50 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_3},,{{cal M}_7},,{{cal M}_{14}},,{ m and},{{cal M}_{18}} ] end{document} respectively predict 10, 15, 17, and 18 such boundaries. As P’ always appears in the reference, no spurious P’ is predicted. This success in detecting the start of P’ is likely to come from the correlation between this section and features representing both the thematic patterns pat:P and the tonality ton:I which is strongly captured by the model as Table 1 predicts 9 starts of P’ exactly or within 1 measure compared to A, modelsrespectively predict 10, 15, 17, and 18 such boundaries. As P’ always appears in the reference, no spurious P’ is predicted. This success in detecting the start of P’ is likely to come from the correlation between this section and features representing both the thematic patternsand the tonalitywhich is strongly captured by the model as Tableattests. TR and TR’ sections are badly predicted, especially on their start, which may be caused by the blend between P/P’ and TR/TR’ in our model. As a global result, M 18 M51 documentclass[10pt]{article} usepackage{wasysym} usepackage[substack]{amsmath} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage[mathscr]{eucal} usepackage{mathrsfs} usepackage{pmc} usepackage[Euler]{upgreek} pagestyle{empty} oddsidemargin -1.0in egin{document} [ {{cal M}_{18}} ] end{document} correctly predicts the sections of 8 movements, only some sections of 20 movements, and incorrectly the sections of 4 movements. correctly predicts the sections of 8 movements, only some sections of 20 movements, and incorrectly the sections of 4 movements. , as well as the HMM with hard-coded parameters proposed previously () that we call

transactions.ismir.net - Learning Sonata Form Structure on Mozart's String Quartets

Is there a free version of flowkey?

You can download the flowkey app for free and immediately gain free access to selected songs and course content. For full access to all songs and...

Last updated Oct 24, 2021

How do you know when to use the pedal in piano?

When you want the note to sound even after you lift your finger off the key, you hold the damper pedal down. It doesn't make any difference whether...

Last updated Nov 7, 2018

Promotion

Is studying piano easy?

The piano is one of the most difficult and rewarding instruments to learn; not only do you have to learn to read notes and translate them to the...

Last updated Oct 15, 2021

Promotion

What are the 3 main notes of every fragrance?

These scents are split into three distinct elements: top notes, heart notes and base notes. Together, the top, heart and base notes work together...

Last updated Dec 7, 2020