<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.9.0">Jekyll</generator><link href="http://davidglavas.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="http://davidglavas.github.io/" rel="alternate" type="text/html" /><updated>2021-11-03T17:29:40+00:00</updated><id>http://davidglavas.github.io/feed.xml</id><title type="html">David Glavas</title><subtitle>A blog about computer science related topics.</subtitle><entry><title type="html">Not so Social GAN</title><link href="http://davidglavas.github.io/Not-so-Social-GAN/" rel="alternate" type="text/html" title="Not so Social GAN" /><published>2020-10-02T12:44:00+00:00</published><updated>2020-10-02T12:44:00+00:00</updated><id>http://davidglavas.github.io/Not%20so%20Social%20GAN</id><content type="html" xml:base="http://davidglavas.github.io/Not-so-Social-GAN/">&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-07-23-Not%20so%20Social%20GAN/cover44slowww.gif?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;h3 id=&quot;tldr&quot;&gt;TL;DR&lt;/h3&gt;
&lt;p&gt;We take a look at &lt;a href=&quot;https://github.com/agrimgupta92/sgan&quot;&gt;Social GAN’s pooling module&lt;/a&gt;, which according to the authors, helps the model “predict socially acceptable trajectories which avoid collisions”. Surprisingly, our experiments show that models without the pooling module tend to have &lt;strong&gt;just as much or less&lt;/strong&gt; collisions than the models with the pooling module. We also show that models with a Random-Pooling function match the performance of the original Social GAN models with a Max-Pooling function. We conclude that Social GAN’s pooling module may not be as effective at encouraging socially acceptable trajectories as claimed in the paper.&lt;/p&gt;

&lt;h3 id=&quot;what-is-social-gan-about&quot;&gt;What is Social GAN about?&lt;/h3&gt;
&lt;p&gt;While moving, people obey a large number of unwritten common sense rules that comply with social conventions. The ability to model these rules and use them to understand and predict human motion in complex real world environments is valuable for the development of socially aware systems. For example, an autonomous vehicle should be able to predict the future positions of pedestrians and adjust its path to avoid collisions. The problem of trajectory prediction can be viewed as a sequence generation task, where we are interested in predicting the future trajectories of people based on their past positions. In this case, future and past positions are sequences of coordinates (x, y tuples). Following the recent successes of Recurrent Neural Networks (RNNs) for sequence prediction tasks, the authors use an RNN Encoder-decoder framework to predict people’s future trajectories, given their past trajectories.&lt;/p&gt;

&lt;p&gt;The main contributions of Social GAN are its approach to consider interactions between people (pooling module) and the way it encourages diverse sample generation (variety loss function). Next, we will take a closer look at Social GAN’s pooling module.&lt;/p&gt;

&lt;h3 id=&quot;whats-wrong-with-social-gan&quot;&gt;What’s wrong with Social GAN?&lt;/h3&gt;
&lt;p&gt;According to the authors, the pooling module causes the models to “predict socially acceptable trajectories which avoid collisions.” Surprisingly, the experiments that we will take a look at next contradict this claim. We will see that models without the pooling module tend to have &lt;strong&gt;just as much or less&lt;/strong&gt; collisions than models with the pooling module.&lt;/p&gt;

&lt;p&gt;The authors provide pre-trained models with and without the pooling module (the 20V-20 and 20VP-20 models from the paper). They trained two models—prediction lengths 8 and 12—for each of the five datasets. That’s twenty models in total, ten with and another ten without the pooling module. You can find these exact models &lt;a href=&quot;https://github.com/agrimgupta92/sgan&quot;&gt;here&lt;/a&gt;. Note that these are pre-trained by the author and we didn’t modify these models in any way. By using the evaluation script provided by the authors we get the same results they do.&lt;/p&gt;

&lt;p&gt;Here we have one example situation where two pedestrians walk next to each other, the prediction differs from the ground truth but it looks socially acceptable. Note that colors differentiate pedestrians, dots are ground truth trajectories, and dashed lines are predictions of a model:&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-07-23-Not%20so%20Social%20GAN/exampleSlow.gif?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Each situation consists of a sequence of frames, each frame consists of the current position (x, y coordinates) of all pedestrians in the situation. We say that the situation contains a collision if there is a collision in any of the frames. We say that there is a collision in a frame if there are at least two pedestrians with a distance lower than a given threshold. So for a higher threshold we expect more collisions, for a high enough threshold we expect all situations to contain collisions. We are interested in a threshold for which we can say that resulting collisions are not socially acceptable. Then we can use this threshold to detect predictions that are not socially acceptable.&lt;/p&gt;

&lt;p&gt;For each frame we compute the euclidian distance between all pedestrians, if any of these distances is below the threshold we classify the situation that this frame belongs to as containing a collision. The following figure shows the percentage of situations that contain collisions for different thresholds, for all of the pretrained models that the authors published (20V-20 and 20VP-20 with prediction lengths 8 and 12). Each of the four charts describes five of the in total twenty models that the authors provided:&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-07-23-Not%20so%20Social%20GAN/collisionFigure.jpg?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Interestingly, the pooling module doesn’t seem to reduce the number of collisions. The models without the pooling module tend to have &lt;strong&gt;just as much or less&lt;/strong&gt; collisions than the models with the pooling module. This contradicts the rationale behind the pooling module which is supposed to avoid collisions. As expected, we see that the more crowded datasets UNIV and ZARA2 have more collisions. This makes sense since there is less space when there are more pedestrians passing each other, so we expect them to get closer to each other. Also as expected, we see that the amount of collisions doesn’t change significantly between the prediction lengths, in fact, we see that a larger prediction length tends to result in more collisions which makes sense since for longer predicted trajectories there are more chances for collisions.&lt;/p&gt;

&lt;p&gt;Next we are interested in inspecting specific situations which have been classified as having a collision. We evaluate the pretrained 20VP-20 model that is provided by the authors on ETH (prediction length 8). Let’s take a look at two arbitrarily chosen situations that were classified as having a collision (threshold = 0.1):&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-07-23-Not%20so%20Social%20GAN/combinedCollisionExamples.gif?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;We see that despite the pooling module, the predicted trajectories deviate from the ground truth in a way that’s not socially acceptable. In the first example, the trajectories of the two pedestrians converge into one trajectory. In the second example, the blue and green pedestrians collide while crossing each other’s path.&lt;/p&gt;

&lt;p&gt;As described in the paper, Social GAN uses Max-Pooling as the pooling function. To see how important a specific pooling function is, we replace the Max-Pooling with Mean-Pooling (instead of selecting the maximum value simply compute the mean over all values) and Random-Pooling (instead of selecting the maximum value simply randomly select one of the values). Note that we only swap out the max pooling function, we leave the rest of the pooling module as well as the training and evaluation procedures as is. For the different pooling modules we compute the same two error metrics as in the paper: Average Displacement Error (ADE): average L2 distance between ground truth and the prediction over all predicted time steps, and Final Displacement Error (FDE): distance between the predicted final destination and the true final destination at the last time step. We add a third metric to evaluate the social acceptability of the predicted trajectories: the percentage of situations that contain a collision for a given threshold. The following table contains the 3 metrics across all datasets for the pretrained models provided by the authors (20V-20 and 20VP-20 with prediction lengths 8), the retrained 20VP-20 model (no modifications, simply retrained models from the paper with exactly the same arguments as used by the authors), and the 2 pooling module variations (exactly the same as original 20VP-20 models except for the swapped pooling function).&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-07-23-Not%20so%20Social%20GAN/EvaluationTable.png?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;We see that none of the pooling functions improves the collision metric. Surprisingly, as measured by the ADE and FDE, the 20VP-20 model with the random pooling function matches the performance of the other models.&lt;/p&gt;

&lt;p&gt;The code for the discussed experiments can be found &lt;a href=&quot;https://github.com/davidglavas/sgan-experiments&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;</content><author><name>davidglavas</name></author><category term="blog" /><category term="Pedestrian Trajectory Prediction" /><category term="Generative Adversarial Networks (GANs)" /><summary type="html"></summary></entry><entry><title type="html">Create your first adversarial examples</title><link href="http://davidglavas.github.io/quickly-craft-your-first-adversarial-examples/" rel="alternate" type="text/html" title="Create your first adversarial examples" /><published>2020-07-06T14:15:00+00:00</published><updated>2020-07-06T14:15:00+00:00</updated><id>http://davidglavas.github.io/quickly-craft-your-first-adversarial-examples</id><content type="html" xml:base="http://davidglavas.github.io/quickly-craft-your-first-adversarial-examples/">&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-04-06-quickly-craft-your-first-adversarial-examples/coverComparison.png?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;h2 id=&quot;tldr&quot;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;The goal of this post is to help you quickly create (craft, generate, construct, call it whatever you want) your first adversarial examples. We use Keras running on top of TensorFlow to train the target neural network, then we craft the adversarial examples and demonstrate their effect on the target network. Train the target network yourself by running &lt;a href=&quot;https://github.com/davidglavas/Craft-your-first-adversarial-examples/blob/master/trainTargetModel.py&quot;&gt;this&lt;/a&gt; or download it &lt;a href=&quot;https://github.com/davidglavas/Craft-your-first-adversarial-examples/blob/master/MNIST_model.h5&quot;&gt;here&lt;/a&gt;. Craft the adversarial examples by running &lt;a href=&quot;https://github.com/davidglavas/Craft-your-first-adversarial-examples/blob/master/craftAdversarialExamples.py&quot;&gt;this&lt;/a&gt;, make sure to have the target network in the same directory.&lt;/p&gt;

&lt;h3 id=&quot;what-are-adversarial-examples&quot;&gt;What are adversarial examples?&lt;/h3&gt;
&lt;p&gt;An adversarial example is an input to a machine learning (ML) model that has been intentionally designed to cause the model to malfunction. We will see how to generate such inputs for deep neural networks using one of the earliest methods, the fast gradient sign method (FGSM). But first, let’s motivate the study of adversarial examples by taking a look at some of the threats they pose to deployed models in the real world. Feel free to skip this part if you want to craft asap.&lt;/p&gt;

&lt;p&gt;Most early adversarial example research was performed under unrealistic conditions, but recently there has been an increasing number of more practical attacks [1]. Many works assume the adversary to have full access to the target model (white-box), but Papernot et al. have shown that attacks are possible even if the adversary has no access to the underlying model (black-box) [2]. Even state of the art machine learning models that are being offered as a service have been shown to be vulnerable to such black-box attacks [3]. Most works assume a threat model in which the adversary can feed data directly into the classifier on a digital level, but researchers have shown that adversarial examples that are printed onto paper and are perceived through a camera by the target network, are also classified incorrectly [4]. Researchers even printed 3D adversarial objects that are robust towards viewpoint shifts, camera noise, and other natural transformations [5].&lt;/p&gt;

&lt;p&gt;Sharif et al. create physical adversarial examples to deceive state of the art neural network based face detection and commercial face recognition systems [6]. These systems are widely used for various sensitive purposes such as surveillance and access control. They print a pair of eyeglass frames, which allows the adversary that wears them to evade being recognized or to impersonate another individual.  Their attack is physically realizable and inconspicuous, meaning that they create not a digital, but a physical adversarial accessory which doesn’t attract the attention of humans (eg. security guard), but which effectively turns the carrier of the accessory into an adversarial example.&lt;/p&gt;

&lt;p&gt;Another interesting practical attack involves the use of adversarial examples to deceive road sign recognition network [7]. Eykholt et al. apply stickers to road signs which cause the target network to interpret a physical stop sign as a speed limit 45 sign. They show that attackers can physically modify objects such as road signs to reliably to cause classification errors in deep learning based systems under widely varying distances, angles, and resolutions.&lt;/p&gt;

&lt;p&gt;Despite these threats, and despite the many approaches to protect neural networks that have been proposed, there is no known reliable defense against adversarial examples so far.&lt;/p&gt;

&lt;h3 id=&quot;train-the-target-network&quot;&gt;Train the target network&lt;/h3&gt;
&lt;p&gt;Adversarial examples need a target, some model to deceive. You can use any neural network really, but you might need to adapt the crafting process if the model’s interface changes. Here we train a Keras sequential model that achieves &amp;gt;99% accuracy on MNIST. You can either run the &lt;a href=&quot;https://github.com/davidglavas/Craft-your-first-adversarial-examples/blob/master/trainTargetModel.py&quot;&gt;code&lt;/a&gt; and train the model yourself, or &lt;a href=&quot;https://github.com/davidglavas/Craft-your-first-adversarial-examples/blob/master/MNIST_model.h5&quot;&gt;download&lt;/a&gt; the trained model and proceed to the next section. Note that the training might take well over an hour if you run TensorFlow on a CPU.&lt;/p&gt;

&lt;h3 id=&quot;craft-the-adversarial-examples&quot;&gt;Craft the adversarial examples&lt;/h3&gt;
&lt;p&gt;At this point you should have a trained Keras sequential model stored on disk. You can craft the adversarial examples by running &lt;a href=&quot;https://github.com/davidglavas/Craft-your-first-adversarial-examples/blob/master/craftAdversarialExamples.py&quot;&gt;this&lt;/a&gt; (put the target model in the same directory as the script). The script loads the MNIST dataset, loads the trained model, crafts adversarial examples for the trained model from the MNIST test set, and stores the adversarial examples on disk. Congratulations, you have crafted 10000 adversarial examples, 61.31% of which cause the target model to return the wrong result. Note that this is the same model that achieves &amp;gt;99% accuracy on the original test set.&lt;/p&gt;

&lt;p&gt;For a given example, FGSM computes the derivative of the model’s loss function with respect to each pixel, then it modifies each pixel in the direction of the gradient by a chosen perturbation size $\epsilon$. Given an example $x$, this method computes an adversarial example $x^*$ as&lt;/p&gt;

\[x^{*} = x + \epsilon sign(\nabla_x J(\Theta, x, y)),\]

&lt;p&gt;where $J(\theta, x, y)$ is the target model’s loss function, with $\theta$ as the model’s parameters, and $y$ as the label of the given example $x$.&lt;/p&gt;

&lt;p&gt;Let’s take a closer look at the effect of an adversarial example on the target network. On the left we see a natural, on the right an adversarial example:&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-04-06-quickly-craft-your-first-adversarial-examples/fourSideBySide.png?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Here is the output of the model’s softmax layer for the left (natural) example:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.999992&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000008&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here is the output for the right (adversarial) example:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.051990&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.006389&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.050749&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.010629&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.253468&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.015621&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.035309&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.007679&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.120034&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.448132&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can interpret each of the numbers as the model’s certainty. The number at the i-th index is how certain the model is that the given example belongs to the i-th class. The index of the greatest number in this array corresponds to the predicted class. We can see that the network is fairly certain about the left image being a four, whereas for the right image it’s convinced that it’s a nine.&lt;/p&gt;

&lt;p&gt;Note that generated adversarial examples don’t neccesarily succeed at deceiving the target network. Out of the 10000 adversarial examples we created, 38.69% fail at deceiving the target network. We can see one such case by examining the following example:&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2019-04-06-quickly-craft-your-first-adversarial-examples/twoSideBySide.png?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Here is the output for the left (natural) example:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here is the output for the right (adversarial) example:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.163652&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.104641&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.446153&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.014896&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.021194&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.021860&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.091441&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.009633&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.096387&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.030143&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We see that for the natural example, the model is certain that it’s a two. For the adversarial example, the model’s certainty that it’s a two drops significantly, but it still thinks it’s a two.&lt;/p&gt;

&lt;p&gt;To conclude, we trained a neural network on MNIST that achieves &amp;gt;99% accuracy on the test set. Then we used FGSM to craft an adversarial test set for which the same network achieves 38.69% accuracy. In the end, we examined the effect of adversarial examples on the model’s softmax output. Note that there are more powerful attack algorithms that result in less perceptible perturbations of the image, and that cause the target model to make mistakes with much higher certainty. For example, performing the same experiment using Carlini and Wagner’s attack (C&amp;amp;W) results in the target model having less than 1% accuracy on the adversarial test set.
&lt;br /&gt;&lt;br /&gt;
&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h3 id=&quot;references&quot;&gt;References:&lt;/h3&gt;

&lt;p&gt;[1]: Sun, Lu, Mingtian Tan, and Zhe Zhou. “A survey of practical adversarial example attacks.” Cybersecurity 1.1 (2018): 9.&lt;/p&gt;

&lt;p&gt;[2]: Papernot, Nicolas, Patrick McDaniel, and Ian Goodfellow. “Transferability in machine learning: from phenomena to black-box attacks using adversarial samples.” arXiv preprint arXiv:1605.07277 (2016).&lt;/p&gt;

&lt;p&gt;[3]: Papernot, Nicolas, et al. “Practical black-box attacks against machine learning.” Proceedings of the 2017 ACM on Asia conference on computer and communications security. ACM, 2017.&lt;/p&gt;

&lt;p&gt;[4]: Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. “Adversarial examples in the physical world.” arXiv preprint arXiv:1607.02533 (2016).&lt;/p&gt;

&lt;p&gt;[5]: Athalye, Anish, et al. “Synthesizing robust adversarial examples.” arXiv preprint arXiv:1707.07397 (2017).&lt;/p&gt;

&lt;p&gt;[6]: Sharif, Mahmood, et al. “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016.&lt;/p&gt;

&lt;p&gt;[7]: Eykholt, Kevin, et al. “Robust physical-world attacks on deep learning models.” arXiv preprint arXiv:1707.08945 (2017).&lt;/p&gt;</content><author><name>davidglavas</name></author><category term="blog" /><category term="adversarial examples" /><category term="neural networks" /><summary type="html">[1]: Sun, Lu, Mingtian Tan, and Zhe Zhou. “A survey of practical adversarial example attacks.” Cybersecurity 1.1 (2018): 9. [2]: Papernot, Nicolas, Patrick McDaniel, and Ian Goodfellow. “Transferability in machine learning: from phenomena to black-box attacks using adversarial samples.” arXiv preprint arXiv:1605.07277 (2016). [3]: Papernot, Nicolas, et al. “Practical black-box attacks against machine learning.” Proceedings of the 2017 ACM on Asia conference on computer and communications security. ACM, 2017. [4]: Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. “Adversarial examples in the physical world.” arXiv preprint arXiv:1607.02533 (2016). [5]: Athalye, Anish, et al. “Synthesizing robust adversarial examples.” arXiv preprint arXiv:1707.07397 (2017). [6]: Sharif, Mahmood, et al. “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016. [7]: Eykholt, Kevin, et al. “Robust physical-world attacks on deep learning models.” arXiv preprint arXiv:1707.08945 (2017).</summary></entry><entry><title type="html">On Writing (Code) Well</title><link href="http://davidglavas.github.io/on-writing-code-well/" rel="alternate" type="text/html" title="On Writing (Code) Well" /><published>2019-10-01T15:22:00+00:00</published><updated>2019-10-01T15:22:00+00:00</updated><id>http://davidglavas.github.io/on-writing-code-well</id><content type="html" xml:base="http://davidglavas.github.io/on-writing-code-well/">&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-03-31-on-writing-code-well/FrontCovers.jpg&quot; /&gt;
&lt;/p&gt;

&lt;h2 id=&quot;tldr&quot;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;I point out similarities between programming and writing nonfiction based on interchangeable advice in Steve McConnell’s &lt;em&gt;Code Complete&lt;/em&gt; and William Zinsser’s &lt;em&gt;On Writing Well&lt;/em&gt;. For the identified similarities—clarity of thought, simplicity, and the importance of iterations—I elaborate on McConnell’s advice for writing code well.&lt;/p&gt;

&lt;p&gt;The goal of this post is to share low-hanging fruits, that is, practical and immediately applicable advice any programmer can benefit from. I read &lt;em&gt;On Writing Well&lt;/em&gt; and &lt;em&gt;Code Complete&lt;/em&gt; in parallel which taught me some similarities. To &lt;del&gt;justify the time I spent procrastinating&lt;/del&gt; keep this post interesting, I relate software construction to nonfiction writing and use the relationship as a basis for McConnell’s advice. First I’ll briefly introduce the books and give you a reason to believe the relationship exists. Then, I’ll go over the three main similarities—clarity of thought, simplicity, and the importance of iterations—that I found to be especially relevant for constructing software.&lt;/p&gt;

&lt;p&gt;Let’s take a brief look at the two books.&lt;/p&gt;

&lt;h2 id=&quot;the-books&quot;&gt;The Books&lt;/h2&gt;

&lt;p&gt;Zinsser’s &lt;em&gt;On Writing Well&lt;/em&gt; is all about expressing oneself with clarity, simplicity, brevity and humanity. It gives a glimpse into the habits of a professional writer and covers general advice such as perseverance, consistency, how to write a good leads and endings, how to not sound emotionless or like a copycat, and much more. He then shows how to apply this advice to various forms such as interviews, travel articles, memoirs, science, technology, business writing, humor and more.&lt;/p&gt;

&lt;p&gt;McConnell’s &lt;em&gt;Code Complete&lt;/em&gt; is a guided tour on lots of widely used development practices. It covers all kinds of issues related to software construction—from variables and statements to code tuning and collaborative construction. Besides learning a ton of new things, I enjoyed seeing tricks that I was using for some time now—especially those which I did unconsciously and never bothered to stop and think more about.
To all those little things—like minimizing the distance in lines between the initialization of variables and their references—McConnell gives names such as live time and span. He manages to give names to things that most of us do intuitively but don’t consciously think about. Regardless of the specific names, the descriptions allow the reader to put a finger on what he already knows while picking up lots of new tricks along the way.&lt;/p&gt;

&lt;p&gt;So is there a connection between programming and nonfiction writing? If so, then we should be able to find parts in the books with the same underlying ideas.&lt;/p&gt;

&lt;p&gt;For the following four quotes, try and guess to which of the two books each of them belongs:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;“Look for the clutter and prune it ruthlessly. Be grateful for everything you can throw away. Are you hanging on to something useless just because you think it’s beautiful? Simplify, simplify.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;“The point is that you have to strip your work down before you can build it back up. You must know what the essential tools are and what job they were designed to do.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;“Sometimes you will despair of finding the right solution—or any solution. You’ll think, “If I live to be ninety I’ll never get out of this mess.” I’ve often thought it myself. But when I finally do solve the problem it’s because I’m like a surgeon removing his 500th appendix; I’ve been there before.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;“When you find yourself in such a situation, look at the troublesome element and ask, “Do I need it at all?” Probably you don’t. It was trying to do an unnecessary job all along.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you guessed &lt;em&gt;On Writing Well&lt;/em&gt; four times then well done. The point is that these statements could easily fit into both books. Let’s interpret the interchangeability of these (and many other) statements as proof for the existence of a connection between programming and non-fiction writing.&lt;/p&gt;

&lt;p&gt;In the rest of this post I’ll cover McConnell’s advice on three points which are mentioned repeatedly throughout both books—clarity of thought, simplicity, and the importance of iterations.&lt;/p&gt;

&lt;h2 id=&quot;1-clarify-your-thoughts-first&quot;&gt;1. Clarify your thoughts first&lt;/h2&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-03-31-on-writing-code-well/Clear.jpg&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Clear minds tend to write clear sentences and produce clear code.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Writers must therefore constantly ask: what am I trying to say? Surprisingly often they don’t know. Then they must look at what they have written and ask: have I said it? Is it clear to someone encountering the subject for the first time? If it’s not, some fuzz has worked its way into the machinery. The clear writer is someone clearheaded enough to see this stuff for what it is: fuzz.&lt;/p&gt;

  &lt;p&gt;— &lt;cite&gt;William Zinsser&lt;/cite&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;More complicated structures require more careful planning, they also benefit from different levels of planning. McConnell says that “from a technical point of view, planning means understanding what you want to build so that you don’t waste money building the wrong thing.“ Investing time into precisely documenting requirements in order to avoid building the wrong features and therefore satisfying the wrong requirements is a form of planning. The same goes for system, object and any other kind of design.&lt;/p&gt;

&lt;p&gt;In a sense, planning is a form of clarifying our thoughts. We don’t talk about requirements and create time consuming design documents for their own sake. We design until we feel confident in our ability to get the job done. The point is to plan enough so that a lack of planning doesn’t create major problems later.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Code Complete&lt;/em&gt; is all about software construction so the planning McConnell writes about the most is related to the nitty-gritty: how to approach constructing classes and routines from variables, statements and control structures. This is not to say that other levels of planning such as requirements and architecture are less important, in fact, he spends the first part of the book talking about their importance and relation to construction activities.&lt;/p&gt;

&lt;p&gt;Let’s take a look at the nitty-gritty.&lt;/p&gt;

&lt;h3 id=&quot;the-pseudocode-programming-process&quot;&gt;The Pseudocode Programming Process&lt;/h3&gt;
&lt;p&gt;McConnell dedicated a whole chapter to this topic. The goal is to solve problems at the level of intent before jumping deep into implementation details. It’s often easier and therefore tempting to start writing code for a routine before clearly stating the problem it’s supposed to solve as well as all of the steps the routine will take. Blindly writing code is a gamble. You are betting your time (and therefore someone’s money) on the code you write to make it into production. This just increases the bond between you and the code which will make abandoning it—after you realize it won’t be needed—more difficult. Before making such bets, improve your chances with the PPP—the Pseudocode Programming Process.&lt;/p&gt;

&lt;p&gt;The following may sound very obvious but bear with me for a few sentences. The goal is to think the problem through, identify steps to solve it, and as soon as you are sure that you can implement a certain step (or part of it) just write down a line of pseudocode with the intent of that step (or substep). This saves you the time of actually having to work out the details which is good, because you don’t yet know if this step will make it into production code.&lt;/p&gt;

&lt;p&gt;How to Pseudocode:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Use English-like statements to precisely describe operations.&lt;/li&gt;
  &lt;li&gt;Make it as programming-language-independent as possible.&lt;/li&gt;
  &lt;li&gt;Keep it at a high enough level to justify its use. Write at the level of intent (what does the operation do instead of the specific steps to do it).&lt;/li&gt;
  &lt;li&gt;Keep it at a low enough level such that you feel comfortable converting it to production code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The better and more familiar you are with the language you use and the problem you are solving, the &lt;a href=&quot;https://i1.wp.com/usethebitcoin.com/wp-content/uploads/2017/11/BITCOIN_9000.jpg?w=618&amp;amp;ssl=1&quot;&gt;higher level&lt;/a&gt; your pseudocode tends to be. A beginner might have to write down the specific steps at first, but if he encounters the same problem multiple times, he will eventually chunk it into one line of pseudocode.&lt;/p&gt;

&lt;p&gt;I often struggle with getting the granularity of pseudocode right. Sometimes I write pseudocode that’s detailed to the point where I might as well write code directly. Sometimes—on the other extreme—I write pseudocode on a level that’s too high—this leads me to gloss over problematic parts of the code I later try (and sometimes fail) to write.&lt;/p&gt;

&lt;p&gt;Ideally, after converting the problem into actual code you will be able to reuse the pseudocode as comments—avoid redundant comments if the code is clear. This tends to improve readability which will make maintaining and reviewing your code easier.&lt;/p&gt;

&lt;p&gt;Keep the above idea—thinking the problem through at the level of intent and only then fully committing to turning your solution into code—in mind while we next take a look at McConnell’s tips for constructing classes and routines.&lt;/p&gt;

&lt;h3 id=&quot;tips-for-constructing-classes&quot;&gt;Tips for Constructing Classes:&lt;/h3&gt;
&lt;ol&gt;
  &lt;li&gt;Create a general design for the class.
    &lt;ul&gt;
      &lt;li&gt;Define the class’s responsibilities.&lt;/li&gt;
      &lt;li&gt;Define what information the class will hide.&lt;/li&gt;
      &lt;li&gt;Define exactly what abstraction the class interface will capture.&lt;/li&gt;
      &lt;li&gt;Include the last three points as a comment in the source code if possible.&lt;/li&gt;
      &lt;li&gt;Make sure that the class’s interface represents a consistent abstraction. (ex. If you offer a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;findEmployee()&lt;/code&gt; routine, it shouldn’t throw an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EOFException&lt;/code&gt; but an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EmployeeNotFoundException&lt;/code&gt;)&lt;/li&gt;
      &lt;li&gt;Determine whether the class will be derived from another class and whether other classes will be allowed to derive from it.&lt;/li&gt;
      &lt;li&gt;Identify key public methods.&lt;/li&gt;
      &lt;li&gt;Identify and design nontrivial data structures.&lt;/li&gt;
      &lt;li&gt;Minimize accessibility, avoid exposing data and functionality when it’s not necessary to do so.&lt;/li&gt;
      &lt;li&gt;Minimize coupling to other classes, avoid depending on code outside of the class as much as practically possible.&lt;/li&gt;
      &lt;li&gt;Preserve integrity of the class’s interface and documentation as you modify it.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Construct the routines within the class.
    &lt;ul&gt;
      &lt;li&gt;Follow steps for constructing routines (see below).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Review and test the class as a whole.
    &lt;ul&gt;
      &lt;li&gt;Ideally, each routine is tested as it’s created. After the class starts taking shape it should be reviewed and tested as a whole in order to uncover any issues that can’t be tested at the individual routine level.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Repeat if necessary.
    &lt;ul&gt;
      &lt;li&gt;As most other processes in software engineering, this is by no means a linear process. For example, during construction of the individual routines (step 2), design errors—such as the need for additional routines—might become apparent. If so, go back to designing the class (step 1) before continuing with construction.&lt;/li&gt;
      &lt;li&gt;Iterate until you are satisfied.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;tips-for-constructing-routines&quot;&gt;Tips for Constructing Routines:&lt;/h3&gt;
&lt;ol&gt;
  &lt;li&gt;Design the routine.
    &lt;ul&gt;
      &lt;li&gt;&lt;strong&gt;Clearly&lt;/strong&gt; define the problem the routine is supposed to solve.&lt;/li&gt;
      &lt;li&gt;Name the routine such that the problem it solves is apparent.&lt;/li&gt;
      &lt;li&gt;Define information that the routine will hide.&lt;/li&gt;
      &lt;li&gt;Define inputs and outputs.&lt;/li&gt;
      &lt;li&gt;Define pre- and post-conditions (what is guaranteed to be true before and after the routine is called)&lt;/li&gt;
      &lt;li&gt;Think about efficiency but don’t sacrifice readability for dubious performance gains.&lt;/li&gt;
      &lt;li&gt;Research available algorithms and data structures, don’t reinvent wheels.&lt;/li&gt;
      &lt;li&gt;Summarize the routines job. Use the summary as a comment in the routines header. Ideally, the reader could treat the routine as a black box and only go into the implementation details if necessary.&lt;/li&gt;
      &lt;li&gt;Write the pseudocode (level of intent).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Code the routine.
    &lt;ul&gt;
      &lt;li&gt;Convert the pseudocode into actual code.&lt;/li&gt;
      &lt;li&gt;Errors in the pseudocode might become more apparent while converting it to actual code. Expect to go back designing the routine (step 1) if you uncover serious errors that impact the whole routine.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Review and test the code and design.
    &lt;ul&gt;
      &lt;li&gt;Mentally check your routine for errors.&lt;/li&gt;
      &lt;li&gt;Does the pseudocode fully solve your problem?&lt;/li&gt;
      &lt;li&gt;Does the code correspond to the pseudocode?&lt;/li&gt;
      &lt;li&gt;Step through your routine with a debugger. This step is so underrated. If you fully understand the routine you just wrote then it shouldn’t take much effort to go through it with a debugger.&lt;/li&gt;
      &lt;li&gt;Test your routine.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Repeat if necessary.
    &lt;ul&gt;
      &lt;li&gt;Expect to heavily iterate over the above steps. You will often have to go into the details and implement some pseudocode to validate your approach, then you go back to the pseudocode, then back into implementation details and so on. Just make sure to minimize the time you spend with implementation details. Only implement things to support your reasoning on the pseudocode level, save time and avoid reasoning at the implementation level.&lt;/li&gt;
      &lt;li&gt;Iterate until you are satisfied.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Tips for testing routines:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Think about how you will test the routine, both before and as you write it. This tends to result in a modular design and often uncovers errors sooner.&lt;/li&gt;
  &lt;li&gt;Test all branches of your routine (ex. if you have a switch statement, test all cases).&lt;/li&gt;
  &lt;li&gt;Boundary analysis, test values +1, -1, and equal to boundaries to avoid off by one errors.&lt;/li&gt;
  &lt;li&gt;Dirty tests, check if your code fails when it should (too little/much data, invalid data, etc.)&lt;/li&gt;
  &lt;li&gt;Consider generating random inputs.&lt;/li&gt;
  &lt;li&gt;Ensure compatibility with old tests if available.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I know, I know. Pre- and post-conditions? Pseudocode? Stepping through with a debugger? For every routine? The above tips sound tedious (they are) and your job is to ship code, that’s &lt;a href=&quot;https://blog.codinghorror.com/not-all-bugs-are-worth-fixing/&quot;&gt;fine&lt;/a&gt;. The above tips are suggestions to bring more structure into our thought process. Being aware of these optional steps and where they fit into our coding habits is in itself valuable.&lt;/p&gt;

&lt;h2 id=&quot;2-keep-it-simple&quot;&gt;2. Keep it simple&lt;/h2&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-03-31-on-writing-code-well/Simple.jpe&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Lots of advice specific to writing nonfiction or writing code can be reduced to this: keep it simple.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;People at every level are prisoners of the notion that a simple style reflects a simple mind. Actually a simple style is the result of hard work and hard thinking; a muddled style reflects a muddled thinker or a person too arrogant, or too dumb, or too lazy to organize his thoughts.&lt;/p&gt;

  &lt;p&gt;— &lt;cite&gt;William Zinsser&lt;/cite&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;McConnell repeatedly writes that “managing complexity is Software’s Primary Technical Imperative”. At one point he refers to Fred Brook’s No Silver Bullets &lt;a href=&quot;http://worrydream.com/refs/Brooks-NoSilverBullet.pdf&quot;&gt;paper&lt;/a&gt; which distinguishes two different types of complexity—essential and accidental. The point is that we should accept only as much complexity as necessary—the essential complexity of the problem at hand. Any rises in difficulty along the path to the final solution should be minimized. In a sense, all advice geared towards improving readability, modularity, maintainability and similar design goals is to increase understanding by reducing complexity. Note that in this post the word complexity refers to intellectual manageability, not computational complexity.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Projects that fail for technical reasons mostly do so because the software is allowed to grow so complex that no one really knows what it does. When a project reaches the point at which no one completely understands the impact that code changes in one area will have on other areas, progress grinds to a halt.&lt;/p&gt;

  &lt;p&gt;— &lt;cite&gt;Steve McConnell&lt;/cite&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So what can developers do to fight accidental complexity?&lt;/p&gt;

&lt;p&gt;Below I list some of the notes I took while reading &lt;em&gt;Code Complete&lt;/em&gt;. Each bullet point is my attempt at summarizing a key idea from McConnell’s discussions. Depending on the amount of experience you have, some points will make more sense and some less. The only way to make the most of the advice is to go through the accompanying stories, studies, and code examples in the book. Nonetheless, I’m sure you will find something useful down there.&lt;/p&gt;

&lt;p&gt;Treat the following list as a buffet, move on if something doesn’t seem interesting and feel free to pick up and adopt any suggestion you find useful.&lt;/p&gt;

&lt;p&gt;Some of McConnell’s advice for reducing complexity:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;General
    &lt;ul&gt;
      &lt;li&gt;Before construction, make sure that the groundwork has been laid (problem is well defined, requirements are reasonably stable, architecture is sufficiently well defined, major risks have been addressed etc.)&lt;/li&gt;
      &lt;li&gt;„Hide complexity so that your brain doesn’t have to deal with it unless you’re specifically concerned with it. “ Be as restrictive as practically possible when it comes to visibility.&lt;/li&gt;
      &lt;li&gt;Limit the negative impact of changes by encapsulating areas that are likely to change such as business rules, hardware dependencies, input and output.&lt;/li&gt;
      &lt;li&gt;Make central points of control when possible (ex. put all code related to processing customer payments into one class or subsystem and keep it there). “The reduced-complexity benefit is that the fewer places you have to look for something the easier and safer it will be to change. “&lt;/li&gt;
      &lt;li&gt;Assign responsibilities to everything—subsystems, classes, routines, variables, etc—this should help to justify their existence and clarify their usage.&lt;/li&gt;
      &lt;li&gt;Use standard techniques whenever possible (widely known algorithms, data structures, design patterns, etc.)&lt;/li&gt;
      &lt;li&gt;Consider using brute force. „A brute-force solution that works is better than an elegant solution that doesn’t work. “&lt;/li&gt;
      &lt;li&gt;Minimize the amount of knowledge required to make a change. Push unnecessary details to another level so that you can think about them when you want to rather than thinking about all of the details all of the time.&lt;/li&gt;
      &lt;li&gt;make code readable from top to bottom&lt;/li&gt;
      &lt;li&gt;Quality gates. Set up checks during your project that determine if current work quality is good enough to continue working.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Design
    &lt;ul&gt;
      &lt;li&gt;Form consistent abstractions. Keep the level of abstraction of public interfaces consistent. For example, if a class offers addEmployee() then it shouldn’t offer nextItemInList() but nextEmployee(). Make sure to encapsulate data such that the resulting interface is consistent with the abstraction you want your class to represent.&lt;/li&gt;
      &lt;li&gt;Reduce coupling by keeping relations (between subsystems, classes, routines, etc.) small, direct, and flexible.&lt;/li&gt;
      &lt;li&gt;Aim for strong cohesion. Code inside of a subsystem, or class, or routine should be strongly related and support a central purpose. (ex. avoid classes that encapsulate unrelated data or behavior).&lt;/li&gt;
      &lt;li&gt;Formalize pre- and post-conditions, what must be true in order to use X and what must be true after X finishes its job.&lt;/li&gt;
      &lt;li&gt;„Design the interfaces so that changes are limited to the inside of the class and the outside remains unaffected. Any other class using the changed class should be unaware that the change has occurred. “&lt;/li&gt;
      &lt;li&gt;Design classes with a high fan-in (ex. your class should be used by a lot of other classes such as a utility class), and a low-to-medium fan-out (ex. your class shouldn’t use and depend on lots of other classes).&lt;/li&gt;
      &lt;li&gt;Design for test. Designing such that testing is easy often results in formalized interfaces and decoupled subsystems which is generally beneficial.&lt;/li&gt;
      &lt;li&gt;Keep your design modular. Draw diagrams and look at your code as a bunch of black boxes with well-defined interfaces.&lt;/li&gt;
      &lt;li&gt;Experimental prototyping. Often you can’t fully define the design problem until you’ve at least partially solved it. In this context, prototyping means writing the absolute minimum amount of throwaway code that’s needed to answer a specific design question.&lt;/li&gt;
      &lt;li&gt;Split up validation and work classes. On a class level you could designate the data validation to public classes and let private classes assume that the data they handle is clean. Code outside of the safety zone throws exceptions, code inside the safety zone uses assertions.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Routines:
    &lt;ul&gt;
      &lt;li&gt;Aim for functional cohesion, statements in a routine should all work together to accomplish exactly one job.&lt;/li&gt;
      &lt;li&gt;Check pre- and post-conditions.&lt;/li&gt;
      &lt;li&gt;Make the name of a routine as short or as long as necessary to make it understandable (other developers should know what the routine does by looking at its name).&lt;/li&gt;
      &lt;li&gt;If possible, don’t exceed 200 lines of code, and use no more than 7 parameters.&lt;/li&gt;
      &lt;li&gt;Consistently order parameters, group similar ones.&lt;/li&gt;
      &lt;li&gt;Avoid using routine parameters as working variables, use local variables in your routine instead.&lt;/li&gt;
      &lt;li&gt;Avoid passing parameters to store the output into, return the result.&lt;/li&gt;
      &lt;li&gt;Document assumptions as you write the routine (you will forget them later).&lt;/li&gt;
      &lt;li&gt;Keep &lt;a href=&quot;https://en.wikipedia.org/wiki/Cyclomatic_complexity&quot;&gt;cyclomatic complexity&lt;/a&gt; of your routines (number of paths, start with 1 and increment for every if, for, while, repeat, and, or, case…) below ~10.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Loops:
    &lt;ul&gt;
      &lt;li&gt;Comment loop headers, allow the reader to treat loops as black boxes.&lt;/li&gt;
      &lt;li&gt;‎Put control structure related code at the beginning or end of the loop (ex. increment counter indices in a while loop at the beginning or end of its body)&lt;/li&gt;
      &lt;li&gt;Loops should be short enough to be viewed all at once, aim for 15-20 lines.&lt;/li&gt;
      &lt;li&gt;‎Loop depth should be at most 3, prepare a good explanation whenever exceeding this limit.&lt;/li&gt;
      &lt;li&gt;‎Avoid altering the loop index in weird ways from inside of the loop.&lt;/li&gt;
      &lt;li&gt;Avoid using the loop index outside of the loop&lt;/li&gt;
      &lt;li&gt;‎Give the loop index, its initialization, and end condition meaningful names (avoid i, j, k, etc.).&lt;/li&gt;
      &lt;li&gt;Using break or continue (or anything else inside of the loop that alters the control structure) increases the loop’s complexity because the reader has to understand the loop’s body in order to understand the loop. Ideally, the reader should understand how the loop behaves by just looking at the contents of (say) a for loop’s parenthesis.&lt;/li&gt;
      &lt;li&gt;Make entry and termination obvious, minimize the number of ways the loop can start and terminate.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Conditionals:
    &lt;ul&gt;
      &lt;li&gt;Outsource complicated tests to routines or variables. Ideally, the reader should understand what a complicated test checks without having to understand the implementation details.&lt;/li&gt;
      &lt;li&gt;Show the normal path first, then exceptions. (ex. cover the most common cases first in else-if statements)&lt;/li&gt;
      &lt;li&gt;Fully parenthesize expressions (how is a &amp;lt; b == c == d evaluated?)&lt;/li&gt;
      &lt;li&gt;‎Write numeric expressions in number line order, (min &amp;lt; i &amp;amp;&amp;amp; i &amp;lt; max) instead of (max &amp;gt; i &amp;amp;&amp;amp; min &amp;lt; i).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Variables:
    &lt;ul&gt;
      &lt;li&gt;Initialize constants at the beginning of a program, initialize member variables in constructors.&lt;/li&gt;
      &lt;li&gt;Binding time. We differentiate coding time (magic numbers) &amp;lt; compile time (named constants) &amp;lt; load time (read from a file) &amp;lt; object instantiation time (read and set a variable upon object initialization) &amp;lt; just in time (read and set a value every time it is used).
After compile time, flexibility increases but so does complexity. The goal is to find a good trade-off based on the project’s requirements.&lt;/li&gt;
      &lt;li&gt;Make sure that every variable does exactly one job, never reuse a variable for uncorrelated purposes.&lt;/li&gt;
      &lt;li&gt;Avoid implicit meanings (ex. special meaning when an integer variable is negative).&lt;/li&gt;
      &lt;li&gt;Encapsulate primitive types in case you expect changes (ex. use a Weight class that internally uses doubles instead of just using doubles)&lt;/li&gt;
      &lt;li&gt;Think about rounding errors, division by zero errors, overflows, avoid hard coding data.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Developer Testing:
    &lt;ul&gt;
      &lt;li&gt;Unit, component, integration, regression, system testing.&lt;/li&gt;
      &lt;li&gt;Do clean and dirty tests, test if your code works but also test if your code fails when it should.&lt;/li&gt;
      &lt;li&gt;Use coverage monitors to ensure high test coverage, choose a good metric such as branch coverage. Most developers are too optimistic when not using tools.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Debugging:
    &lt;ul&gt;
      &lt;li&gt;Understand the problem before trying to fix it.&lt;/li&gt;
      &lt;li&gt;Add a unit test that triggers the error and keep it in order to prevent others/yourself from reintroducing the error.&lt;/li&gt;
      &lt;li&gt;Don’t ignore compiler warnings, fix and understand all of them to avoid weird problems.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Code Tuning:
    &lt;ul&gt;
      &lt;li&gt;Best preparation for code tuning is writing clean, easy to understand and modifiable code.&lt;/li&gt;
      &lt;li&gt;Performance relationships vary across languages, compilers, libraries, machines and versions. Mistrust any general claims about one technique being more efficient than another.&lt;/li&gt;
      &lt;li&gt;Always use execution profilers to understand where your program spends its time.&lt;/li&gt;
      &lt;li&gt;‎Only start tuning code if it works correctly. Measure bottlenecks, backup your working code, tune and measure the impact of every change.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Comments:
    &lt;ul&gt;
      &lt;li&gt;Avoid unnecessary comments that explain the obvious. Question the existence of each comment, delete if it’s not helpful (ex. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;employees.addEmployee(employee); // adds employee&lt;/code&gt;).&lt;/li&gt;
      &lt;li&gt;Comment data declarations (intent, usage), blocks of code (intent), routines (inputs, outputs, assumptions, limitations, source of algorithms, global effects, source), loops (intent), class headers, etc.&lt;/li&gt;
      &lt;li&gt;Only use commenting styles that are easy to maintain (ex. avoid fancy boxes and indentations).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Collaborative Construction:
    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Software_inspection&quot;&gt;Formal inspections&lt;/a&gt;, &lt;a href=&quot;https://en.wikipedia.org/wiki/Pair_programming&quot;&gt;pair programming&lt;/a&gt;, walk-throughs, code readings.&lt;/li&gt;
      &lt;li&gt;Studies show that formal inspections and pair programming are at least as effective as testing.&lt;/li&gt;
      &lt;li&gt;Show your code to others, get feedback, and use it to improve not just your current work but your approach towards future problems.&lt;/li&gt;
      &lt;li&gt;‎Do regression tests, they are essential to producing complex systems.&lt;/li&gt;
      &lt;li&gt;Studies show that to maximize defect detection we should combine different testing techniques (formal/informal inspections, prototyping, developer testing, etc.) and views of different people during all stages (requirements, design, construction).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Organizational:
    &lt;ul&gt;
      &lt;li&gt;‎If you are behind, you likely won’t catch up and it will only get worse. &lt;a href=&quot;https://en.wikipedia.org/wiki/Brooks%27s_law&quot;&gt;Carefully&lt;/a&gt; expand the team, reduce the scope of the project—focus on the most important parts, postpone deadlines.&lt;/li&gt;
      &lt;li&gt;Don’t mindlessly apply changes that pop into your head. Store ideas (and requests) for change and deal with them systematically.&lt;/li&gt;
      &lt;li&gt;Do daily builds (compile, link, produce an executable, get the code running), and smoke tests (thorough tests of the main features, doesn’t have to be exhaustive).&lt;/li&gt;
      &lt;li&gt;Check in code frequently and work in small increments to reduce the amount of integration errors.&lt;/li&gt;
      &lt;li&gt;Make quality assurance a part of all development stages. Don’t postpone all testing till the end, it won’t be done properly.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Miscellaneous
    &lt;ul&gt;
      &lt;li&gt;‎Avoid recursion if it makes you feel uneasy. Only use it as a last resort when an iterative solution is very complex. Think about stack space when using recursion.&lt;/li&gt;
      &lt;li&gt;Avoid gotos. There are situations where their use is justified but use them as a last resort. They often hinder compiler optimizations, mistrust efficiency claims.&lt;/li&gt;
      &lt;li&gt;You should never have to look at the source code of some class in order to understand how to use it. You should be able to use a class by reading its documentation. Knowing a class’s implementation allows you to exploit it while using the class (often unconsciously). This is error-prone because the class’ developers are only responsible to maintain the interface, implementation details that you assume might change and break your code.&lt;/li&gt;
      &lt;li&gt;Be aware of technology waves. Working with technology that is not mature yet means spending a large portion of the day trying to figure out how to use the technology (early wave). Most problems that you face will feel as if you are the first person experiencing them.
Working with mature technology means spending more time on building new functionality and less time on understanding the technology since most of the common tasks have been thought of and made easy by other programmers (late wave).&lt;/li&gt;
      &lt;li&gt;Use assertions to identify things that should NEVER happen, when an assertion goes off it’s (ideally) not handled but the source code needs to be fixed. Use assertions to verify pre- and post-conditions or to check assumptions in safety zones (ex. private routines) such as checking the range of variables, state of a file/variable, object is not null, size of a data structure meets some criteria, etc.&lt;/li&gt;
      &lt;li&gt;„Establish programming conventions (naming, formatting, commenting, etc.) at the start of the project before you begin programming. It’s nearly impossible to change code to match them later. “&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Most of this advice is there to keep developers from writing code that’s more complex than it has to be. Whenever I find myself leaning in a little too close towards the screen while working on some “smart” code, I try to lean back for a reality check. Does this have to be difficult? Am I just being silly and making things more difficult than they have to be? Am I using enough &lt;a href=&quot;https://i.redd.it/ym82hxxxq2d01.jpg&quot;&gt;hash maps&lt;/a&gt;? More often than not I end up ditching the “smart” code and doing it the good old “boring” way.&lt;/p&gt;

&lt;h2 id=&quot;3-iterate-iterate-and-iterate-again&quot;&gt;3. Iterate, iterate, and iterate again&lt;/h2&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-03-31-on-writing-code-well/Iterate.jpeg&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;Books, articles, blogposts and non-trivial systems aren’t written in one go. Both authors emphasize the importance of heavily iterating over their work.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Rewriting is the essence of writing well: it’s where the game is won or lost. That idea is hard to accept. We all have an emotional equity in our first draft; we can’t believe that it wasn’t born perfect. But the odds are close to 100 percent that it wasn’t. Most writers don’t initially say what they want to say, or say it as well as they could.&lt;/p&gt;

  &lt;p&gt;— &lt;cite&gt;William Zinsser&lt;/cite&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The point is not to start with one approach and keep working on it till it’s good enough. The point is to acknowledge that mistakes will be made and learned from while making and abandoning attempts on a best effort basis. McConnell writes: “A first attempt might produce a solution that works, but it’s unlikely to produce the best solution. “&lt;/p&gt;

&lt;p&gt;Fun &lt;a href=&quot;https://arxiv.org/ftp/arxiv/papers/1702/1702.01715.pdf&quot;&gt;fact&lt;/a&gt;, Google rewrites most of their software every few years.&lt;/p&gt;

&lt;p&gt;I’ll leave you with McConnell’s emphasis on the importance of an iterative process:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Iteration is appropriate for many software-development activities. During your initial specification of a system, you work with the user through several versions of requirements until you’re sure you agree on them. That’s an iterative process. When you build flexibility into your process by building and delivering a system in several increments, that’s an iterative process. If you use prototyping to develop several alternative solutions quickly and cheaply before crafting the final product, that’s another form of iteration. Iterating on requirements is perhaps as important as any other aspect of the software-development process. Projects fail because they commit themselves to a solution before exploring alternatives. Iteration provides a way to learn about a product before you build it.&lt;/p&gt;

  &lt;p&gt;— &lt;cite&gt;Steve McConnell&lt;/cite&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;wind-up&quot;&gt;Wind Up&lt;/h2&gt;
&lt;p&gt;I hope that you found some of the tips as useful as I did. Obviously, you won’t remember (and need) all of them, I tried summarizing the ones I think could help most developers. There is much more advice in the book. I would recommend &lt;em&gt;Code Complete&lt;/em&gt;  to people that have programmed for about a year or two and would like to fill in gaps and get an overview of software construction.&lt;/p&gt;

&lt;p&gt;To summarize, we talked about three main issues related to both, nonfiction writing and software construction. First we acknowledged the importance of clarifying thoughts and saw examples of how to structure the class and routine construction processes. Then we took a look at &lt;del&gt;eleventy&lt;/del&gt; suggestions on how to keep it simple by avoiding accidental complexity. Finally, we underlined the importance of iterating over and over again until you are satisfied with the outcome.&lt;/p&gt;

&lt;p&gt;In case you are interested in more books related to software engineering, McConnell provides a neat reading list at the end of the book. You can also find it &lt;a href=&quot;http://www.construx.com/Thought_Leadership/Books/Survival_Guide/Resources_By_Chapter/Recommended_Reading_Lists/&quot;&gt;online&lt;/a&gt;.&lt;/p&gt;</content><author><name>davidglavas</name></author><category term="blog" /><category term="code complete 2" /><category term="on writing well" /><category term="software engineering advice" /><summary type="html"></summary></entry><entry><title type="html">Let’s build an audio classifier</title><link href="http://davidglavas.github.io/lets-build-an-audio-classifier/" rel="alternate" type="text/html" title="Let’s build an audio classifier" /><published>2019-05-20T12:08:00+00:00</published><updated>2019-05-20T12:08:00+00:00</updated><id>http://davidglavas.github.io/lets-build-an-audio-classifier</id><content type="html" xml:base="http://davidglavas.github.io/lets-build-an-audio-classifier/">&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2018-05-20-lets-build-an-audio-classifier/Pipeline.png?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;h2 id=&quot;tldr&quot;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;I use a neural network to build a simple audio classifier and evaluate its performance on the UrbanSound8K dataset.&lt;/p&gt;

&lt;p&gt;For this post, our goal is learning how to:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;turn sound recordings into feature vectors which a neural network can use,&lt;/li&gt;
  &lt;li&gt;build a simple classifier with TensoFlow’s estimator API,&lt;/li&gt;
  &lt;li&gt;run experiments and interpret results.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We will start by taking a look the dataset. Then, we will learn how to extract meaningful information from audio signals to obtain a compact, yet expressive description that the classifier can use. Then, we will learn how to use TensorFlow’s estimator API to build a simple neural network with which we will classify the extracted features. Finally, we will learn how to run experiments, interpret the results, and use what we have learned in future projects.&lt;/p&gt;

&lt;h2 id=&quot;the-data&quot;&gt;The Data&lt;/h2&gt;
&lt;p&gt;In order for the neural network to generalize well we need a sufficiently large amount of varied labeled data. Creating such datasets is challenging as it usually involves first finding quality sources of data, then setting up a system to automate the collection process from various sources, and finally manually labeling the data. &lt;a href=&quot;https://serv.cusp.nyu.edu/projects/urbansounddataset/salamon_urbansound_acmmm14.pdf&quot;&gt;Salamon and Bello&lt;/a&gt; did just that. Their efforts resulted in the &lt;a href=&quot;https://serv.cusp.nyu.edu/projects/urbansounddataset/urbansound8k.html&quot;&gt;UrbanSound8K dataset&lt;/a&gt;, which we will use in this post. I chose this dataset because it is relatively large and easy to work with.&lt;/p&gt;

&lt;p&gt;The UrbanSound8K dataset is a collection of 8732 short (~4 second) labeled audio recordings (.wav files) of various urban sounds. Each recording belongs to one of the following 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, or street_music. The names of the audio files contain various meta-data of which we will use the class id, in other words, the label of each audio recording is contained in the file name. The recordings are conveniently pre-sorted into 10 folds to help with the reproduction and comparison of results.&lt;/p&gt;

&lt;h2 id=&quot;feature-extraction&quot;&gt;Feature Extraction&lt;/h2&gt;
&lt;p&gt;In this section we will turn our audio recordings into feature vectors. We do this because of the network structure we use to build the audio classifier—our data must conform to the input layer’s structure. Hence, we need to turn our audio recordings into arrays of floating point numbers—the feature vectors.&lt;/p&gt;

&lt;p&gt;We must also keep in mind that the length of our feature vector is equal to the number of units in the network’s input layer—we therefore have full control over it. However, the cost of training a network (time and compute) usually increases with the number of units in the network. Therefore we will try to keep our feature vectors as small as possible.&lt;/p&gt;

&lt;p&gt;A major challenge during the development of audio classifiers is the identification of appropriate content-based features for the representation of the audio recordings. The exact number of different features that people have used up to this point is unknown and irrelevant–the point is that there’s a lot of different ways to concisely represent audio signals, &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0065245810780037&quot;&gt;this&lt;/a&gt; is a great overview of the most used features. Choosing which feature(s) to extract depends on the nature of the audio sources. Different types of audio sources have different characteristics, the goal is to find features that capture relevant differences between our audio recordings which then the neural network can use during classification.&lt;/p&gt;

&lt;p&gt;Our recordings are a kind of environmental sound. An overview of effective features for such audio recordings can be found &lt;a href=&quot;https://ac.els-cdn.com/S0167865503001478/1-s2.0-S0167865503001478-main.pdf?_tid=8901d567-2f27-44bd-ba5b-b48290fd89c8&amp;amp;acdnat=1525870758_f2ee1c0c2b690073c98a4232873a5974&quot;&gt;here&lt;/a&gt;. The feature we will use is inspired by the human auditory system and has proven to be very effective—the Mel-frequency cepstrum (MFC). For the purposes of this post we need to know that we extract one MFC per audio recording, that an MFC is made up of Mel-frequency cepstral coefficients (MFCCs), and that an MFC can be stored as a matrix. Each column in an MFCC matrix represents the MFCCs for one frame, and each row represents the extracted MFCCs across all frames (note that some other libraries reverse the columns and rows). So an MFCC matrix is a sequence of MFCCs. In case you are interested how the matrix is computed, see &lt;a href=&quot;http://www.practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/&quot;&gt;this&lt;/a&gt;. Next we will describe how to turn the MFCC matrix into a feature vector by summarizing the extracted sqeuences of MFCCs (rows).&lt;/p&gt;

&lt;p&gt;Let’s assume that we extracted all the MFCCs for our audio recordings, now we have one matrix per audio recording which is supposed to be a good representation of the recording’s content. Finally, we will obtain the feature vectors by summarizing each MFCC matrix into one vector. For example, we can summarize one MFCC matrix into a feature vector by computing the mean of every sequence of MFCCs (row in the matrix)—the feature vector would be made up of the means and its length would be equal to the number of MFCCs seqeuences (rows in the matrix). We will go a step further and use multiple summary statistics (minimum, maximum, median, mean, variance, skewness, kurtosis) and obtain the final feature vector by concatenating the vectors we obtained for each of the summary statistics. Therefore, the length of our final feature vector will be the number of coefficients (rows) in our MFCC matrix multiplied by the number of summary statistics we use—in our case we have a feature vector of length 20 * 7 = 140. Note that I chose 20 as the number of coefficients through experimenting with different values.&lt;/p&gt;

&lt;p&gt;To extract the features we will use &lt;a href=&quot;https://librosa.github.io/librosa/index.html&quot;&gt;LibROSA&lt;/a&gt;—a package for music and audio analysis. Given the path to one of the recordings, we can compute the corresponding MFCC matrix and create our feature vector as follows:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;extract_features_from_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sample_rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;librosa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# one row per extracted coefficient, one column per frame
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;librosa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfcc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n_mfcc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# row-wise summaries
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;skew&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s take a closer look at a feature vector we create with the above function. Explicitly, the first feature in our feature vector is the minimum of the first coefficient (first MFCCs sequence) of the recording’s MFCC matrix. The second feature is the minimum value of the second coefficient in the recording’s MFCC matrix. Assume that the MFCC matrix has $k$ rows and start index is $1$, the $k+1$-th feature in our feature vector is the maximum (second summary statistic) value of the first coefficient in the recording’s MFCC matrix. The $2k+1$-th feature is the median of the first coefficient in the recording’s MFCC matrix. And so on for the rest of the summary statistics we chose: the $m$-th summary statistic makes up the $k$ features starting from the $(m-1)*k + 1$-th feature in the feature vector.&lt;/p&gt;

&lt;p&gt;The following example shows how to use librosa for performing the feature extraction I described above.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# First we load the audio file.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sample_rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;librosa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# file must be in the root folder of your project
&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;raw_sound:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;raw_sound.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;librosa&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfcc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sample_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n_mfcc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# compute the MFCC matrix
&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# Next we compute the summary statistics, each of them summarizes the MFCC matrix in its own way.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# row-wise minimum, etc
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;skew&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# We obtain the feature vector by concatenating the different summaries.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;finalFeatureVector&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;concatenate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_min:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_min.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_max:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_max.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_median:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_median.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_mean:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_mean.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_variance:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_variance.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_skeweness:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_skeweness.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_kurtosis:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mfccs_kurtosis.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;finalFeatureVector&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;finalFeatureVector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;finalFeatureVector.shape:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;finalFeatureVector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The output of the above example shows the data we are working with during all stages of the feature extraction process I described earlier:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.05454996&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.13038099&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1837227&lt;/span&gt;  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.02945998&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.00751382&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.03976963&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;raw_sound&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;88200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;9.78217290e+01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.15200264e+02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.40122382e+02&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.96356880e+02&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.88242525e+02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.70682403e+02&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.32125684e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.27208858e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.15394368e+02&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.34137467e+02&lt;/span&gt;
   &lt;span class=&quot;mf&quot;&gt;1.27680399e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.22533017e+02&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.74225340e+01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.66515689e+01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.92767531e+01&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.84596698e+01&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.27690513e+01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.38633941e+01&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.10589355e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.17217352e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.09952991e+01&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.37435825e+01&lt;/span&gt;
   &lt;span class=&quot;mf&quot;&gt;1.12389371e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.08770824e+01&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.70890730e-02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.68226600e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;6.01086590e+00&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.57389057e+00&lt;/span&gt;
   &lt;span class=&quot;mf&quot;&gt;3.50404748e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.94450867e+00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;6.35962364e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.37899134e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;6.57503920e+00&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;7.80786112e+00&lt;/span&gt;
   &lt;span class=&quot;mf&quot;&gt;6.53761178e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;7.81919162e+00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;173&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccsMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;262.1882601&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;103.68769703&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;70.92737613&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;7.06636488&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.57441136&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;14.41774564&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;3.2479818&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;5.35776817&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;28.04321332&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.00591745&lt;/span&gt;
   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.77451869&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;19.54877245&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;13.61704643&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;1.02649618&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;16.86513529&lt;/span&gt;
   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.19287018&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.88764237&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.10989292&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.86591663&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.98465032&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccsMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccsMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;97.82172897&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;168.4061548&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;28.47245793&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;38.02142843&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;21.11869576&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;16.15369493&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;30.22304018&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;30.83948912&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;7.8262849&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;23.61299204&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;12.44763379&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;6.71378187&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;9.72676893&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;18.96199983&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;16.05955284&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;24.25329661&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;23.63521651&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;21.56967718&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;14.08734175&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;18.52652466&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccsMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccsMedian&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;211.31858874&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;143.49744481&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;54.27675658&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;20.94282905&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;9.16580307&lt;/span&gt;
   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.21533032&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;12.92539716&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;19.48127578&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;15.44112234&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;8.92052176&lt;/span&gt;
    &lt;span class=&quot;mf&quot;&gt;2.95877848&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;9.21938462&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.30998802&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;12.52612904&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;5.7662368&lt;/span&gt;
   &lt;span class=&quot;mf&quot;&gt;10.03349768&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;7.50772457&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;6.01753001&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;2.9066586&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;10.10142573&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccsMedian&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccsMean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.07550980e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.41899004e+02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.35410878e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.05936852e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;9.21844173e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.36437309e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.33962109e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.92881863e+01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.49302789e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;9.21802490e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.25152418e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;8.93898900e+00&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.07507361e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.17549153e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;4.85579146e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;9.71038003e+00&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;7.07648620e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;6.55743907e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.49942263e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;9.86277491e+00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccsMean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccsVariance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;889.79984189&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;268.69156081&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;88.54314337&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;75.46131586&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;25.79757966&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;32.35109778&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;36.47155209&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;27.3671436&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;42.67346134&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;35.26865155&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;14.07979971&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;26.54864055&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;19.22512523&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;15.06969567&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;35.02728229&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;20.04404124&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;21.50347683&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;29.0669686&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;15.21941274&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;14.67586581&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccsVariance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccsSkeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.87052288&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.47742533&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.35495499&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.09604224&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1575763&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.25362987&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.50156373&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.12891151&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.49370992&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.00782589&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.10513811&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.37190028&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.17411075&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.45545584&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.59793439&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.19273781&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.054743&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;0.27577162&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.07832378&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.30175746&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccsSkeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;mfccsKurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.90751202&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.4865782&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.6830255&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.64238508&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3210699&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.08463533&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.26465951&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.29554489&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.11507486&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.42076076&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.32018804&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.18306213&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.03231722&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.42274134&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.1122558&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.02620816&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.24541489&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.14746111&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.13099534&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.15450902&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mfccsKurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;finalFeatureVector&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.62188260e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.03687697e+02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;7.09273761e+01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;7.06636488e+00&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.57441136e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.44177456e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.24798180e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;5.35776817e+00&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.80432133e+01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.00591745e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.77451869e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.95487725e+01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.36170464e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.02649618e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.68651353e+01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.19287018e+00&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.88764237e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.10989292e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.86591663e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;9.84650318e-01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;9.78217290e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.68406155e+02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.84724579e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.80214284e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;2.11186958e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.61536949e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.02230402e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.08394891e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;7.82628490e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.36129920e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.24476338e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;6.71378187e+00&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;9.72676893e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.89619998e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.60595528e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.42532966e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;2.36352165e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.15696772e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.40873418e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.85265247e+01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.11318589e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.43497445e+02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.42767566e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.09428290e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;9.16580307e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.15330319e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.29253972e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.94812758e+01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.54411223e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;8.92052176e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.95877848e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;9.21938462e+00&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.30998802e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.25261290e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;5.76623680e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.00334977e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;7.50772457e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;6.01753001e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.90665860e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.01014257e+01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.07550980e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.41899004e+02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.35410878e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.05936852e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;9.21844173e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.36437309e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.33962109e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.92881863e+01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.49302789e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;9.21802490e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.25152418e+00&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;8.93898900e+00&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.07507361e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.17549153e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;4.85579146e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;9.71038003e+00&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;7.07648620e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;6.55743907e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.49942263e+00&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;9.86277491e+00&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;8.89799842e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.68691561e+02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;8.85431434e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;7.54613159e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;2.57975797e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.23510978e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.64715521e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.73671436e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;4.26734613e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.52686516e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.40797997e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.65486406e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.92251252e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.50696957e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.50272823e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.00440412e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;2.15034768e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.90669686e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.52194127e+01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.46758658e+01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;8.70522876e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.77425325e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.54954990e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;9.60422388e-02&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.57576301e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.53629873e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;5.01563735e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.28911511e-01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;4.93709923e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;7.82588861e-03&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.05138109e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.71900277e-01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.74110753e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.55455841e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.97934394e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.92737805e-01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.47430015e-02&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.75771615e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;7.83237770e-02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.01757456e-01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;9.07512022e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.86578202e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.83025497e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;6.42385082e-01&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.21069903e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;8.46353279e-02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.64659508e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.95544886e-01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.15074865e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.20760761e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.20188037e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.83062131e-01&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;3.23172225e-02&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.22741337e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.12255804e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;2.62081612e-02&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;2.45414889e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.47461115e-01&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.30995342e-01&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.54509024e-01&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;finalFeatureVector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;140&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that the feature vectors will potentially vary a lot due to the nature of the recordings (ex. maximum for a gun shot recording will potentially be much larger than for an air conditioner recording). Hence, we will mean normalize the extracted features as follows:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;mean_normalize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;featureMatrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;featureMatrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# compute mean of each column (feature)
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;featureMatrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ddof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# compute sample std of each column (feature)
&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;featureMatrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# subtract each column's mean from every value in the corresponding column
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;featureMatrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# divide values in each column with the corresponding sample std for that column
&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;featureMatrix&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The feature extraction phase tends to take long—my PC takes about 6 minutes to extract feature vectors of length 140 for one fold of the UrbanSound8K dataset. We will store the extracted features. This way we only need to pay the price of extracting features once, and can reuse them while experimenting during classification later on.&lt;/p&gt;

&lt;p&gt;I plan to do a 10-fold cross validation, the following two functions make it easy to extract and store the features systematically. I use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;extract_feature_from_directories&lt;/code&gt; to iterate through the folds and extract features from the recordings they contain–all the extracted feature vectors are combined into one feature matrix (rows are feature vectors, columns are features).&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;extract_features_from_directories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sub_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file_ext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;*.wav&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;feature_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;featureVectorLength&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sub_dir&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;enumerate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sub_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file_ext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)):&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extract_features_from_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Finished processing file: &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Error while processing file: &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;

            &lt;span class=&quot;c1&quot;&gt;# concatenate extracted features
&lt;/span&gt;            &lt;span class=&quot;n&quot;&gt;new_feature_vector&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hstack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mfccs_min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_median&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_variance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_skeweness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mfccs_kurtosis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

            &lt;span class=&quot;c1&quot;&gt;# add current feature vector as last row in feature matrix
&lt;/span&gt;            &lt;span class=&quot;n&quot;&gt;feature_matrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vstack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feature_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_feature_vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

            &lt;span class=&quot;c1&quot;&gt;# extracts label from the file name. Change '\\' to  '/' on Unix systems
&lt;/span&gt;            &lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'-'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feature_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prepare_features&lt;/code&gt; to extract and store the training and validation set for one run of the 10-fold cross validation.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;prepare_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;parent_dir&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'Sound-Data'&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# name of the directory which contains the recordings
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;training_sub_dirs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_dirs&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;validation_sub_dirs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_dirs&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# ndarrays
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;training_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extract_features_from_directories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_sub_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;test_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extract_features_from_directories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_sub_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# convert ndarray to pandas dataframe
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columns&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;featureVectorLength&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# convert ndarray to pandas series
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Series&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tolist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# convert ndarray to pandas dataframe
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;columns&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;featureVectorLength&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# convert ndarray to pandas series
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Series&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tolist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# store extracted training data
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# store extracted validation data
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The following snippet uses the above functions to create the training and validation sets that are used during the first run of the 10-fold cross-validation:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# On the first run I use the first 9 folds for training, the tenth for validation.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_dirs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;fold1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold6&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold7&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold8&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fold9&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;validation_dirs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;fold10&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# extracts and stores training and validation sets that are used for the first run of the 10-fold cross-validation.
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;prepare_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold10'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold10'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To summarize, we extracted one MFCC matrix per recording. The features we will use during classification are the summary statistics of the MFCC matrix’s coefficients (rows). By concatenating the different summaries of the MFCC matrix we obtain a compact representation of the original audio recording—the final feature vector. The code for this section can be found &lt;a href=&quot;https://gist.github.com/davidglavas/c33a9eb5bec736e47438ec546f629520&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Next, we will discuss the classification phase and how the neural network will use the feature vectors we developed in this section.&lt;/p&gt;

&lt;h2 id=&quot;classification&quot;&gt;Classification&lt;/h2&gt;
&lt;p&gt;In this section we will use a neural network to classify the audio recordings. We want to classify each audio recording into one of 10 classes mentioned earlier. Hence, we are dealing with a multi-class classification problem where the classes are mutually exclusive (a recording can belong to only one class). Therefore, we want to build a neural network with a softmax layer at the top.&lt;/p&gt;

&lt;p&gt;Our goal is to create a simple but flexible framework which we can use for experiments. We want to control hyperparameters such as the learning rate, the regularization strength, the number of steps the optimization algorithm makes, the batch size, the number of hidden layers, and the number of hidden units in each layer. We also want to leave the doors open for trying out different types of regularization (L1, L2, Dropout), activation functions, to change the number of classes, and to test different optimization algorithms. We want this flexibility while minimizing the number of errors when switching between configurations during experiments. For other issues we are fine with reasonable defaults (set by the TensorFlow’s designers), as long as it allows for enough freedom to quickly try out interesting configurations.&lt;/p&gt;

&lt;p&gt;TensorFlow’s &lt;a href=&quot;https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator&quot;&gt;Estimator API&lt;/a&gt; offers everything we need. Next we will see how to use the estimator API’s &lt;a href=&quot;https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier&quot;&gt;DNNClassifier&lt;/a&gt; to build our network. The code that follows is a modified version of &lt;a href=&quot;https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/programming-exercise&quot;&gt;this&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We have to let our estimator know what type of data it can expect, we do this by defining a &lt;a href=&quot;https://www.tensorflow.org/get_started/feature_columns&quot;&gt;feature column&lt;/a&gt;. Then, we have to let our estimator know how to fetch data from our dataset—we do this by defining an &lt;a href=&quot;https://www.tensorflow.org/get_started/premade_estimators#create_input_functions&quot;&gt;input function&lt;/a&gt;. Finally, we will initialize the classifier and setup the training loop—here we will set up the monitoring of metrics we are interested in while training the classifier (loss curves, confusion matrix).&lt;/p&gt;

&lt;p&gt;We start by defining the feature column:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;construct_feature_columns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;Construct the TensorFlow Feature Columns.

    Returns:
      A set of feature columns
    &quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feature_column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;numeric_column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'audioFeatures'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shape&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;featureVectorSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Next we will define two functions for creating the input functions, one for fetching data from the training set, the other for the validation set:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;create_training_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_epochs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shuffle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;A custom input_fn for sending our feature vectors to the estimator for training.

    Args:
      features: The training features.
      labels: The training labels.
      batch_size: Batch size to use during training.

    Returns:
      A function that returns batches of training features and labels during training.
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_epochs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_epochs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shuffle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;permutation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;raw_features&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;audioFeatures&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reindex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)}&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;raw_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dataset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_tensor_slices&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raw_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raw_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repeat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_epochs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shuffle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shuffle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# Returns the next batch of data.
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;feature_batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label_batch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_one_shot_iterator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;feature_batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label_batch&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_input_fn&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;create_predict_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;A custom input_fn for sending our feature vectors to the estimator for predictions.

    Args:
      features: The features to base predictions on.
      labels: The labels of the prediction examples.

    Returns:
      A function that returns features and labels for predictions.
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;raw_features&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;audioFeatures&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;raw_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dataset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_tensor_slices&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raw_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raw_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# Returns the next batch of data.
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;feature_batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label_batch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_one_shot_iterator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;feature_batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label_batch&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_input_fn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Next we will define the function that will train the model while periodically printing loss metrics to guide our hyperparameter search during experiments. We will divide the number of training steps into 10 periods. For each period, we train the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DNNClassifier&lt;/code&gt; for $steps/10$ steps. Then we print loss metrics and continue with the next period. We repeat this 10 times, each time we continue training the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DNNClassifier&lt;/code&gt; where the previous period left off. The last period gives us the final &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DNNClassifier&lt;/code&gt; which represents our final model.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;train_nn_classification_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;learning_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;regularization_strength&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;hidden_units&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;model_Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'no_Name'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;Trains a neural network classification model.

    In addition to training, this function also prints training progress information,
    a plot of the training and validation loss over time, as well as a confusion
    matrix.

    Args:
      learning_rate: An `int`, the learning rate to use.
      regularization_strength: A float, the regularization strength.
      steps: A non-zero `int`, the total number of training steps. A training step
        consists of a forward and backward pass using a single batch.
      batch_size: A non-zero `int`, the batch size.
      hidden_units: A `list` of int values, specifying the number of units in each layer.
      training_examples: A `DataFrame` containing the training features.
      training_labels: A `DataFrame` containing the training labels.
      validation_examples: A `DataFrame` containing the validation features.
      validation_labels: A `DataFrame` containing the validation labels.
      model_Name: A `string` containing the model's name which is used when storing the loss curve and confusion
       matrix plots.

    Returns:
      The trained `DNNClassifier` object.
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;periods&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;steps_per_period&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;steps&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;periods&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Create the input functions.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;predict_training_input_fn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;create_predict_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;predict_validation_input_fn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;create_predict_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;training_input_fn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;create_training_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Create feature columns.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;feature_columns&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;construct_feature_columns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Create a DNNClassifier object.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;my_optimizer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;train&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ProximalAdagradOptimizer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;learning_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;learning_rate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;l2_regularization_strength&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;regularization_strength&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# can be swapped for l1 regularization
&lt;/span&gt;    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;estimator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DNNClassifier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;feature_columns&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;feature_columns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;n_classes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;hidden_units&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden_units&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;optimizer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;my_optimizer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;config&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;contrib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;learn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RunConfig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keep_checkpoint_max&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Train the model, but do so inside a loop so that we can periodically assess loss metrics.
&lt;/span&gt;    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Training model...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;LogLoss error (on validation data):&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;training_errors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;validation_errors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;period&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;periods&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# Train the model, starting from the prior state.
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;train&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;input_fn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;steps_per_period&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# Use the current model to make predictions on both, the training and validation set.
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;training_predictions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_fn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predict_training_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;training_pred_class_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'class_ids'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_predictions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;training_pred_one_hot&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keras&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utils&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_categorical&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_pred_class_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;validation_predictions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_fn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predict_validation_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_pred_class_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'class_ids'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_predictions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_pred_one_hot&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keras&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utils&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_categorical&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_pred_class_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# Use predictions to compute training and validation errors.
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;training_log_loss&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;log_loss&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_pred_one_hot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_log_loss&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;log_loss&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_pred_one_hot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# Print validation error of current model.
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;  period %02d : %0.2f&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;period&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_log_loss&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# Store loss metrics so we can plot them later.
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;training_errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_log_loss&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_log_loss&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Model training finished.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Remove event files to save disk space.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;model_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'events.out.tfevents*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Compute predictions of final model.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;final_predictions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input_fn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predict_validation_input_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;final_predictions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;array&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'class_ids'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;final_predictions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Evaluate predictions of final model.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;accuracy&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;accuracy_score&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;final_predictions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Final accuracy (on validation data): %0.2f&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;accuracy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Output a graph of loss metrics over periods.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylabel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;LogLoss&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlabel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Periods&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;LogLoss vs. Periods&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;training&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;plot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_errors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;validation&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;legend&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# plt.show()  # blocks execution
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;savefig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Results&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;model_Name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'_loss_curve.png'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bbox_inches&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tight'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gcf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Create a confusion matrix.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;cm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;metrics&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;confusion_matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;final_predictions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Normalize the confusion matrix by the number of samples in each class (rows).
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;cm_normalized&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;astype&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;float&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;axis&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[:,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;newaxis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ax&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sns&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heatmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cm_normalized&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;bone_r&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_aspect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Confusion matrix&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ylabel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;True label&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xlabel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Predicted label&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# plt.show()  # blocks execution
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;savefig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Results&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;model_Name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'_confusion_matrix.png'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bbox_inches&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tight'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gcf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We train the model by calling the training function we defined above like this:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# unpickle and prepare training data
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean_normalize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;notFold10_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;notFold10_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# unpickle and prepare validation data
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean_normalize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fold10_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fold10_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;train_nn_classification_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;learning_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.003&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;regularization_strength&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hidden_units&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;120&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# One layer with 120 units, for more layers simply add more integers to the list.
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To summarize, we used the Estimator’s API &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DNNClassifier&lt;/code&gt; to build a simple neural network with which we classified the feature vectors we created earlier. The code for this section can be found &lt;a href=&quot;https://gist.github.com/davidglavas/60d102bb236cda4f2ff129324352dc86&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;results&quot;&gt;Results&lt;/h2&gt;

&lt;p&gt;In order to estimate how a certain model configuration will perform on unseen data we will do a 10-fold cross-validation using the already provided folds in the UrbanSound8K dataset. My machine takes about 4 minutes (on average across different configurations with 1 layer) to train a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DNNClassifier&lt;/code&gt; on 9 folds and evaluate it on 1 fold. A 10 fold cross-validation takes my machine about 40 minutes. This severely limits my hyperparameter search, I therefore searched for a good model configuration while training on 9 folds and validating on 1. Then, once I found a configuration which performed well, I ran a 10-fold cross-validation to see how it generalizes.&lt;/p&gt;

&lt;p&gt;To perform a 10-fold cross-validation we first have to load the features we extracted and stored earlier. Then we use the training function to train models for the different folds:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;k_fold_cross_validation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_set_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_set_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
    Performs a k-fold cross validation. Trains k different models and lets you know how they perform by using the
    corresponding validation set.
    :param training_set_names: List of training sets stored as tuples. Each tuple is a pair of strings, first
    element is the name of the training examples, second element is the name of the corresponding training labels.
    :param validation_set_names: List of validation sets stored as tuples. Each tuple is a pair of strings, first
     element is the name of the validation examples, second element is the name of the corresponding validation labels.
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# group each training set with its corresponding validation set
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;folds&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_set_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_set_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;folds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;#####################################################################################&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Model is trained with &quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;and validated with&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;train_nn_classification_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;learning_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.003&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;regularization_strength&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;hidden_units&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;120&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;model_Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
 

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;load_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataset_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
    Unpickles the given examples and labels. Mean normalizes the examples.
    :param dataset_name: Pair of names referring to an example and corresponding label set.
    :return: Actual dataset as a pair, first element are the mean normalized examples (pandas DataFrame), second
     element are the corresponding labels (pandas Series).
    &quot;&quot;&quot;&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;examples_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataset_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# unpickles and mean normalizes examples
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;examples&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean_normalize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;examples_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# unpickles labels
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;labels_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'Extracted_Features&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataset_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_pickle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labels_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;labels&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Finally, we perform the 10-fold cross validation by specifying the folds and calling the above &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k_fold_cross_validation&lt;/code&gt; function. Note that the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k&lt;/code&gt; is equal to the number of pairs in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;training_set_names&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;validation_set_names&lt;/code&gt;. Further, the first cross validation run will use the first tuple in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;training_set_names&lt;/code&gt; (corresponds to the first 9 folds) for training, and the first tuple in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;validation_set_names&lt;/code&gt; (corresponds to the tenth fold) for validation. Similarly for the rest of the runs.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# order in training_set_names matches the order in validation_set_names
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_set_names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold1_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold1_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold2_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold2_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold3_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold3_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold4_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold4_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold5_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold5_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold6_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold6_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold7_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold7_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold8_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold8_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold9_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold9_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'notFold10_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'notFold10_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;validation_set_names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold1_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold1_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold2_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold2_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold3_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold3_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; 
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold4_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold4_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold5_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold5_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; 
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold6_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold6_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold7_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold7_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; 
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold8_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold8_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold9_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold9_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'fold10_features.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'fold10_labels.pkl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;k_fold_cross_validation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_set_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;validation_set_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In my (fairly limited) experiments, the best configuration I found achieved an average accuracy of 70.3% across all 10 folds:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;classifier&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;train_nn_classification_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;learning_rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.003&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;regularization_strength&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hidden_units&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;120&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_examples&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validation_labels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;model_Name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;training_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The loss curves and confusion matrices are similar across folds. The following is obtained by training on the first 9 folds and validating on the 10th:&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://github.com/davidglavas/BlogFigures/blob/master/_posts/Figures/2018-05-20-lets-build-an-audio-classifier/classification_results.png?raw=true&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;While trying to compare results I found only one &lt;a href=&quot;http://iaser.org/Vol-2/Session%203/02EEECS212.pdf&quot;&gt;article&lt;/a&gt; that used simple fully connected neural networks to classify the UrbanSound8K dataset. Their best classification accuracy is 72.2% using 2 hidden layers and 3000 units per layer.&lt;/p&gt;

&lt;p&gt;Note that there are much better approaches for classifying the UrbanSound8K dataset. I used a simple fully connected neural network with one layer and few hidden units due to hardware constraints. The most common deep learning based approach for classification of sounds is to convert the audio file to an image (ex. spectrogram, MFCC, CRP), and then use a convolutional neural network to classify the image. The best classification accuracy on the UrbanSound8K dataset I could find is 93%, the approach is described &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S1877050917316599&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The approach I used in this post is by no means a good one when it comes to maximizing classification performance. It’s good for learning because it can easily be transferred to similar problems to get working prototypes quickly. In case you wish to use a different dataset of audio recordings you could research what features work well for your data and adapt the feature extraction process. In case you use fewer or more classes, simply change the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n_classes&lt;/code&gt; parameter. In case you want to try out more layers and/or units, simply change the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hidden_units&lt;/code&gt; parameter. You can swap &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;l2_regularization_strength&lt;/code&gt; for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;l1_regularization_strength&lt;/code&gt; or add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dropout=0.5&lt;/code&gt; to the initialization phase of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DNNClassifier&lt;/code&gt;. You can swap the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;proximalAdagradOptimizer&lt;/code&gt; with another &lt;a href=&quot;https://www.tensorflow.org/api_docs/python/tf/train/Optimizer&quot;&gt;optimizer&lt;/a&gt; (ex.  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tf.train.AdagradOptimizer(learning_rate=learning_rate)&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;To conclude, we learned how to use librosa to extract features from audio files. Then we learned how to build a simple but flexible framework for quick experiments with the Estimator API. Finally we learned how to evaluate results and compared it with the results of others. Feel free to comment on what I could have done better or what you would have done differently.&lt;/p&gt;</content><author><name>davidglavas</name></author><category term="blog" /><category term="audio classification" /><category term="TensorFlow" /><summary type="html"></summary></entry><entry><title type="html">Computing Visibility Polygons</title><link href="http://davidglavas.github.io/computing-visibility-polygons/" rel="alternate" type="text/html" title="Computing Visibility Polygons" /><published>2019-02-20T12:34:00+00:00</published><updated>2019-02-20T12:34:00+00:00</updated><id>http://davidglavas.github.io/computing-visibility-polygons</id><content type="html" xml:base="http://davidglavas.github.io/computing-visibility-polygons/">&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-02-20-computing-visibility-polygons/FrontVisibilityPolygon.jpg&quot; /&gt;
&lt;/p&gt;

&lt;h2 id=&quot;tldr&quot;&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;I discuss the gist and limitations of my &lt;a href=&quot;https://github.com/davidglavas/Visibility-Polygons-by-Joe-Simpson&quot;&gt;implementation&lt;/a&gt; of Joe and Simpson’s &lt;a href=&quot;https://cs.uwaterloo.ca/research/tr/1985/CS-85-38.pdf&quot;&gt;visibility polygon algortihm&lt;/a&gt;, that is, an asymptotically optimal algorithm for computing the visibility polygon from a point inside of a simple polygon.&lt;/p&gt;

&lt;h2 id=&quot;the-problem&quot;&gt;The Problem&lt;/h2&gt;
&lt;p&gt;Recently, I took a course on computational geometry and got interested in the notion of visibility. Besides its applications in hidden surface removal (HSR) algorithms and exact robot motion planning, it received a great amount of attention through the art gallery problem. The first application I could think of when thinking about visibility were video games that incorporated a &lt;a href=&quot;https://legends2k.github.io/2d-fov/&quot;&gt;fog of war&lt;/a&gt;, that is, the computation of a player’s surroundings that is visible to him as he navigates a map. Unfortunately, the algorithm we will be talking about doesn’t work with obstacles, making it of limited use in games.&lt;/p&gt;

&lt;p&gt;There are many different types of &lt;a href=&quot;https://en.wikipedia.org/wiki/Visibility_(geometry)#Concepts_and_problems&quot;&gt;visibility problems&lt;/a&gt;, some deal with finding viewpoints to illuminate certain parts of the environment, while others deal with computing visible parts of the environment for a given viewpoint. Some of them consider a single viewpoint, some consider multiple viewpoints, and yet others consider different types of view elements than a point—the list goes on and on. In this post we will focus our attention on the following variant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem Statement.&lt;/strong&gt; Given a viewpoint $z$ inside of a simple polygon $P$ with $n$ vertices, we want to compute the visibility polygon $VP(P, z)$, which consists of all points in $P$ visible from the viewpoint $z$. We say that point $p$ is visible from point $q$ (and conversely, $q$ is visible from $p$) if and only if the line segment $\overline{pq}$ lies completely in $P$.&lt;/p&gt;

&lt;h2 id=&quot;background&quot;&gt;Background&lt;/h2&gt;
&lt;p&gt;The visibility polygon from a single viewpoint $z$ can be computed naively in $\mathcal{O}(n^2)$ time. Simply cast a ray from $z$ towards every vertex of the polygon $P$ in, let’s say, counter-clockwise order. For each ray we iterate through all edges and store the closest (Euclidian distance to $z$) intersection as a vertex of the visibility polygon. Correctness follows from the observation that the visibility polygon’s boundary changes its shape only due to vertices of the polygon.&lt;/p&gt;

&lt;p&gt;The above approach is simple yet computationally inefficient for large polygons. Fortunately, more efficient algorithms have been published. For example &lt;a href=&quot;https://search.ieice.org/bin/summary.php?id=e68-e_9_557&quot;&gt;Asano’s&lt;/a&gt; $\mathcal{O}(nlogn)$ time sweeping algorithm and &lt;a href=&quot;https://cs.uwaterloo.ca/research/tr/1985/CS-85-38.pdf&quot;&gt;Joe and Simpson’s&lt;/a&gt; $\mathcal{O}(n)$ time algorithm (yes, those are the ones used by &lt;a href=&quot;https://arxiv.org/pdf/1403.3905.pdf&quot;&gt;CGAL&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick Background of Joe and Simpson’s algorithm.&lt;/strong&gt; Linear time algorithms have been shown to be optimal for computing the visibility polygon from a single viewpoint inside of a simple polygon. Such an algorithm was first proposed by &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/0196677481900195&quot;&gt;ElGindy and Avis&lt;/a&gt;, it requires three stacks and is quite complicated. Then, a conceptually simpler algorithm requiring only one stack was proposed by &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/0734189X83900658&quot;&gt;Lee&lt;/a&gt;. Later, Joe and Simpson showed that both algorithms return wrong results for polygons that wind sufficiently, they published a &lt;a href=&quot;https://cs.uwaterloo.ca/research/tr/1985/CS-85-38.pdf&quot;&gt;correction of Lee’s algorithm&lt;/a&gt;. It is their algorithm that we’ll take a look at.&lt;/p&gt;

&lt;h2 id=&quot;the-algorithm&quot;&gt;The Algorithm&lt;/h2&gt;
&lt;p&gt;Instead of a futile attempt to try and capture the algorithm in more detail than Joe &amp;amp; Simpson did, I’ll present to you an overview that will be enough to understand the main idea. We’ll also take a closer look at parts of the algorithm for which there is no pseudocode in the paper—the pre- and post-processing.&lt;/p&gt;

&lt;p&gt;The algorithm runs in $\mathcal{O}(n)$ time and space. It makes assumptions on the input which we establish in the preprocessing step—this produces a list of vertices $V = v_0, v_1, \cdots, v_n $ that represents the boundary of $P$ in a specific order, depending on the position of viewpoint $z$ as described in the paper (section two, first two paragraphs).&lt;/p&gt;

&lt;h3 id=&quot;preprocessing&quot;&gt;Preprocessing&lt;/h3&gt;
&lt;p&gt;The preprocessing will shift the polygon such that the viewpoint $z$ becomes the new origin and it will rotate the polygon such that the closest vertex $v_0$ lies on the x-axis next to $z$—this ensures a line of sight between the first vertex in $V$ and our point of reference $z$ which makes the algorithm’s subsequent design simpler. We’ll make rotations simpler by working with coordinates in polar form.&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;	&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Pair&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;VsRep&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;preprocess&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;CCWPolygon&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;c1&quot;&gt;// shift polygon such that z is origin&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;shiftToOrigin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// determines the closest vertex to z&lt;/span&gt;
		&lt;span class=&quot;kt&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zIsVertex&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;contains&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;CommonUtils&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;origin2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		&lt;span class=&quot;nc&quot;&gt;PolarPoint2D&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getInitialVertex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zIsVertex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// converts the polygon's vertices from Cartesian to polar&lt;/span&gt;
		&lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;PolarPoint2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;V&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PolarPoint2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Collectors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// adjusts list V such that v0 is at the beginning&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;placeV0First&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// if z is on boundary then [v0, v1, ..., vk, z] -&amp;gt; [z, v0, v1, ..., vk]&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;adjustPositionOfz&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zIsVertex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// rotate all points of the shifted polygon clockwise such that v0 lies on the x axis&lt;/span&gt;
		&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;PolarPoint2D&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
			&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;isOrigin&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;())&lt;/span&gt;
				&lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;rotateClockWise&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;theta&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
		&lt;span class=&quot;c1&quot;&gt;// return the preprocessed vertices and the angle of rotation&lt;/span&gt;
		&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Pair&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;VsRep&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zIsVertex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;theta&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
	&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;main-idea&quot;&gt;Main Idea&lt;/h3&gt;
&lt;p&gt;Then the algorithm proceeds towards its three main procedures—&lt;em&gt;Advance&lt;/em&gt;, &lt;em&gt;Retard&lt;/em&gt;, and &lt;em&gt;Scan&lt;/em&gt;—that handle the three different scenarios that can occur during the monotone scan of $V$—that is, of $P’s$ boundary. While iterating through $V$, a stack $S = s_0, s_1, \cdots, s_t$ of vertices is maintained—it represents a possible subset of vertices of the final visibility polygon. Note that vertices in $S$ are not necessarily vertices of the final visibility polygon. The three procedures are responsible for handling vertices while iterating through $V$ such that the final content of $S$ is all the information necessary for the postprocessing step to construct the final visibility polygon. &lt;em&gt;Advance&lt;/em&gt; is responsible for pushing vertices from $V$ onto the stack $S$, &lt;em&gt;retard&lt;/em&gt; for popping vertices from the $S$, and &lt;em&gt;scan&lt;/em&gt; for skipping vertices in $V$ that have no business modifying $S$. Assume that $V$ is being scanned with $v_j, v_{j+1}$ as the current edge, and that $v_j$ is visible—for $v_{j+1}$ one of the following three cases can occur:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;$v_{j+1}$ is visible so that the newly discovered edge doesn’t obstruct previous vertices $\implies$ &lt;em&gt;Advance&lt;/em&gt; is called—it pushes $v_{j+1}$ onto $S$.&lt;/li&gt;
  &lt;li&gt;$v_{j+1}$ is visible and the newly discovered edge obstructs previous vertices $\implies$ &lt;em&gt;Retard&lt;/em&gt; is called—it pops obstructed vertices from $S$ and pushes $v_{j+1}$.&lt;/li&gt;
  &lt;li&gt;$v_{j+1}$ is invisible, it’s obstructed by previous vertices $\implies$ &lt;em&gt;Scan&lt;/em&gt; is called—it doesn’t modify $S$ and keeps iterating through $V$ until it reaches a visible vertex.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The algorithm switches between the three procedures until $V$ is scanned completely—at this point $S$ contains all the necessary information to construct the final visibility polygon. The switching between procedures depends on the cumulative angular displacement of scanned vertices with respect to the viewpoint $z$—it is modified upon handling a new vertex from $V$, exactly how is described in the paper.&lt;/p&gt;

&lt;h3 id=&quot;postprocessing&quot;&gt;Postprocessing&lt;/h3&gt;

&lt;p&gt;The preprocessing step modified the original input polygon $P$ in order to establish assumptions made by the rest of the algorithm—mainly that the first vertex $v_{0}$ that we process is visible which eliminates the need for subsequent special cases. The main algorithm therefore worked on a modified input polygon $P’$ and viewpoint $z’$ and therefore produced $VP(P’, z’)$ whose coordinates correspond to $P’$ and not to our original input polygon $P$.&lt;/p&gt;

&lt;p&gt;The postprocessing converts $VP(P’, z’)$ to the corresponding visibility polygon in $P$ by taking $VP(P’, z’)$ and applying “inverse” operations to the ones applied to $P$ and $z$ during the preprocessing. Before shifting and rotating $VP(P’, z’)$ to obtain $VP(P, z)$ we have to reverse the stack’s content to establish counterclockwise order—we pushed vertices onto the stack while iterating in counterclockwise order, just popping them would give us $VP(P’, z’)$ in clockwise order.&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;	&lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CCWPolygon&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;postprocess&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;VertDispl&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pre_s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;VsRep&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;initAngle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;V&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;zIsVertex&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
			&lt;span class=&quot;n&quot;&gt;pre_s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;VertDispl&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PolarPoint2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;CommonUtils&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;origin2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// reverse order of stack to establish CCW order of final visibility polygon&lt;/span&gt;
		&lt;span class=&quot;nc&quot;&gt;Collections&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;reverse&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pre_s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// convert VertDispl to PolarPoint2D&lt;/span&gt;
		&lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;PolarPoint2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rotatedPol&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pre_s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Collectors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// rotates points back to original position before the rotation in preprocess()&lt;/span&gt;
		&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;PolarPoint2D&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rotatedPol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
			&lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;rotateClockWise&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;initAngle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
		&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// convert PolarPoint2D to Point2D&lt;/span&gt;
		&lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shiftedPol&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rotatedPol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toCartesian&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Collectors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;toList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
		
		&lt;span class=&quot;c1&quot;&gt;// shifts points back to their position before the shift in preprocess()&lt;/span&gt;
		&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shiftedPol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
			&lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setLocation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getX&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getX&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;curr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;());&lt;/span&gt;
		
		&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;CCWPolygon&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shiftedPol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
	&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Time and space complexity.&lt;/strong&gt; Each vertex is scanned just once, at most two vertices are pushed onto the stack $S$ at each iteration, and popped vertices are never pushed again. This implies that $V$ and $S$ contain a linear number of vertices with respect to the input polygon’s size $n$. Hence, the algorithm runs in $\mathcal{O}(n)$ time and space.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I left out.&lt;/strong&gt; The degenerate cases and intricacies regarding the angular displacement and how this affects the switching between &lt;em&gt;advance&lt;/em&gt;, &lt;em&gt;scan&lt;/em&gt;, and &lt;em&gt;retard&lt;/em&gt; with a detailed example on how exactly this algorithm avoids errors made by previously published algorithms is contained in the paper. It also contains the rationale that justifies the use of angular displacements and of course a proof of the algorithm’s correctness.&lt;/p&gt;

&lt;h2 id=&quot;tests&quot;&gt;Tests&lt;/h2&gt;
&lt;p&gt;I included tests of the pre- and post-processing steps, see &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TestPreprocessing.java&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TestVisibilityPol.java&lt;/code&gt; for details. All of the following visualizations can be reproduced via the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DrawVisibilityPolygons.java&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;Given the input polygon $P$ and a viewpoint $z$ I created tests for the following six scenarios. The polygon can be either convex or concave, for both types $z$ can be in $P$’s interior, or on an edge of $P$’s boundary, or on one of $P$’s vertices. The following figures are visualizations of those six scenarios.&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-02-20-computing-visibility-polygons/ConvexVisibilityFigure1.jpg&quot; /&gt;
&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-02-20-computing-visibility-polygons/ConcaveVisibilityFigure1.jpg&quot; /&gt;
&lt;/p&gt;

&lt;p&gt;The algorithm can also be used to create the illusion of computing the visibility region from multiple viewpoints, that is, all points in $P$ visible from at least one of the viewpoints. I say illusion because the algorithm actually computes the visibility polygon for each of the viewpoints individually and unions them by drawing them onto the same plane. This points to a natural approach to actually compute the visibility region, we could use Joe and Simpson’s algorithm to compute the visibility polygons individually and union them with, say, &lt;a href=&quot;http://www.cs.ucr.edu/~vbz/cs230papers/martinez_boolean.pdf&quot;&gt;Martinez et al.’s algorithm&lt;/a&gt;.&lt;/p&gt;

&lt;p align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://raw.githubusercontent.com/davidglavas/BlogFigures/master/_posts/Figures/2018-02-20-computing-visibility-polygons/HMSExample.PNG&quot; /&gt;
&lt;/p&gt;

&lt;h2 id=&quot;usage&quot;&gt;Usage&lt;/h2&gt;
&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;VisibilityPolygon&lt;/code&gt; class can be used to compute the visibility polygon from a point inside of a simple polygon (given as n vertices in counterclockwise order) in O(n) time and space. Here is an example:&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;    &lt;span class=&quot;c1&quot;&gt;// initialize polygon vertices in CCW order&lt;/span&gt;
	&lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ArrayList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
	
	&lt;span class=&quot;c1&quot;&gt;// initialize polygon&lt;/span&gt;
	&lt;span class=&quot;nc&quot;&gt;CCWPolygon&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CCWPolygon&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vertices&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
	
	&lt;span class=&quot;c1&quot;&gt;// initialize viewpoint&lt;/span&gt;
	&lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point2D&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;Double&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
	
	&lt;span class=&quot;c1&quot;&gt;// VP contains the visibility polygon from z in pol in CCW order.&lt;/span&gt;
	&lt;span class=&quot;nc&quot;&gt;CCWPolygon&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;VP&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;VisibilityPolygon&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;computeVisPol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;robustness-issues&quot;&gt;Robustness Issues&lt;/h2&gt;
&lt;p&gt;Substituting floating-point arithmetic for the real arithmetic assumed in the paper doesn’t go unpunished. My implementation will fail for certain inputs due to round-off errors cause by the inherent limitations of floating-point arithmetic. A straightforward solution would be to make use of some library that allows for arbitrary-precision arithmetic such as &lt;a href=&quot;http://www.apfloat.org/apfloat_java/&quot;&gt;Apfloat&lt;/a&gt; or &lt;a href=&quot;http://jscience.org/&quot;&gt;JScience&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Another approach for obtaining a less robust but presumably more efficient implementation would be to modify the predicates and experimentally assess the improvement in robustness. In order to implement the above algorithm in a robust manner, it is necessary to robustly implement the predicates upon which it relies. The algorithm repeatedly runs a two dimensional orientation test to determine whether a point lies to the left of, to the right of, or on a line defined by two other points. It also computes intersections between lines and segments, half-lines and segments, and between two segments. We will see that all the predicates can be reduced to orientation tests.&lt;/p&gt;

&lt;p&gt;For the orientation test we use the &lt;a href=&quot;https://www.cs.cmu.edu/~quake/robust.html&quot;&gt;determinant approach&lt;/a&gt;—it’s fast and immediately applicable to double precision floating-point inputs.&lt;/p&gt;

&lt;p&gt;The orientation test is performed by evaluating the sign of $orientation(A, B, C)$:&lt;/p&gt;

&lt;p&gt;\begin{equation}
orientation(A, B, C) = 
\begin{vmatrix}
a_x &amp;amp; a_y &amp;amp; 1 \\ 
b_x &amp;amp; b_y &amp;amp; 1 \&lt;br /&gt;
c_x &amp;amp; c_y &amp;amp; 1 
\end{vmatrix}
=
\begin{vmatrix}
a_x - c_x &amp;amp; a_y - c_y \\ 
b_x - c_x &amp;amp; b_y - c_y
\end{vmatrix}
\end{equation}&lt;/p&gt;

&lt;p&gt;If $orientation(A, B, C)$ is less than 0 then $C$ lies to the right of the line that goes through $A$ and $B$, if greater than 0 then $C$ is to the left of, and if equal to 0 then $C$ lies on the line.&lt;/p&gt;

&lt;p&gt;Next we’ll take a look at the connection between the orientation test and other geometric predicates that my implementation uses. The problem of testing whether a line and a segment intersect can be reduced to two orientation tests. To test whether a line $l$ and a line segment $ls$ intersect we test whether an endpoint of $ls$ lies on $l$ or whether the interior of $ls$ intersects $l$. For the former we simply test whether endpoints of $ls$ lie on $l$ with the line equation in point-slope form. Testing whether the interior of $ls$ intersects $l$ is equivalent to testing whether the endpoints of $ls$ lie on opposite sides of $l$—can be determined with two orientation tests.
The other tests can be reduced similarly—they have more special cases, see the provided implementation for details. The fact that all predicates are reducible to orientation tests makes me believe that implementing the orientation test in a robust manner could significantly improve the algorithm’s robustness.&lt;/p&gt;

&lt;p&gt;At this point I’ll quote &lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.543.6920&amp;amp;rep=rep1&amp;amp;type=pdf&quot;&gt;Schirra’s advice&lt;/a&gt; that is directly applicable to our problem:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The straightforward approach to implement geometric algorithms reliably is to use exact rational arithmetic instead of inherently imprecise floating-point arithmetic. Unfortunately, this slows down the code by orders of magnitude. As suggested by the exact geometric computation paradigm a better approach is to combine exact rational arithmetic with floating-point filters, e.g. interval arithmetic, in order to save most of the efficiency of floating-point arithmetic for nondegenerate cases. This approach is implemented in the exact geometry kernels of CGAL and LEDA. The use of adaptive predicates `a la Shewchuck is highly recommended.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Therefore, the algorithm’s robustness issues could be partially resolved by replacing the current naive implementation of the orientation test by &lt;a href=&quot;https://people.eecs.berkeley.edu/~jrs/papers/robust-predicates.pdf&quot;&gt;Shewchuk’s adaptive approach&lt;/a&gt;. It would be interesting to compare the impact on robustness and running time when substituting the current orientation test for Shewchuk’s more robust but presumably slower approach. Note that my implementation would still fail after the substitution because of round-off errors—even if all orientation tests are performed flawlessly—due to comparisons between doubles that are sprinkled all over the code.&lt;/p&gt;

&lt;h3 id=&quot;summary&quot;&gt;Summary&lt;/h3&gt;

&lt;p&gt;In this post we discussed the gist behind Joe and Simpson’s algorithm for computing the visibility polygon from a viewpoint inside of a simple polygon and concluded by taking a look at the robustness issues of my implementation. We saw how to establish assumptions about the input that were made in the paper—the preprocessing step. Then, we discussed the main idea behind the three routines—Advance, Retard and Scan—and in what manner they modify the stack. We took a look at how the final visibility polygon is constructed using the stack’s content left by the three subroutines after the algorithm finished iterating through the polygon’s boundary—the postprocessing step. We concluded with a discussion on robustness issues where we mentioned an approach to resolve them—arbitrary-precision arithmetic —and an idea for future work that makes use of adaptive predicates.&lt;/p&gt;</content><author><name>davidglavas</name></author><category term="blog" /><category term="algorithms" /><category term="computational geometry" /><summary type="html"></summary></entry></feed>