住日 3 Fitting the Norm 27 Figu re 4 4. The numbers 0 through 9 created through averagmg pixel values of handwritten digits. call these new digits 市 g / な . We'll store each image 0f a handwritten digit as a 784 x 1 vector. SO, test digits become 豆 vectors. There are a number Of ways, some quite advanced, tO use the training data tO create an identification algorlthm. We'll try a very simple one that utilizes vector norms. The training set contains 5 923 images 0f the number 0. Rather than compare our test digit tO each image, we will create an average handwrltten 0 from the training set. The algorlthm is fairly simple. Let a pixel in our average 0 ' ' equal the average ()r mean) Of the pixel ⅲ the same location for all zeros in the training set. For example, for the pixel in the upper に hand corner ofthe 、、 average 0 , ' ' take the average 0f the pixel in the upper に hand corner Of all zeros in the training set. What does such an average digit ok like? You can see all ten, computed for all the numbers from 0 t0 9 ⅲ Figure 4.4. 嶬で are now ready tO classify a test digit. 、 Ve'll denote the associated test vector by t. でⅢ alSO denote by ml as the vector associated with the average number / , again formed from the pixels 0f all digits classified as the number / in the training set. TO classify our test digit, compute Ⅱー tll 2 for 0 ミ / 9. This computes the distance between the test vector and the vector for each average digit computed above. Whatever value / produces the smallest distance (value 0f llm ー t112) is our classification for the test digit. For instance, if the smallest norm comes om the distance between the vector for the average 4 and the test digit, then we classify the test digit as a 4.
50 When Life is Linear After zerolng out all the elements in the first column under the main diagonal, we got the matnx system 1 26 0 ー 39 0 The values in the second and last rows changed because they were also affected by the addition of a multiple of the first row. SO, the final process Of Gaussian elimination on the image ⅲ Fig- ure 6.5 (a) produces the image ⅲ (b). ln the last three sections, we've seen three types Of row operatlons needed t0 solve a linear system. Amazingly, we only need these three oper- at10ns and no more. At one level, SOlVing a linear system can seem simple. Keep in mind, though, the process Of solving a linear system is a common and important procedure ⅲ scientific computing. ln the next section, we 叩可 y linear algebra t0 cryptography. 00 8 - 、 ) っ ~ 4 一ー 6.2 Being Cryptic Billions Of dollars are spent on online purchases resulting in credit card numbers flying through the lnternet. For such transactions, you want tO be on a secure site. Secur1ty comes from the encryption Of the data. While essentially gibberish for almost anyone, encrypted data can be decrypted by the receiver. Why can't others decipher it? Broadly speaking, many of the most secure encryption techniques are based on factorlng really huge numbers. How big? l'm referring t0 a number of the size of 1 quattuorvig- intillion, which is 1() to the 75th power. Keep in mind that the number of atoms ⅲ the observable universe estimated tO be 10 tO the 80th power. Such techniques rely heavily on number theory, which as we'll see in this sectlon can overlap with linear algebra. Let's start with one Of simplest encryption methods named the Caesar cipher after Julius Caesar. TO begin, letters are enumerated ⅲ alphabetical order from 0 to 25 as seen below, which will a ト 0 be helpful for reference. 1 -6 -4- O っ ~ 0 ・」 9 10 怖 17 18 19 20 21 22 23 24 25 凵 い
82 When Life is Linear matrlx. Let'S consider a matr1X that has more rows than columns: 1 4 5 6 7 10 Ⅱ 0 Performing the SVD on a matrIX returns three matrlces, which are denoted U, ! , and に For 召 , our SVD calculator would calculate ー 0.1520 ー 0.3940 ー 0.6359 ー 0.6459 2 L0244 0 0 ー 0.6012 ー 0.3096 ー 0.7367 0.2369 0.3626 0.4883 ー 0.7576 0 7.9811 0.8684 0.2160 ー 0.4365 0.0936 0 0 and 0 0.5255 ー 0.68 ー 0.2682 0.6742 ー 0.4064 0.9122 ー 0.0518 The product U ! 「 will equal 既 where U is a matr1X with the same number of rows and columns as B and 「 and ! will have the same number of columns as U. Further, 「 and ! have the same number ofrows as columns. iS a / 4g0 〃 4 / 川 4 な , WhiCh means it only contains nonzero values on the diagonal, which are called the 豆〃 g ″ / values. We will assume that the singular values 0f are ordered 仕 om highest t0 lowest with the highest appearmg in the upper lefthand corner 0fthe matnx. lt is interesting that the columns Of U and the rows Of に 7 、 in the SVD are related tO our computations for PCA in Section 8.4. While the matr1X product U equals B, the three matnces can be used tO approximate 召 , but that's getting a bit ahead of ourselves. Before discussing the approximation ofmatrlces, let us think a bit about the approximation Of numbers. What dO we mean when we say a number い a good approximation tO another number? Generally, we are refernng tO distance. For example, 3. Ⅱ 1 1 would be considered a closer approximation to 兀 than 3.0. Similarly, we can define distance for matrlces. One such measure is the Frobenius norm, which is denoted as 日川レ , defined as the square root of
Entering the Matrix Figure 2.6. Detail ofthe Mona Lisa that will be approximated with the numbers ⅲ a matrlX. Let'S 、 vork a bit smaller and create a matr1X Of Mona Lisa'S eyes as seen ln Figure 2.6. We will break the interval 什 om 0 to 255 into four intervals from 0 to 63 , 64 to に 7 , に 8 to 191 , and 192 to 255. AII pixels from 0 to 63 will be replacedby a 5 oran 8. AII pixels from 64t0 に 7 will be replacedby a() ora 3. AII remaimng pixels will be replaced by a I. When there is more than one ChOice for a number, like choosing a 5 or an 8 , we choose a value randomly with a flip ofa coin. The darkest pixels are replaced with a 5 or an 8 as they require more pixels t0 draw than a し which replaces the lightest pixels. Performing this replacement process on the Mona Lisa'S eyes creates the image seen ln Figure 2.7. lfyou 0k carefully at it, you may notice that Mona Lisa's face is thinner. The aspect ratiO is 0 This is due tO the fact that typed characters, particularly in this font, d0 not fill a square space, but instead a rectangular one, which can be seen in the text below. D 〇 N' T BE SQUARE Each character is almost twice as tall as it is wide. SO, we will adapt our algorithm. Figure 2.7. lmage ofthe Mona Lisa that replaces each pixel with a number.
6 When Life is Linear 1 0 5 5 5 1 3 5 8 3 1 ) 8 5 5 1 3 8 8 8 1 ) 8 3 っ ) 1 0 8 ^ 0 0 1 っ ) 8 0 ^ ソ 1 0 5 3 1 1 0 8 0 よ 1 1 8 3 0 1 0 8 8 3 1 0 0 8 》 Figu re 2.1. A well-known image created with a prlnter and numbers, but to much less satisfying effect than with a brush and paint. is a submatrix, which you'll see in Figure 2.1. SO, where is 図 ? lt's part of Mona Lisa's nght eye. Here lies an important lesson in linear algebra. Sometimes, we need an entire matrIX, even if it is large, tO get insight about the underlying application. Other times, we Ok at submatnces for our insight. At times, the hardest part is determimng what part 0f informatlon is most helpful. 2.1 Sub Swa 叩 ing AS we gain more tOOls tO mathematically manipulate matrlces, we'll Ok at applications such as finding a group 0f high school friends among a larger number Of Faceb00k friends. We'll identify such a group with submatrlces. The tools that we develop will help with such identification. WhiIe such mathematical methods are helpful, we can d0 interesting things simply with submatrices, particularly ⅲ computer graphics.
lt's EIementary, My Dear Watson Let's encrypt the word LINEAR 51 which corresponds to the sequence 0f numbers Ⅱ , 8 , 13 , 4 , 0 , and 17. The Ca ・ 0 ゅん夜・ , as it is called, adds 3 tO every number. A vector v is encrypted by computing v 十 3u where Ⅱ is the vector Of ls that is the same SIze as v. Encryptingv= い 18134017] leadsto い 4Ⅱ 167320 ] weform the encrypted message by wrlting the letters associated with the entrles Of the encrypted vector. For this message we get OLQHDU So, the word 、 'linear" becomes 、、 olqhdu," which I, at least, would strug- gle t0 pronounce.ln cryptography, the receiver must be able t0 easily decode a message. Suppose you receive the encrypted message PDWULA and want tO decrypt it. Again, begin by creating the vector associated with these letters, which is い 5 3 22 20 Ⅱ 0 ] . Decryption on a vector v is performed by calculating v ー 3u. For our message, the decrypted word vectoris い 20 19 に 8 ー 3 ] Simple enough but we aren't quite done. Our subtraction gave us a negative number. Which letter is this? A similar issue anses if we add 3 tO 23 giving 26. When we reach 26 , we want t0 op around t0 0. Returning to the letters, we want a shift Of 1 tO take us from the letter "z" tO the letter a .ln this way, we are working modulo 26 , meaning that any numberlarger than 25 or smaller than 0 becomes the remainder of that number divided by 26. So, ー 3 corresponds to the same letter as ー 3 十 26 = 23. Now we're ready t0 find our decrypted message since our decrypted vector, modulo 26 , equals い 2 0 19 1 7 8 23 ] . This corresponds t0 the letters MATRIX. Ready tO start passing encrypted notes or sending encrypted text mes- sages? ln this example, we shifted by 3. You couldJust as easily shift by 1 1 , い , or 22. JustIet your receiver know and you are ready to send encrypted messages tO a friend. But, don't get t00 confident and send sensitive information. This cipher is 信 y easy t0 break, regardless 0f your shift. If you suspect or know this type 0f cipher was used, you could try all shifts between 1 and 25 and see if
87 Zombie Math—Decomposing Figu re 9.1 . The grayscale image (a) of the data contained ⅲ the matrix in ( 9. い . A 648 by 509 pixel image (b) of AIbrecht Dürer's prmt Mela 〃勧 0 / . in Figure 9.2 (a). An advantage 0fthis low rank approximation can be in the storage savings. The original matrix stored 648 ( 509 ) = 329 , 832 numbers whereas . can be stored with 648 numbers from U (the first column), 1 singular value, and 509 numbers from 「 (the first row). SO, 一 requires stormg only 1 , 158 numbers or 0.35 % 0f the original data. That's a very small fraction Of the orlginal data! Great savings on storage, but not a great representation Of the 0 ロ g ⅲ引 data. Let's increase the rank Of our approximation. The closest rank 10 approximation t0 月 is depicted in Figure 9.2 (b). Let's 0k at the stor- age savmgs again. This matnx reqmres stormg 10 column vectors Of U, 10 singular values, and 10 row vectors Of 「 . This correlates tO a storage requirement 0f 6 , 480 十 10 十 5 , 090 numbers or about 3.5 % ofthe onginal Figure 9.2. The closest rank ー (a) and rank 川 (b) approximations to the Dürer print.ln (c) and (d), we see the closest rank 50 approximation and the original print. Can you tell which is which?
2. 3. 4. 5. 6. 7. 8. 9. IO. 15. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 32. 33. 34. 35. ANNELI LAX NEW MATHEMATICAL LIBRARY Numbers: RationaI and lrrational の , / Ⅷ〃 Ni 〃 What is CaIcuIus About? の , ⅢⅢ S のリ夜・ An lntroduction to lnequalities の , E. E B ビ c 〃わ 4 勧の . Bellman Geometric lnequalities の , Ⅳ D. K の・ⅲ Q 〃・ The Contest Problem Book ー Annual High school Mathematics Examinations 円 50--- 円 60. CompiIed and with solutions の , C んの・ / い 7 : 立 / んⅲイ The Lore of Large Numbers の , 2 ノ Davis Uses oflnfinity の , んビ 0 / の〃加 Geometric Transformations ー /. M. g / 0 乢 translated の , 図 . Shields Continued Fractions の , C の・ / D. 0 / ホ Replaced by NML-34 Hungarian Problem BOOks ー and 日 , Based on the Eötvös Competitions 94 一円 05 and 円 06---1928 , な 4 d の , E. 犬 4 〃〃 0 灯 Episodes 什 om the Early History of Mathematics . イ . 4 わ oe Groups and Their Gr 叩 hs E. Grossman のⅢ Mag 〃 The Mathematics of Choice / Ⅷ〃 Ni 〃 From Pythagoras to Einstein の , ん 0. ん・市た The Contest Problem Book Ⅱ Annual High School Mathematics Examinations 円 6 い円 65. Compiled and with solutions の , C んの・ / い 7 : 立〃 d First Concepts ofTop010gy の , Ⅲ G. C ん〃〃 4 〃イⅣ E. & 〃 ro イ Geometry Revisited の , 〃 . & M. Co 工 ete 尸 4 〃 d & ん . G 尾″ z 夜・ lnvitation t0 Number Theory の , 0 セ 0 尾 Geometric Transformations 日 /. M. g / 0 乢 / ra イの , . Shields Elementary Cryptanalysis の朝わ ra ん〃 & 〃んの , 尾ⅵ d 〃イ″〃 d 襯 e イの , Todd lngenuity in Mathematics RO ”〃 0 わ夜・ g げ Geometric Transformations Ⅲの , /. M. g / 0 ′〃 , / ra d の , 月 . e 〃″ z 夜・ The Contest Problem Book ーⅡ Annual High school Mathematics Examinations 円 66- ・一円 72. CompiIed and with solutions の , C. 7 : 立 / んⅲ d 〃 d ユ M. ECII ・ / Mathematical Methods in Science George Pö 4 lnternational MathematicaI Olympiads—1959—1977. CompiIed and with solutions の , & た G 尾″ z 夜・ The Mathematics of Games and GambIing, Second Edition の , Edward ル c / The Contest Problem Book IV Annual High school Mathematics Examinations 円 73 一円 82. Compiled and with solutions の , R. 月 . , のⅵ〃 0 , 図 . M. GagIione, Ⅳ S 〃 The Ro に of Mathematics ⅲ Science の , M. M. 立〃塗イん . 召 0 ル d ビ〃 lnternational Mathematical Olympiads 円 78 ー 1985 and forty supplementary problems. Compiled and with solutions の , ML11 ・ ray & ん / 〃石〃 RiddIes ofthe Sphinx M のⅵ〃 Ga 〃夜・ U. S. A. Mathematical Olympiads 円 72 一円 86. Compiled and with solutions の , Mum ・ ray & ん / al 〃石〃 Graphs and Their Uses の , ・〃 0 尾 . Revised and 叩 dated Ro わⅲ . / 盟な 0 〃 Exploring Mathematics with Your Computer の , 励Ⅲ・ Engel
8 に 8 126 97 100 99 川 0 99 1 1 8 138 130 122 135 90 1 1 8 82 103 1 1 8 1 1 7 147 141 168 172 170 170 に 8 159 169 153 135 127 133 1 引 い 4 150 凵 5 1 引 135 126 133 125 1 14 108 104 Ⅱ 5 When Life is Linear t0 gain insight on the data. Let's start with the numbers by looking at the Any picture can be viewed as a matr1X, and matnces can be Vie 、 as images 2.2 Spying on the Matrix make everything black and white, or actually gray ・ important and, in their own way, play a role in what we see. TO see this, let's 2 」 , the numbers formed the image. ln this example, the numbers are equally tmage, hOW does this connect tO matrices? 、 Mhere are the numbers? ln Figure WhiIe we are clearly choosing small rectangular portions of a larger matr1X 64 26 36 59 71 70 68 53 77 Ⅱ 6 49 58 20 33 40 38 43 80 82 67 35 150 138 134 Ⅱ 2 103 142 147 144 15 1 48 27 38 58 66 59 57 47 44 61 59 34 19 24 26 37 32 42 76 86 54 187 166 177 179 162 160 142 133 1 引 39 29 45 48 48 53 66 90 61 42 50 42 42 34 51 47 56 41 77 91 55 27 45 34 22 21 25 36 引 43 91 81 44 71 25 53 23 23 29 26 26 63 77 68 81 20 40 45 54 22 33 44 74 88 93 29 20 33 41 1 1 41 79 98 93 46 28 42 48 37 63 89 92 84 80 1 1 8 67 46 34 55 65 67 89 47 83 Ⅱ 5 139 Ⅱ 9 This matr1X M stores the data of a grayscale image. Since the matnx has SIXteen rows and t 、 velve columns, the associated image would have SIXteen rows and twelve columns 0f pixels. AII the values are between 0 and 255 where 0 stores black and 255 white. So, the upper left-hand pixel has a value of 128 and would be gray. A zoomed-in view ofthe corresponding image is seen in Figure 2.3 (a). Like the previous sectlon, we again have a submatrlx, ⅲ this case 0fthe matnx corresponding t0 the image in Figure 2.3 (b). Even when the data in a matrtx isn't related to a picture, graphically viewing the table of numbers can be helpful. One application is finding structure in a matr1X by creating a 可 Ot Of the matr1X, sometimes called a
24 When Life is Linear Ⅱ山一Ⅱ . There is also the x たわ〃川 , which is com- the vector 恒ⅱ 2 = puted as い旧 my preferences become the vector t = [ ー 3 3 5 ー 4 ] , Oscar's is 0 = NOW, we take the ratlngs and create preference vectors. For instance, generator create ratings for all 0f us. The ratings appear ⅲ Table 4. l. not wanting tO see it again. TO keep things general, l'lllet a random number wanting t0 see the film (for the first time or again) and ー 5 means definitely Each person rates the films between ー 5 and 5. A rating of 5 correlates to Award for best picture: 襯夜・ / ca 〃〃 , Gravity, 〃 , and P ん / 0 〃肥〃 4. things simple, we'lllook at films that were nominated for a 2014 Academy Emmy, and compare their preferences for a few movies tO my own. TO keep with similar tastes in movies. Let's Ok at tWO Of my frlends, Oscar and fact, let's see how tO use vector norms in the fourth dimension tO find people The Euclidean norm can be extended intO more than two dimenslons. 4.1 Recommended Movie norm, or another norm Of your choosing. work generally and later decide ifyou are using the Euclidean norm, taxicab norm. ln some cases, you want and need tO be specific. Sometimes, you can This allows mathematics t0 be developed without being specific about a 3. If いⅡ = 0 , then v equals the zero vector. 2. 恒十ⅵミⅡ可十いⅡ , which is called the tnangle inequality. =lalllvll. number , erties. ln particular, for any tWO n-dimensional vectors u and v and real Regardless 0f what norm you choose, the norm will have specific prop- tion that maps vectors in n-dimensional space tO the set Of real numbers. such cases, we can wrlte a general norm by wnting い日 , which is a func- Sometimes, we don't want or need tO specify which norm we'll use. ln of the vector Ⅱ equals 恒旧 Euclidean distance between the points. Using the taxicab norm, the length length ofthe vector is ( ー 3 ) 2 十い ) 2 = 、 / 了 0 which again correlates t0 the under the vector norm you choose. Using the Euclidean norm, the tance between them is the length Of the vector u ーい 4 ] Suppose we are interested in the points ( ー 1 , 5 ) and ( 2 , 4 ). The dis-