102 When Life is Linear Figu re 10.6. Sierpifiski's triangle, a fractal, named after its founder, waclaw Sierpifiski. Rules 1. PIace a dot halfway between square ー and 2. 2. RoII a die and place a new dot halfway between where you placed thelast dot and square 1 if you roll 1 or 2 , square 2 if you roll 3 or 4 , or square 3 ifyou roll 5 or 6. 3. Return to Step 2. Play a few times! 、 Mhat shape emerges? If you played long and accurately enough, the emergmg lmage is a fractal known as Sierpifiski's tnangle seen in Figure 10.6. The image contains three copies Ofthe larger lmage. There is one at the top and tWO along the bOttom. Magnifying an ObJect and seeing similarities tO the whole is an important property of fractals. An obJect with self-similanty has the property 0 ロ 00k ⅲ g the same as or similar tO itself under increaslng magnification. HOW dO we create this shape using linear algebra? Let's ok care- fully at the rules. Let's represent the squares in the game by the vectors = [ 0 1 卩 ' P2 = [ 00 ド , and P3 = い 0 ] T. The game weJt1St played, sometlmes called the chaos game, entails taking our current vector v and letting the new vector equal half the distance between the current v and vectors for squares し 2 , or 3 , which we choose randomly. lfwe let vn denote
What Are the Chances? をを物謇 - 呶内一、納・ 93 物 , イ第物 , をぉ bilities ofgoing om one square tO another. ROW / Ofthe matr1X corresponds Our first step is forming a ″〃 s ″〃〃 ta な , which contains the proba- matnx will be 1 . correspond tO probabilities. This means that the sum Of each row Of every these, we'll use イ am ・んの , 勧ⅲ〃歩 which use matnces ⅲ which the elements moves are necessary for it tO be 50 % likely that someone has won? TO answer algebra. What is the mimmum number Of moves in the game? HOW many We will ask a couple of questions, which will be answered with linear at square 9 tO complete the game. square 8. When are you are on square 8 , you can roll 3 , 4 , 5 , or 6 and land down the snake t0 square 2. If you land on square 6 , you climb the ladder t0 you roll 5 or 6 , move ahead two squares. げ you land on square 4 , you slide and stay on your current square. lfyou roll 3 or 4 , move ahead one square. げ results in winning. On each turn, roll a die. げ you roll 1 or 2 , d0 not move game board in Figure 10.2. The game starts on square Reaching square 9 PermiSSlon. www.hinduismtoday.com/ Copyright 2009 by Himalayan Academy. Reprinted with From 、、 Hinduism Endures: Ⅱ 00 to 50 , ' ' 2009 , Hinduism T0day, Figu re 10.1. Game of snakes and Ladders on cloth 什 om the 19th century in lndia.
What Are the Chances? 95 TO analyze the game, we begin with a vector, containing probabilities that sum t0 1 , representing where the game begins, which is (with probability 1 ) on square 1. This corresponds tO the vector = い 0 0 0 0 0 0 0 0 ] . Where will we be after one turn? This involves computing Ⅵ = vA , SO = い / 3 1 / 3 1 / 3 0 0 0 0 0 0 ] . This indicates that you have a 1 / 3 chance of being at square 1 , 2 , or 3 after 1 move. NO 、 V, V2 = VIA, SO = [ 0.11 Ⅱ 0.4444 0.3333 0 0.11 Ⅱ 0 0 0 0 ] . This indicates that after two moves, there is about an 1 1 % chance you are still at square 1 , a 44 % chance you 0f being at square 2 , a 33 % chance 0f being at square 3 , and an Ⅱ % chance Of being at square 5. Let's find the minlmum number ofmoves in the game. TO find this, we compute vn = until the last element in vn IS nonzero. This happens when 〃 = 4. LOOk at the game and verify that one could win it in four moves. What's the probability ofthis happening? Just under 4 % since = [ 0.0123 0.4074 0.2593 0 0.14 新 0 0.0617 0.0741 0.0370 ] . ()ur Other questlon was hOW many moves are necessary for it tO be 50 % likely that someone has won? We compute vn = v 〃 until the last element in first becomes larger than 0.50. This h 叩 pens for = [ 0.0000 0.1854 0. Ⅱ 46 0 0.0708 0 0.0436 0.0700 0.5156 ] . SO, it is possible that you could win in four moves but generally it will take ten. げ you know how long a turn takes, you can even compute about hOW long the game will 0ften last! Suppose, we decide t0 change the game. R011ing 1 or 2 still results in not moving. NOW, rolling 3 , 4 , or 5 will move your P1ece one square. R011ing a 6 moves you two squares. The minimum length 0f the game is still four moves but that now will only occur about 1 % Ofthe time. HOW about game length? Now, after 1 5 moves you are 50 % likely t0 be done the game. Let's answer these questions for a larger game board as seen in Figure 10.3. げ we set up the transition matnx, we can analyze the game. We'Il assume you again have a ー / 3 chance Of staying on your current square, moving 1 , or moving 2 squares forward. NOW, the mimmum length 0f the game SIX turns happening with about a 1 % chance.
which is called the 尾豆面記 0 Returning tO our first chOice Of 川 residual vector IS 58 The error in the approximation is computed by ー 0.0204 and わ = 50.7372 , the When Life is Linear 8.7480 11.1840 9.3360 9.63 12 9.99 2012 1896 1984 1 1 1 ー 0.0204 50.7372 9.63 12 9.99 ー 0.8820 0.2736 0.0588 0.0624 10.2636 12.0588 9.6924 ー 0.6540 ー 0.8160 9.63 12 9.99 Between these two choices for 襯 and わ , which is better? TO answer this, we need a mathematical sense Of what is meant by 、、 better. " this chapter, a smaller / e づ la 尾 s 夜・ ro implies a better approximation tO the data. Least-squares error equals the sum Of the squares Of the terms Of the re Si dual vector. For the residual vector ー 0.8820 ー 0.8160 ー 0.6540 the least squares error equals ( ー 0.8820 ) 2 十 ( ー 0.8160 ) 2 十 ( ー 0.6540 ) 2 = 1 .8715. For the residual vector 0.0624 0.0588 0.2736 the least squares error equals ( 0.0624 ) 2 十 ( 0.0588 ) 2 十 ( 0.2736 ) 2 = 0.0822. Since our first chOice for 川 and わ produced a lower least-squares error, the resulting line, ー 0.0204X 十 50.7372 , is a better fit to the data. ThiS iS where the process Of mimnuzatl()n enters. ()ur question is: among all choices for 川 and わ , which produces the smallest least-squares error? The least-squares error is the square of the length of the residual vector. SO, our problem can alSO be Vie 、 as choosing values Of 襯 and わ that produce the shortest residual vector.
83 Zombie Math—Decomposing the sum Of the squares Of all the elements Ofthe matnx. This is the distance from the matrlx 月 tO the zero matrlx and can be wrltten as いし , = 回引 2 , where .4 is an 川 x 〃 matrIX. Let's Ok at an example. If 1 2 2 then = ( 22 + 12 + 22 + ( ー 4 ) 2 ) We square every element Of the matnx, sum them, and take the square root. We have a norm, which we defined for vectors in Chapter 4. Matr1X norms have similar propertles. With the Frobenius norm at our disposal, we are about tO see the power ofthe SVD. First, we need tO discuss what it means for a matrIX tO have rank ん . A matrix has rank 1 ifevery row a multiple 0fthe first row. The matnx an example Of a rank 1 matr1X. We can a ト 0 talk about a rank ん matnx. A matnx has rank ん if there exist ん rows such that all remammg rows are a linear combination Of those ん rows, and ん is the smallest such number. For instance, an example Of a rank 2 matr1X. This is true since you can add the first and second rows tO form the third, and no row a constant multiple Of another ro 、 V. The SVD gives the closest rank ー approximatlon Of a matr1X as mea- sured by the Frobenius norm. Finding this rank 1 matnx with the SVD is 1 1 1 0.5 0.5 M = 0.5 2 2 2 一 1 っ ~ っ 0 ー 1 っ ~
10 What Are the Chances? The world is infused with chance. Random events have underlying probabil- ities. A fair flip 0fa coin has the same chance oflanding heads as tails.ln this chapter, we use linear algebra tO analyze situatlons that involve randomness. 10.1 Down the Chute Ever play Monop01y for several days? HOW often does this h 叩 pen? HOW short a game is possible? These types 0f questions can be analyzed using matnces tO determine such things as average and shortest game length. Let's try this with a few different games. First, we willlook at Chutes and Ladders, 引 so known as Snakes and Ladders. Chutes and Ladders onginated in lndia as a game 0f knowledge and was known as Jfiåna Chaupår. Landing on a virtue resulted in climbing a ladder toward the god Vishnu. Rather than sliding down a chute, the game had a player swallowed by a snake, which resulted in death and a new start. This game entered middle-class Victonan parlors 0f England in the nineteenth century as Snakes and Ladders. Board game pioneer Milton Bradley introduced the game as Chutes and Ladders in the United States ⅲ円 43 promoting it as the 、、 improved new version of snakes and ladders, England's famous indoor sport. The game is played with two or more players on a board with numbered squares on a grld as seen in Figure 10. l. Players begin on the starting square and the first to reach the finish square wins. This 」 ourney from beginning to end is helped and hindered by ladders and chutes ()r snakes) that 叩 pear on the board. Let's see how linear algebra can aid us in analyzing the game. We'II ok at a smaller game board t0 simplify the computations. We'll analyze the 92
94 9 FINISH 1 4 HERE START 8 5 2 When Life is Linear 7 6 3 Figu re 10.2. A small game of snakes and Ladders. to being on square / in the game. ln the jth column 0f row / , we place the probability 0f going t0 square . / 仕 om square / when the die is rolled. Let's form the first row Of the matr1X. This corresponds tO being on square l. SO, we have a 1 / 3 chance 0f staying on square 1 and a 1 / 3 chance Of moving tO square 2 or square 3. SO, the first row IS い / 3 1 / 3 レ 3 0 0 0 0 0 0 ]. What happens on square 2 ? There is a 1 / 3 chance of staying on square 2 or moving t0 square 3. There is also a 1 / 3 probability oflanding on square 4 , which actually returns us tO square 2. SO, the second row : [ 0 2 / 3 1 / 3 0 Ver1fy that the transition matr1X 月 is 0 0 0 0 1 / 3 0 0 0 0 0 0 0 0 1 / 3 2 / 3 1 / 3 0 0 0 0 0 0 1 / 3 1 / 3 1 / 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 / 3 0 1 / 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 / 3 0 1 / 3 0 0 0 ] ・ 0 0 0 0 1 / 3 0 1 / 3 1 / 3 0 0 0 0 0 0 0 1 / 3 2 / 3 1
Mining f0 「 Meaning From smartphones tO tablets tO laptops and even tO supercomputers, data is being collected and produced. With so many bits and bytes, data analytics and data mimng play unprecedented roles in computing. Linear algebra is an important tOOl in this field. ln this chapter, we touch on some tOOls in data mining that use linear algebra, many built on ideas presented earlier in the book. Before we start, how much data is 心 ot ofdata? Let's ok to Facebook. What were you doing 15 minute ago? ln that time, the number of photos uploaded t0 Facebook is greater than the number 0f photographs stored ⅲ the New York PubIic Library photo archives. Think about the amount ofdata produced in the past tWO hours or since yesterday or last week. Even more lmpresslve is hOW Facebook can organize the data SO it can appear quickly intO your news feed. 11.1 SIice and Dice ln Section 8.3 , we looked at clusterlng and saw how to break data into two groups usmg an elgenvector. As we saw in that section, it can be helpful and sometimes necessary for larger networks, tO 可 Ot the adjacency matr1X 0fa gr 叩 h.ln Figure 11.1 , we see an example where a black square is placed where there iS a nonzero entry in the matr1X and a white square iS placed Othe ハ Vise. The go 引 ofclusterlng is tO find maximally intraconnected components and minimally interconnected components.ln a 可 Ot Of a matr1X, this results in darker square reglons. We saw this for a network of about fifty Face- book friends in Figure 8.5 (b). Now, let's turn to an even larger network. We'll analyze the graph of approximately 500 of my frlends on Facebook. 106
Entering the Matrix Figure 2.6. Detail ofthe Mona Lisa that will be approximated with the numbers ⅲ a matrlX. Let'S 、 vork a bit smaller and create a matr1X Of Mona Lisa'S eyes as seen ln Figure 2.6. We will break the interval 什 om 0 to 255 into four intervals from 0 to 63 , 64 to に 7 , に 8 to 191 , and 192 to 255. AII pixels from 0 to 63 will be replacedby a 5 oran 8. AII pixels from 64t0 に 7 will be replacedby a() ora 3. AII remaimng pixels will be replaced by a I. When there is more than one ChOice for a number, like choosing a 5 or an 8 , we choose a value randomly with a flip ofa coin. The darkest pixels are replaced with a 5 or an 8 as they require more pixels t0 draw than a し which replaces the lightest pixels. Performing this replacement process on the Mona Lisa'S eyes creates the image seen ln Figure 2.7. lfyou 0k carefully at it, you may notice that Mona Lisa's face is thinner. The aspect ratiO is 0 This is due tO the fact that typed characters, particularly in this font, d0 not fill a square space, but instead a rectangular one, which can be seen in the text below. D 〇 N' T BE SQUARE Each character is almost twice as tall as it is wide. SO, we will adapt our algorithm. Figure 2.7. lmage ofthe Mona Lisa that replaces each pixel with a number.
Math tO the Max Table 7.3. 乃℃浦 ct 〃 4ctu 記立 0 尾をルド Sc ん 00 な Using んⅲビ a' ・ , 4 / ビわ ra 〃 d ん ea Squares. 63 School B ates Macalester Bryn Mawr Oberlin Predi cted Score 83.5 8L6 83.6 79.2 Actual Score 83 82 where D is the data for the schools, w is the computed weights, and s is the final overall score. Trying to solve this system, we quickly see that we need least squares, as in the preV1011S section. Again, we'll omit some data 仕 om our linear system in order tO see the accuracy Of the predicted scores. When we have computed w for the linear system, we find the predicted scores for the schOOls omitted om the linear system; this produces Table 7.3. Grmnell was omitted since no data was submitted for the SAT. Now ヨ 00k at the predicted scores and how closely they come t0 approximating the overall scores computed by し & News & わイ火 0 . Keep ⅲ mind, though, that we solved the system using least squares. SO, no 、 our predicted scores even for schools in the onginal system are no longer exact. For instance, the predicted scores for Williams and Amherst were 99.5 and 98.2 , whereas the actual overall scores were 100 and 98. Let's reflect on our method. First, the least squares method for solving a linear system does a reasonable 」 Ob at predicting the actual scores ofcolleges. Keep in mind that it is unlikely these are the actual weights since ( 1 ) the reported scores were probably rounded and ( 2 ) the scores were probably scaled t0 give that year's top college, in this case Williams, a score 0f 100. lnsights are possible, even with our mexact weights. lnterestingly, aca- demic reputation has the highest weight. lt's over double the magnitude 0f the nextlargest weight. This is interesting in light Of a controversy that occurred in 2007 when the Presidents Letter was signed by twelve college and university presidents. lt called for colleges and universities tO no longer use the rankings as an indication Of their quality and tO refuse tO 倉Ⅱ out the reputational survey. Since the onginal letter was published, more than 50 0ther college and umversity presidents JOined in signing the letter. The letter can be found on the website for The Education Conservancy.