ITS4 R ぉ API 1 rand48 lstat mbstowcs memcpy 361 Kingdom Security Feature Time and State lnput Validation and Representation lnput Validation and Representation Description apply t0 usage 0f a wide variety 0f functions. There are many generic types Of errors that can and bounds-checking e 「「 0 「 s. Many functions are susceptible tO off-by-one terminator. on some versions. A 区 0 watch fO 「 the NULL lnternalstack allocated buffer can be overflowed use category call. argument (the directory name) is used later in a A ( 訓 to stat() should be flagged if the first (). g. , mkdi r is vulnerable t0 TOCTOU attacks). functions that manipulate the file being queried stat( ) is used in combination with Other open file known by the file descriptor fd. fstat() obtains the same information about an the link references. while stat( ) returns information about the file 1 stat( ) returns information about the link ′ the named file is a symboliclink, in which case lstat() islike stat() except in the case where searchable. pathname leading t0 the file must be required, but all directories listed in the execute permission Of the named file iS not the file pointed t0 by path. Read, write, 0 「 The stat( ) function obtains information about used in high-security encryption. numbers generated by the LCGs f0 「 them tO be randomness 0 「 entropy in the pseudorandom security problem. There is simply not enough fO 「 encryption computations, then it becomes a pseudorandom numbers are used as the basis guessed with reasonable ease. Hence, if the context), and the generated numbers can be generated is very small ()n a cryptographic numbers is well known, the range Of numbers p 「 0b 厄 m. The algorithm that generates the However, hOW the numbers are used can be a integers. That by itself is not a security issue. Generator (LCG) used t0 create pseudorandom The random function is a Linear Congruential
3 72 API srand srand48 continued sscanf stat A 々々 d ⅸ B Kingdom lnput Validation and Representation Time and State Description use category ( 訓 . argument (the directory name) is used later in a A call t0 stat( ) should be flagged if the first (). g. , mkdi r is vulnerable t0 TOCTOU attacks). functions that manipulate the file being queried stat( ) is used in combination with other open file known by the file descriptor fd. fstat( ) obtains the same information about an the link references. while stat( ) returns information about the file 1 stat( ) returns information about the link, the named file is a symbolic link, in which case 1 stat( ) is like stat( ) except in the case where searchable. pathname leading t0 the file must be required ′ but 訓 0f the directories listed in the execute permission Of the named file is not the file pointed t0 by path. Read, write, 0 「 The stat( ) function obtains information about ove 用 ow. The scanf( ) function is susceptible to buffer p 「 0b 厄 m. iS a ClassiC buffer overflow security vulnerability overflow into the adjoining memory space. This longer than the buffer size, the characters will capability. げ the string that is being accepted is in the fact that it has no bounds-checking The vulnerability of the scanf( ) function resides by str. its input from the character string pointed tO stream pointer stream, and sscanf reads stream stdi n, fscanf reads input from the function reads input from the standard input through the pointer arguments. The scanf results from such conversions, if any, are stored format may contain conversion specifiers,• the according t0 a format as described below. This The scanf family of functions scans input used in high-security encryption. numbers generated by LCGs fo 「 them to be randomness or entropy in pseudorandom security p 「 0b 厄 m. There is simply not enough fO 「 encryption computations, then it becomes a
Co 川 e 尾 / 肪 0 / 梔〃 do 123 ・ Splint extends the lint concept into the security realm [Larochelle and Evans 2001 ]. By adding annotations, developers can enable splint to find abstraction violations, unannounced modifications to global vari- ables, and possible use-before-initialization errors. splint can also rea- son about and 1 れ array bounds accesses if it is provided with function pre- and postconditions. Many static analysis approaches hold promise but have yet to be directly applied t0 security. Some of the more noteworthy ones include ESP (a large- scale property verification approach) [Das, Lerner, and seigle 2002 ] , model checkers such as SLAM and BLAST (which use predicate abstraction to examine program safety properties) [Ball and Rajamani 2001 ; Henzinger et 引 . 2003 ] , and FindBugs (a lightweight checker with a good reputation for unearthing common errors in Java programs) [Hovemeyer and pugh 2004 ]. Academic 、 Stat1C analysis contlnues apace, and research results are published with some regularity at conferences such as USENIX security, IEEE Security and Privacy (Oakland), ISOC Network and Distributed sys- tem Security, and Programming Language Design and lmplementation (PLDI). Although it often takes years for results to make a commercial impact, solid technology transfer paths have been established, and the pipeline 100kS good. Expect great progress in stat1C analysis during the next several years. CommerciaI T06 Vendors ln 2004 and 2005 , a number of startups formed to address the software security space. Many Of these vendors have built and are selling basic source code analysis t001S. Major vendors in the space include the following: ・ Coverity <http://w、v、v.coverity.com/ ・ Fortify <http://、vww.fortifysoftware.com/ ・ Ounce Labs <http://www.ouncelabs.com/ ・ Secure S()ftware く http://、.v、v、v.securesoftware. C01れ> The technological approach taken by many of these vendors is very simi- lar, although some are more academically inclined than others. By basing their t001S on compiler technology, these vendors have upped the level of sophistication far beyond the early, almost unusable tools like ITS4.8 8Beware of security consultants armed with lTS4 who aren't software people. Consultants with COde review t001S are rapidly becoming to the soft 、 vare security world what consultants with penetration testing tools are to the network security world. Make sure you carefully vet your vendors.
A 〃 c 〃 H / 00 7 09 about a program can be reduced t0 the halting problem,3 applies in spades tO stat1C analysis t001S. ln scientific terms, static analysis problems are unde- cidable in the worst case. The practical ramifications Of Rice'S theorem are that all Stat1C analysis tOOlS are forced tO make approximatlons and that these approximations lead tO less-than-perfect output. Static analysis t001S suffer from な〃 eg 酣ルお ()n which the program contains bugs that the t001 doesn't report) and なビ々 os ″ルお ()n which the t001 reports bugs that the program doesn't really contain). False positives cause immediate grief t0 any analyst wh0 has t0 sift through them, but false negatives are much more dangerous because they lead tO a false sense Of securlty. A t001 is SO 〃〃 d if, for a given set Of assumptions, it produces no false negatives. Unfortunately, the downside tO always erring on the side Of cau- tion is a potentially debilitating number Of false positives. The static analysis crowd jokes that t00 high a percentage 0f false positives leads t0 100 % false negatives because that's what you get when people stop usrng a t001. A t001 IS 〃〃 50 〃〃 d if it tries tO reduce false positives at the COSt Of sometlmes letting a false negative slip by. 、 MOSt commercial t001S these days are unsound. Ancient History The first code scanner built to look for security problems in code was Cigi- tal's ITS4 く http://www.cigitaI.com/its4/>.4 Since ITS4's release in early 2000 , the idea 0f detecting security problems by looking over source code with a t001 has come 0f age. Much better approaches exist and are being rapidly commercialized. ITS4 and its counterparts RATS <http://www.securesoftware.com/ and FIawfinder <http://www.dwheeler.com/flawfinder/> are extremely simple— the tools scan through a file (lexically), looking for syntactic matches based on a number Of simple "rules" that might indicate possible security vul- nerabilities. One such rule might be "use of strcpy( ) should be avoided, which can be applied by looking through the software for the pattern "st rcpy" and alerting the user when and where it is found. This is 3See く http://en.wikipedia.org/wik1/Halting—problem> if you're not a computer scrence theory junkie. 41TS4 is actually an acronym for "lt's The Software Stupid Security Scanner," a name we invented much t0 the dismay 0f our poor marketing people. That was back in the day when Cigital was called Reliable S0ftware Technologies ・
Code Review with a To 0 SECURITY REQUIREMENTS 0 ABUSE CASES 0 EXTERNAL REVIEW ( 00E CODE ( T00 い ) REVIEW 0 PENETRATION TE STI NG RISK ANA LYSI 5 REQUIREMENTS ARCHITECTURE AND USE CASES AND DESICN 0 RISK-BASED SECURITY TE 5T5 TEST PLANS Ⅲ 5 K ANALYSIS TEST RESULTS TESTS AN D SECURITY OPERATIONS FEEDBACK FROM THE FIELD Debugging な酣 / s 比元 e おわ d お々 rog 川川 g. ) 0 〃 COde な as C レビ as ) 0 〃 ca 〃々 0S5 ~ わ / ) 〃 24 た e ″ , 舫ビ〃わ ) d 〃ⅲ 0 〃ア 0 〃 ' 尾〃 0 5 川 4 な〃 0 〃 g わ debug ″ . BRIAN KERNIGHAN Ⅱ software projects are guaranteed to have one artifact in common— source COde. Because Of thiS basic guarantee, it makes sense to center a software assurance activity around COde itself. l)lus, a large number Of se- curity problems are caused by simple bugs that can be spotted in code (). g. , a buffer overfl() 、 vulnerability is the C0n11 れ on result Of misusing varl()us string functions including strcpy( ) in C). ln terms of bugs and flaws, code revlew is about finding and fixing bugs. Together with architectural risk analysis (see Chapter 5 ) , code review for security tops the list of software security touchpoints. ln this chapter, I describe how to automate source COde security analysis with stat1C analysis t001S. lParts of this chapter appeared in original form in IEEE S 〃りび privacy magazine co- authored with Brian Chess [Chess and McGraw 200 引 . 7 05
356 API freopen continued fscanf fstat ftok ftw fwprintf Kingdom lnput Validation and Representation Time and State lnput Validation and Representation A 々々 d ⅸ B Description The freopen( ) function opens the file whose pathname is the string pointed t0 by fi 1 ename and associates the stream pointed tO by stream with it. The mode argument is used just as in fopen(). freopen( ) is vulnerable t0 TOCTOU attacks. A ( 訓 t0 freopen ( ) should be flagged if the first argument (the directory 0 「 filename) is used earlier in a "check" category ( 訓 . On Windows platforms the APls -freopen, _tfreopen, and _wfreopen are synonymous with freopen. The scanf family of functions scans input according tO a format as described below. This format may contain conversion specifiers; the results from such conversions, if any, are stored through the pointer arguments. The scanf function reads input from the standard input stream stdi n, fscanf reads input from the stream pointer stream, and sscanf reads itS input from the character string pointed tO by str. The vulnerability 0f the scanf( ) function resides in the fact that it has no bounds-checking capability. げ the string that is being accepted is longer than the buffer size ′ the characters will overflow into the adjoining memory space. This iS a ClassiC buffer overflow security vulnerability problem. scanf( ) function is susceptible t0 buffer ove 用 ow. Verify file states before file operations; they are susceptible tO races. ( A 区 0 make sure that buffers are large enough. ) The pri ntf family Of functions is susceptible tO a variety Of format string and buffer overflow attacks. Flag any instance 0f the pri ntf( ) family Of functions in the code. Determine whether 0 「 not the format string is being provided through some input channel.lf it is using a single argument ′ this is a definite vulnerability. Replace the code with the "fix" section.
Appendix B S4 RuIes 345 IA Venn diagram of rules overlap for early t001s can be seen in Figure 4 ー 1 ()f Chapter 4 ). unsuspecting develope 「 S may cause a seve 「 e allergic 「 eaction. Use 0 日 TS4 by clueless security people in the name Of imposing softwa 「 e security on Surgeon GeneraI's Warning Of rules that static analysis t001S enforce. should no longer be used. lnstead, the idea is t0 give you an idea 0f the kinds This is not an endorsement Of lTS4, which is ancient technology that would include all rules from ITS4, RATS, and SourceScope (see Chapter 4 ). りイル s that every static analysis t001 should cover. A better mimmum set doesn't iS not worth its salt. consider this the tiniest 川 / れ一川″川 S ビ右 0 ー sec 〃 - Every basic security scanner should include these rules. Any scanner that base 0f software security rules. Only three ()f many more) fields are shown. The rules shown here were taken from Cigital's extensive knowledge rules Of a very similar nature. 1 kinds of rules that were built into ITS4. RATS added several hundred more in UNIX- or Windows-based systems. What follows is a complete list of the potential false positives. N0t surprisingly, most 0f these rules are about APls through source code looking for simple patterns—an approach filled with The rules shown here are enforced in ITS4 by essentially g reping and their use. basic static analysis t001S. See Chapter 4 for more on stat1C analysis tOOlS Flawfinder provided an early set 0f software security rules built int0 very TS4 <http://www.cigital.com/its4/> and its counterparts RATS and
C わ 4 4 Co R ルルル / 舫 4 肪 0 / 132 0 物す」ぞを % u ・・ Audit ・ Po y Ⅵ 0n5 VuIn ・偬bⅢツ S002 ・ 8 穩陰 2 7 き 2 0 のを一 05 Da Da 「 2 等をま き 38 ・可 : ー等を : トま ) ー等制 2 ・・・等を w 地ト図水 ) 新町 e 師・ no e 員 3 町・ 0 ( 00 物ⅲ om 2014272 ⅲ物 8 p 「・ⅵ 0U3 b 0 ー当 ( d0 ⅲ 0E229.95 れ・国 ou$b 山 0 ( do 物れを om 33-15000 0000 me 0 ⅵ 0 リ 5 buil 置 4- 引 9 55 0 ・ 20 6 ハ 0 ー 4 ー 9 1 ユ日 0 w ・ b00 た 8 ハ 0 ・引 9 5 が 0 4.5 ハ 0 ー 3 ー 9 「 s 町・ロ 8.5 ハ 0 日 9 社′・ 46 The software security Manager Dashboard helps bring source code analysis up out 0f the weeds. The Software Security Manager is a Web-based security policy and reportlng interface that enables development teams tO manage and control risk across multiple prOJects and releases. The S0ftware SecuritY Manager helps tO centralize reporting, enable trend analysiS' and produce software security reports for management. The software SecuritY Manager includes a number 0f predefined metrics that cover the number and type 0f vulnerabili- ties, policy violations, and severity. Figure 4 ー 6 shows the S0ftware SecuritY Manager Dashboard. The ド 0 y KnowIedge Base The most critical feature Of any static analysis t001 involves the knowledge built into it. we've come a long way since the early days 0f RATS and lTS4 when a simple g rep for a possibly dangerous API might suffice. T0daY' the software security knowledge expected tO drive stat1C analysis tOOls is much more sophisticated. A complete taxonomy Of software security vulnerabilities that can be uncovered using automated t001S is discussed in Chapter 12. Software
36 C わ 4 1 Defining 4 D なじゅ〃れ of saltzer and Schroeder [ 1975 ] ) and rules (identified and captured in static analysis t001S such as lTS4 [Viega et al• 2000aD are fairly well understood. Knowledge catalogs only more recently identified include guidelines (often built into prescriptive frameworks for technologies such as . NET and J2EE)' attack patterns [Hoglund and McGraw 2004 ] , and historical risks. TO- gether, these various knowledge catalogs provide a basic foundation for a unified knowledge architecture supportlng SOft 、 security. Software security knowledge can be successfully applied at various stages throughout the entire SDLC. One effective way t0 apply such knowl- edge is through the use Of software security touchpoints. For example' rules are extremely useful for Stat1C analysis and COde actlvities. Figure 1 ー 12 ShOWS an enhanced version Of the soft 、 security touch- points diagram introduced in Figure 1 ー 9. ln Figure 1 ー 12 , I identify those activities and artifacts most clearly impacted bY the knowledge catalogs briefly mentioned above. 、 informatron about these catalogs can be found in Chapter 11. Awareness Of the SOft 、 securlty problem is growmg among researchers and S01 e securlty practitloners. the 1 OSt important audience has in some sense experienced the least eXPOSUre¯¯fOr the most part, software architects, developers and testers remain blithely unaware of the problem. One obvious way t0 spread software security knowledge iS tO traln software development staff critical software security issues. The most effective form 0f training begins with a description 0f the problem and demonstrates its impact and importance. During the Windows security push in February and March 2002 , Microsoft provided basic aware- ness training tO all Of its developers. Many Other organizations have ongo- ing SOftware security tralmng programs. Beyond a 、 advanced software security training should Offer coverage Of security engi- neering, design principles and guidelines implementation design flaws' analysis techniques, and security testing. specialtracks should be made avail- able tO quality assurance personnel' especially those whO carry out testing. Of course, the best training programs will Offer extensive and detailed coverage Of the touchpoints covered in this bOOk. putting the tOuchPOints intO practice reqmres cultural changg and that means training. Assembling a complete soft 、 security program at the enterprise level is the subJect of Chapter 10. The good news is that the three pillars 0f software security¯risk manage- ment, touchpoints, and knowledge—can be applied in a sensible' evolution- ary manner no matter what your existing software developm. ent approach is.
278 C わ 4 7 2 A xo 〃 0 川ア 0 ー Co 市〃 g E な 0 that knows this stuff) will be better prepared to build security in than those who don't. Though this taxonomy is incomplete and imperfect, it provides an lmportant start. One Of the problems Of all categorization schemes like this is that they don't leave room for new (often surprising) kinds of vulnerabili- ties. 、・ or dO they take into account higher-level concerns such as the archi- tectural flaws and associated risks described in Chapter 5.2 Even when it comes tO simple security-related COding issues themselves, this taxonomy is not perfect. C0ding problems in embedded control software and common bugs in high-assurance software developed using formal methods are poorly represented here, for example. The bulk of this taxonomy is influenced by the kinds of security coding problems often found in large enterprise software pr0Jects. Of course, only COding problems are represented since the purpose 0f this taxonomy is to feed a Stat1C analysis engrne with knowledge. The taxonomy as it stands is neither comprehensive nor theoretically complete. lnstead it is practical and based on real- 、 expenence. focus IS on collecting common errors and explaining them in such a way that they make sense to programmers. The taxonomy IS expected to evolve and change as time goes by and coding issues (e ・ g ・ , platform, language of choice, and so (n) change. This versl()n Of the taxonomy places emphasis concrete and specific problems over abstract or theoretical ones. ln some sense, the taxonomy may err in favor Of omitting 。。 big-picture" errors in favor 0f covering spe- CifiC and widespread errors. The taxonomy is made up of two distinct kinds of sets (which we're stealing from biology). What is called a 々わん襯 is a type or particular kind of coding error; for example, lllegal pointer Value is a phylum. What is called a た gdo 川 is a collection of phyla that share a common theme. That is, kingdoms are sets of phyla; for example, lnput Validation and Represen- tation is a kingdom. Both kingdoms and phyla naturally emerge from a soup Of COding rules relevant tO enterprise software. For this reason, the taxon- omy is likely tO be incomplete and may be missing certain coding errors. ln some cases, it IS easler and more effective t0 talk about a category of errors than it is tO talk about any particular attack. Though categories are certainly related tO attacks, they are not the same as attack patterns. 2This should really come as no surprise. static analysis for architectural flaws would require a formal architectural description so that pattern matching could occur. No such architectural description exists. (And before you object, UML doesn't cut it. )