C++ Source Code Analyzer Example
Using XQuery and the xmlpatterns
command line utility to query C++ source code.
This example uses XQuery and the xmlpatterns
command line utility to query C++ source code.
Introduction
Suppose we want to analyze C++ source code to find coding standard violations and instances of bad or inefficient patterns. We can do it using the common searching and pattern matching utilities to process the C++ files (e.g., grep
, sed
, and awk
). Now we can also use XQuery with the Qt XML Patterns module.
An extension to the g++
open source C++ compiler (GCC-XML) generates an XML description of C++ source code declarations. This XML description can then be processed by Qt XML Patterns using XQueries to navigate the XML description of the C++ source and produce a report. Consider the problem of finding mutable global variables:
Reporting Uses of Mutable Global Variables
Suppose we want to introduce threading to a C++ application that was originally written without threading. In a threaded program, mutable global variables can cause bugs, because one thread might change a global variable that other threads are reading, or two threads might try to set the same global variable. So when converting our program to use threading, one of the things we must do is protect the global variables to prevent the bugs described above. How can we use XQuery and GCC-XML to find the variables that need protecting?
A C++ application
Consider the declarations in this hypothetical C++ application:
1. int mutablePrimitive1; 2. int mutablePrimitive2; 3. const int constPrimitive1 = 4; 4. const int constPrimitive2 = 3; 5. 6. class ComplexClass 7. { 8. public: 9. ComplexClass(); 10. ComplexClass(const ComplexClass &); 11. ~ComplexClass(); 12. }; 13. 14. ComplexClass mutableComplex1; 15. ComplexClass mutableComplex2; 16. const ComplexClass constComplex1; 17. const ComplexClass constComplex2; 18. 19. int main() 20. { 22. int localVariable; 23. localVariable = 0; 24. return localVariable; 25. }
The XML description of the C++ application
Submitting this C++ source to GCC-XML produces this XML description:
<?xml version="1.0"?> <GCC_XML> <Namespace id="_1" name="::" members="_3 _4 _5 _6 _7 _8 _9 _10 _11 _12 _13 _14 _15 " mangled="_Z2::"/> <Namespace id="_2" name="std" context="_1" members="" mangled="_Z3std"/> <Function id="_3" name="_GLOBAL__D_globals.cppwVRo3a" returns="_16" context="_1" location="f0:14" file="f0" line="14" endline="14"/> <Function id="_4" name="_GLOBAL__I_globals.cppwVRo3a" returns="_16" context="_1" location="f0:14" file="f0" line="14" endline="14"/> <Function id="_5" name="__static_initialization_and_destruction_0" returns="_16" context="_1" mangled="_Z41__static_initialization_and_destruction_0ii" location="f0:23" file="f0" line="23" endline="14"> <Argument name="__initialize_p" type="_17"/> <Argument name="__priority" type="_17"/> </Function> <Function id="_6" name="main" returns="_17" context="_1" location="f0:20" file="f0" line="20" endline="24"/> <Variable id="_7" name="constComplex2" type="_11c" context="_1" location="f0:17" file="f0" line="17"/> <Variable id="_8" name="constComplex1" type="_11c" context="_1" location="f0:16" file="f0" line="16"/> <Variable id="_9" name="mutableComplex2" type="_11" context="_1" location="f0:15" file="f0" line="15"/> <Variable id="_10" name="mutableComplex1" type="_11" context="_1" location="f0:14" file="f0" line="14"/> <Class id="_11" name="ComplexClass" context="_1" mangled="12ComplexClass" location="f0:7" file="f0" line="7" members="_19 _20 _21 " bases=""/> <Variable id="_12" name="constPrimitive2" type="_17c" init="3" context="_1" location="f0:4" file="f0" line="4"/> <Variable id="_13" name="constPrimitive1" type="_17c" init="4" context="_1" location="f0:3" file="f0" line="3"/> <Variable id="_14" name="mutablePrimitive2" type="_17" context="_1" location="f0:2" file="f0" line="2"/> <Variable id="_15" name="mutablePrimitive1" type="_17" context="_1" location="f0:1" file="f0" line="1"/> <FundamentalType id="_16" name="void"/> <FundamentalType id="_17" name="int"/> <CvQualifiedType id="_11c" type="_11" const="1"/> <Constructor id="_19" name="ComplexClass" context="_11" mangled="_ZN12ComplexClassC1Ev *INTERNAL* " location="f0:9" file="f0" line="9" extern="1"/> <Constructor id="_20" name="ComplexClass" context="_11" mangled="_ZN12ComplexClassC1ERKS_ *INTERNAL* " location="f0:10" file="f0" line="10" extern="1"> <Argument type="_23"/> </Constructor> <Destructor id="_21" name="ComplexClass" context="_11" mangled="_ZN12ComplexClassD1Ev *INTERNAL* " location="f0:11" file="f0" line="11" extern="1"> </Destructor> <CvQualifiedType id="_17c" type="_17" const="1"/> <ReferenceType id="_23" type="_11c"/> <File id="f0" name="globals.cpp"/> </GCC_XML>
The XQuery for finding global variables
We need an XQuery to find the global variables in the XML description. Here is our XQuery source. We walk through it in XQuery Code Walk-Through.
(: This XQuery loads a GCC-XML file and reports the locations of all global variables in the original C++ source. To run the query, use the command line: xmlpatterns reportGlobals.xq -param fileToOpen=globals.gccxml -output globals.html "fileToOpen=globals.gccxml" binds the file name "globals.gccxml" to the variable "fileToOpen" declared and used below. :) declare variable $fileToOpen as xs:anyURI external; declare variable $inDoc as document-node() := doc($fileToOpen); (: This function determines whether the typeId is a complex type, e.g. QString. We only check whether it's a class. To be strictly correct, we should check whether the class has a non-synthesized constructor. We accept both mutable and const types. :) declare function local:isComplexType($typeID as xs:string) as xs:boolean { exists($inDoc/GCC_XML/Class[@id = $typeID]) or exists($inDoc/GCC_XML/Class[@id = $inDoc/GCC_XML/CvQualifiedType[@id = $typeID]/@type]) }; (: This function determines whether the typeId is a primitive type. :) declare function local:isPrimitive($typeId as xs:string) as xs:boolean { exists($inDoc/GCC_XML/FundamentalType[@id = $typeId]) }; (: This function constructs a line for the report. The line contains a variable name, the source file, and the line number. :) declare function local:location($block as element()) as xs:string { concat($inDoc/GCC_XML/File[@id = $block/@file]/@name, " at line ", $block/@line) }; (: This function generates the report. Note that it is called once in the <body> element of the <html> output. It ignores const variables of simple types but reports all others. :) declare function local:report() as element()+ { let $complexVariables as element(Variable)* := $inDoc/GCC_XML/Variable[local:isComplexType(@type)] return if (exists($complexVariables)) then (<p xmlns="http://www.w3.org/1999/xhtml/">Global variables with complex types:</p>, <ol xmlns="http://www.w3.org/1999/xhtml/"> { (: For each Variable in $complexVariables... :) $complexVariables/<li><span class="variableName">{string(@name)}</span> in {local:location(.)}</li> } </ol>) else <p xmlns="http://www.w3.org/1999/xhtml/">No complex global variables found.</p> , let $primitiveVariables as element(Variable)+ := $inDoc/GCC_XML/Variable[local:isPrimitive(@type)] return if (exists($primitiveVariables)) then (<p xmlns="http://www.w3.org/1999/xhtml/">Mutable global variables with primitives types:</p>, <ol xmlns="http://www.w3.org/1999/xhtml/"> { (: For each Variable in $complexVariables... :) $primitiveVariables/<li><span class="variableName">{string(@name)}</span> in {local:location(.)}</li> } </ol>) else <p xmlns="http://www.w3.org/1999/xhtml/">No mutable primitive global variables found.</p> }; (: This is where the <html> report is output. First there is some style stuff, then the <body> element, which contains the call to the \c{local:report()} declared above. :) <html xmlns="http://www.w3.org/1999/xhtml/" xml:lang="en" lang="en"> <head> <title>Global variables report for {$fileToOpen}</title> </head> <style type="text/css"> .details {{ text-align: left; font-size: 80%; color: blue }} .variableName {{ font-family: courier; color: blue }} </style> <body> <p class="details">Start report: {current-dateTime()}</p> { local:report() } <p class="details">End report: {current-dateTime()}</p> </body> </html>
Running the XQuery
To run the XQuery using the xmlpatterns
command line utility, enter the following command:
xmlpatterns reportGlobals.xq -param fileToOpen=globals.gccxml -output globals.html
The XQuery output
The xmlpatterns
command loads and parses globals.gccxml
, runs the XQuery reportGlobals.xq
, and generates this report:
Start report: 2008-12-16T13:43:49.65Z
Global variables with complex types:
- mutableComplex1 in globals.cpp at line 14
- mutableComplex2 in globals.cpp at line 15
- constComplex1 in globals.cpp at line 16
- constComplex2 in globals.cpp at line 17
Mutable global variables with primitives types:
- mutablePrimitive1 in globals.cpp at line 1
- mutablePrimitive2 in globals.cpp at line 2
End report: 2008-12-16T13:43:49.65Z
XQuery Code Walk-Through
The XQuery source is in examples/xmlpatterns/xquery/globalVariables/reportGlobals.xq
It begins with two variable declarations that begin the XQuery:
declare variable $fileToOpen as xs:anyURI external; declare variable $inDoc as document-node() := doc($fileToOpen);
The first variable, $fileToOpen
, appears in the xmlpatterns
command shown earlier, as -param fileToOpen=globals.gccxml
. This binds the variable name to the file name. This variable is then used in the declaration of the second variable, $inDoc
, as the parameter to the doc() function. The doc()
function returns the document node of globals.gccxml
, which is assigned to $inDoc
to be used later in the XQuery as the root node of our searches for global variables.
Next skip to the end of the XQuery, where the <html>
element is constructed. The <html>
will contain a <head>
element to specify a heading for the html page, followed by some style instructions for displaying the text, and then the <body>
element.
<html xmlns="http://www.w3.org/1999/xhtml/" xml:lang="en" lang="en"> <head> <title>Global variables report for {$fileToOpen}</title> </head> <style type="text/css"> .details {{ text-align: left; font-size: 80%; color: blue }} .variableName {{ font-family: courier; color: blue }} </style> <body> <p class="details">Start report: {current-dateTime()}</p> { local:report() } <p class="details">End report: {current-dateTime()}</p> </body> </html>
The <body>
element contains a call to the local:report()
function, which is where the query does the "heavy lifting." Note the two return
clauses separated by the comma operator about halfway down:
declare function local:report() as element()+ { let $complexVariables as element(Variable)* := $inDoc/GCC_XML/Variable[local:isComplexType(@type)] return if (exists($complexVariables)) then (<p xmlns="http://www.w3.org/1999/xhtml/">Global variables with complex types:</p>, <ol xmlns="http://www.w3.org/1999/xhtml/"> { (: For each Variable in $complexVariables... :) $complexVariables/<li><span class="variableName">{string(@name)}</span> in {local:location(.)}</li> } </ol>) else <p xmlns="http://www.w3.org/1999/xhtml/">No complex global variables found.</p> , let $primitiveVariables as element(Variable)+ := $inDoc/GCC_XML/Variable[local:isPrimitive(@type)] return if (exists($primitiveVariables)) then (<p xmlns="http://www.w3.org/1999/xhtml/">Mutable global variables with primitives types:</p>, <ol xmlns="http://www.w3.org/1999/xhtml/"> { (: For each Variable in $complexVariables... :) $primitiveVariables/<li><span class="variableName">{string(@name)}</span> in {local:location(.)}</li> } </ol>) else <p xmlns="http://www.w3.org/1999/xhtml/">No mutable primitive global variables found.</p> };
The return
clauses are like two separate queries. The comma operator separating them means that both return
clauses are executed and both return their results, or, rather, both output their results. The first return
clause searches for global variables with complex types, and the second searches for mutable global variables with primitive types.
Here is the html generated for the <body>
element. Compare it with the XQuery code above:
<body> <p class="details">Start report: 2008-12-16T13:43:49.65Z</p> <p>Global variables with complex types:</p> <ol> <li> <span class="variableName">mutableComplex1</span> in globals.cpp at line 14</li> <li> <span class="variableName">mutableComplex2</span> in globals.cpp at line 15</li> <li> <span class="variableName">constComplex1</span> in globals.cpp at line 16</li> <li> <span class="variableName">constComplex2</span> in globals.cpp at line 17</li> </ol> <p>Mutable global variables with primitives types:</p> <ol> <li> <span class="variableName">mutablePrimitive1</span> in globals.cpp at line 1</li> <li> <span class="variableName">mutablePrimitive2</span> in globals.cpp at line 2</li> </ol> <p class="details">End report: 2008-12-16T13:43:49.65Z</p> </body>
The XQuery declares three more local functions that are called in turn by the local:report()
function. isComplexType()
returns true if the variable has a complex type. The variable can be mutable or const.
declare function local:isComplexType($typeID as xs:string) as xs:boolean { exists($inDoc/GCC_XML/Class[@id = $typeID]) or exists($inDoc/GCC_XML/Class[@id = $inDoc/GCC_XML/CvQualifiedType[@id = $typeID]/@type]) };
isPrimitive()
returns true if the variable has a primitive type. The variable must be mutable.
declare function local:isPrimitive($typeId as xs:string) as xs:boolean { exists($inDoc/GCC_XML/FundamentalType[@id = $typeId]) };
location()
returns a text constructed from the variable's file and line number attributes.
declare function local:location($block as element()) as xs:string { concat($inDoc/GCC_XML/File[@id = $block/@file]/@name, " at line ", $block/@line) };