File System Example

Using Qt XML Patterns for querying non-XML data that is modeled to look like XML.

This example shows how to use Qt XML Patterns for querying non-XML data that is modeled to look like XML.

Introduction

The example models your computer's file system to look like XML and allows you to query the file system with XQuery. Suppose we want to find all the cpp files in the subtree beginning at /filetree:

The User Interface

The example is shown below. First, we use File->Open Directory (not shown) to select the /filetree directory. Then we use the combobox on the right to select the XQuery that searches for cpp files (listCPPFiles.xq). Selecting an XQuery runs the query, which in this case traverses the model looking for all the cpp files. The XQuery text and the query results are shown on the right:

Don't be mislead by the XML representation of the /filetree directory shown on the left. This is not the node model itself but the XML obtained by traversing the node model and outputting it as XML. Constructing and using the custom node model is explained in the code walk-through.

Running your own XQueries

You can write your own XQuery files and run them in the example program. The file xmlpatterns/filetree/queries.qrc is the resource file for this example. It is used in main.cpp (Q_INIT_RESOURCE(queries);). It lists the XQuery files (.xq) that can be selected in the combobox.

 <!DOCTYPE RCC>
  <RCC version="1.0">
 <qresource>
     <file>queries/listCPPFiles.xq</file>
     <file>queries/wholeTree.xq</file>
 </qresource>
 </RCC>

To add your own queries to the example's combobox, store your .xq files in the examples/xmlpatterns/filetree/queries directory and add them to queries.qrc as shown above.

Code Walk-Through

The strategy is to create a custom node model that represents the directory tree of the computer's file system. That tree structure is non-XML data. The custom node model must have the same callback interface as the XML node models that the Qt XML Patterns query engine uses to execute queries. The query engine can then traverse the custom node model as if it were traversing the node model built from an XML document.

The required callback interface is in QAbstractXmlNodeModel, so we create a custom node model by subclassing QAbstractXmlNodeModel and providing implementations for its pure virtual functions. For many cases, the implementations of several of the virtual functions are always the same, so Qt XML Patterns also provides QSimpleXmlNodeModel, which subclasses QAbstractXmlNodeModel and provides implementations for the callback functions that you can ignore. By subclassing QSimpleXmlNodeModel instead of QAbstractXmlNodeModel, you can reduce development time.

The Custom Node Model Class: FileTree

The custom node model for this example is class FileTree, which is derived from QSimpleXmlNodeModel. FileTree implements all the callback functions that don't have standard implementations in QSimpleXmlNodeModel. When you implement your own custom node model, you must provide implementations for these callback functions:

 virtual QXmlNodeModelIndex::DocumentOrder compareOrder(const QXmlNodeModelIndex&, const QXmlNodeModelIndex&) const;
 virtual QXmlName name(const QXmlNodeModelIndex &node) const;
 virtual QUrl documentUri(const QXmlNodeModelIndex &node) const;
 virtual QXmlNodeModelIndex::NodeKind kind(const QXmlNodeModelIndex &node) const;
 virtual QXmlNodeModelIndex root(const QXmlNodeModelIndex &node) const;
 virtual QVariant typedValue(const QXmlNodeModelIndex &node) const;
 virtual QVector<QXmlNodeModelIndex> attributes(const QXmlNodeModelIndex &element) const;
 virtual QXmlNodeModelIndex nextFromSimpleAxis(SimpleAxis, const QXmlNodeModelIndex&) const;

The FileTree class declares four data members:

 mutable QVector<QFileInfo>  m_fileInfos;
 const QDir::Filters         m_filterAllowAll;
 const QDir::SortFlags       m_sortFlags;
 QVector<QXmlName>           m_names;

The QVector m_fileInfos will contain the node model. Each QFileInfo in the vector will represent a file or a directory in the file system. At this point it is instructive to note that although the node model class for this example (FileTree) actually builds and contains the custom node model, building the custom node model isn't always required. For example, it is possible to use an already existing QObject tree as a node model and just implement the callback interface for that already existing data structure. In this file system example, however, although we have an already existing data structure, i.e. the file system, that data structure is not in memory and is not in a form we can use. So we must build an analog of the file system in memory from instances of QFileInfo, and we use that analog as the custom node model.

The two sets of flags, m_filterAllowAll and m_sortFlags, contain OR'ed flags from QDir::Filters and QDir::SortFlags respectively. They are set by the FileTree constructor and used in calls to QDir::entryInfoList() for getting the child list for a directory node, i.e. a QFileInfoList containing the file and directory nodes for all the immediate children of a directory.

The QVector m_names is an auxiliary component of the node model. It holds the XML element and attribute names (QXmlName) for all the node types that will be found in the node model. m_names is indexed by the enum FileTree::Type, which specifies the node types:

 enum Type {
     File,
     Directory,
     AttributeFileName,
     AttributeFilePath,
     AttributeSize,
     AttributeMIMEType,
     AttributeSuffix
 };

Directory and File will represent the XML element nodes for directories and files respectively, and the other enum values will represent the XML attribute nodes for a file's path, name, suffix, its size in bytes, and its mime type. The FileTree constructor initializes m_names with an appropriate QXmlName for each element and attribute type:

 FileTree::FileTree(const QXmlNamePool& pool)
   : QSimpleXmlNodeModel(pool),
     m_filterAllowAll(QDir::AllEntries |
                      QDir::AllDirs |
                      QDir::NoDotAndDotDot |
                      QDir::Hidden),
     m_sortFlags(QDir::Name)
 {
     QXmlNamePool np = namePool();
     m_names.resize(7);
     m_names[File]               = QXmlName(np, QLatin1String("file"));
     m_names[Directory]          = QXmlName(np, QLatin1String("directory"));
     m_names[AttributeFileName]  = QXmlName(np, QLatin1String("fileName"));
     m_names[AttributeFilePath]  = QXmlName(np, QLatin1String("filePath"));
     m_names[AttributeSize]      = QXmlName(np, QLatin1String("size"));
     m_names[AttributeMIMEType]  = QXmlName(np, QLatin1String("mimeType"));
     m_names[AttributeSuffix]    = QXmlName(np, QLatin1String("suffix"));
 }

Note that the constructor does not pre-build the entire node model. Instead, the node model is built incrementally as the query engine evaluates a query. To see how the query engine causes the node model to be built incrementally, see Building And Traversing The Node Model. To see how the query engine accesses the node model, see Accessing the node model. See also: Node Model Building Strategy.

Accessing The Node Model

Since the node model is stored outside the query engine in the FileTree class, the query engine knows nothing about it and can only access it by calling functions in the callback interface. When the query engine calls any callback function to access data in the node model, it passes a QXmlNodeModelIndex to identify the node in the node model that it wants to access. Hence all the virtual functions in the callback interface use a QXmlNodeModelIndex to uniquely identify a node in the model.

We use the index of a QFileInfo in m_fileInfos to uniquely identify a node in the node model. To get the QXmlNodeModelIndex for a QFileInfo, the class uses the private function toNodeIndex():

 QXmlNodeModelIndex
 FileTree::toNodeIndex(const QFileInfo &fileInfo, Type attributeName) const
 {
     const int indexOf = m_fileInfos.indexOf(fileInfo);

     if (indexOf == -1) {
         m_fileInfos.append(fileInfo);
         return createIndex(m_fileInfos.count()-1, attributeName);
     }
     else
         return createIndex(indexOf, attributeName);
 }

It searches the m_fileInfos vector for a QFileInfo that matches fileInfo. If a match is found, its array index is passed to QAbstractXmlNodeModel::createIndex() as the data value for the QXmlNodeIndex. If no match is found, the unmatched QFileInfo is appended to the vector, so this function is also doing the actual incremental model building (see Building And Traversing The Node Model).

Note that toNodeIndex() gets a node type as the second parameter, which it just passes on to createIndex() as the additionalData value. Logically, this second parameter represents a second dimension in the node model, where the first dimension represents the element nodes, and the second dimension represents each element's attribute nodes. The meaning is that each QFileInfo in the m_fileInfos vector can represent an element node and one or more attribute nodes. In particular, the QFileInfo for a file will contain the values for the attribute nodes path, name, suffix, size, and mime type (see FileTree::attributes()). Since the attributes are contained in the QFileInfo of the file element, there aren't actually any attribute nodes in the node model. Hence, we can use a QVector for m_fileInfos.

A convenience overloading of toNodeIndex() is also called in several places, wherever it is known that the QXmlNodeModelIndex being requested is for a directory or a file and not for an attribute. The convenience function takes only the QFileInfo parameter and calls the other toNodeIndex(), after obtaining either the Directory or File node type directly from the QFileInfo:

 QXmlNodeModelIndex FileTree::toNodeIndex(const QFileInfo &fileInfo) const
 {
     return toNodeIndex(fileInfo, fileInfo.isDir() ? Directory : File);
 }

Note that the auxiliary vector m_names is accessed using the node type, for example:

 QXmlName FileTree::name(const QXmlNodeModelIndex &node) const
 {
     return m_names.at(node.additionalData());
 }

Most of the virtual functions in the callback interface are as simple as the ones described so far, but the callback function used for traversing (and building) the node model is more complex.

Building And Traversing The Node Model

The node model in FileTree is not fully built before the query engine begins evaluating the query. In fact, when the query engine begins evaluating its first query, the only node in the node model is the one representing the root directory for the selected part of the file system. See The UI Class: MainWindow below for details about how the UI triggers creation of the model.

The query engine builds the node model incrementally each time it calls the nextFromSimpleAxis() callback function, as it traverses the node model to evaluate a query. Thus the query engine only builds the region of the node model that it needs for evaluating the query.

nextFromSimpleAxis() takes an axis identifier and a node identifier as parameters. The node identifier represents the context node (i.e. the query engine's current location in the model), and the axis identifier represents the direction we want to move from the context node. The function finds the appropriate next node and returns its QXmlNodeModelIndex.

nextFromSimpleAxis() is where most of the work of implementing a custom node model will be required. The obvious way to do it is to use a switch statement with a case for each axis.

 QXmlNodeModelIndex
 FileTree::nextFromSimpleAxis(SimpleAxis axis, const QXmlNodeModelIndex &nodeIndex) const
 {
     const QFileInfo fi(toFileInfo(nodeIndex));
     const Type type = Type(nodeIndex.additionalData());

     if (type != File && type != Directory) {
         Q_ASSERT_X(axis == Parent, Q_FUNC_INFO, "An attribute only has a parent!");
         return toNodeIndex(fi, Directory);
     }

     switch (axis) {
         case Parent:
             return toNodeIndex(QFileInfo(fi.path()), Directory);

         case FirstChild:
         {
             if (type == File) // A file has no children.
                 return QXmlNodeModelIndex();
             else {
                 Q_ASSERT(type == Directory);
                 Q_ASSERT_X(fi.isDir(), Q_FUNC_INFO, "It isn't really a directory!");
                 const QDir dir(fi.absoluteFilePath());
                 Q_ASSERT(dir.exists());

                 const QFileInfoList children(dir.entryInfoList(QStringList(),
                                                                m_filterAllowAll,
                                                                m_sortFlags));
                 if (children.isEmpty())
                     return QXmlNodeModelIndex();
                 const QFileInfo firstChild(children.first());
                 return toNodeIndex(firstChild);
             }
         }

         case PreviousSibling:
             return nextSibling(nodeIndex, fi, -1);

         case NextSibling:
             return nextSibling(nodeIndex, fi, 1);
     }

     Q_ASSERT_X(false, Q_FUNC_INFO, "Don't ever get here!");
     return QXmlNodeModelIndex();
 }

The first thing this function does is call toFileInfo() to get the QFileInfo of the context node. The use of QVector::at() here is guaranteed to succeed because the context node must already be in the node model, and hence must have a QFileInfo in m_fileInfos.

 const QFileInfo&
 FileTree::toFileInfo(const QXmlNodeModelIndex &nodeIndex) const
 {
     return m_fileInfos.at(nodeIndex.data());
 }

The Parent case looks up the context node's parent by constructing a QFileInfo from the context node's path and passing it to toNodeIndex() to find the QFileInfo in m_fileInfos.

The FirstChild case requires that the context node must be a directory, because a file doesn't have children. If the context node is not a directory, a default constructed QXmlNodeModelIndex is returned. Otherwise, QDir::entryInfoList() constructs a QFileInfoList of the context node's children. The first QFileInfo in the list is passed to toNodeIndex() to get its QXmlNodeModelIndex. Note that this will add the child to the node model, if it isn't in the model yet.

The PreviousSibling and NextSibling cases call the nextSibling() helper function. It takes the QXmlNodeModelIndex of the context node, the QFileInfo of the context node, and an offest of +1 or -1. The context node is a child of some parent, so the function gets the parent and then gets the child list for the parent. The child list is searched to find the QFileInfo of the context node. It must be there. Then the offset is applied, -1 for the previous sibling and +1 for the next sibling. The resulting index is passed to toNodeIndex() to get its QXmlNodeModelIndex. Note again that this will add the sibling to the node model, if it isn't in the model yet.

 QXmlNodeModelIndex FileTree::nextSibling(const QXmlNodeModelIndex &nodeIndex,
                                          const QFileInfo &fileInfo,
                                          qint8 offset) const
 {
     Q_ASSERT(offset == -1 || offset == 1);

     // Get the context node's parent.
     const QXmlNodeModelIndex parent(nextFromSimpleAxis(Parent, nodeIndex));

     if (parent.isNull())
         return QXmlNodeModelIndex();

     // Get the parent's child list.
     const QFileInfo parentFI(toFileInfo(parent));
     Q_ASSERT(Type(parent.additionalData()) == Directory);
     const QFileInfoList siblings(QDir(parentFI.absoluteFilePath()).entryInfoList(QStringList(),
                                                                                  m_filterAllowAll,
                                                                                  m_sortFlags));
     Q_ASSERT_X(!siblings.isEmpty(), Q_FUNC_INFO, "Can't happen! We started at a child.");

     // Find the index of the child where we started.
     const int indexOfMe = siblings.indexOf(fileInfo);

     // Apply the offset.
     const int siblingIndex = indexOfMe + offset;
     if (siblingIndex < 0 || siblingIndex > siblings.count() - 1)
         return QXmlNodeModelIndex();
     else
         return toNodeIndex(siblings.at(siblingIndex));
 }
The UI Class: MainWindow

The example's UI is a conventional Qt GUI application inheriting QMainWindow and the Ui_MainWindow base class generated by Qt Designer.

 #include "filetree.h"
 #include "ui_mainwindow.h"

 class MainWindow : public QMainWindow, private Ui_MainWindow
 {
     Q_OBJECT

 public:
     MainWindow();

 private slots:
     void on_actionOpenDirectory_triggered();
     void on_actionAbout_triggered();
     void on_queryBox_currentIndexChanged(const QString &);

 private:
     void loadDirectory(const QString &directory);
     void evaluateResult();

     const QXmlNamePool  m_namePool;
     const FileTree      m_fileTree;
     QXmlNodeModelIndex  m_fileNode;
 };

It contains the custom node model (m_fileTree) and an instance of QXmlNodeModelIndex (m_fileNode) used for holding the node index for the root of the file system subtree. m_fileNode will be bound to a $variable in the XQuery to be evaluated.

Two actions of interest are handled by slot functions: Selecting A Directory To Model and Selecting And Running An XQuery.

Selecting A Directory To Model

The user selects File->Open Directory to choose a directory to be loaded into the custom node model. Choosing a directory signals the on_actionOpenDirectory_triggered() slot:

 void MainWindow::on_actionOpenDirectory_triggered()
 {
     const QString directoryName = QFileDialog::getExistingDirectory(this);
     if (!directoryName.isEmpty())
         loadDirectory(directoryName);
 }

The slot function simply calls the private function loadDirectory() with the path of the chosen directory:

 void MainWindow::loadDirectory(const QString &directory)
 {
     Q_ASSERT(QDir(directory).exists());

     m_fileNode = m_fileTree.nodeFor(directory);

     QXmlQuery query(m_namePool);
     query.bindVariable("fileTree", m_fileNode);
     query.setQuery(QUrl("qrc:/queries/wholeTree.xq"));

     QByteArray output;
     QBuffer buffer(&output);
     buffer.open(QIODevice::WriteOnly);

     QXmlFormatter formatter(query, &buffer);
     query.evaluateTo(&formatter);

     treeInfo->setText(tr("Model of %1 output as XML.").arg(directory));
     fileTree->setText(QString::fromLatin1(output.constData()));
     evaluateResult();
 }

loadDirectory() demonstrates a standard code pattern for using Qt XML Patterns programatically. First it gets the node model index for the root of the selected directory. Then it creates an instance of QXmlQuery and calls QXmlQuery::bindVariable() to bind the node index to the XQuery variable $fileTree. It then calls QXmlQuery::setQuery() to load the XQuery text.

Note: QXmlQuery::bindVariable() must be called before calling QXmlQuery::setQuery(), which loads and parses the XQuery text and must have access to the variable binding as the text is parsed.

The next lines create an output device for outputting the query result, which is then used to create a QXmlFormatter to format the query result as XML. QXmlQuery::evaluateTo() is called to run the query, and the formatted XML output is displayed in the left panel of the UI window.

Finally, the private function evaluateResult() is called to run the currently selected XQuery over the custom node model.

Note: As described in Building And Traversing The Node Model, the FileTree class wants to build the custom node model incrementally as it evaluates the XQuery. But, because the loadDirectory() function runs the wholeTree.xq XQuery, it actually builds the entire node model anyway. See Node Model Building Strategy for a discussion about building your custom node model.

Selecting And Running An XQuery

The user chooses an XQuery from the menu in the combobox on the right. Choosing an XQuery signals the on_queryBox_currentIndexChanged() slot:

 void MainWindow::on_queryBox_currentIndexChanged(const QString &currentText)
 {
     QFile queryFile(":/queries/" + currentText);
     queryFile.open(QIODevice::ReadOnly);

     queryEdit->setPlainText(QString::fromLatin1(queryFile.readAll()));
     evaluateResult();
 }

The slot function opens and loads the query file and then calls the private function evaluateResult() to run the query:

 void MainWindow::evaluateResult()
 {
     if (queryBox->currentText().isEmpty() || m_fileNode.isNull())
         return;

     QXmlQuery query(m_namePool);
     query.bindVariable("fileTree", m_fileNode);
     query.setQuery(QUrl("qrc:/queries/" + queryBox->currentText()));

     QByteArray formatterOutput;
     QBuffer buffer(&formatterOutput);
     buffer.open(QIODevice::WriteOnly);

     QXmlFormatter formatter(query, &buffer);
     query.evaluateTo(&formatter);

     output->setText(QString::fromLatin1(formatterOutput.constData()));
 }

evaluateResult() is a second example of the same code pattern shown in loadDirectory(). In this case, it runs the XQuery currently selected in the combobox instead of qrc:/queries/wholeTree.xq, and it outputs the query result to the panel on the lower right of the UI window.

Node Model Building Strategy

We saw that the FileTree tries to build its custom node model incrementally, but we also saw that the MainWindow::loadDirectory() function in the UI class immediately subverts the incremental build by running the wholeTree.xq XQuery, which traverses the entire selected directory, thereby causing the entire node model to be built.

If we want to preserve the incremental build capability of the FileTree class, we can strip the running of wholeTree.xq out of MainWindow::loadDirectory():

 void MainWindow::loadDirectory(const QString &directory)
 {
     Q_ASSERT(QDir(directory).exists());

     m_fileNode = m_fileTree.nodeFor(directory);
 }

Note, however, that FileTree doesn't have the capability of deleting all or part of the node model. The node model, once built, is only deleted when the FileTree instance goes out of scope.

In this example, each element node in the node model represents a directory or a file in the computer's file system, and each node is represented by an instance of QFileInfo. An instance of QFileInfo is not costly to produce, but you might imagine a node model where building new nodes is very costly. In such cases, the capability to build the node model incrementally is important, because it allows us to only build the region of the model we need for evaluating the query. In other cases, it will be simpler to just build the entire node model.

Example project @ code.qt.io