Manual

From wiki
Revision as of 13:47, 15 December 2017 by Mat007 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This section provides a complete description of every feature in the library.

XML

Using the library

Only one include statement is necessary :

#include <xeumeuleu/xml.hpp>

The whole library is enclosed within the namespace xml.

All input streams derive from xml::xistream and all ouput streams from xml::xostream.
All operations described in this manual are actually available on those base stream types.

Should an error happen manipulating a stream an exception of type xml::exception which derives from std::exception will be raised.

Furthermore a textual representation of the current context of an input stream can be retrieved using the context() method. It usually contains the line/column number and any information to identify the input (such as file name).

Creating a stream

The first step is to create the proper stream depending on the needs.

Input streams

The following code creates a stream from an XML document contained in an std::string :

xml::xistringstream xis( string );

The following code creates a stream from an XML document contained in a file file.xml :

xml::xifstream xis( "file.xml" );

The following code creates a stream from an XML document contained in an std::istream :

xml::xistreamstream xis( is );

The following code clones a given existing stream xis :

xml::xibufferstream xibs( xis );

The purpose of this stream implementation is to bufferize XML in order to read it later.

The following code branches a given existing stream xis :

xml::xisubstream xiss( xis );

The purpose of this stream implementation is to perform operations on xiss without impacting xis.
Obviously the life-time of the sub-stream must not exceed the life-time of the underlying stream.

The following code combines two streams xis1 and xis2 into a single stream :

xml::ximultistream xis( xis1, xis2 );

The purpose of this stream implementation is to read from two streams transparently as if there was one stream.
If one piece of data is available in both streams, the first one takes precedence, otherwise it is pulled from any of the streams as needed.
The life-time of the composite stream must not exceed the life-time of the underlying streams.

Output streams

The following code creates a stream to write an XML document to an std::string :

xml::xostringstream xos;
const std::string xml = xos.str();

After the serialization is finished the resulting document can be retrieved using the str() method.

The following code creates a stream to write an XML document to a file file.xml :

xml::xofstream xos( "file.xml" );

The following code adapts an std::ostream to output an XML document into it :

xml::xostreamstream xos( os );

The following code creates a stream to write an XML document to a buffer :

xml::xobufferstream xobs;

The purpose of this type of stream is to also be an input stream, therefore the buffer content can be read back later.

The following code branches a given existing stream xos :

xml::xosubstream xoss( xos );

The purpose of this stream implementation is to perform operations on xoss without impacting xos.
Obviously the life-time of the sub-stream must not exceed the life-time of the underlying stream.

Encoding

By default the encoding of an input stream will be detected in the document prolog if available, example :

<?xml version="1.0" encoding="UTF-8" ?>
<element/>

If no encoding is specified or no prolog is available, the encoding will be assumed to be UTF-8.

An output stream will be encoded in UTF-8 by default.

The encoding type can be forced at creation time, for example :

xml::xostringstream xos( xml::encoding( "UTF-16" ) );

Or :

xml::xifstream xis( "file.xml", xml::encoding( "ISO-8859-1" ) );

For the list of possible encoding values see "What encodings are supported by Xerces-C / XML4C?".

Validation

By default validation will only occur if an XML schema definition (XSD) is specified when creating an input stream :

xml::xistringstream xis( string, xml::external_grammar( "example.xsd" ) );

The file name is relative to the current working directory.

Note that passing a URI is valid, for instance :

xml::xistringstream xis( string, xml::external_grammar( "http://example.org/example.xsd" ) );

Sometimes an embedded schema definition has been specified in the document :

<?xml version="1.0" encoding="UTF-8" ?>
<element xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="example.xsd" >

A special xml::internal_grammar has to be used :

xml::xistringstream xis( string, xml::internal_grammar() );

The default behaviour is not to validate against internal schema definitions because it may require an internet connection.

In-memory schemas are also supported through the use of xml::memory_grammar :

const std::string schema = "<xs:schema ...";
xml::xistringstream xis( string, xml::memory_grammar( schema ) );

Combining several schemas can be achieved using the composite grammar xml::grammars :

xml::memory_grammar schema_1( "<xs:schema ..." );
xml::memory_grammar schema_2( "<xs:schema ..." );
xml::grammars composite;
composite.add( schema_1 );
composite.add( schema_2 );
xml::xistringstream xis( string, composite );

Or in a shorter version :

xml::xistringstream xis( string, xml::memory_grammar( "<xs:schema ..." ) + xml::memory_grammar( "<xs:schema ..." ) );

The only combinations supported are with external and memory schemas : internal schemas cannot be mixed with other types of schemas.

Walking the tree

Elements

Elements are (un)serialized using xml::start and xml::end, for example :

xos << xml::start( "element" ) << xml::end;

An important feature concerning input streams is that when unserializing an element the first available child matching the given name will be pulled.
This allows to ignore elements ordering as for the following XML document :

<root>
  <first-element/>
  <second-element/>
</root>

The following unserialization code is valid :

xis >> xml::start( "root" )
      >> xml::start( "second-element" ) >> xml::end
      >> xml::start( "first-element" ) >> xml::end;

Of course as a side effect the following code appears to be also valid :

xis >> xml::start( "root" )
      >> xml::start( "first-element" ) >> xml::end
      >> xml::start( "first-element" ) >> xml::end
      >> xml::start( "first-element" ) >> xml::end;

It is also considered valid for some nodes to be left unserialized and therefore no exception will be raised.

An xml::xistream can also be used as an element in order to be directly copied into an xml::xostream, for example given the following input :

<root>
  <element>
    <sub-element/>
  </element>
</root>

The following code :

xml::xifstream xis( "file.xml" );
xis >> xml::start( "root" );
xml::xostringstream xos;
xos << xis;

Will result in the following output document :

<?xml version="1.0" encoding="UTF-8" ?>
<element/>
  <sub-element/>
</element>

Note that the serialization can also be written the other way around, that is :

xis >> xos;

Testing the presence of a child element on an input stream can be achieved with :

bool has_child = xis.has_child( "element" );

Lists of elements

In order to cope with lists of elements of the same name in input streams xml::list can be used, for example for the following document :

<root>
  <element/>
  <element/>
  <element/>
</root>

The following code will be used :

xis >> xml::start( "root" )
      >> xml::list( "element", my_instance, &my_class::my_method );

Where my_instance is an instance of a functor class my_class which method my_method signature will be :

class my_class
{
public:
    void my_method( xistream& xis );
};

Note that the functor method can be declared const if needed.

For each element the functor method will be called and the stream passed as parameter will be ready to handle the branch as if the corresponding child element had been unserialized.
If no child of the given name can be found the functor method will not be called but no exception will be raised.

Additionally if custom parameters are needed within the functor method they can be passed to xml::list, for example :

int my_integer = 3;
std::string my_string = "something";
xis >> xml::start( "root" )
      >> xml::list( "element", my_instance, &my_class::my_method, my_integer, my_string );

The functor method signature will be expected to be called with the given parameters :

class my_class
{
public:
    void my_method( xistream& xis, int my_integer, const std::string& my_string ) const;
};

Note that each parameter can be declared const or not depending on the needs.

Another version of xml::list exists which does not take an element name as first parameter, but instead discovers every element and for each one calls a functor with its name.
This allows to read an heterogeneous list of elements when the ordering matters, for example with the document :

<root>
  <first-element/>
  <second-element/>
  <third-element/>
</root>

The following code will call a functor for each element :

xis >> xml::start( "root" )
      >> xml::list( my_instance, &my_class::my_method );

With the functor method signature being :

class my_class
{
public:
    void my_method( const std::string& name, xistream& xis ) const;
};

The name argument will be first-element, second-element and then third-element : the order of the calls matches the order of the elements in the document.

Additionally STL-style functors are compatible with xml::list, for instance the function :

void my_function( xml::xistream& xis )
{
    xis >> ...
}

Can be used to read data with :

xis >> xml::start( "root" )
      >> xml::list( "element", &my_function );

In the same way, the following functor :

class my_functor
{
public:
    void operator()( xml::xistream& xis )
    {
        xis >> ...
    }
};

Can be used like this :

xis >> xml::start( "root" )
      >> xml::list( "element", my_functor() );

In this case a copy of the functor is made internally.

In order to pass the functor by reference the template parameter of xml::list has to be explicitly specified :

my_functor functor;
xis >> xml::start( "root" )
      >> xml::list< my_functor& >( "element", functor );

Optional elements

If an element is optional in an input stream, xml::optional can be used during unserialization, for example the following document :

<element/>

Can be unserialized with :

xis >> xml::start( "element" )
      >> xml::optional >> xml::start( "non-existing-child" )
        >> xml::start( "non-existing-grand-child" )
          >> xml::attribute( "name", name );

No exception will be raised as the whole non-existing-child branch will be skipped.
Obviously name will not be read, which points out the fact that when using xml::optional data should probably be initialized with default values prior to unserialization.

Similarly in an output stream an xml::optional element will only be serialized if non-optional elements or data follow :

 xos << xml::optional << xml::start( "non-existing-child" )
     << xml::end;

The non-existing-child will not be serialized, however given :

 xos << xml::optional << xml::start( "element" )
       << xml::start( "sub-element" )
       << xml::end
     << xml::end;

both elements will be serialized as usual.

Managing data

Supported data types are :

  • std::string
  • const char* (handled as strings and only for output streams)
  • bool
  • short
  • int
  • long
  • long long
  • float
  • double
  • long double
  • unsigned short
  • unsigned int
  • unsigned long
  • unsigned long long

Contents

The content of an element can be directly (un)serialized, for example :

xos << xml::start( "element" )
      << "the text content of the node"
    << xml::end;

Will produce the following XML document :

<?xml version="1.0" encoding="UTF-8" ?>
<element>
  the text content of the node
</element>

As convenience xml::content is a shortcut to manage the previous manipulation in one step, for example :

xos << xml::content( "element", "the text content of the node" );

Will perform the exact same operation.

A CDATA section happens to be unserialized transparently either directly or with an xml::content, however it must be manually serialized, for example :

xos << xml::start( "element" )
      << xml::cdata( "the CDATA section of the node" )
    << xml::end;

It is worth noting that xml::cdata can be used inside an xml::content :

xos << xml::content( "element", xml::cdata( "the CDATA section of the node" ) );

Reading the value of a node is also possible directly using the following method, either :

const std::string result = xis.value< std::string >();

Or by providing a default value if content is optional :

const std::string result = xis.value( "a default value" );

Alternatively another method reads the content inside a given node directly :

const std::string result = xis.content< std::string >( "element" );

Or also by providing a default value if the node and/or its content is optional :

const std::string result = xis.content( "element", "a default value" );

Note that the first forms require the template to be explicit whereas the second forms deduce it from the default value parameter type.

Testing whether a content is available or not in an input stream can be achieved by :

bool has_content = xis.has_content();

Attributes

Handling attributes involves using xml::attribute, for example :

xos << xml::start( "element" )
      << xml::attribute( "name", "the-name" )
    << xml::end;

produces the following XML document :

<?xml version="1.0" encoding="UTF-8" ?>
<element name="the-name"/>

Similarly :

 std::string name;
 xis >> xml::start( "element" )
       >> xml::attribute( "name", name )
     >> xml::end;

reads the name attribute into the variable name.

If the attribute might not exist, a default value can be provided :

 std::string name;
 xis >> xml::start( "element" )
       >> xml::attribute( "name", name, "the-name" )
     >> xml::end;

Similarly the writing of an attribute can be omitted if it matches a given default value :

std::string name = "the-name";
xos << xml::start( "element" )
      << xml::attribute( "name", name, "the-name" )
    << xml::end;

Because name equals the default provided the-name, it will not be outputted.

Reading an attribute is also possible in a one line statement, using either :

const std::string result = xis.attribute< std::string >( "attribute" );

Or by providing a default value if the attribute is optional :

const std::string result = xis.attribute( "attribute", "a default value" );

Note that the first version requires the template to be explicit whereas the second version deduces it from the default value parameter type.

Testing the presence of an attribute on an input stream can be achieved by :

bool has_attribute = xis.has_attribute( "an attribute" );

Lists of attributes

In the same way as nodes, attributes can also be enumerated, for example given the following document :

<root first="value 1" second="value 2"/>

A functor to handle the attributes when their names are not known beforehand would be :

class my_class
{
public:
    void my_method( const std::string& name, xistream& xis ) const;
};

Note that this functor has the exact same signature as the similar xml::list functor.

With my-instance being an object of type my_class, the document can be read by the following piece of code :

xis >> xml::start( "root" )
      >> xml::attributes( my_instance, &my_class::my_method );

The functor will be called a first time with name equal to first and a second time with name equal to second.
The xis parameter will be an input stream already placed on the root node.

Additionally STL-style functors are compatible with xml::attributes, for instance the function :

void my_function( xml::xistream& xis )
{
    xis >> ...
}

Can be used to read data with :

xis >> xml::start( "root" )
      >> xml::attributes( &my_function );

In the same way, the following functor :

class my_functor
{
public:
    void operator()( xml::xistream& xis )
    {
        xis >> ...
    }
};

Can be used like this :

xis >> xml::start( "root" )
      >> xml::attributes( my_functor() );

In this case a copy of the functor is made internally (similarly to most of the STL algorithms).

In order to pass the functor by reference the template parameter of xml::attributes has to be explicitly specified :

my_functor functor;
xis >> xml::start( "root" )
      >> xml::attributes< my_functor& >( functor );

Optional data

An optional content can be handled by the xml::optional feature :

xis >> xml::optional >> content;

The variable content will not be read if the current node does not have any content and no exception will be raised.

Just the same way an optional attribute can be skipped silently :

xis >> xml::start( "element" )
      >> xml::optional >> xml::attribute( "name", name );

The variable name will not be read if the attribute name does not exist in the XML document and no exception will be raised.

Similarly an optional attribute will be serialized to an output stream only if its parent element gets serialized :

 xos << xml::optional << xml::start( "non-existing-element" )
       << xml::optional << xml::attribute( "non-existing-attribute", 42 )
     << xml::end;

Neither the element nor the attribute are being serialized, however :

 xos << xml::optional << xml::start( "element" )
       << xml::optional << xml::attribute( "attribute", 42 )
       << xml::start( "sub-element" )
       << xml::end
     << xml::end;

yields the elements and the attribute in the output stream.

Namespaces

By default input streams ignore namespaces, that is the following XML document :

<ns:element xmlns:ns="http://www.example.org"/>

Can be unserialized with :

xis >> xml::start( "element" );

However namespaces are also supported (for instance for filtering) with the help of xml::ns :

xis >> xml::ns( "http://www.example.org" ) >> xml::start( "element" );

The xml::ns filter is only valid for the next operation.

Furthermore a prefix associated with a given namespace can be retrieved using xml::prefix, for instance the code :

std::string prefix;
xis >> xml::start( "element" )
      >> xml::prefix( "http://www.example.org", prefix );

Yields the string "ns" into prefix.

Output streams behave symmetrically, for instance :

xos << xml::ns( "http://www.example.org" ) << xml::start( "element" );

Yields the following document with a default namespace declaration :

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<element xmlns="http://www.example.org"/>

Setting a prefix is also possible, for instance :

xos << xml::ns( "http://www.example.org" ) << xml::start( "element" ) << xml::prefix( "http://www.example.org", "ns" );

Yields the following document :

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<ns:element xmlns:ns="http://www.example.org"/>

Processing instructions

A processing instruction can be added by the xml::instruction feature, for instance :

xos << xml::instruction( "xml-stylesheet", "type=\"text/xsl\" href=\"my_stylesheet.xsl\"" );

Yields the following line in the resulting document's prolog :

<?xml-stylesheet type="text/xsl" href="my_stylesheet.xsl"?>

Which effectively associates the document with the given style sheet.

Customizing for user data

Extending the library in order to provide support for custom types is usually as trivial as implementing a custom (un)serialization operator, for instance :

xml::xistream& operator>>( xml::xistream& xis, some_type& t )
{
  // ...
  return xis;
}

This is sufficient in order to (un)serialize this type as a base type  :

some_type t;
xis >> xml::start( "element" )
      >> t;

And of course also :

some_type t;
xis >> xml::content( "element", t );

But it might also be possible to (un)serialize it as an attribute :

some_type t;
xis >> xml::attribute( "attribute", t );

The only requirement being that the (un)serialization operator only reads data :

xml::xistream& operator>>( xml::xistream& xis, some_type& t )
{
  std::string data;
  xis >> data;
  // initialize t based on data
  return xis;
}

Should the operator perform tree walking operations (e.g. xml::start, xml::list, etc..) an exception will be raised.

XSL

Using the library

Only one include statement is necessary :

#include <xeuseuleu/xsl.hpp>

Should an error happen manipulating a transformation an exception of type xsl::exception which derives from std::exception will be raised.

Creating a transformation

The first step is to create the proper transformation depending on the needs.

The following code creates a transformation from an XSL document stylesheet.xsl to write the resulting output to a file output :

xsl::xftransform xft( "stylesheet.xsl", "output" );

The following code creates a transformation from an XSL document stylesheet.xsl to write the resulting output to an std::string :

xsl::xstringtransform xst( "stylesheet.xsl" );

After the transformation is finished the resulting output can be retrieved :

const std::string xml = xst.str();

The following code creates a transformation from an XSL document stylesheet.xsl to write the resulting output in a buffer :

xsl::xbuffertransform xbt( "stylesheet.xsl" );

The purpose of this type of transformation is to chain transformations.

Performing a transformation

A transformation outputs a result when receiving an XML input.

For instance the following code manually injects XML :

xsl::xstringtransform xst( "stylesheet.xsl" );
xst << xml::start( "root" )
      << xml::content( "element", 42 )
    << xml::end;

The transformation is actually performed when the input document is valid and has been completely received, therefore after the xml::end in this example.

The following demonstrates the possibility to directly send an xml::xistream to be transformed :

xsl::xstringtransform xst( "stylesheet.xsl" );
xml::xistringstream xis(
    "<root>"
        "<element/>"
    "</root>"
);
xst << xis;

Chaining transformations

In order to avoid a complex and monolithic stylesheet one solution is to divide it into several smaller stylesheets.
The xsl::xbuffertransform has been introduced specifically for the purpose of applying several transformations one after another.

For instance to apply stylesheet-1.xsl then stylesheet-2.xsl :

xsl::xstringtransform xst( "stylesheet-2.xsl" );
xsl::xbuffertransform xbt( "stylesheet-1.xsl" );
xml::xistringstream xis( "<root/>" );
xst << xbt << xis;

The final transformation cannot be an xsl::xbuffertransform.
Moreover the output generated by an xsl::xbuffertransform must be an XML document because it is handed as input to the next transformation.

An xsl::xbuffertransform can be used as many times as needed, for example :

xsl::xstringtransform xst( "stylesheet-2.xsl" );
xsl::xbuffertransform xbt( "stylesheet-1.xsl" );
xml::xistringstream xis( "<root/>" );
xst << xml::start( "root" )
      << xbt << xis
      << xbt << xis
    << xml::end;

Stylesheet parameters

Stylesheet parameters can be defined by using xsl::parameter, for example with the XSL document being :

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:param name="my-key" select="'default'"/>
  <xsl:template match="/root"/>
</xsl:stylesheet>

The following code sets the parameter my-key to expression :

xsl::xstringtransform xst( "stylesheet.xsl" );
xml::xistringstream xis( "<root/>" );
xst << xsl::parameter( "my-key", "expression" )
      << xis;