Loading External Resources
A guide to using external resources with XmlPrime
This topic contains the following sections.
Overview
        For security reasons, by default the set of available documents,
        collections and unparsed text resources is empty.  This means that the
        doc, document,
        collection and
        unparsed-text functions will always raise an
        error, and doc-available and
        unparsed-text-available will always return
        false.  This guide explains how to load
        external documents and query them with XmlPrime.
      
The Document Set
        The doc, document,
        collection and
        unparsed-text functions are all defined to
        be stable.  This means that every time they are
        called within an XQuery program or XPath expression they must return
        the same object.  Since we cannot make the same garuantee for external
        resources, the accessibility and content of the documents must be
        cached throughout the evaluation of the query.  The caching of
        resources is handled by the
        DocumentSetDocumentSetDocumentSet.
      
Becuase a document set contains the documents used during query evaluation, the document set must be bound to a name table, which is specified in the constructor.
When a resource is used by a query or expression, it is requested from the document set. If the resource (or the fact that the resource is unavailable) is cached in the document set then it is returned (or an error is raised). Otherwise the document set proceeds to retrieve the resource through its resolvers.
To avoid reloading resources, and to allow sharing of the cached documents, the document set can be shared between different queries and XPath expressions. The document set is designed to be thread-safe, so it can also be shared between queries and expressions executing concurrently (assuming that the name table used is also thread-safe, for example ConcurrentNameTableConcurrentNameTableConcurrentNameTable). The document set to be used for evaluation of a query or expression is specified by the DynamicContextSettings.DocumentSetDynamicContextSettings.DocumentSetDynamicContextSettings::DocumentSet property.
It is recommended that any documents passed as arguments to an XQuery 1.0 program or an XPath 2.0 expression are loaded through the document set to improve consistency.
Pre-populating the Document Set
The document set can be populated programmatically. This provides bindings from URIs to resources that override those specified by the resolvers. Documents, collections and unparsed-text resources can all be added to the document set before it is used.
Any documents contatining nodes specified in the context item or any parameters with a non-empty document URI are automatically added in a similar fashion.
Resolving Resources
The document set defines which documents are available via the document resolver, collection resolver and resource resolver which are passed in to the constructor. These are used to retrieve any documents that are not already in the cache
XmlPrime provides specialized interfaces to resolve resources rather than using an XmlResolverXmlResolverXmlResolver. This is so that resources already loaded in memory do not have to be serialized and reparsed. It also allows flexibility in which document representations are used.
Document Resolvers
            A document resolver is a class implementing the
            IDocumentResolverIDocumentResolverIDocumentResolver
            interface.  The interface includes the
            ResolveResolveResolve
            method which is called to resolve external documents as requested
            by the doc and
            document functions.
          
The method is passed the URI of the document to resolve, the document set itself and the name table to use when loading any new documents. The method returns null if the URI is not handled, returns the document if it was retrieved successfully, or throws an exception if there was an error retrieving the document. The document resolver should not attempt to add or retrieve the document with the requested URI from the document set, as this will result in a deadlock.
The document set is passed in for the case that a resolver wants to use other available resources to retrieve the document.
Two default implementations of IDocumentResolverIDocumentResolverIDocumentResolver are provided by XmlPrime
- UnparsedTextDocumentResolverUnparsedTextDocumentResolverUnparsedTextDocumentResolver
 - This resolver retrieves the unparsed text with the specified URI from the document set, and then attempts to parse it as a document.
 - XmlReaderDocumentResolverXmlReaderDocumentResolverXmlReaderDocumentResolver
 - 
              This resolver uses the supplied
              XmlReaderSettingsXmlReaderSettingsXmlReaderSettings
              to retrieve the document at the specified URL.  Note that this
              does not make the resource available to the
              
unparsed-textfunction. 
Collection Resolvers
            A collection resolver is a class implementing the
            ICollectionResolverICollectionResolverICollectionResolver
            interface.  The interface includes the
            ResolveResolveResolve
            method which is called to resolve external collections as requested
            by the collection function.
          
The method is passed the URI of the collection to resolve, the document set itself and the name table to use when loading any new documents. The method returns null if the URI is not handled, returns the collection if it is retrieved successfully, or throws an exception if there was an error retrieving the collection. If a null URI is passed in then this indicates that the default collection should be resolved.
              Any nodes returned as part of a collection must either have an
              empty document URI, or must be in the document set.  This is to
              enforce the rule in XQuery that
              doc(document-uri($N)) is $N is always
              true for any document node $N.
            
This is easiest to enforce if all documents returned are loaded from the document set.
The collection resolver should not attempt to add or retrieve the collection with the requested URI from the document set, as this will result in a deadlock.
Resource Resolvers
            A resource resolver is a class implementing the
            IResourceResolverIResourceResolverIResourceResolver
            interface.  The interface includes the
            ResolveResolveResolve
            method which is called to resolve external resources as requested
            by the unparsed-text function.
          
The method is passed the URI of the resource to resolve. It returns null if the URI is not handled, returns the resource if it was retrieved successfully, or throws an exception if there was an error retrieving the resource.
The XmlResourceResolverXmlResourceResolverXmlResourceResolver is a resource resolver that wraps the specified XmlResolverXmlResolverXmlResolver.
Using an XmlResolver to Resolve Documents
            The
            DocumentSet (XmlResolver, XmlReaderSettings)DocumentSet (XmlResolver, XmlReaderSettings)DocumentSet (XmlResolver^, XmlReaderSettings^)
            constructor initializes a new document set with a
            UnparsedTextDocumentResolverUnparsedTextDocumentResolverUnparsedTextDocumentResolver
            and an
            XmlResourceResolverXmlResourceResolverXmlResourceResolver
            wrapping the
            XmlReaderSettingsXmlReaderSettingsXmlReaderSettings
            and
            XmlResolverXmlResolverXmlResolver
            passed in.  Any document requested will first be retrieved as
            unparsed text, and then parsed to create a document.  This
            ensures that the resources returned by
            unparsed-text and
            doc remain consistent.  If a query or
            expression never uses the unparsed-text
            function then this results in the raw data of every document
            retrieved being unnecassarrily cached in memory.  In this case
            it is better to construct the
            DocumentSetDocumentSetDocumentSet
            using an
            XmlReaderDocumentResolverXmlReaderDocumentResolverXmlReaderDocumentResolver
            to avoid caching the unparsed data.  The code below shows how to
            set up a document set with an
            XmlUrlResolverXmlUrlResolverXmlUrlResolver
            without caching unparsed data.
          
XmlNameTable nameTable = new NameTable(); // this should be the name table used for the query/expression XmlResolver resolver = new XmlUrlResolver(); XmlReaderSettings readerSettings = new XmlReaderSettings(); readerSettings.NameTable = nameTable; readerSettings.XmlResolver = resolver; IDocumentResolver documentResolver = new XmlReaderDocumentResolver(readerSettings); DocumentSet documentSet = new DocumentSet(nameTable, documentResolver, null, null);
Setting the Types of External Resources
        It is often advantageous to specify the types of external resources.
        This can indicate that particular documents will conform to a
        particular schema for example, and can help improve static type
        checking and aid in optimization.  A call to
        doc, document or
        collection calls upon the 
          document
          type resolver
         or collection type resolver
        to identify the static type of the document or collection respectively.
      
        If a document or collection retrieved during evaluation of an XQuery
        program or XPath expression does not match the type declared by the
        document resolver or collection resolver then an
        XPST0004 type error is raised.
      
Document Type Resolvers
            A document type resolver implements the
            IDocumentTypeResolverIDocumentTypeResolverIDocumentTypeResolver
            interface.  When the URI of a document can be determined statically,
            then the
            ResolveResolveResolve
            method is called which returns the static type of the document.  If
            the URI of the document can not be determined statically, or the
            ResolveResolveResolve
            method returned null, then the type is set to the value of the
            DefaultTypeDefaultTypeDefaultType
            property, or document-node() if it returns
            null.
          
An implementation of IDocumentTypeResolverIDocumentTypeResolverIDocumentTypeResolver can be retrieved from a document set with the DocumentTypeResolverDocumentTypeResolverDocumentTypeResolver property. This resolver resolves all the documents requested statically, and returns their actual type.
The document type resolver is set by setting the StaticContextSettings.DocumentTypeResolverStaticContextSettings.DocumentTypeResolverStaticContextSettings::DocumentTypeResolver property.
Collection Type Resolvers
            A collection type resolver implements the
            ICollectionTypeResolverICollectionTypeResolverICollectionTypeResolver
            interface.  When the URI of a collection can be determined statically,
            then the
            ResolveResolveResolve
            method is called which returns the static type of the collection.  If
            the URI of the document can not be determined statically, or the
            ResolveResolveResolve
            method returned null, then the type is set to the value of the
            DefaultTypeDefaultTypeDefaultType
            property, or node()* if it returns
            null.
          
The static type of the default collection is determined by first passing a null URI to ResolveResolveResolve. If this returns null the process proceeds as above.
An implementation of ICollectionTypeResolverICollectionTypeResolverICollectionTypeResolver can be retrieved from a document set with the CollectionTypeResolverCollectionTypeResolverCollectionTypeResolver property. This resolver resolves all the documents requested statically, and returns their actual type.
