Import

From Sense/Net Wiki
Jump to: navigation, search
  •  
  •  
  •  
  •  
  • 100%
  • 6.0
  • Enterprise
  • Community
  • Planned

Overview

Import
Sense/Net ECMS has a Content Repository that stores every Content in the database. When you want to add new content or modify existing ones, you can use the built-in import tool to import them from the file system. This article describes the usage of the import tool for administrators and developers.

Details

Content files in the file system

When using the import tool you will need to have a folder in th file system that contains all the content and folders that will be imported into the database. Content are represented with xml files of .Content extension and any binary attachment is placed beside it as a regular file. An image for example can be represented with two files: a .Content file that describes the content to be created (defines the Content Type, display name, other Field values etc.) and an image file (ie. .png) that holds the binary image information. The folder structure corresponds to the Content Repository tree structure to be created in the database.

Content file format

The .Content files mentioned in the previous section are xml files containing all the metadata that is necessary to create, update or delete content in the Content Repository. The following example displays a simple file that represents an article:

<?xml version="1.0" encoding="utf-8"?>
<ContentMetaData>
  <ContentType>Article</ContentType>
  <ContentName>Workshop</ContentName>
  <Fields>
    <DisplayName>Workshop</DisplayName>
    <Lead><![CDATA[
      <p>: Do you have opinion about real enterprise criteria? What kind of opportunities are still hidden in .Net platform?
      We are inviting the great authorities of the subjects.</p>
    ]]></Lead>
    <Body><![CDATA[
      <p>We are constantly collecting professional opinions about Enterprise Content Management, .Net platform and open source software.
      The next workshop will be held on 23-24 November, 2014 in Vienna (Austria). Participation and social events are gratuitous. Find
      the agenda listed on page Enterprise Workshop, Vienna  and send your application for us.</p>
    ]]></Body>
    <Pinned>false</Pinned>
    <ArchiveDate>2013-06-30T00:00:00</ArchiveDate>
    <Index>0</Index>
  </Fields>
  <Permissions>
    <Clear />
  </Permissions>
</ContentMetaData>

The content xml may contain the following values:

  • ContentType: content type name.
  • ContentName (optional): name of the content to be created or updated. If it is not present, the name of the .Content file will be used.
  • Fields (optional): all metadata comes here. Values must be in the appropriate format to be able to be imported by the system (e.g. number fields cannot contain text, etc.).
  • Permissions (optional): explicite permission entries set on this content and whether the inheritance should be broken here. See next section for details.

Binary attachment

If a content (e.g. a file or a Portlet Page) has one or more binary fields, its .Content file xml contains an entry that refers a separate file containing the actual binary. The following example displays the content file of a resource content that refers another xml file containing the string resources:

<?xml version="1.0" encoding="utf-8"?>
<ContentMetaData>
	<ContentName>NotificationResources.xml</ContentName>
	<ContentType>Resource</ContentType>
	<Fields>
		<Index>10</Index>
		<Binary attachment="NotificationResources.xml" />
	</Fields>
</ContentMetaData>

Permission import

It is possible to import permissions for a content. The .Content file contains a Permissions section that describes how the system should set the permissions on the content. The following example displays all the possibilities of this feature:

<Permissions>
  <Break />
  <Clear />
  <Identity path="/Root/IMS/BuiltIn/Portal/Administrators">
    <See>Allow</See>
	<Preview>Allow</Preview>
    <PreviewWithoutWatermark>Allow</PreviewWithoutWatermark>
    <PreviewWithoutRedaction>Allow</PreviewWithoutRedaction>
    <Open>Allow</Open>
    <OpenMinor>Allow</OpenMinor>
    <Save>Allow</Save>
    <Publish>Allow</Publish>
    <ForceCheckin>Allow</ForceCheckin>
    <AddNew>Allow</AddNew>
    <Approve>Allow</Approve>
    <Delete>Allow</Delete>
    <RecallOldVersion>Allow</RecallOldVersion>
    <DeleteOldVersion>Allow</DeleteOldVersion>
    <SeePermissions>Allow</SeePermissions>
    <SetPermissions>Allow</SetPermissions>
    <RunApplication>Allow</RunApplication>
    <ManageListsAndWorkspaces>Allow</ManageListsAndWorkspaces>
  </Identity>
</Permissions>

As you can see, you may instruct the system that it should perform a permission break on this content, clear all inherited entries and add only one entry for the Administrators group to allow all built-in permissions.

Deleting content - from version 6.3

From version 6.3 it is possible to delete existing content by 'importing' a simple content file that states that the content should be deleted. This content file must not contain any fields, and may contain the name of the content. The following example demonstrates the usage of the delete attribute:

<?xml version="1.0" encoding="utf-8"?>
<ContentMetaData delete="true">
  <ContentName>Workshop</ContentName>
</ContentMetaData>

If the ContentName node is not present in the xml, the name of the content file will be used to find the content in the repository. If the content does not exist, a log entry will be written to the import log.

Importing version information - from version 6.4.1

From Sense/Net ECM version 6.4.1 it is possible to define the version information of a content in the import file. This means you may tell the system that the version of a particular content after import should be 1.0. This may be useful if you are importing to a document library where versioning is switched on, but you do not want all new documents to be created with the initial version 0.1 draft.

<?xml version="1.0" encoding="utf-8"?>
<ContentMetaData>
  <Fields>
     <Version>2.0</Version>
  </Fields>
</ContentMetaData>

Please note that this does not mean that you can import multiple versions of the same content - that is not yet possible. This is only about the version information of the latest version.

Limitations

  • You cannot update existing content with a smaller version.
  • You cannot import content with an explicit version defined, if the content is locked, pending or rejected
  • You cannot provide the state of the content. It is determined by the provided version number. A major version (e.g. 2.0) will become Approved, a minor version (e.g. 2.3) will become a Draft.

Usage

The tool is either located in the web folder, or in case you are using the source version you will find it under the

Source\SenseNet\Tools\Import\bin\Debug

folder. When executing without parameters the usage screen is displayed with all possible parameters:

Sense/Net Content Repository Import tool Usage:
Import [-?] [-HELP]
Import [-SCHEMA <schema>] [-SOURCE <source> [-TARGET <target>]] [-ASM <asm>]
       [-CONTINUEFROM <continuepath>] [-NOVALIDATE]
 
Parameters:
<schema>:       The filesystem path of the directory that contains Content Type Definitions and Aspects.
<source>:       The filesystem path of the file or directory that contains content to import.
<target>:       Sense/Net Content Repository path (folder, site, workspace, etc) as the import target
                (default: /Root).
<asm>:          The filesystem path of the directory that contains the required assemblies
                (default: location of Import.exe).
<continuepath>: The filesystem path of the file or directory as the restart point of the <source> tree.
NOVALIDATE:     Disables content validaton.
 
Comments:
The <schema>, <source> and <asm> paths can be valid local or network
    filesystem paths.
Schema elements (content type definitions and aspects) will be imported before
    any other contents if the -SCHEMA parameter is presented. During content
    import schema elements will be skipped even if <schema> is contained by
    <source>.

The following apply to the parameters:

  • -SCHEMA: (optional) use this parameter if you want to install/reinstall CTDs and Aspects. The path of Schema folder is to be provided. Schema folder can contain 'Aspects' for aspects or 'ContentTypes' for content type definition xmls or both to import.
  • -SOURCE: the path provided here should point to the file system directory containing the Content files to be imported.
  • -TARGET: the path provided here should be the Content Repository path of the Content to be created.
  • -ASM: (optional) by default this is the the folder of the import tool. It is advisable to give the webfolder bin path here if the import tool is not placed in the webfolder, as Content Handler dlls (system and third party) are required for the import - and they are always present in the webfolder.
  • -CONTINUEFROM: (optional) the import tool is able to continue importing from a given location if it was previously terminated, without the need of re-structuring the files in file system under the source folder.
  • -NOVALIDATE: (optional) if given, the fields of the content are saved without using field validation (ie. if an Integer Field is configured in the type's CTD to disallow negative numbers and NOVALIDATE option is given, the field will be imported without errors even if the .Content file contains negative number as the field value).

See #Example/Tutorials section for examples.

Import and indexing

By default, indexing is switched off at import time. Therefore, after every import operation it is necessary to reindex the whole Content Repository using the Index Populator tool.

Indexing can be switched on in the Import.exe.config, however it is not recommended and it is advised to use the IndexPopulator at the end of the import procedure as it is faster and the index created this way is optimized.

The IndexPopulator requires the website to be stopped, so it is highly recommended to shut down the portal before importing.

Configuration

Since the import tool is used in a very special use-case, its configuration (importer.exe.config) slightly differs from the system's web.config. The most important differences are the following:

  • the import config does not contain any loggingConfiguration sections, and therefore logging is disabled during import. Please note, that import has its own logging mechanism and a logfile is created after every import (see #Log).
  • operation trace is switched off (OperationTrace key).
  • journal is switched off (CreateJournalItems key).
  • audit log background logic is switched off (AuditEnabled key).
  • system-defined NodeObservers are switched off (DisabledNodeObservers key).
  • MSMQ is switched off (ClusterChannelProvider and MsmqChannelQueueName keys).
  • indexing is switched off (EnableOuterSearchEngine and IndexDirectoryPath keys). See #Import and indexing for details.
  • you can control whether the tool should skip missing content references (SkipImportingMissingReferences key) - only works for CreatedBy and ModifiedBy fields. Default value is False.
  • you can control whether the tool should skip binary attachments if they are missing (SkipBinaryImportIfFileDoesNotExist key) - useful if you want to update only file metadata. Default value is False.

Special file extensions

Some files that you want to import do not need to have a .Content file placed next to them to hold Content Type information. An image for example usually has all the information in the image file itself that is necessary to create an image content in the Content Repository: the binary image data, the name and the extension that indicates its type (Image). The import tool is able to detect such files and create the corresponding content in the Content Repository automatically.

From version 6.3, the content type detection is based on a setting that is stored in the Content Repository. You can find (and modify) it in the /Root/System/Settings/Portal.settings file as the UploadFileExtensions setting.

UploadFileExtensions: {
	"jpg": "Image",
	"jpeg": "Image",
	"gif": "Image",
	"png": "Image",
	"bmp": "Image",
	"svg": "Image",
	"svgz": "Image",
	"tif": "Image",
	"tiff": "Image",
	"xaml": "WorkflowDefinition",
	"DefaultContentType": "File"
}

Before version 6.3, this was done using a list of extension and content type pairs defined in the application configuration of the import tool. Below you can see an excerpt from the Importer.exe.config:

  <sensenet>
    <uploadFileExtensions>
      <add key=".jpg" value="Image" />
      <add key=".jpeg" value="Image" />
      <add key=".gif" value="Image" />
      <add key=".png" value="Image" />
      <add key=".bmp" value="Image" />
    </uploadFileExtensions>
  </sensenet>

Any file whose extension is not listed in this config section and is not a .Content file will be created as a content of File Content Type.

Log

The import tool creates a logfile placed next to the executable, with the name importlog.txt. This file contains the verbose output of the import tool with timestamps, and thus contains any errors or exceptions that occurred during import.

The importlog.txt is not a rolling file, and thus every import overwrites the log file of any previous import.

Example/Tutorials

All examples are executed from the Deployment folder. Importing a full repository:

..\Source\SenseNet\Tools\Import\bin\Debug\Import.exe 
   -SCHEMA ..\Source\SenseNet\WebSite\Root\System\Schema 
   -SOURCE ..\Source\SenseNet\WebSite\Root 
   -TARGET /Root 
   -ASM ..\Source\SenseNet\WebSite\bin
 
rem build index files
..\Source\SenseNet\WebSite\bin\IndexPopulator.exe

Importing Content Templates subtree:

..\Source\SenseNet\Tools\Import\bin\Debug\Import.exe 
   -SOURCE ..\Source\SenseNet\WebSite\Root\ContentTemplates 
   -TARGET /Root/ContentTemplates 
   -ASM ..\Source\SenseNet\WebSite\bin 
 
rem build index files
..\Source\SenseNet\WebSite\bin\IndexPopulator.exe

Importing all users:

..\Source\SenseNet\Tools\Import\bin\Debug\Import.exe 
   -SOURCE ..\Source\SenseNet\WebSite\Root\IMS 
   -TARGET /Root/IMS 
   -ASM ..\Source\SenseNet\WebSite\bin 
 
rem build index files
..\Source\SenseNet\WebSite\bin\IndexPopulator.exe

Related links

References

There are no external references for this article.