This article gives answers regarding the following questions:
What CMIS is
The Acronym CMIS stands for Content Management Interoperability Services. I think that the 'Content Management' part of it is quite clear to you. So let's focus on the IS part of CMIS. Interoperability means that a system which provides CMIS should be compatible regarding the Content Access to another one which provides CMIS. So it means that two CMIS capable systems are compatible regarding the CMIS access. The Service part means that CMIS has to be provided as a Web Service by using either a RESTFul or a SOAP one. So in theory CMIS was invented to make the content acccess more independent from the Content Managing System. It's a standard access layer realized as Web Service-s for accessing a Content Repository.
What CMIS not is
It's implemented by following an open standard. So specific features of a specific ECM-System or Content Repository may not be covered by CMIS. Let's understand it as the smallest subset of functionality which should be provided by any ECM-System or Content-Repository.
Some history
There is a consortium named O(pen) and A(dvanced) S(tandards) for I(nformation) S(ociety) which created the CMIS standard in May 2010. All bigger ECM players were part of this consortium (Adobe, Alfresco, ASG, Booz Allen Hamilton, Day Software, dotCMS, EMC, FatWire, fme AG, IBM, ISIS Papyrus, Liferay, Microsoft, Nuxeo, Open Text, Oracle, SAP, Saperion, WeWebU). So you can see that CMIS is quite a young standard.
Explaining the standard
So before we will get our hands on by testing CMIS with some typical vendors (EMC Documentum and Alfresco) let's point a spot light on the specification. The whole specification is available here: http://docs.oasis-open.org/cmis/CMIS/v1.0/os/cmis-spec-v1.0.pdf ,
The official definition is "The Content Management Interoperability Services (CMIS) standard defines a domain model and set of bindings that include Web Services and ReSTful AtomPub that can be used by applications to work with one or more Content Management repositories/systems."
The mentioned Domain model includes a Data Model. Some Meta data about the Repository (Capabilities, ...) is part of the Domain Model, too. There are four base types of objects:
- What is CMIS?
- Which purpose has CMIS?
- How to use it?
- Does it realize what it promises?
What CMIS is
The Acronym CMIS stands for Content Management Interoperability Services. I think that the 'Content Management' part of it is quite clear to you. So let's focus on the IS part of CMIS. Interoperability means that a system which provides CMIS should be compatible regarding the Content Access to another one which provides CMIS. So it means that two CMIS capable systems are compatible regarding the CMIS access. The Service part means that CMIS has to be provided as a Web Service by using either a RESTFul or a SOAP one. So in theory CMIS was invented to make the content acccess more independent from the Content Managing System. It's a standard access layer realized as Web Service-s for accessing a Content Repository.
What CMIS not is
It's implemented by following an open standard. So specific features of a specific ECM-System or Content Repository may not be covered by CMIS. Let's understand it as the smallest subset of functionality which should be provided by any ECM-System or Content-Repository.
Some history
There is a consortium named O(pen) and A(dvanced) S(tandards) for I(nformation) S(ociety) which created the CMIS standard in May 2010. All bigger ECM players were part of this consortium (Adobe, Alfresco, ASG, Booz Allen Hamilton, Day Software, dotCMS, EMC, FatWire, fme AG, IBM, ISIS Papyrus, Liferay, Microsoft, Nuxeo, Open Text, Oracle, SAP, Saperion, WeWebU). So you can see that CMIS is quite a young standard.
Explaining the standard
So before we will get our hands on by testing CMIS with some typical vendors (EMC Documentum and Alfresco) let's point a spot light on the specification. The whole specification is available here: http://docs.oasis-open.org/cmis/CMIS/v1.0/os/cmis-spec-v1.0.pdf ,
The official definition is "The Content Management Interoperability Services (CMIS) standard defines a domain model and set of bindings that include Web Services and ReSTful AtomPub that can be used by applications to work with one or more Content Management repositories/systems."
The mentioned Domain model includes a Data Model. Some Meta data about the Repository (Capabilities, ...) is part of the Domain Model, too. There are four base types of objects:
- Document Objects (entities)
- Folder Objects (container)
- Relation Objects (directional, optional)
- Policy Objects (optional)
Additional
sub-types may be defined in the repository as subtypes of these types.
(BUT these object types must not extend or alter the behavior or
semantic of the CMIS service. So there can exists constraints regarding
the Object type underneeth CMIS.) Each object has indeed an object id
(unique and constant). Every object has a set of named (not ordered)
properties(, whereby the repository should return the properties always
in a consistent order). A document may also have a content stream. In
fact a document can have multiple content streams by using
renditions.(CMIS allows to expose renditions BUT it provides no
capability to create or update renditions those were accessed through
the rendition services). Objects may have Access Controll Lists.
Properties are typed key value pairs. Whereby a property can be a single value or a multi-value one. There is no NULL value, instead a property is just not set. The following types are supported:
Inheritance should work the following way:
Properties are typed key value pairs. Whereby a property can be a single value or a multi-value one. There is no NULL value, instead a property is just not set. The following types are supported:
- string
- boolean
- decimal
- integer
- datetime
- uri
- id (type of the object id)
- html
Inheritance should work the following way:
- A base type does not have a base type
- Object type attributes (aspects not properties - fileable, queryable, ...) are not part of the inheritance
- The property definitions are inheritad
- The scope of a query on a given object-type is automatically expanded to include all descendat types
At least the following attributes are required for an OBJECT TYPE definition:
- id:id
- localName:string
- localNameSpace:string
- queryName:string
- displayName:string
- baseId:Enum (indicates the object type)
- parentId:id
- description:string
- creatable:boolean
- fileable: boolean (when it is a child of folder)
- queryable: boolean (can occur in a FROM clause of a query statement)
- controllablePolicy: boolean (can be controlled via policies)
- fullTextIndexed:boolean
- includedInSupertypeQuery: boolean
An object type definition may contain multiple PROPERTY DEFINITIONS:
- id: id
- localName: string
- localNameSpace: string
- queryName: string
- displayName: string
- description: string
- propertyType: Enum
- cardinality: Enum (single, multi)
- updateability: Enum (readonly, readwrite, whencheckedout, oncreate)
- inheritad: boolean
- required: boolean
- queryable: boolean (can occur in the WHERE clause)
- orderable: boolean
- choices: <ProperyChoiceType list> (allowed values)
- openChoice: boolean (regarding the choices attribute)
- defaultValue: <PropertyType>
Dependent
of the type, you also have other attributes: minValue, maxValue (for
numbers), resolution (for dates - Year, Date, Time), precision (for
decimal), maxLength (for String).
Content has the following properties:
Content has the following properties:
- streamId: id
- mimeType: String
- lenght: integer (lenght in bytes)
- kind: string (cmis:thumbnail OR a repository specific kind)
- title: string
- height: integer (for images)
- width: integer (for images)
- renditionDocumentId: id (if the rendition should be also accessed as a document object)
Additionally the CMIS standard describes which properties have to be assigned to a which base object type. Here a short diagram:
CMIS supports also Access Control. An Access Control List is a list of Access Control Entries. An ACE holds the 'principalId' and one or more Strings with the names of permissions. An ACE additional has a attribute 'direct' which indicates if the ACE is directly assigned to the object or if it was somehow derived. The permissions are 'cmis:read','cmis:write' and 'cmis:all'. CMIS allows to check if a specifc action would be allowed even by not applying the action. It's possible to call methods like 'can{$Action identifier}' (E.G. canUpdateProperties).
CMIS supports versioning. This means Folders, Documents, Relationships and Policies can be versioned. Each version of an object is itself an object. A version series is a list (order is relevant) of versions of a object. It's possible to get the latest version. CMIS makes not a semantical difference between Major and Minor versions in a version series. The repository is responsible to apply additional constraints regarding major versions. A new version of the document is created when a Check-in operation is performed. To be able to perform a Check-in it is required to prevously Check-out the object. A Check-out creates a Private Working Copy which then will be checked in later. After the Check-in operation the Private Working Copy has to disappear. CMIS supports to query for multiple versions. HOWEVER, the repository specific implementation has just to indicate if this is supported.
CMIS provides a typed based query service to discover objects by specific criterias. The query language is made of a subset of the SQL-92 grammar plus some extension (E.G. the INTREE function). You can imagine a Relational View on top of the Object oriented one. A virtual table is equivalent to a queryable object type. A virtual column is equivalent to a object property. The 'queryName' attribute is used to perform queries. There is just one virtual table for all object types those are part of one object hierarchy (Only true for user defined types, because instead there would be just one single table, right?). A virtual table does only provide access to the Meta Data and not the Content-Strems of an object.
- SELECT [virtual columns] FROM [virtual table names] WHERE [conditions] ORDER BY [sort specification]
The specification does in detail describe how the service interfaces are looking. So the following is just an overview:
- RepositoryServices
- getRepositories
- getRepositoryInfo
- getTypeChildren
- getTypeDescendants
- getTypeDefinition
- Navigation Services
- getChildren
- getDescendants
- getFolderTree
- getFolderParent
- getObjectParents
- getCheckedOutDocs
- Object services
- createDocument
- createDocumentsFromSource
- createFolder
- createRelationship
- createPolicy
- getAllowableActions
- getProperties
- getObjectByPath
- getContentStream
- getRenditions
- updateProperties
- moveObject
- deleteObject
- deleteTree
- setContentStream
- deleteContentStream
- Multi-filing service
- addObjectToFolder
- removeObjectFromFolder
- Discovery services
- query
- getContentChanges
- Versioning services
- checkOut
- cancelCheckOut
- checkIn
- getObjectOfLatestVersion
- getPropertiesOfLatestVersion
- getAllVersions
- Relationship services
- getObjectRelationships
- Policy Services
- applyPolicies
- removePolicy
- getAppliedPolicies
- ACL services
- getACL
- applyACL
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <atom:entry xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/" xmlns:cmism="http://docs.oasis-open.org/ns/cmis/messaging/200908/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:app="http://www.w3.org/2007/app" xmlns:cmisra="http://docs.oasis-4479 open.org/ns/cmis/restatom/200908/"> <atom:author> <atom:name>Al Brown</atom:name> </atom:author> <atom:id>urn:uuid:efe0542e-8933-4b3e-93f2-4d1caa3fc2d9</atom:id> <atom:title type="text">CMIS Example Document</atom:title> <atom:updated>2010-01-25T10:20:58.318-08:00</atom:updated> <atom:content type="text">some text</atom:content> <cmisra:object> 4488 <cmis:properties> <cmis:propertyId localName="rep-cmis:objectTypeId" propertyDefinitionId="cmis:objectTypeId"> <cmis:value>invoice</cmis:value> 4492 </cmis:propertyId> <cmis:propertyString localName="rep-cmis:name" propertyDefinitionId="cmis:name"> <cmis:value>CMIS Example Document</cmis:value> </cmis:propertyString> </cmis:properties> </cmisra:object> </atom:entry> |
The specification explains the CMIS AtomPub protocol in detail.
The other way is to use a SOAP-based Web Service. This is a message based approach, whereby the SOAP protocol is used to exchange XML messages. In detail such a Web Service is follwing the guidelines of the W(eb) S(ervice) I(nteroperability) Organization.
Intermediate summary
As we can see CMIS is not suitable to cover everything which you would like to to with your Content Repository. Features like for instance 'Create a Policy' could not be covered by a standard which aims to be System independent. In the specification, there are a lot MAY-s (may or may not) which means that two different CMIS implementations may even not have the same set of features available. The specification then allows to just set the related 'featureSupported' flag to 'false'. The following table may give you an idea what I am meaning:
MAY | SHOULD | MUST | |
More than one Repo | x | ||
Further Implementaion specific Meta Data | x | ||
Additional object types | x | ||
Content streams | x | ||
Renditions | x | ||
ACL-s | x | ||
Multi value properties | x | ||
Referential constraints | x | ||
Policies | x | ||
Relationships | x | ||
Value Constraints | x | ||
... | |||
Permanent and unique id-s | x | ||
Consistent Property order | x | ||
Meaningful error messages | x | ||
Property inheritance | x | ||
Normalized String identifier (E.G. camel case) | x | ||
Preserved order of multi values | x | ||
... | |||
Get a list of repositories via service method | x | ||
Unique Repository id | x | ||
Provide information for each object type def. if Contentstreams are supported or required | x | ||
Empty set as multiple value set disallowed | x | ||
queryName Property | x | ||
folder and document base types | x | ||
... |
Hands on
Alfresco is an Open Source ECM-System (http://www.alfresco,com). So let's try out Alfresco's CMIS implementation. Alfresco itself states that it has a full implementation of the specification (whatever that means regarding the fact that even a not fully featured Repository could be 100% compliant), So Alfresco 3.3 or higher is required.
(BTW: Alresco mentions also an Repository to Repository use case: http://wiki.alfresco.com/wiki/CMIS)
There are some Alfresco specific things. A custom Aspect in Alfresco is handled like a Policy which causes that they can not be created via CMIS. There are further examples as well, but you could say that Alfresco was just extending the standard by still beeing compliant. It seems that Alfresco uses the "OpenCMIS" framework from Apache. However, Jeff Potts from Alfresco wrote a good tutorial about CMIS and Alfresco, so this blog article will just follow parts of his tutorial.
The tutorial covers:
- How to authenticate against the repository (not CMIS specific)
- Getting basic repository information
- Create a folder and content via CMIS
curl -uadmin:admin http://localhost:8080/alfresco/s/cmis |
What you get is a lot of XML back. Actually what you are getting back here is the result of the 'getRepositoryInfo' method. This entry point explains where to find the specific resources to interact with.
There are collection elements those are providing a set URL-s those are buildung the RESTFul service.
... <collection href="http://localhost:8080/alfresco/s/cmis/s/workspace:SpacesStore/i/4 eb6a431-3c56-4767-816a-4ceca2295ae2/children"> <atom:title>root collection</atom:title> <cmisra:collectionType>root</cmisra:collectionType> </collection> ... <collection href="http://localhost:8080/alfresco/s/cmis/types"> <atom:title>type collection</atom:title> <cmisra:collectionType>types</cmisra:collectionType> </collection> ... |
The capabilities element tells you which parts of the CMIS specifications are implemented:
... <cmis:capabilityAllVersionsSearchable>false</cmis:capabilityAllVersions Searchable> <cmis:capabilityChanges>none</cmis:capabilityChanges> <cmis:capabilityContentStreamUpdatability>anytime</cmis:capabilityConte ntStreamUpdatability> ... |
The root collection above gives information about the root directory of your repository. So if you take the url of it then you will get the entries those are corresponding to objects inside the root folder of the repository.
The type collection above gives information about the types those are available inside the repositoy. You can also copy the URL of it to a CURL call to get the result back.
To create a new folder you need to create an xml file and to post it to the right URL:
<?xml version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/" xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/"> <title>Someco</title> <cmisra:object> <cmis:properties> <cmis:propertyId propertyDefinitionId="cmis:objectTypeId"><cmis:value>cmis:folder</cmis: value></cmis:propertyId> </cmis:properties> </cmisra:object> </entry> |
Here like such a POST could look via CURL:
curl -X POST -uadmin:admin "http://localhost:8080/alfresco/s/cmis/p/children" -H "Content-Type: application/atom+xml" -d @/${path}/testCreateSomecoFolder.atom.xml |
This creates a folder named Someco inside the root folder of the repository,
You also can upload content by using a POST statement. The content has to be part of the xml file, whereby it must be Base64 encoded:
openssl base64 -in ./sample-a.doc -out ./sample-a.doc.base64 |
At this point we stop following the tutorial by requesting an easier way to access the repository via CMIS. Indeed there is one, all we need is a CMIS client. So the first step is to load the client libraries from the OpenCMIS web site. Afterwards you can create a Java Project (E.G. by using Eclipse) which references the client libraries and it's dependencies. So let's do some examples:
- Connect to the repository
- Create a folder
- Create a document
- Get an object and it's properties
package de.ecmgeek.opencmis; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import org.apache.chemistry.opencmis.client.api.Repository; import org.apache.chemistry.opencmis.client.api.SessionFactory; import org.apache.chemistry.opencmis.client.runtime.SessionFactoryImpl; import org.apache.chemistry.opencmis.commons.SessionParameter; import org.apache.chemistry.opencmis.commons.enums.BindingType; public class ConnectionExample { public static void main(String[] args) { SessionFactory sessionFactory = SessionFactoryImpl.newInstance(); Map<String, String> params = new HashMap<String, String>(); params.put(SessionParameter.USER, "admin"); params.put(SessionParameter.PASSWORD, "admin"); params.put(SessionParameter.ATOMPUB_URL, "http://192.168.190.129:8080/alfresco/s/cmis"); params.put(SessionParameter.BINDING_TYPE, BindingType.ATOMPUB.value()); List<Repository> repositories = new ArrayList<Repository>(); repositories = sessionFactory.getRepositories(params); for (Repository r : repositories) { System.out.println("Id: " + r.getId() + ";Name: " + r.getName()); } Session session = repositories.get(0).createSession(); } } |
Let's just assume that we have a helper method that can return us the params. The the following code can be used to get the root folder:
package de.ecmgeek.opencmis; import java.util.Map; import org.apache.chemistry.opencmis.client.api.CmisObject; import org.apache.chemistry.opencmis.client.api.Folder; import org.apache.chemistry.opencmis.client.api.Repository; import org.apache.chemistry.opencmis.client.api.Session; import org.apache.chemistry.opencmis.client.api.SessionFactory; import org.apache.chemistry.opencmis.client.runtime.SessionFactoryImpl; public class RootFolderExample { /** * @param args */ public static void main(String[] args) { SessionFactory sessionFactory = SessionFactoryImpl.newInstance(); Map<String, String> params = Helper.createParams("admin", "admin", "http://192.168.190.129:8080/alfresco/s/cmis"); //By assuming that only one Repository is existent Repository repo = sessionFactory.getRepositories(params).get(0); Session session = repo.createSession(); Folder root = session.getRootFolder(); System.out.println("+-" + root.getName()); for (CmisObject child : root.getChildren()) { System.out.println("+--" + child.getName()); } } } |
This prints out the following in my case:
+-Firmen-Home +--Sites +--Datenverzeichnis +--Besucher-Home +--Benutzer-Homes +--Test |
Localization problem?: I tried to use the parameter 'params.put(SessionParameter.LOCALE_ISO639_LANGUAGE, "en");' to get the english name of the document, but this seems not to work. So the primary name of the document in Alfresco seems to be German because my JVM was configured to German as started the Alfresco the first time (Before the first startup you can set the following java arguments to have the names by default set to english:Duser.country=US -Duser.language=en ). Alfresco then supports multilingual titles.
Now let's try to get some Alfresco specific properties. It turns out that this is not simple as it could be. A folder can have a 'cm:title' attribute in Alfresco. Surprisingly there is no such a property in the property list of the above mentioned root folder. So what's wrong here? The answer seems to be the following:
- OpenCMIS parses the XML by taking the 'cmis:properties' tag into account. For Alfresco this XML node has also a subnode which is named 'alf:aspects'. The 'cm:title' property is part of an aspect. CMIS does not support the Alfresco aspects.
There is an
extension library available
'http://code.google.com/a/apache-extras.org/p/alfresco-opencmis-extension/'.
So you can set the 'OBJECT_FACTORY_CLASS' parameter to use the specific Alfresco object factory.
parameter.put(SessionParameter.OBJECT_FACTORY_CLASS, "org.alfresco.cmis.client.impl.AlfrescoObjectFactoryImpl"); |
However, you can imagine that you just can't do that for multiple repositories. Here is the code:
package de.ecmgeek.opencmis; import java.util.Map; import org.alfresco.cmis.client.AlfrescoFolder; import org.apache.chemistry.opencmis.client.api.CmisObject; import org.apache.chemistry.opencmis.client.api.Folder; import org.apache.chemistry.opencmis.client.api.Repository; import org.apache.chemistry.opencmis.client.api.Session; import org.apache.chemistry.opencmis.client.api.SessionFactory; import org.apache.chemistry.opencmis.client.runtime.SessionFactoryImpl; public class AlfrescoRootFolderExample { /** * @param args */ public static void main(String[] args) { SessionFactory sessionFactory = SessionFactoryImpl.newInstance(); Map<String, String> params = Helper.createParams("admin", "admin", "http://192.168.190.129:8080/alfresco/s/cmis"); //By assuming that only one Repository is existent Repository repo = sessionFactory.getRepositories(params).get(0); Session session = repo.createSession(); //Casting is not required at this point, just for demo purposes AlfrescoFolder root = (AlfrescoFolder) session.getRootFolder(); System.out.println("+-" + root.getPropertyValue("cm:title")); for (CmisObject child : root.getChildren()) { System.out.println("+--" + child.getPropertyValue("cm:title")); } } } |
The result is:
+-Firmen-Home +--Sites +--Datenverzeichnis +--Besucher-Home +--Benutzer-Homes +-- |
The last entry is empty because the Folder 'Test' got not assined a title.
So let's continue the tutorial by creating a folder and a document!
package de.ecmgeek.opencmis; import java.util.HashMap; import java.util.Map; import org.apache.chemistry.opencmis.client.api.Document; import org.apache.chemistry.opencmis.client.api.Folder; import org.apache.chemistry.opencmis.client.api.Repository; import org.apache.chemistry.opencmis.client.api.Session; import org.apache.chemistry.opencmis.client.api.SessionFactory; import org.apache.chemistry.opencmis.client.runtime.SessionFactoryImpl; import org.apache.chemistry.opencmis.commons.enums.VersioningState; public class CreateFolderAndDocExample { /** * @param args */ public static void main(String[] args) { SessionFactory sessionFactory = SessionFactoryImpl.newInstance(); Map<String, String> params = Helper.createParams("admin", "admin", "http://192.168.190.129:8080/alfresco/s/cmis"); //By assuming that only one Repository is existent Repository repo = sessionFactory.getRepositories(params).get(0); Session session = repo.createSession(); Folder root = session.getRootFolder(); //The new folder's properties Map<String,String> newFolderProps = new HashMap<String, String>(); newFolderProps.put("cmis:objectTypeId", "cmis:folder"); newFolderProps.put("cmis:name", "Test Folder 1"); Folder newFolder = root.createFolder(newFolderProps); System.out.println("Created folder with name: " + newFolder.getName()); //The new document' properties Map<String,String> newDocProps = new HashMap<String, String>(); newDocProps.put("cmis:objectTypeId", "cmis:document"); newDocProps.put("cmis:name", "Test Doc 1"); Document newDoc = newFolder.createDocument(newDocProps, null, VersioningState.NONE); System.out.println("Created document with name: " + newDoc.getName()); } } |
The last part of the tutorial shows how to basically use the CMIS Query Language. So let's at first assume that you already created a new document type inside your Alfresco repository. I named it 'ecg:document'. Then you can query all documents of this specific type by using the following source code:
package de.ecmgeek.opencmis; import java.util.Map; import org.apache.chemistry.opencmis.client.api.QueryResult; import org.apache.chemistry.opencmis.client.api.Repository; import org.apache.chemistry.opencmis.client.api.Session; import org.apache.chemistry.opencmis.client.api.SessionFactory; import org.apache.chemistry.opencmis.client.runtime.SessionFactoryImpl; public class QueryExample { public static void main(String[] args) { SessionFactory sessionFactory = SessionFactoryImpl.newInstance(); Map<String, String> params = Helper.createParams("admin", "admin", "http://192.168.190.129:8080/alfresco/s/cmis"); Repository repo = sessionFactory.getRepositories(params).get(0); Session session = repo.createSession(); String query = "SELECT * FROM ecg:document"; for (QueryResult r : session.query(query, false)) { System.out.println(r.getPropertyValueById("cmis:name")); } } } |
BTW: Alfresco has built in types like 'cm:content'. It's not possible to query directly for 'cm:content' because it is mapped to 'cmis:document'.
Outlook
This was only a small tutorial, and so it did not cover all facets of CMIS. However, I hope it helped to understand it a bit. A furher subject could be the Query Language. It seems that it provides enough stuff for another blog entry ;-).
No comments:
Post a Comment