« View all Enterprise Content posts

Documentum problems and how to fix them: #1 - Character encoding

Written by Willem Lavrijssen on 12/10/17

Documentum has a very broad spectrum of enterprise document management features and solutions ranging from scanning software, over document management and business process management, to document generation. It goes without saying that, even after 15 years of working as a Documentum consultant, sometimes you still find things that were never an issue before.

Because of this, we’ll start a series of posts "Documentum problems and how to fix them", where we describe problems our team has encountered when implementing Documentum for customers and share how to fix them.

 

Problem: Errors on attribute field length

At the start of every new Documentum project, testers often test basic functionality, e.g. if attribute fields allow the number of characters specified as their maximum length. Although this seems pretty basic, in my current project, we found that entering special characters common in Western European characters sets (such as ë, é, etc.) results in undesired behavior - we had a character encoding problem.

Say, for example, a certain string attribute is defined to be 4-character length, and the user enters three normal characters and one special character (e.g. “theé”). The system throws an error stating that the string exceeds the maximum string length of 4 characters, confusing for regular end-users. Why does the system misinterpret the string length?

 

Getting to the bottom of it

As a first step in analyzing the issue, we entered the same string directly in the database (which was configured to use UTF-8). This confirmed that the database was capable of holding the string. Then, we did the same using DQLTester, to confirm that the content server was capable of passing the string to the database.

This also worked as expected. Which left us with only xCP and DFC as the only possible culprits. Since the stacktrace clearly suggested that the error was raised from the DFC level, we decided to raise a support request with OpenText. Meanwhile we found a support note stating that attribute length is measured in bytes, not in characters. This supports our findings, since the special characters take up 2 bytes instead of 1 for normal characters. OpenText’s surprising response to the support request: "I have researched the issue further and found that an improvement request was logged last year documenting the issue you describe and was reviewed by the Product Manager that covers core services  (Content server and DFC) but there was insufficient time to commit to addressing this for the 7.3 release and it has comments to indicate it would be reviewed again for the next major release."

The response also stated that raising the issue with partner or account managers could help in increasing the visibility of a core defect that is very hard to explain to end-users.

 

Solution: Setting up recognition of different character encoding types

The answer is attribute length is measured in bytes, not in characters, and special characters take up 2 bytes instead of 1 for normal characters.

After we raised a service request, this issue was associated with Feature Enhancement CS-49851 – “Server does not recognize a UTF-8 enabled database and unnecessarily errors on attribute length”, planned for a future Documentum release.



Topics:
Documentum, OpenText, Document Management






Willem Lavrijssen

Written by Willem Lavrijssen

Willem Lavrijssen is ECS Technical Consultant at AMPLEXOR, based in Eindhoven, The Netherlands. As a certified Documentum Proven Professional, Willem has over 15 years of strong implementation experience in Documentum product suite across a variety of industries.

Related posts

Comments