Documentum problems and how to fix them: #2 – Using diacritic characters in search

    Learn how to fix a common problem in Documentum: using diacritic characters in case insensitive xCP searches and Postgres.

    Subscribe to our blog

    In the second post of the series “Documentum problems and how to fix them”, where we describe problems our team has encountered when implementing Documentum for customers and share how to fix them, we’ll look at using diacritic characters in case insensitive xCP searches and Postgres.


    Problem: Using diacritic characters in search

    One of our testers found that searching on a search term containing a diacritic character did not work in xCP: the system did not return any results while clearly there were documents matching the criteria entered by the tester. 

     

    Solution: Change locale support settings in the database

    Activating DFC-tracing revealed the actual query that was being run. As it turns out, OpenText has implemented a case-insensitive search in xCP using uppercase: the database converts the values in the search column to uppercase and xCP uses java to turn the search term entered by the user into uppercase, e.g. étienne becomes ÉTIENNE. This makes sense and works on an Oracle database, but not on a Postgres database.

    The behavior of the UPPER function in Postgres turns out to depend on the LC_COLLATE database setting. The LC_COLLATE setting determines how the database handles locale specific data, such as alphabets, sorting, number formatting, etc.

    See for example:

    SHOW LC_COLLATE;
     -- "fr_FR.UTF-8"
    SELECT upper('étienne');
     -- "ÉTIENNE"

    Versus

    SHOW LC_COLLATE;
     -- "C"
    SELECT upper('étienne');
     -- Étienne

     

    Since Java translates the criteria entered by the user as ÉTIENNE, obviously, the UPPER function behavior with LC_COLLATE set to "C" (which is the default) will not match.

    To fix this behavior, we needed to take the following steps in order for the search function to work:

    1. Change the LC_COLLATE setting to Dutch_Netherlands.1252
    2. Set the PGCLIENTENCODING windows environment variable to UTF8.
    3. Use the PostgreSQL ANSI (x64) odbc datasource instead of the PostgreSQL Unicode (x64) datasource.
    Published on 30/10/17    Last updated on 21/06/18

    #Documentum, #OpenText, #Document Management

    About the author

    Willem Lavrijssen is ECS Technical Consultant at AMPLEXOR, based in Eindhoven, The Netherlands. As a certified Documentum Proven Professional, Willem has over 15 years of strong implementation experience in Documentum product suite across a variety of industries.

    SUBSCRIBE TO OUR BLOG

    Participate in this discussion