[odb-users] Re: custom type mapping using value_traits specialization and memory leak prevention

Sat May 2 16:00:02 EDT 2020

Hello

Answering my own question. I analyzed the source code mssql/statement.cxx
and came to the conclusion that everything we need is already provided so
we don't need any other state machine information. Here's the source code
of read_callback() registered by value_traits<string,
id_long_nstring>::set_value()

using Poco::UTF8Encoding;
using Poco::UTF16Encoding;

/*
SQLRETURN SQLGetData(
      SQLHSTMT       StatementHandle,
      SQLUSMALLINT   Col_or_Param_Num,
      SQLSMALLINT    TargetType,
      SQLPOINTER     TargetValuePtr,
      SQLLEN         BufferLength,
      SQLLEN *       StrLen_or_IndPtr);

After SQLGetData() is called with BufferLength=0, the callback function is
called for the first time with the 'chunk' parameter set to any of the
following
values:
- chunk_null, if 'StrLen_or_IndPtr' == SQL_NULL_DATA;
- chunk_one, if 'StrLen_or_IndPtr' == 0;
- chunk_first, otherwise.
Since BufferLength=0, the buffer does not contain any data yet.

Unless the 'chunk' was chunk_null, chunk_one or chunk_last, the callback
function must set the variables pointed to by 'buffer' and 'size'
parameters.
SQLGetData() will store at most 'size' bytes to the buffer.
*/

void value_traits<string, id_long_nstring>::read_callback(
    void *context,          // User context.
    size_t *position,       // Position context. An implementation is free
                            // to use this to track position information. It
                            // is initialized to zero before the first call.
    void **buffer,          // [in/out] Buffer to copy the data to. On the
                            // the first call it contains a pointer to the
                            // long_callback struct (used for redirections).
    size_t *size,           // [in/out] In: amount of data copied into the
                            // buffer after the previous call. Out: capacity
                            // of the buffer.
    chunk_type chunk,       // The position of this chunk; chunk_first means
                            // this is the first call, chunk_last means
there
                            // is no more data, chunk_null means this value
is
                            // NULL, and chunk_one means the value is empty.
    size_t size_left,       // Contains the amount of data left or 0 if this
                            // information is not available.
    void *tmp_buffer,       // A temporary buffer that may be used by the
                            // implementation.
    size_t tmp_capacity     // Capacity of the temporary buffer.
)
{
    string &result(*static_cast<string *>(context));
    if(chunk == chunk_null || chunk_one) {
        result.clear();
        return;
    }

    if(chunk == chunk_first) {
        *buffer = tmp_buffer;
        *size = tmp_capacity;
        return;
    }

    /* Convert at most
     * (char *)(*buffer) + *size - (char *)tmp_buffer
     * bytes containing the UTF16 character sequence from 'tmp_buffer' and
     * append the resulting UTF8 characters to the 'result'. If the sequence
     * is truncated, move the unconverted bytes to the beginning of the
     * 'tmp_buffer', increase the pointer stored in the variable pointed to
     * by 'buffer' by the number of bytes containing the unconverted UTF16
     * characters and decrease the variable pointed to by 'size' by the said
     * number.
     */

    UTF8Encoding utf8;
    UTF16Encoding utf16(UTF16Encoding::LITTLE_ENDIAN_BYTE_ORDER);
    size_t chunk_left = (char *)(*buffer) + *size - (char *)tmp_buffer;
    auto chunk_p = (unsigned char *)(tmp_buffer);
    while(chunk_left != 0) {
        auto char_size = utf16.sequenceLength(chunk_p, chunk_left);
        if(char_size <= 0) break;
        if((size_t)char_size > chunk_left) break;
        auto ch = utf16.queryConvert(chunk_p, chunk_left);

        unsigned char utf8_char[6];
        auto utf8_char_size = utf8.convert(ch, utf8_char,
sizeof(utf8_char));
        result.append((char *)utf8_char, utf8_char_size);

        chunk_p += char_size;
        chunk_left -= char_size;
    }

    if(chunk == chunk_last) {
        if(chunk_left == 0) return;

        Scope;
        Error << "nvarchar contains truncated data, bytes left " <<
chunk_left;
        return;
    }

    if(chunk_left != 0) memmove(tmp_buffer, chunk_p, chunk_left);
    *buffer = (char *)tmp_buffer + chunk_left;
    *size = tmp_capacity - chunk_left;
}

On Sat, May 2, 2020 at 6:50 PM Sten Kultakangas <ratkaisut at gmail.com> wrote:

> Hello
>
> I am implementing nvarchar <-> utf8-encoded std::string database type
> mapping. Everything seems to be easy for nvarchar fields not exceeding
> certain limit in length:
>
> #include <cstring>
> #include <Poco/UnicodeConverter.h>
> #include "core/db_types.h"
>
> using Poco::UnicodeConverter;
> using namespace std;
>
> namespace odb {
> namespace mssql {
>
> void value_traits<string, id_nstring>::set_value(string &value,
> const ucs2_char *buffer, size_t buffer_size, bool is_null)
> {
>   if(is_null) value = "";
>   else UnicodeConverter::convert(buffer, buffer_size, value);
> }
>
>
> void value_traits<string, id_nstring>::set_image(ucs2_char *buffer,
> size_t buffer_size, size_t &actual_size, bool &is_null, const string
> &value)
> {
>   Poco::UTF16String utf16;
>   UnicodeConverter::convert(value, utf16);
>
>   is_null = false;
>   actual_size = utf16.size();
>   if(actual_size > buffer_size) actual_size = buffer_size;
>   memcpy(buffer, utf16.data(), actual_size * sizeof(ucs2_char));
> }
>
> }
> }
>
> However, i would like to implement the specialization for the
> id_long_nstring type also so i can work with nvarchar fields exceeding the
> length limit. The main concern is whether the callback is called with
> chunk_type=chunk_last even in the case of an exception thrown due to an I/O
> error. I could not find any destructor to prove that the callback will be
> called in such scenario. If the callback is not called in the case of an
> I/O error, the non-trivially destructible object referenced in the "user
> context" parameter will not be destroyed and a memory leak will occur. If
> there is such a limitation, then i must design a trivially destructible
> state machine object and place it in the provided buffer to prevent memory
> leaks.
>
> Can you confirm my concern that the callback will not be called in the
> case of an exception thrown during the set_value/set_image operation for
> the id_long_nstring type ?
>
> Best regards,
> Sten Kultakangas
>
>