From boris at codesynthesis.com  Fri Oct  1 09:16:07 2010
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Fri Oct  1 09:08:17 2010
Subject: [odb-users] Support for database schema versioning/evolution
Message-ID: <boris.20101001151005@codesynthesis.com>

User 'kurige' at reddit/r/cpp posted[1] the following comment that
I am reposting here:

Better and better! Christmas has come early this year! Seriously, though,
keep up the good work - you guys are amazing. (On a related note: Is it
"you guys," plural? Or is CodeSynthesis a one-man-shop?)

First off, a bit of a correction. Apparently my terminilogy is mistaken.

"a database supports schema evolution if it permits modification of the
 schema without the loss of extant data; in addition, it supports schema
 versioning if it allows the querying of all data through user-definable
 version interfaces. ... we will consider the schema evolution as a special
 case of the schema versioning where only the current schema version is
 retained."[2]

So, what we are referring to is schema evolution.

I guess what I had in mind was not necessarily a full-blown schema evolution
system, but rather a set of "helper" classes and functions that would let me
manually write upgrade/downgrade scripts in C++ rather than in SQL, making it
completely independent of any specific backend database.

See here for example [3].

If you do end up going down the rabbit-hole, then at a minimum, whatever
versioning system you end up developing needs to be able to support the
following operations:

    * Add column
    * Drop column
    * Change column attributes
    * Change column name
    * Add class
    * Drop class
    * Change class name

Simply being able to add columns to existing tables is not enough. It's
essential that any versioning system support all of the above. If it
doesn't, then the first time I need to perform a modification to the
database that the ODB versioning doesn't support I'll be unable to use
the ODB versioning system from that point forward since the physical
database and the ODB generated model of the database will be in an
irreconcilable state on client's machines.

To complicate matters, there's no elegant solution to delete columns/tables
using "#pragma" directives withought adding quite a bit of unsightly cruft
to my header files, since even "deleted" columns would need to remain in
my header indefinitely with a "#pragma" above them indicated that they,
in fact, no longer exist.

A possible workaround would be for ODB to automagically create (at user's
request) and maintain a "version history" header that keeps track of all
changes made to a schema each time ODB is run. This could detect
additions/removals from the schema, modifications to column types, and
changes to the names of tables/columns via the db column('foo') and db
object table('bar') pragmas.

When I say automagically... I'm not kidding about the magic part. What
form that "version history" header file would take is a bit of a mystery
to me, but you might be able to use the semantic approach described in [2].


[1] http://www.reddit.com/r/cpp/comments/dkwwj/odb_compilerbased_orm_for_c/c111r9t

[2] "A semantic approach for schema evolution and versioning in object-oriented databases"
    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.19.7241&rep=rep1&type=pdf

[3] "Active Record Migrations"
    http://api.rubyonrails.org/classes/ActiveRecord/Migration.html

From boris at codesynthesis.com  Tue Oct  5 09:53:49 2010
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Tue Oct  5 09:41:56 2010
Subject: [odb-users] Support for database schema versioning/evolution
In-Reply-To: <boris.20101001151005@codesynthesis.com>
References: <boris.20101001151005@codesynthesis.com>
Message-ID: <boris.20101005145252@codesynthesis.com>

Hi Christopher,

Christopher Gateley <...> writes:

> First off, a bit of a correction. Apparently my terminilogy is mistaken.
> 
> "a database supports schema evolution if it permits modification of the
>  schema without the loss of extant data; in addition, it supports schema
>  versioning if it allows the querying of all data through user-definable
>  version interfaces. ... we will consider the schema evolution as a special
>  case of the schema versioning where only the current schema version is
>  retained."[2]
> 
> So, what we are referring to is schema evolution.

Well, yes, if we follow the above (academic) terminology. In practice,
what we are trying to achieve is automatic updating of the relational
database schema and, in some cases, the data stored in this database
based on the changes to persistent C++ classes. The paper refers to
what I would call "schema versions" as "schema snapshots". I think
"schema version" is a much better term.

My ideal workflow for schema evolution is this: the application developer
requests the ODB compiler to generate an upgrade schema (SQL file) between
two specific versions. He/she then runs this SQL file against the database,
similar to the way the initial schema is created, and the database is 
upgraded.


> I guess what I had in mind was not necessarily a full-blown schema evolution
> system, but rather a set of "helper" classes and functions that would let me
> manually write upgrade/downgrade scripts in C++ rather than in SQL, making it
> completely independent of any specific backend database.
> 
> See here for example [3].

They still refer to tables, columns, etc., in [3]. So it is just a wrapper
for executing DDL queries. Ideally we would want to support automatic schema
evolution and if the developer's input is required, then it should be in the
form of conversions between C++ values (see below).


> If you do end up going down the rabbit-hole, then at a minimum, whatever
> versioning system you end up developing needs to be able to support the
> following operations:
> 
>     * Add column

Adding a member would automatically add a column to the database. The
interesting part here is what to do with existing rows. The two possible
approaches that I can see are: (1) allow the developer to provide an 
expression that returns a default value if the value is not set and (2)
proactively set the default value for all existing rows.

With the first approach we would need to use some kind of a marker (e.g.,
NULL) to indicate the "not set" condition, which could be a problem


>     * Drop column

If we have a way to capture the information about the deleted member, then
generating the SQL query to remove the column will be easy:

#pragma db version(2)
#pragma db member(middle_) delete


>     * Change column attributes

I assume the most common scenario will be the change of a member's type.

// born_ used to be string in version 1
//
#pragma db member(born_) type (string)

#pragma db version(2)
date born_; // now it is date

The tricky part is how to convert the existing data from one type to
the other. Seems like the developer will have to provide the conversion
function and ODB will have to generate some C++ code (in addition to SQL)
to automatically convert the existing objects. Sounds messy.


>     * Change column name

// first_name_ used to be called first_
//
#pragma db member(first_name_) column("first")

#pragma db version(2)
string first_name_;


>     * Add class

This will be easy:

#pragma db version(2)
#pragma db object
class new_class
{
  ...
};


>     * Drop class

#pragma db version(2)
#pragma db object(old_class) delete


>     * Change class name

#pragma db object(person) table("people")

#pragma db version(2)
#pragma db object
class person
{
  ..
}

So the hard part in all these operations is how to upgrade the existing
data to the new schema. Let me know if you have some ideas on this.


> To complicate matters, there's no elegant solution to delete columns/tables
> using "#pragma" directives withought adding quite a bit of unsightly cruft
> to my header files, since even "deleted" columns would need to remain in
> my header indefinitely with a "#pragma" above them indicated that they,
> in fact, no longer exist.

I don't see why they should remain in the header indefinitely. You can
remove the cruft corresponding to a version as soon as there are no more
databases that you need to upgrade. I would expect most people to only 
keep a few versions at a time.


> A possible workaround would be for ODB to automagically create (at user's
> request) and maintain a "version history" header that keeps track of all
> changes made to a schema each time ODB is run. This could detect
> additions/removals from the schema, modifications to column types, and
> changes to the names of tables/columns via the db column('foo') and db
> object table('bar') pragmas.

No, creating and maintaining a "state" file is definitely not something
I would want to do. It is a really big can of worms (backups, parallel
builds, renames, etc., etc).

Boris

From raindog at macrohmasheen.com  Sun Oct 17 19:31:51 2010
From: raindog at macrohmasheen.com (raindog)
Date: Mon Oct 18 02:29:13 2010
Subject: [odb-users] Questions and future direction of ODB
Message-ID: <4CBB8767.5080904@macrohmasheen.com>

Hello,

I wanted to ask several questions about the development of ODB.

1. Why did you choose the GCC plugin system of clang? At this time clang 
is supposedly a fully featured C++ compiler that conforms to the 
standard, it seems like clang was designed specifically for these types 
of projects.

2. When will support for stored procedures come?

3. Have you thought about immitating the syntax of LINQ?

4. Can't the ODB code generator insert the "friend class odb::access;" 
in entities that it finds so that the user does not have to do that? 
Same with the #include <odb/core.hxx>

5. Comment: Finally a library that imitates STL lowercase naming!

6. Was your goal to have dependencies only on the standard library and 
the various database library front-ends? I say this because of your use 
of std::auto_ptr which makes it difficult to store things in STL containers.

7. Why does retrieving data from the database require being wrapped in a 
transaction?

8. Is this used anywhere in production yet?

9. What about "paging" (returning a subset of results per query based on 
some page #) support, some databases have support for it, but others 
like MS SQL don't, can we emulate that with ODB?

10. It looks like ODB will do really well for objects that need 
persisted, but what about for use cases where we just need to run 
arbitrary queries for say, generating a table and we don't really need 
persistent objects for that.

11. The limitation with having only one uncached result set per 
transaction, is this a design limitation or database limitation? In an 
application with a large number of concurrent users (website, game 
server, etc), having whole result sets cached can become a huge memory 
overhead.

12. Is it possible for ODB to connect to the database to validate that 
the persistence code one has written is in some way 'correct', in this 
case, the transient pragma could be removed or automatically determined.

13. How are relationships treated? It seems like you could just add a 
field to the persistable object for the FK of the relationship and then 
use the load functionality, but in that case it might be quite slow if 
you have returned a large result set and now you need to load 
individually all of the foreign entities one by one.


14. What is the dev roadmap?


Thanks!

From boris at codesynthesis.com  Mon Oct 18 12:25:28 2010
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Mon Oct 18 12:15:02 2010
Subject: [odb-users] Questions and future direction of ODB
In-Reply-To: <4CBB8767.5080904@macrohmasheen.com>
References: <4CBB8767.5080904@macrohmasheen.com>
Message-ID: <boris.20101018170727@codesynthesis.com>

Hi,

raindog <raindog@macrohmasheen.com> writes:

> 1. Why did you choose the GCC plugin system of clang? At this time clang  
> is supposedly a fully featured C++ compiler that conforms to the  
> standard, it seems like clang was designed specifically for these types  
> of projects.

I assume you mean "GCC plugin system over clang" above. There are several
reasons:

1. When we started working on ODB (which was about a year ago), Clang still
   wasn't a fully-conforming C++ compiler. In fact, the project announced
   full conformity only a couple of weeks ago, which still needs to be 
   tested by real-world usage.

2. As far as I know, Clang is not very well supported on Windows. Its
   primary target is OS X with Linux being an "also supported" platform.

3. GCC is a very mature, cross-platform C++ compiler. While Clang claims
   to provide a conforming C++ frontend, I am not sure about the whole 
   compilation system (runtimes, STL, etc, etc). While in most cases
   ODB only needs a C++ frontend, there are situations where one may
   need a complete C++ compiler on which ODB is based. This, for example,
   would be necessary if persistent classes used headers from another
   library which has to be configured for each specific C++ compiler.
   That's the reason, for example, why pre-compiled binaries for ODB
   include working GCC compiler binaries with all the libraries, etc.

If, however, in the future we see that Clang becomes a better choice
for the C++ compiler frontend, it probably won't be very difficult
to switch since ODB builds its own "semantics graph" from the GCC
syntax tree and generated all the code based on that.


> 2. When will support for stored procedures come?

I am not sure which C++ class constructs would be mapped to stored
procedures. Do you have a sample scenario in mind?


> 3. Have you thought about immitating the syntax of LINQ?

We already do something similar with the ODB Query Language, given
the constraints of C++.


> 4. Can't the ODB code generator insert the "friend class odb::access;"  
> in entities that it finds so that the user does not have to do that?  
> Same with the #include <odb/core.hxx>

While the friend declaration can be annoying, I don't think it is a
good idea for ODB to modify hand-written C++ headers. Plus, there is
no guarantee that the header won't be compiled by the native C++
compiler before ODB had a chance to change it.


> 5. Comment: Finally a library that imitates STL lowercase naming!

Yes, we decided to use the standard C++ naming convention.


> 6. Was your goal to have dependencies only on the standard library and  
> the various database library front-ends?

Yes, we tried to keep the external dependencies to a minimum.


> I say this because of your use of std::auto_ptr which makes it difficult
> to store things in STL containers.

By default, the ODB runtime returns objects as ordinary pointers so you
can use any smart pointer. We use auto_ptr in examples to avoid extra
dependencies (tr1, unfortunately, is not yet portable).

Plus, in the text release of ODB you will be able to specify your own
smart pointer so that if you specify, for example, boost::shared_ptr,
then that's what will be returned by the ODB runtime.


> 7. Why does retrieving data from the database require being wrapped in
> a transaction?

Most database systems that support proper transactions start and
terminate a transaction implicitly even if one is not started explicitly.
The drawback of this is that if you have 5 database operations, there
will be 5 transactions, which is wasteful. Plus, with 5 individual
transactions you don't have the consistent read guarantee. So we decided
to keep it simple and always require a transaction.


> 8. Is this used anywhere in production yet?

Yes, ODB is used in a couple of real-world projects that started "beta-
using" it before the official release.


> 9. What about "paging" (returning a subset of results per query based on  
> some page #) support, some databases have support for it, but others  
> like MS SQL don't, can we emulate that with ODB?

If the underlying database doesn't support this, then it will be hard
to emulate in ODB and hope for any kind of efficiency. On the other
hand, if the underlying database supports this, then it should be
possible to use even now with the native query syntax.


> 10. It looks like ODB will do really well for objects that need  
> persisted, but what about for use cases where we just need to run  
> arbitrary queries for say, generating a table and we don't really need  
> persistent objects for that.

You mean DDL queries? If so, then this is on the TODO list.


> 11. The limitation with having only one uncached result set per  
> transaction, is this a design limitation or database limitation?

It is a database limitation that I think is fairly common. Most database
systems have a notion of a connection and it is impossible to read result
sets for several queries over the same connection at the same time.


> 12. Is it possible for ODB to connect to the database to validate that  
> the persistence code one has written is in some way 'correct', in this  
> case, the transient pragma could be removed or automatically determined.

Hm, do you mean that ODB validates that the persistent class declarations
actually correspond to the database schema?


> 13. How are relationships treated? It seems like you could just add a  
> field to the persistable object for the FK of the relationship and then  
> use the load functionality, but in that case it might be quite slow if  
> you have returned a large result set and now you need to load  
> individually all of the foreign entities one by one.

Yes, that is the recommended way to handle this in the meantime. Proper
support for relationships is coming in the next version.


> 14. What is the dev roadmap?

The next release, due mid-to-end-November, will add support for relationships
and containers. In subsequent versions we will be adding support for other
database systems (PostgreSQL, SQLite, Oracle are all in the pipeline) plus
additional features based on developer feedback, just like yours ;-)

Boris

From raindog at macrohmasheen.com  Mon Oct 18 19:56:47 2010
From: raindog at macrohmasheen.com (Raindog)
Date: Mon Oct 18 19:56:52 2010
Subject: [odb-users] Questions and future direction of ODB
In-Reply-To: <boris.20101018170727@codesynthesis.com>
References: <4CBB8767.5080904@macrohmasheen.com>
	<boris.20101018170727@codesynthesis.com>
Message-ID: <4CBCDEBF.6090405@macrohmasheen.com>

On 10/18/2010 9:25 AM, Boris Kolpackov wrote:
> Hi,
>
> raindog<raindog@macrohmasheen.com>  writes:
>
> >  1. Why did you choose the GCC plugin system of clang? At this time clang
> >  is supposedly a fully featured C++ compiler that conforms to the
> >  standard, it seems like clang was designed specifically for these types
> >  of projects.
>
> I assume you mean "GCC plugin system over clang" above. There are several
> reasons:
>
> 1. When we started working on ODB (which was about a year ago), Clang still
>     wasn't a fully-conforming C++ compiler. In fact, the project announced
>     full conformity only a couple of weeks ago, which still needs to be
>     tested by real-world usage.
>
> 2. As far as I know, Clang is not very well supported on Windows. Its
>     primary target is OS X with Linux being an "also supported" platform.
>    
Windows support is being consistently improved. In fact, it compiles 
very easy out of the box now on windows.

> >  2. When will support for stored procedures come?
>
> I am not sure which C++ class constructs would be mapped to stored
> procedures. Do you have a sample scenario in mind?
>    
>>  10. It looks like ODB will do really well for objects that need
>>  persisted, but what about for use cases where we just need to run
>>  arbitrary queries for say, generating a table and we don't really need
>>  persistent objects for that.

>You mean DDL queries? If so, then this is on the TODO list.

Sorry, I meant something like a use case for #2. If I have a query or 
stored procedure that is not just pulling single entities from a 
database, IE i have several joins, etc, and I want to display these 
results in something like an HTML table, it seems like the current 
feature set makes that complex to do.

>
> >  3. Have you thought about immitating the syntax of LINQ?
>
> We already do something similar with the ODB Query Language, given
> the constraints of C++.
>    
If you are being bound to valid syntax of C++, isn't it possible to 
build a DSL on top of something like say, boost.proto similar to how 
boost.spirit is a great way for EBNF in C++

>
> >  4. Can't the ODB code generator insert the "friend class odb::access;"
> >  in entities that it finds so that the user does not have to do that?
> >  Same with the #include<odb/core.hxx>
>
> While the friend declaration can be annoying, I don't think it is a
> good idea for ODB to modify hand-written C++ headers. Plus, there is
> no guarantee that the header won't be compiled by the native C++
> compiler before ODB had a chance to change it.
>
>    
If you already generate code, what harm is there in making it a little 
easier to use by making it harder to forget "boilerplate" code.

> >  6. Was your goal to have dependencies only on the standard library and
> >  the various database library front-ends?
>
> Yes, we tried to keep the external dependencies to a minimum.
>
>
> >  I say this because of your use of std::auto_ptr which makes it difficult
> >  to store things in STL containers.
>
> By default, the ODB runtime returns objects as ordinary pointers so you
> can use any smart pointer. We use auto_ptr in examples to avoid extra
> dependencies (tr1, unfortunately, is not yet portable).
>
> Plus, in the text release of ODB you will be able to specify your own
> smart pointer so that if you specify, for example, boost::shared_ptr,
> then that's what will be returned by the ODB runtime.
>    

Ok, that makes sense, I've just been reading the documentation without 
looking at any of the source to ODB yet.

> >  9. What about "paging" (returning a subset of results per query based on
> >  some page #) support, some databases have support for it, but others
> >  like MS SQL don't, can we emulate that with ODB?
>
> If the underlying database doesn't support this, then it will be hard
> to emulate in ODB and hope for any kind of efficiency. On the other
> hand, if the underlying database supports this, then it should be
> possible to use even now with the native query syntax.
>    
It is possible in MS SQL, it's just not very easy, hence by abstracting 
the difficulty into a feature of the library, it would be much easier to 
use and rely upon.


> >  11. The limitation with having only one uncached result set per
> >  transaction, is this a design limitation or database limitation?
>
> It is a database limitation that I think is fairly common. Most database
> systems have a notion of a connection and it is impossible to read result
> sets for several queries over the same connection at the same time.
>    
Yes, but most support returning several result sets at once.

> >  12. Is it possible for ODB to connect to the database to validate that
> >  the persistence code one has written is in some way 'correct', in this
> >  case, the transient pragma could be removed or automatically determined.
>
> Hm, do you mean that ODB validates that the persistent class declarations
> actually correspond to the database schema?
>    
Yes. This way the user can receive some form of compile time 
verification that his code matches his schema, IE column names, table 
names, etc all match.

> >  14. What is the dev roadmap?
>
> The next release, due mid-to-end-November, will add support for relationships
> and containers. In subsequent versions we will be adding support for other
> database systems (PostgreSQL, SQLite, Oracle are all in the pipeline) plus
> additional features based on developer feedback, just like yours ;-)
>
> Boris
>
>    

From boris at codesynthesis.com  Tue Oct 19 09:50:43 2010
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Tue Oct 19 09:40:11 2010
Subject: [odb-users] Questions and future direction of ODB
In-Reply-To: <4CBCDEBF.6090405@macrohmasheen.com>
References: <4CBB8767.5080904@macrohmasheen.com>
	<boris.20101018170727@codesynthesis.com>
	<4CBCDEBF.6090405@macrohmasheen.com>
Message-ID: <boris.20101019145510@codesynthesis.com>

Hi,

Raindog <raindog@macrohmasheen.com> writes:

> Windows support is being consistently improved. In fact, it compiles  
> very easy out of the box now on windows.

That's good to know, thanks. Just to clarify, is this just the compiler
frontend or the whole toolchain? If it is the toolchain, does it reuse
some existing bits, e.g., from MinGW?


> Sorry, I meant something like a use case for #2. If I have a query or  
> stored procedure that is not just pulling single entities from a  
> database, IE i have several joins, etc, and I want to display these  
> results in something like an HTML table, it seems like the current  
> feature set makes that complex to do.

The "object-oriented database way" of doing it would be to pull individual
objects into the application's memory and then intersect them there using
C++. The advantage of this approach is that it may reduce the database
load since the processing will be done by the application instead of the
database server. I said "may" because it is also possible that this will
increase the database load if the number of returned objects is large.

But it seems that this is not a very popular approach (you are the second
person who asks for support for joins). So we will need to think of a way
to support this. Maybe something like a read-only, "view object":

#pragma view table("table1", "table2")
class test_view
{
  #pragma column("table1.v1")
  string v1;

  #pragma column("table2.v2")
  string v2;
};

db->query<test_view> (query::v1 == query::v2);

What do you think?


> If you are being bound to valid syntax of C++, isn't it possible to  
> build a DSL on top of something like say, boost.proto similar to how  
> boost.spirit is a great way for EBNF in C++

Yes, but which functionality will this provide that is not already
possible with the ODB Query Language? Can you give an example?

Plus, Spirit is nice on toy languages. When you need to handle something
non-trivial, things get very complicated very fast (I used Spirit to
create a compiler for CORBA IDL).


> If you already generate code, what harm is there in making it a little  
> easier to use by making it harder to forget "boilerplate" code.

I am not sure I follow you here. What is important to remember is that
the friend declaration is there not for the ODB compiler but for the
native C++ compiler that will be used to compile the generated code.
So here we would have to modify the hand-written class declarations,
not the generated code.


> Ok, that makes sense, I've just been reading the documentation without  
> looking at any of the source to ODB yet.

Ok, I guess we need to clarify this in the documentation. Added to the
TODO.


> Yes, but most support returning several result sets at once.

Ok, I will need to look into this in more detail for other databases,
but MySQL definitely doesn't support this.


> Yes. This way the user can receive some form of compile time  
> verification that his code matches his schema, IE column names, table  
> names, etc all match.

Hm, I see how this can be useful but I don't see how this can be 
implemented in a manageable way. Ideally the ODB compiler would do 
this checking when compiling the header. Automatically connecting
the database doesn't sound like a good idea. Perhaps we could use
a DDL file that was dumped by the database to perform this 
verification. Would this meet your requirements?

Boris

From raindog at macrohmasheen.com  Tue Oct 19 23:20:08 2010
From: raindog at macrohmasheen.com (Raindog)
Date: Tue Oct 19 23:20:12 2010
Subject: [odb-users] Questions and future direction of ODB
In-Reply-To: <boris.20101019145510@codesynthesis.com>
References: <4CBB8767.5080904@macrohmasheen.com>
	<boris.20101018170727@codesynthesis.com>
	<4CBCDEBF.6090405@macrohmasheen.com>
	<boris.20101019145510@codesynthesis.com>
Message-ID: <4CBE5FE8.1060003@macrohmasheen.com>

On 10/19/2010 6:50 AM, Boris Kolpackov wrote:
> Hi,
>
> Raindog<raindog@macrohmasheen.com>  writes:
>
> >  Windows support is being consistently improved. In fact, it compiles
> >  very easy out of the box now on windows.
>
> That's good to know, thanks. Just to clarify, is this just the compiler
> frontend or the whole toolchain? If it is the toolchain, does it reuse
> some existing bits, e.g., from MinGW?
>    
I'm not exactly sure about that to be honest. I seem to recall them not 
having their own library implementation yet, at least not a complete 
one, and IMO, it would serve them well just to use GCC's
>
> >  Sorry, I meant something like a use case for #2. If I have a query or
> >  stored procedure that is not just pulling single entities from a
> >  database, IE i have several joins, etc, and I want to display these
> >  results in something like an HTML table, it seems like the current
> >  feature set makes that complex to do.
>
> The "object-oriented database way" of doing it would be to pull individual
> objects into the application's memory and then intersect them there using
> C++. The advantage of this approach is that it may reduce the database
> load since the processing will be done by the application instead of the
> database server. I said "may" because it is also possible that this will
> increase the database load if the number of returned objects is large.
>    
I can see how in a website for example, it might reduce load on the DB 
by offsetting it to the front-end, which in general is easier to scale 
than a DB, however for general purpose data processing, this can lead to 
orders of magnitude slower performance due to the latency in retrieving 
data from the database, just think of how slow this approach can be if 
the database is not on the same machine as the application making the DB 
requests.

> But it seems that this is not a very popular approach (you are the second
> person who asks for support for joins). So we will need to think of a way
> to support this. Maybe something like a read-only, "view object":
>
> #pragma view table("table1", "table2")
> class test_view
> {
>    #pragma column("table1.v1")
>    string v1;
>
>    #pragma column("table2.v2")
>    string v2;
> };
>
> db->query<test_view>  (query::v1 == query::v2);
>
> What do you think?
>    
I think defining a structure with which one can store results is 
perfectly acceptable, although many times I think that just using a 
Boost.Tuple is appropriate. Where something like this:

       result r ("select a,b,c from t");

       for (result::iterator i (r.begin ()); i != r.end (); ++i)
       {
         cout << "a: " << boost::at<0>(*i) << " b: " boost::at<1>(*i) << 
" c: " << boost::at<2>(*i) << endl;
       }

Where the result::iterator dereferences to, for example, a 
boost::tuple<x,x,x>;

Also, I can understand not wanting to take external dependencies where 
possible, but I think when it makes sense, taking a dependency on boost 
can save a lot of headache if you are instead faced with implementing 
something already provided by boost. You may otherwise get in same 
scenario that STL programmers face when trying to use MFC.


>
> >  If you are being bound to valid syntax of C++, isn't it possible to
> >  build a DSL on top of something like say, boost.proto similar to how
> >  boost.spirit is a great way for EBNF in C++
>
> Yes, but which functionality will this provide that is not already
> possible with the ODB Query Language? Can you give an example?
>    
My suggestion there was not about providing additional features to ODB 
QL, but rather suggested that the compiler plugin might have been 
unneeded altogether.

> Plus, Spirit is nice on toy languages. When you need to handle something
> non-trivial, things get very complicated very fast (I used Spirit to
> create a compiler for CORBA IDL).
>    
Someone wrote C++ parser called scalpel with Spirit. They recently 
abandoned the project however due to inability to compete with Clang due 
to the project being only 1 person, and Recently Joel wrote a lisp or 
scheme compiler in Spirit.
>
> >  If you already generate code, what harm is there in making it a little
> >  easier to use by making it harder to forget "boilerplate" code.
>
> I am not sure I follow you here. What is important to remember is that
> the friend declaration is there not for the ODB compiler but for the
> native C++ compiler that will be used to compile the generated code.
> So here we would have to modify the hand-written class declarations,
> not the generated code.
>
>    
Right. Which is exactly the point I am saying is not needed. If you are 
generating code, your code generator can easily determine that you are 
missing the include, or missing the friend declaration. The fact that it 
is needed by the generated code is an implementation detail that the 
client of ODB should not need to worry about.

> >  Ok, that makes sense, I've just been reading the documentation without
> >  looking at any of the source to ODB yet.
>
> Ok, I guess we need to clarify this in the documentation. Added to the
> TODO.
>
>
> >  Yes, but most support returning several result sets at once.
>
> Ok, I will need to look into this in more detail for other databases,
> but MySQL definitely doesn't support this.
>    
In MS SQL a stored procedure can return several result sets. I think 
that pgsql has the same ability.
>
> >  Yes. This way the user can receive some form of compile time
> >  verification that his code matches his schema, IE column names, table
> >  names, etc all match.
>
> Hm, I see how this can be useful but I don't see how this can be
> implemented in a manageable way. Ideally the ODB compiler would do
> this checking when compiling the header. Automatically connecting
> the database doesn't sound like a good idea. Perhaps we could use
> a DDL file that was dumped by the database to perform this
> verification. Would this meet your requirements?
>    
I think so, but IMO that would be more complex to parse than just 
connecting to the database as you will be getting a lot more data than 
needed, plus, each DB implementation will have slightly different 
formats. I think an alternative is to connect to the database, gather 
only the data you need and store it for later use. IMO lacking this 
means to validate ones code would prevent a lot of people from using ODB 
as it could potentially become just too complicated to debug all your 
table/column name mismatches and in cases where you do not have 100% 
code coverage or testing, you might find out about these errors days or 
weeks down the line.


Maybe I am wrong, but it seems you are really limiting the scope, and 
therefore overall utility of ODB by not automating as much as possible. 
IMO what the C++ community is lacking is something that has 
feature-parity of something like Hibernate, LINQ, Entity Framework or 
ActiveRecord. Data access in C++ is still more complex today than it was 
in C# 1.0 when all you had was bare bones ADO.NET. I envision your 
approach with ODB as the way to give C++ programmers a path towards 
being able to easily solve the kinds of problems that ActiveRecord and 
Hibernate are able to solve, but at this point, C++ programmers can't 
even begin to think about something like ActiveRecord++ because there 
still isn't any solid foundation upon which one can build.

> Boris
>
>    

From boris at codesynthesis.com  Wed Oct 20 10:07:01 2010
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Wed Oct 20 09:56:22 2010
Subject: [odb-users] Questions and future direction of ODB
In-Reply-To: <4CBE5FE8.1060003@macrohmasheen.com>
References: <4CBB8767.5080904@macrohmasheen.com>
	<boris.20101018170727@codesynthesis.com>
	<4CBCDEBF.6090405@macrohmasheen.com>
	<boris.20101019145510@codesynthesis.com>
	<4CBE5FE8.1060003@macrohmasheen.com>
Message-ID: <boris.20101020154802@codesynthesis.com>

Hi,

Raindog <raindog@macrohmasheen.com> writes:
    
> Right. Which is exactly the point I am saying is not needed. If you are  
> generating code, your code generator can easily determine that you are  
> missing the include, or missing the friend declaration.

Yes, we can detect this. And, perhaps, issue a warning/error.


> The fact that it is needed by the generated code is an implementation 
> detail that the client of ODB should not need to worry about.

I still think we are not on the same page here. The friend declaration
is needed by the generated code so that it can be compiled by the native
compiler. If it were possible to put some magic in the generated code to
make the friend declaration unnecessary, I would be the first person to
advocate for it.

At the moment I don't know any way other than having this declaration
in the hand-written header. And automatically modifying it seems like
too brittle an approach to me.

Or do you see any other way?


> In MS SQL a stored procedure can return several result sets. I think  
> that pgsql has the same ability.

Well, that's not what we need. We need the ability to multiplex result
sets from multiple queries that were initiated at different times.


> I think so, but IMO that would be more complex to parse than just  
> connecting to the database as you will be getting a lot more data than  
> needed, plus, each DB implementation will have slightly different  
> formats.

Agreed.


> I think an alternative is to connect to the database, gather only the
> data you need and store it for later use. 

Yes, that's a pretty good idea. As long as we don't have to have this
in the ODB compiler. I think a separate client for each database system
will be a reasonable approach.


> Maybe I am wrong, but it seems you are really limiting the scope, and  
> therefore overall utility of ODB by not automating as much as possible.

Of course the goal is to automate as much as possible. The difficult
part is to find a way to automate things so that we won't bring more 
problems than we solve. Plus we have to work in the constraints of C++
and performance expectations of C++ developers.

Boris

From amr.ali.cc at gmail.com  Sat Oct 30 14:39:54 2010
From: amr.ali.cc at gmail.com (Amr Ali)
Date: Sat Oct 30 15:44:08 2010
Subject: [odb-users] [BUG] Missing paths in Automake build system files
Message-ID: <4CCC667A.9000700@gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I cloned your ODB ORM git repository into my project and added it as a
submodule. I tried the following steps to compile it ...

$ ./bootstrap
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `config'.
libtoolize: copying file `config/config.guess'
libtoolize: copying file `config/config.sub'
libtoolize: copying file `config/install-sh'
libtoolize: copying file `config/ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
configure.ac:12: installing `config/missing'
automake: no `Makefile.am' found for any configure output
automake: Did you forget AC_CONFIG_FILES([Makefile]) in configure.ac?
autoreconf: automake failed with exit status: 1

After facing this error, which after I looked in the file in question
(configure.ac) I found that most of the paths/directories/files are replaced
with a rather weird `__file__' or `__path__'.

I downloaded your TAR packages from your website and compared the build system
files in general and found out that this is a general phenomena. I'm rather
confused as of why you did this, as after I gone through your various versions
for the different packages (ex. odb, libodb, libodb-mysql, etc.) I found it to
be everywhere.

Example diff:
7c7
< AC_INIT([libodb], [1.0.0], [odb-users@codesynthesis.com])
- ---
> AC_INIT([libodb], [__value__(version)], [odb-users@codesynthesis.com])
56c56
< AC_CONFIG_FILES([libodb.pc Makefile odb/Makefile])
- ---
> AC_CONFIG_FILES([__path__(config_files)])
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAkzMZnoACgkQ2VxGY2VcpoiGjwCfdVEvcg5CGd4BV4t9w0u4ioUe
8iQAniI/yZtddfeZiyFlV+IGpz/lZqWc
=leZb
-----END PGP SIGNATURE-----

From boris at codesynthesis.com  Sun Oct 31 11:53:12 2010
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Sun Oct 31 11:45:20 2010
Subject: [odb-users] [BUG] Missing paths in Automake build system files
In-Reply-To: <4CCC667A.9000700@gmail.com>
References: <4CCC667A.9000700@gmail.com>
Message-ID: <boris.20101031174038@codesynthesis.com>

Hi Amr,

Amr Ali <amr.ali.cc@gmail.com> writes:

> I cloned your ODB ORM git repository into my project and added it as a
> submodule. I tried the following steps to compile it ...
> 
> [...]
> 
> After facing this error, which after I looked in the file in question
> (configure.ac) I found that most of the paths/directories/files are replaced
> with a rather weird `__file__' or `__path__'.

The source code from the repository does not contain autotools-based makefiles
(or Visual Studio project files). Rather, it contains templates for these
files as well as its own, custom build system. This build system provides
some of the features that autotools lack (e.g., non-recursive, multi-makefile
architectures, cross-project dependency tracking, massively-parallel builds,
etc). At the same time it is not as portable as autotools.

As a result, the custom build system is used internally for development as 
well as to automatically generate the autotools (and Visual Studio) files.

One drawback of this setup, as you have discovered, is that using the source
code from the repository is not as easy as bootstrap, configure, and make.
I see two ways to work around this:

1. Use the custom build system to build everything or to create the source
   distributions with the autotools build system. I have added the
   INSTALL-GIT file to each package that describes the necessary steps.

2. We can start creating source distribution snapshots that will contain
   the latest source code from the repository along with the bootstrapped
   autotools build system.

Let me know which approach will work best for you.

Boris