From boris at codesynthesis.com  Mon Apr  1 04:57:46 2024
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Mon Apr  1 04:48:11 2024
Subject: [odb-users] Re: How to perform an aggregate subquery
In-Reply-To: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com>
References: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com>
Message-ID: <boris.20240401105527@codesynthesis.com>

Aldo Laiseca <alaiseca@gmail.com> writes:

> I need to find the most recent T1 (based on field timestamp_action) having a foreign key to T2 matching a particular value of T2.name. In SQL, something like this: 
> 
> select t1.* from t1
>   where t1.timestamp_action = (select max(t1.timestamp_action) 
>                                			 from t1 join t2 on t1.id_t2 = t2.id
>                                    		   where t2.name = ?))
> 
> 
> How could I write such a query using ODB?

By using a view:

https://www.codesynthesis.com/products/odb/doc/manual.xhtml#10

Generally, a lot of "how do I do X in ODB" can be answered by at least
skimming through the documentation once so that you have an idea of the
functionality available.

From alaiseca at gmail.com  Tue Apr  2 18:54:50 2024
From: alaiseca at gmail.com (Aldo Laiseca)
Date: Fri Apr  5 08:31:42 2024
Subject: [odb-users] Re: How to perform an aggregate subquery
In-Reply-To: <boris.20240401105527@codesynthesis.com>
References: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com>
	<boris.20240401105527@codesynthesis.com>
Message-ID: <F8D98861-D04E-45D6-A5EE-2CC1C16A6EB1@gmail.com>

Yes, actually I had figured out that a view was the approach to follow. I declared this: 

#pragma db view object(T1) object(T2)
struct ViewT1T2 {

    #pragma db column("max(" + T1::timestamp_action + ")")
    #pragma db type("TIMESTAMP")
    boost::posix_time::ptime maxTimestamp;
};

However, after that, I don?t know how to recover the entire T1 object having the max value. Right now I solved my requirement in two steps: 

1. Execute a query_value against the view to grab the maximum timestamp value.
2. Execute a query_one against the object T1 to get the row having the row matching the value obtained in step 1. 

Is there any way to do the above in one step? I read about object loading views; however, I don?t see how to include a max condition in a query in an object loading view. 

Thanks


> El 1 abr 2024, a las 2:57, Boris Kolpackov <boris@codesynthesis.com> escribi?:
> 
> Aldo Laiseca <alaiseca@gmail.com> writes:
> 
>> I need to find the most recent T1 (based on field timestamp_action) having a foreign key to T2 matching a particular value of T2.name. In SQL, something like this: 
>> 
>> select t1.* from t1
>>  where t1.timestamp_action = (select max(t1.timestamp_action) 
>>                               			 from t1 join t2 on t1.id_t2 = t2.id
>>                                   		   where t2.name = ?))
>> 
>> 
>> How could I write such a query using ODB?
> 
> By using a view:
> 
> https://www.codesynthesis.com/products/odb/doc/manual.xhtml#10
> 
> Generally, a lot of "how do I do X in ODB" can be answered by at least
> skimming through the documentation once so that you have an idea of the
> functionality available.


From finjulhich at gmail.com  Mon Apr  8 21:40:57 2024
From: finjulhich at gmail.com (MM)
Date: Mon Apr  8 21:31:41 2024
Subject: [odb-users] Re: How to perform an aggregate subquery
Message-ID: <CA+Jb2D2bw12izbFROKTFnX=f+idf=JK4arLJv=+rA=Vk9aPWaA@mail.gmail.com>

On Fri, 5 Apr 2024 at 13:41, Aldo Laiseca <alaiseca@gmail.com> wrote:

> Yes, actually I had figured out that a view was the approach to follow. I
> declared this:
>
> #pragma db view object(T1) object(T2)
> struct ViewT1T2 {
>
>     #pragma db column("max(" + T1::timestamp_action + ")")
>     #pragma db type("TIMESTAMP")
>     boost::posix_time::ptime maxTimestamp;
> };
>
> However, after that, I don?t know how to recover the entire T1 object
> having the max value. Right now I solved my requirement in two steps:
>
> 1. Execute a query_value against the view to grab the maximum timestamp
> value.
> 2. Execute a query_one against the object T1 to get the row having the row
> matching the value obtained in step 1.
>
> Is there any way to do the above in one step? I read about object loading
> views; however, I don?t see how to include a max condition in a query in an
> object loading view.
>
> Thanks
>
>
I have the same question here

class B {
date begin;
date end;
};

class A {
  std::vector<B>  bs_;
};

both A and B are persistent....    B's table has object-id column pointed
to from A. and index column and value_* columns.

How to write the view to pick up the B row that matches the max index of
the vector?
Is the max() aggregation expressible neatly in a view?

Rds,
From boris at codesynthesis.com  Tue Apr  9 23:40:26 2024
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Tue Apr  9 23:30:54 2024
Subject: [odb-users] Re: How to perform an aggregate subquery
In-Reply-To: <CA+Jb2D2bw12izbFROKTFnX=f+idf=JK4arLJv=+rA=Vk9aPWaA@mail.gmail.com>
References: <CA+Jb2D2bw12izbFROKTFnX=f+idf=JK4arLJv=+rA=Vk9aPWaA@mail.gmail.com>
Message-ID: <boris.20240410053453@codesynthesis.com>

MM <finjulhich@gmail.com> writes:
 
> class B {
> date begin;
> date end;
> };
> 
> class A {
>   std::vector<B>  bs_;
> };
>  
> How to write the view to pick up the B row that matches the max index of
> the vector?
> Is the max() aggregation expressible neatly in a view?

ODB currently does not support containers in views (with the
exception of inverse containers that establish relationships).
So to accomplish this you will need to use a table (or native)
view over the container's underlying table.

From boris at codesynthesis.com  Wed Apr 10 00:02:46 2024
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Tue Apr  9 23:53:15 2024
Subject: [odb-users] Re: How to perform an aggregate subquery
In-Reply-To: <F8D98861-D04E-45D6-A5EE-2CC1C16A6EB1@gmail.com>
References: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com>
	<boris.20240401105527@codesynthesis.com>
	<F8D98861-D04E-45D6-A5EE-2CC1C16A6EB1@gmail.com>
Message-ID: <boris.20240410055626@codesynthesis.com>

Aldo Laiseca <alaiseca@gmail.com> writes:

> Yes, actually I had figured out that a view was the approach to
> follow. I declared this: 
> 
> #pragma db view object(T1) object(T2)
> struct ViewT1T2 {
> 
>     #pragma db column("max(" + T1::timestamp_action + ")")
>     #pragma db type("TIMESTAMP")
>     boost::posix_time::ptime maxTimestamp;
> };
> 
> However, after that, I don?t know how to recover the entire T1
> object having the max value. Right now I solved my requirement
> in two steps:
> 
> 1. Execute a query_value against the view to grab the maximum
>    timestamp value.
> 2. Execute a query_one against the object T1 to get the row having
>    the row matching the value obtained in step 1. 
> 
> Is there any way to do the above in one step? I read about object loading
> views; however, I don?t see how to include a max condition in a query in
> an object loading view.

The first question that you need to answer is how to do this in SQL.
Then try to map this answer to an appropriate ODB view. In SQL you
cannot just say `SELECT ... WHERE max(timestamp_action)` (well, maybe
there is support for something like this in the specific database that
you are using).

From me at raphieps.com  Sat Apr 13 15:13:44 2024
From: me at raphieps.com (Raphael Palefsky-Smith)
Date: Sat Apr 13 15:04:26 2024
Subject: [odb-users] Object Loading Views with Containers / Alternatives?
Message-ID: <CADwJ639f2m0VDDXaBYngUYv6D+F9MywusPh5dtmOvtKbOVSMoQ@mail.gmail.com>

Hello! When it first starts up, my application loads a top-level object
with a deep hierarchy of related objects stored in odb::vectors.

For example, say I have an Employer that has many Employees, who each have
many Pets. These are all stored in odb::vectors, though I'm open to using
other schemes to make this work. Some pseudocode:

#pragma db object
class employer {
    unsigned long id;
    string address;
    odb::vector<shared_ptr<employee>> employees; // non-lazy, need at
initial load
};

#pragma db object
class employee {
    unsigned long id;
    string name;
    odb::vector<shared_ptr<pet>> pets; // non-lazy, need at initial load
};

#pragma db object
class pet {
    unsigned long id;
    string nickname;
};

For a given Employer ID, I'd like to load all the associated Employees and
all their Pets (along with the non-relational data members on those
Employer/Employees/Pets). It would be great to perform this load all in one
shot with a single JOIN'd SELECT; the naive db.load() executes a ton of
individual statements and is far too slow. It seems like Object Loading
Views are designed exactly for this purpose, but I can't seem to get them
working with containers.

My object loading view looks like this:

#pragma db view object(employer) object(employee) object(pet)
struct employer_with_employees_and_pets {
    shared_ptr<employer> e;
};

When I query using this view, the stderr_tracer reports that individual
SELECTs are still being executed for each employee and pet instance. Is
there a way to modify the view so that it runs with a single SELECT?

If Object Loading Views are incompatible here, is there a workaround, even
if it involves hand-writing a bunch of SQL? Performance is more important
than source-code elegance in this specific case.

Thank you!
From boris at codesynthesis.com  Fri Apr 19 10:35:53 2024
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Fri Apr 19 10:26:10 2024
Subject: [odb-users] Object Loading Views with Containers / Alternatives?
In-Reply-To: <CADwJ639f2m0VDDXaBYngUYv6D+F9MywusPh5dtmOvtKbOVSMoQ@mail.gmail.com>
References: <CADwJ639f2m0VDDXaBYngUYv6D+F9MywusPh5dtmOvtKbOVSMoQ@mail.gmail.com>
Message-ID: <boris.20240419161819@codesynthesis.com>

Raphael Palefsky-Smith <me@raphieps.com> writes:

> #pragma db object
> class employer {
>     unsigned long id;
>     string address;
>     odb::vector<shared_ptr<employee>> employees; // non-lazy, need at
> initial load
> };
> 
> #pragma db object
> class employee {
>     unsigned long id;
>     string name;
>     odb::vector<shared_ptr<pet>> pets; // non-lazy, need at initial load
> };
> 
> #pragma db object
> class pet {
>     unsigned long id;
>     string nickname;
> };
> 
> For a given Employer ID, I'd like to load all the associated Employees and
> all their Pets (along with the non-relational data members on those
> Employer/Employees/Pets). It would be great to perform this load all in one
> shot with a single JOIN'd SELECT; the naive db.load() executes a ton of
> individual statements and is far too slow. It seems like Object Loading
> Views are designed exactly for this purpose, but I can't seem to get them
> working with containers.
> 
> My object loading view looks like this:
> 
> #pragma db view object(employer) object(employee) object(pet)
> struct employer_with_employees_and_pets {
>     shared_ptr<employer> e;
> };
> 
> When I query using this view, the stderr_tracer reports that individual
> SELECTs are still being executed for each employee and pet instance. Is
> there a way to modify the view so that it runs with a single SELECT?
> 
> If Object Loading Views are incompatible here, is there a workaround, even
> if it involves hand-writing a bunch of SQL? Performance is more important
> than source-code elegance in this specific case.

There is really no way to do this perfectly from the efficiency POV in
an SQL database: You want to receive 1 row from the employer table, N
rows from the employee table, and NxM groups of rows from the pet table.
While SELECT can only return one or more uniform rows.

In SQL the best you can do is probably join all three tables and, for
each pet return duplicate employee data and duplicate employer data.
I say probably because whether it is the most efficient way may depend
on the database and the data used. For example, if your employer object
contains a bunch of large BLOBs and you use an in-process database like
SQLite with fast queries, than I won't be surprised if performing a
bunch of SELECTs with small results is actually faster than a single
SELECT with a large number of duplicated data.

Now, let's say you've decided you want a single SELECT. The way I would
try to map it to an ODB view is as follows (see the manual for details
on the by-value loading):

#pragma db view object(employer) object(employee) object(pet)
struct employer_with_employees_and_pets
{
  shared_ptr<employer> ee;
  employee er;
  pet pt;
};

I would then use the following two tricks:

1. I would use a session so that only the first employer object
   (from a bunch of duplicates) is actually instantiated.

2. I would put the containers in employer and employee into lazy-
   loaded sections so that they are not automatically loaded
   when creating the employer object. Instead I would populate
   these containers from the data returned in the view manually.

From me at raphieps.com  Fri Apr 19 17:09:20 2024
From: me at raphieps.com (Raphael Palefsky-Smith)
Date: Fri Apr 19 17:00:02 2024
Subject: [odb-users] Object Loading Views with Containers / Alternatives?
In-Reply-To: <boris.20240419161819@codesynthesis.com>
References: <CADwJ639f2m0VDDXaBYngUYv6D+F9MywusPh5dtmOvtKbOVSMoQ@mail.gmail.com>
	<boris.20240419161819@codesynthesis.com>
Message-ID: <CADwJ639h3sdLFnNmw8+ck6sGMHEvoYwzuy0J5wxroezoHu+Zzg@mail.gmail.com>

Hi Boris - many thanks for your really thoughtful reply!

Your two tricks make total sense - utilizing the session for de-duplication
and manually populating lazy-sectioned containers is brilliant, and I'm
already seeing a ~10x reduction in my load times from using these two
methods.

The part I'm a little confused by is the by-value loading, even after
reading the docs. Could you explain what the difference between these three
options would be? Assuming the same db pragmas as before:

struct option_a {
  shared_ptr<employer> ee;
  employee er;
  pet pt;
};

struct option_b {
  shared_ptr<employer> ee;
  shared_ptr<employee> er;
  pet pt;
};

struct option_c {
  shared_ptr<employer> ee;
  shared_ptr<employee> er;
  shared_ptr<pet> pt;
};

I'm using option_c (put a shared_ptr on everything in the view) to get that
reduction in load time, but I'm curious how the internal logic changes
between the three. Even if the other options don't yield higher
performance, I'd love to understand a little more of what's going on under
the hood. Thanks again!

On Fri, Apr 19, 2024 at 10:35?AM Boris Kolpackov <boris@codesynthesis.com>
wrote:

> Raphael Palefsky-Smith <me@raphieps.com> writes:
>
> > #pragma db object
> > class employer {
> >     unsigned long id;
> >     string address;
> >     odb::vector<shared_ptr<employee>> employees; // non-lazy, need at
> > initial load
> > };
> >
> > #pragma db object
> > class employee {
> >     unsigned long id;
> >     string name;
> >     odb::vector<shared_ptr<pet>> pets; // non-lazy, need at initial load
> > };
> >
> > #pragma db object
> > class pet {
> >     unsigned long id;
> >     string nickname;
> > };
> >
> > For a given Employer ID, I'd like to load all the associated Employees
> and
> > all their Pets (along with the non-relational data members on those
> > Employer/Employees/Pets). It would be great to perform this load all in
> one
> > shot with a single JOIN'd SELECT; the naive db.load() executes a ton of
> > individual statements and is far too slow. It seems like Object Loading
> > Views are designed exactly for this purpose, but I can't seem to get them
> > working with containers.
> >
> > My object loading view looks like this:
> >
> > #pragma db view object(employer) object(employee) object(pet)
> > struct employer_with_employees_and_pets {
> >     shared_ptr<employer> e;
> > };
> >
> > When I query using this view, the stderr_tracer reports that individual
> > SELECTs are still being executed for each employee and pet instance. Is
> > there a way to modify the view so that it runs with a single SELECT?
> >
> > If Object Loading Views are incompatible here, is there a workaround,
> even
> > if it involves hand-writing a bunch of SQL? Performance is more important
> > than source-code elegance in this specific case.
>
> There is really no way to do this perfectly from the efficiency POV in
> an SQL database: You want to receive 1 row from the employer table, N
> rows from the employee table, and NxM groups of rows from the pet table.
> While SELECT can only return one or more uniform rows.
>
> In SQL the best you can do is probably join all three tables and, for
> each pet return duplicate employee data and duplicate employer data.
> I say probably because whether it is the most efficient way may depend
> on the database and the data used. For example, if your employer object
> contains a bunch of large BLOBs and you use an in-process database like
> SQLite with fast queries, than I won't be surprised if performing a
> bunch of SELECTs with small results is actually faster than a single
> SELECT with a large number of duplicated data.
>
> Now, let's say you've decided you want a single SELECT. The way I would
> try to map it to an ODB view is as follows (see the manual for details
> on the by-value loading):
>
> #pragma db view object(employer) object(employee) object(pet)
> struct employer_with_employees_and_pets
> {
>   shared_ptr<employer> ee;
>   employee er;
>   pet pt;
> };
>
> I would then use the following two tricks:
>
> 1. I would use a session so that only the first employer object
>    (from a bunch of duplicates) is actually instantiated.
>
> 2. I would put the containers in employer and employee into lazy-
>    loaded sections so that they are not automatically loaded
>    when creating the employer object. Instead I would populate
>    these containers from the data returned in the view manually.
>
From boris at codesynthesis.com  Mon Apr 22 10:34:33 2024
From: boris at codesynthesis.com (Boris Kolpackov)
Date: Mon Apr 22 10:24:49 2024
Subject: [odb-users] Object Loading Views with Containers / Alternatives?
In-Reply-To: <CADwJ639h3sdLFnNmw8+ck6sGMHEvoYwzuy0J5wxroezoHu+Zzg@mail.gmail.com>
References: <CADwJ639f2m0VDDXaBYngUYv6D+F9MywusPh5dtmOvtKbOVSMoQ@mail.gmail.com>
	<boris.20240419161819@codesynthesis.com>
	<CADwJ639h3sdLFnNmw8+ck6sGMHEvoYwzuy0J5wxroezoHu+Zzg@mail.gmail.com>
Message-ID: <boris.20240422163030@codesynthesis.com>

Raphael Palefsky-Smith <me@raphieps.com> writes:

> The part I'm a little confused by is the by-value loading, even after
> reading the docs.

Yes, I don't know what made me think we need the by-value loading. It
would have been beneficial (to avoid dynamic memory allocations) if
the containers we wanted to populate manually stored their elements by
value by that's clearly not the case here. My only excuse is that it
was Fri ;-).

From me at raphieps.com  Wed Apr 24 11:22:06 2024
From: me at raphieps.com (Raphael Palefsky-Smith)
Date: Wed Apr 24 11:12:46 2024
Subject: [odb-users] Object Loading Views with Containers / Alternatives?
In-Reply-To: <boris.20240422163030@codesynthesis.com>
References: <CADwJ639f2m0VDDXaBYngUYv6D+F9MywusPh5dtmOvtKbOVSMoQ@mail.gmail.com>
	<boris.20240419161819@codesynthesis.com>
	<CADwJ639h3sdLFnNmw8+ck6sGMHEvoYwzuy0J5wxroezoHu+Zzg@mail.gmail.com>
	<boris.20240422163030@codesynthesis.com>
Message-ID: <CADwJ638sqDzY=Vpz_06C2+adbvsv95S2gSyDtGALUta3c5TCdw@mail.gmail.com>

The Friday brain! A phenomenon I know all too well. Awesome to hear the
shared_ptr approach is correct and I'll carry onward. Thanks again for the
great explanation!

On Mon, Apr 22, 2024 at 10:34?AM Boris Kolpackov <boris@codesynthesis.com>
wrote:

> Raphael Palefsky-Smith <me@raphieps.com> writes:
>
> > The part I'm a little confused by is the by-value loading, even after
> > reading the docs.
>
> Yes, I don't know what made me think we need the by-value loading. It
> would have been beneficial (to avoid dynamic memory allocations) if
> the containers we wanted to populate manually stored their elements by
> value by that's clearly not the case here. My only excuse is that it
> was Fri ;-).
>