[odb-users] Object Loading Views with Containers / Alternatives?

Fri Apr 19 10:35:53 EDT 2024

Raphael Palefsky-Smith <me at raphieps.com> writes:

> #pragma db object
> class employer {
>     unsigned long id;
>     string address;
>     odb::vector<shared_ptr<employee>> employees; // non-lazy, need at
> initial load
> };
> 
> #pragma db object
> class employee {
>     unsigned long id;
>     string name;
>     odb::vector<shared_ptr<pet>> pets; // non-lazy, need at initial load
> };
> 
> #pragma db object
> class pet {
>     unsigned long id;
>     string nickname;
> };
> 
> For a given Employer ID, I'd like to load all the associated Employees and
> all their Pets (along with the non-relational data members on those
> Employer/Employees/Pets). It would be great to perform this load all in one
> shot with a single JOIN'd SELECT; the naive db.load() executes a ton of
> individual statements and is far too slow. It seems like Object Loading
> Views are designed exactly for this purpose, but I can't seem to get them
> working with containers.
> 
> My object loading view looks like this:
> 
> #pragma db view object(employer) object(employee) object(pet)
> struct employer_with_employees_and_pets {
>     shared_ptr<employer> e;
> };
> 
> When I query using this view, the stderr_tracer reports that individual
> SELECTs are still being executed for each employee and pet instance. Is
> there a way to modify the view so that it runs with a single SELECT?
> 
> If Object Loading Views are incompatible here, is there a workaround, even
> if it involves hand-writing a bunch of SQL? Performance is more important
> than source-code elegance in this specific case.

There is really no way to do this perfectly from the efficiency POV in
an SQL database: You want to receive 1 row from the employer table, N
rows from the employee table, and NxM groups of rows from the pet table.
While SELECT can only return one or more uniform rows.

In SQL the best you can do is probably join all three tables and, for
each pet return duplicate employee data and duplicate employer data.
I say probably because whether it is the most efficient way may depend
on the database and the data used. For example, if your employer object
contains a bunch of large BLOBs and you use an in-process database like
SQLite with fast queries, than I won't be surprised if performing a
bunch of SELECTs with small results is actually faster than a single
SELECT with a large number of duplicated data.

Now, let's say you've decided you want a single SELECT. The way I would
try to map it to an ODB view is as follows (see the manual for details
on the by-value loading):

#pragma db view object(employer) object(employee) object(pet)
struct employer_with_employees_and_pets
{
  shared_ptr<employer> ee;
  employee er;
  pet pt;
};

I would then use the following two tricks:

1. I would use a session so that only the first employer object
   (from a bunch of duplicates) is actually instantiated.

2. I would put the containers in employer and employee into lazy-
   loaded sections so that they are not automatically loaded
   when creating the employer object. Instead I would populate
   these containers from the data returned in the view manually.