[odb-users] Object Loading Views with Containers / Alternatives?

Fri Apr 19 17:09:20 EDT 2024

Hi Boris - many thanks for your really thoughtful reply!

Your two tricks make total sense - utilizing the session for de-duplication
and manually populating lazy-sectioned containers is brilliant, and I'm
already seeing a ~10x reduction in my load times from using these two
methods.

The part I'm a little confused by is the by-value loading, even after
reading the docs. Could you explain what the difference between these three
options would be? Assuming the same db pragmas as before:

struct option_a {
  shared_ptr<employer> ee;
  employee er;
  pet pt;
};

struct option_b {
  shared_ptr<employer> ee;
  shared_ptr<employee> er;
  pet pt;
};

struct option_c {
  shared_ptr<employer> ee;
  shared_ptr<employee> er;
  shared_ptr<pet> pt;
};

I'm using option_c (put a shared_ptr on everything in the view) to get that
reduction in load time, but I'm curious how the internal logic changes
between the three. Even if the other options don't yield higher
performance, I'd love to understand a little more of what's going on under
the hood. Thanks again!

On Fri, Apr 19, 2024 at 10:35 AM Boris Kolpackov <boris at codesynthesis.com>
wrote:

> Raphael Palefsky-Smith <me at raphieps.com> writes:
>
> > #pragma db object
> > class employer {
> >     unsigned long id;
> >     string address;
> >     odb::vector<shared_ptr<employee>> employees; // non-lazy, need at
> > initial load
> > };
> >
> > #pragma db object
> > class employee {
> >     unsigned long id;
> >     string name;
> >     odb::vector<shared_ptr<pet>> pets; // non-lazy, need at initial load
> > };
> >
> > #pragma db object
> > class pet {
> >     unsigned long id;
> >     string nickname;
> > };
> >
> > For a given Employer ID, I'd like to load all the associated Employees
> and
> > all their Pets (along with the non-relational data members on those
> > Employer/Employees/Pets). It would be great to perform this load all in
> one
> > shot with a single JOIN'd SELECT; the naive db.load() executes a ton of
> > individual statements and is far too slow. It seems like Object Loading
> > Views are designed exactly for this purpose, but I can't seem to get them
> > working with containers.
> >
> > My object loading view looks like this:
> >
> > #pragma db view object(employer) object(employee) object(pet)
> > struct employer_with_employees_and_pets {
> >     shared_ptr<employer> e;
> > };
> >
> > When I query using this view, the stderr_tracer reports that individual
> > SELECTs are still being executed for each employee and pet instance. Is
> > there a way to modify the view so that it runs with a single SELECT?
> >
> > If Object Loading Views are incompatible here, is there a workaround,
> even
> > if it involves hand-writing a bunch of SQL? Performance is more important
> > than source-code elegance in this specific case.
>
> There is really no way to do this perfectly from the efficiency POV in
> an SQL database: You want to receive 1 row from the employer table, N
> rows from the employee table, and NxM groups of rows from the pet table.
> While SELECT can only return one or more uniform rows.
>
> In SQL the best you can do is probably join all three tables and, for
> each pet return duplicate employee data and duplicate employer data.
> I say probably because whether it is the most efficient way may depend
> on the database and the data used. For example, if your employer object
> contains a bunch of large BLOBs and you use an in-process database like
> SQLite with fast queries, than I won't be surprised if performing a
> bunch of SELECTs with small results is actually faster than a single
> SELECT with a large number of duplicated data.
>
> Now, let's say you've decided you want a single SELECT. The way I would
> try to map it to an ODB view is as follows (see the manual for details
> on the by-value loading):
>
> #pragma db view object(employer) object(employee) object(pet)
> struct employer_with_employees_and_pets
> {
>   shared_ptr<employer> ee;
>   employee er;
>   pet pt;
> };
>
> I would then use the following two tricks:
>
> 1. I would use a session so that only the first employer object
>    (from a bunch of duplicates) is actually instantiated.
>
> 2. I would put the containers in employer and employee into lazy-
>    loaded sections so that they are not automatically loaded
>    when creating the employer object. Instead I would populate
>    these containers from the data returned in the view manually.
>