From boris at codesynthesis.com Mon Apr 1 04:57:46 2024 From: boris at codesynthesis.com (Boris Kolpackov) Date: Mon Apr 1 04:48:11 2024 Subject: [odb-users] Re: How to perform an aggregate subquery In-Reply-To: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com> References: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com> Message-ID: Aldo Laiseca writes: > I need to find the most recent T1 (based on field timestamp_action) having a foreign key to T2 matching a particular value of T2.name. In SQL, something like this: > > select t1.* from t1 > where t1.timestamp_action = (select max(t1.timestamp_action) > from t1 join t2 on t1.id_t2 = t2.id > where t2.name = ?)) > > > How could I write such a query using ODB? By using a view: https://www.codesynthesis.com/products/odb/doc/manual.xhtml#10 Generally, a lot of "how do I do X in ODB" can be answered by at least skimming through the documentation once so that you have an idea of the functionality available. From alaiseca at gmail.com Tue Apr 2 18:54:50 2024 From: alaiseca at gmail.com (Aldo Laiseca) Date: Fri Apr 5 08:31:42 2024 Subject: [odb-users] Re: How to perform an aggregate subquery In-Reply-To: References: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com> Message-ID: Yes, actually I had figured out that a view was the approach to follow. I declared this: #pragma db view object(T1) object(T2) struct ViewT1T2 { #pragma db column("max(" + T1::timestamp_action + ")") #pragma db type("TIMESTAMP") boost::posix_time::ptime maxTimestamp; }; However, after that, I don?t know how to recover the entire T1 object having the max value. Right now I solved my requirement in two steps: 1. Execute a query_value against the view to grab the maximum timestamp value. 2. Execute a query_one against the object T1 to get the row having the row matching the value obtained in step 1. Is there any way to do the above in one step? I read about object loading views; however, I don?t see how to include a max condition in a query in an object loading view. Thanks > El 1 abr 2024, a las 2:57, Boris Kolpackov escribi?: > > Aldo Laiseca writes: > >> I need to find the most recent T1 (based on field timestamp_action) having a foreign key to T2 matching a particular value of T2.name. In SQL, something like this: >> >> select t1.* from t1 >> where t1.timestamp_action = (select max(t1.timestamp_action) >> from t1 join t2 on t1.id_t2 = t2.id >> where t2.name = ?)) >> >> >> How could I write such a query using ODB? > > By using a view: > > https://www.codesynthesis.com/products/odb/doc/manual.xhtml#10 > > Generally, a lot of "how do I do X in ODB" can be answered by at least > skimming through the documentation once so that you have an idea of the > functionality available. From finjulhich at gmail.com Mon Apr 8 21:40:57 2024 From: finjulhich at gmail.com (MM) Date: Mon Apr 8 21:31:41 2024 Subject: [odb-users] Re: How to perform an aggregate subquery Message-ID: On Fri, 5 Apr 2024 at 13:41, Aldo Laiseca wrote: > Yes, actually I had figured out that a view was the approach to follow. I > declared this: > > #pragma db view object(T1) object(T2) > struct ViewT1T2 { > > #pragma db column("max(" + T1::timestamp_action + ")") > #pragma db type("TIMESTAMP") > boost::posix_time::ptime maxTimestamp; > }; > > However, after that, I don?t know how to recover the entire T1 object > having the max value. Right now I solved my requirement in two steps: > > 1. Execute a query_value against the view to grab the maximum timestamp > value. > 2. Execute a query_one against the object T1 to get the row having the row > matching the value obtained in step 1. > > Is there any way to do the above in one step? I read about object loading > views; however, I don?t see how to include a max condition in a query in an > object loading view. > > Thanks > > I have the same question here class B { date begin; date end; }; class A { std::vector bs_; }; both A and B are persistent.... B's table has object-id column pointed to from A. and index column and value_* columns. How to write the view to pick up the B row that matches the max index of the vector? Is the max() aggregation expressible neatly in a view? Rds, From boris at codesynthesis.com Tue Apr 9 23:40:26 2024 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Apr 9 23:30:54 2024 Subject: [odb-users] Re: How to perform an aggregate subquery In-Reply-To: References: Message-ID: MM writes: > class B { > date begin; > date end; > }; > > class A { > std::vector bs_; > }; > > How to write the view to pick up the B row that matches the max index of > the vector? > Is the max() aggregation expressible neatly in a view? ODB currently does not support containers in views (with the exception of inverse containers that establish relationships). So to accomplish this you will need to use a table (or native) view over the container's underlying table. From boris at codesynthesis.com Wed Apr 10 00:02:46 2024 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Apr 9 23:53:15 2024 Subject: [odb-users] Re: How to perform an aggregate subquery In-Reply-To: References: <4F874EAC-04FE-472E-9E16-23AC177F10CB@gmail.com> Message-ID: Aldo Laiseca writes: > Yes, actually I had figured out that a view was the approach to > follow. I declared this: > > #pragma db view object(T1) object(T2) > struct ViewT1T2 { > > #pragma db column("max(" + T1::timestamp_action + ")") > #pragma db type("TIMESTAMP") > boost::posix_time::ptime maxTimestamp; > }; > > However, after that, I don?t know how to recover the entire T1 > object having the max value. Right now I solved my requirement > in two steps: > > 1. Execute a query_value against the view to grab the maximum > timestamp value. > 2. Execute a query_one against the object T1 to get the row having > the row matching the value obtained in step 1. > > Is there any way to do the above in one step? I read about object loading > views; however, I don?t see how to include a max condition in a query in > an object loading view. The first question that you need to answer is how to do this in SQL. Then try to map this answer to an appropriate ODB view. In SQL you cannot just say `SELECT ... WHERE max(timestamp_action)` (well, maybe there is support for something like this in the specific database that you are using). From me at raphieps.com Sat Apr 13 15:13:44 2024 From: me at raphieps.com (Raphael Palefsky-Smith) Date: Sat Apr 13 15:04:26 2024 Subject: [odb-users] Object Loading Views with Containers / Alternatives? Message-ID: Hello! When it first starts up, my application loads a top-level object with a deep hierarchy of related objects stored in odb::vectors. For example, say I have an Employer that has many Employees, who each have many Pets. These are all stored in odb::vectors, though I'm open to using other schemes to make this work. Some pseudocode: #pragma db object class employer { unsigned long id; string address; odb::vector> employees; // non-lazy, need at initial load }; #pragma db object class employee { unsigned long id; string name; odb::vector> pets; // non-lazy, need at initial load }; #pragma db object class pet { unsigned long id; string nickname; }; For a given Employer ID, I'd like to load all the associated Employees and all their Pets (along with the non-relational data members on those Employer/Employees/Pets). It would be great to perform this load all in one shot with a single JOIN'd SELECT; the naive db.load() executes a ton of individual statements and is far too slow. It seems like Object Loading Views are designed exactly for this purpose, but I can't seem to get them working with containers. My object loading view looks like this: #pragma db view object(employer) object(employee) object(pet) struct employer_with_employees_and_pets { shared_ptr e; }; When I query using this view, the stderr_tracer reports that individual SELECTs are still being executed for each employee and pet instance. Is there a way to modify the view so that it runs with a single SELECT? If Object Loading Views are incompatible here, is there a workaround, even if it involves hand-writing a bunch of SQL? Performance is more important than source-code elegance in this specific case. Thank you! From boris at codesynthesis.com Fri Apr 19 10:35:53 2024 From: boris at codesynthesis.com (Boris Kolpackov) Date: Fri Apr 19 10:26:10 2024 Subject: [odb-users] Object Loading Views with Containers / Alternatives? In-Reply-To: References: Message-ID: Raphael Palefsky-Smith writes: > #pragma db object > class employer { > unsigned long id; > string address; > odb::vector> employees; // non-lazy, need at > initial load > }; > > #pragma db object > class employee { > unsigned long id; > string name; > odb::vector> pets; // non-lazy, need at initial load > }; > > #pragma db object > class pet { > unsigned long id; > string nickname; > }; > > For a given Employer ID, I'd like to load all the associated Employees and > all their Pets (along with the non-relational data members on those > Employer/Employees/Pets). It would be great to perform this load all in one > shot with a single JOIN'd SELECT; the naive db.load() executes a ton of > individual statements and is far too slow. It seems like Object Loading > Views are designed exactly for this purpose, but I can't seem to get them > working with containers. > > My object loading view looks like this: > > #pragma db view object(employer) object(employee) object(pet) > struct employer_with_employees_and_pets { > shared_ptr e; > }; > > When I query using this view, the stderr_tracer reports that individual > SELECTs are still being executed for each employee and pet instance. Is > there a way to modify the view so that it runs with a single SELECT? > > If Object Loading Views are incompatible here, is there a workaround, even > if it involves hand-writing a bunch of SQL? Performance is more important > than source-code elegance in this specific case. There is really no way to do this perfectly from the efficiency POV in an SQL database: You want to receive 1 row from the employer table, N rows from the employee table, and NxM groups of rows from the pet table. While SELECT can only return one or more uniform rows. In SQL the best you can do is probably join all three tables and, for each pet return duplicate employee data and duplicate employer data. I say probably because whether it is the most efficient way may depend on the database and the data used. For example, if your employer object contains a bunch of large BLOBs and you use an in-process database like SQLite with fast queries, than I won't be surprised if performing a bunch of SELECTs with small results is actually faster than a single SELECT with a large number of duplicated data. Now, let's say you've decided you want a single SELECT. The way I would try to map it to an ODB view is as follows (see the manual for details on the by-value loading): #pragma db view object(employer) object(employee) object(pet) struct employer_with_employees_and_pets { shared_ptr ee; employee er; pet pt; }; I would then use the following two tricks: 1. I would use a session so that only the first employer object (from a bunch of duplicates) is actually instantiated. 2. I would put the containers in employer and employee into lazy- loaded sections so that they are not automatically loaded when creating the employer object. Instead I would populate these containers from the data returned in the view manually. From me at raphieps.com Fri Apr 19 17:09:20 2024 From: me at raphieps.com (Raphael Palefsky-Smith) Date: Fri Apr 19 17:00:02 2024 Subject: [odb-users] Object Loading Views with Containers / Alternatives? In-Reply-To: References: Message-ID: Hi Boris - many thanks for your really thoughtful reply! Your two tricks make total sense - utilizing the session for de-duplication and manually populating lazy-sectioned containers is brilliant, and I'm already seeing a ~10x reduction in my load times from using these two methods. The part I'm a little confused by is the by-value loading, even after reading the docs. Could you explain what the difference between these three options would be? Assuming the same db pragmas as before: struct option_a { shared_ptr ee; employee er; pet pt; }; struct option_b { shared_ptr ee; shared_ptr er; pet pt; }; struct option_c { shared_ptr ee; shared_ptr er; shared_ptr pt; }; I'm using option_c (put a shared_ptr on everything in the view) to get that reduction in load time, but I'm curious how the internal logic changes between the three. Even if the other options don't yield higher performance, I'd love to understand a little more of what's going on under the hood. Thanks again! On Fri, Apr 19, 2024 at 10:35?AM Boris Kolpackov wrote: > Raphael Palefsky-Smith writes: > > > #pragma db object > > class employer { > > unsigned long id; > > string address; > > odb::vector> employees; // non-lazy, need at > > initial load > > }; > > > > #pragma db object > > class employee { > > unsigned long id; > > string name; > > odb::vector> pets; // non-lazy, need at initial load > > }; > > > > #pragma db object > > class pet { > > unsigned long id; > > string nickname; > > }; > > > > For a given Employer ID, I'd like to load all the associated Employees > and > > all their Pets (along with the non-relational data members on those > > Employer/Employees/Pets). It would be great to perform this load all in > one > > shot with a single JOIN'd SELECT; the naive db.load() executes a ton of > > individual statements and is far too slow. It seems like Object Loading > > Views are designed exactly for this purpose, but I can't seem to get them > > working with containers. > > > > My object loading view looks like this: > > > > #pragma db view object(employer) object(employee) object(pet) > > struct employer_with_employees_and_pets { > > shared_ptr e; > > }; > > > > When I query using this view, the stderr_tracer reports that individual > > SELECTs are still being executed for each employee and pet instance. Is > > there a way to modify the view so that it runs with a single SELECT? > > > > If Object Loading Views are incompatible here, is there a workaround, > even > > if it involves hand-writing a bunch of SQL? Performance is more important > > than source-code elegance in this specific case. > > There is really no way to do this perfectly from the efficiency POV in > an SQL database: You want to receive 1 row from the employer table, N > rows from the employee table, and NxM groups of rows from the pet table. > While SELECT can only return one or more uniform rows. > > In SQL the best you can do is probably join all three tables and, for > each pet return duplicate employee data and duplicate employer data. > I say probably because whether it is the most efficient way may depend > on the database and the data used. For example, if your employer object > contains a bunch of large BLOBs and you use an in-process database like > SQLite with fast queries, than I won't be surprised if performing a > bunch of SELECTs with small results is actually faster than a single > SELECT with a large number of duplicated data. > > Now, let's say you've decided you want a single SELECT. The way I would > try to map it to an ODB view is as follows (see the manual for details > on the by-value loading): > > #pragma db view object(employer) object(employee) object(pet) > struct employer_with_employees_and_pets > { > shared_ptr ee; > employee er; > pet pt; > }; > > I would then use the following two tricks: > > 1. I would use a session so that only the first employer object > (from a bunch of duplicates) is actually instantiated. > > 2. I would put the containers in employer and employee into lazy- > loaded sections so that they are not automatically loaded > when creating the employer object. Instead I would populate > these containers from the data returned in the view manually. > From boris at codesynthesis.com Mon Apr 22 10:34:33 2024 From: boris at codesynthesis.com (Boris Kolpackov) Date: Mon Apr 22 10:24:49 2024 Subject: [odb-users] Object Loading Views with Containers / Alternatives? In-Reply-To: References: Message-ID: Raphael Palefsky-Smith writes: > The part I'm a little confused by is the by-value loading, even after > reading the docs. Yes, I don't know what made me think we need the by-value loading. It would have been beneficial (to avoid dynamic memory allocations) if the containers we wanted to populate manually stored their elements by value by that's clearly not the case here. My only excuse is that it was Fri ;-). From me at raphieps.com Wed Apr 24 11:22:06 2024 From: me at raphieps.com (Raphael Palefsky-Smith) Date: Wed Apr 24 11:12:46 2024 Subject: [odb-users] Object Loading Views with Containers / Alternatives? In-Reply-To: References: Message-ID: The Friday brain! A phenomenon I know all too well. Awesome to hear the shared_ptr approach is correct and I'll carry onward. Thanks again for the great explanation! On Mon, Apr 22, 2024 at 10:34?AM Boris Kolpackov wrote: > Raphael Palefsky-Smith writes: > > > The part I'm a little confused by is the by-value loading, even after > > reading the docs. > > Yes, I don't know what made me think we need the by-value loading. It > would have been beneficial (to avoid dynamic memory allocations) if > the containers we wanted to populate manually stored their elements by > value by that's clearly not the case here. My only excuse is that it > was Fri ;-). >