PGSQL PERFORMANCE 31 WHICH IS MORE EFFICIENT
From: mike@no-spam ("Mike G.")
Subject: Which is more efficient?
Date: Fri, 17 Dec 2004 12:52:01 -0600


Hi,

I have data that I am taking from 2 tables, pulling out specific columns and inserting into one table.


Is it more efficient to do:
a) insert into x select z from y;
insert into x select z from a;

b) insert into x select z from y union all select z from a;

I have run both through explain.
a) 650ms b) 741.57ms
According to the planner option a, select z from y takes 545.93 ms Under option b select z from y takes 553.34 ms
Shouldn't the time predicted for the select z from y be the same?

I would believe b would be more efficient as the inserts could be done in a batch rather than individual transactions but the planner doesn't recognize that. When I run option a through the planner I have to highlight each insert separately since the planner stops executing after the first ; it comes across.


Mike
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@no-spam

From: Christopher Browne (cbbrowne@no-spam)
Subject: Re: Which is more efficient?
Date: Fri, 17 Dec 2004 21:55:27 -0500

A long time ago, in a galaxy far, far away, mike@no-spam ("Mike G.") wrote:
> Hi,
>
> I have data that I am taking from 2 tables, pulling out specific columns and inserting into one table.

>
> Is it more efficient to do:
> a) insert into x > select z from y;
> insert into x > select z from a;
>
> b) insert into x > select z from y > union all > select z from a;
>
> I have run both through explain.
> a) 650ms > b) 741.57ms >
> According to the planner option a, select z from y takes 545.93 ms > Under option b select z from y takes 553.34 ms >
> Shouldn't the time predicted for the select z from y be the same?

No, these are approximations. They can't be expected to be identical,
and as you can see there's no material difference, as 545.93 only differs from 553.34 by 1.34%.

The point of EXPLAIN is to show the query _plans_ so you can evaluate how sane they seem. They're pretty well identical, so EXPLAIN's doing what might be expected.

> I would believe b would be more efficient as the inserts could be > done in a batch rather than individual transactions but the planner > doesn't recognize that. When I run option a through the planner I > have to highlight each insert separately since the planner stops > executing after the first ; it comes across.

The case where there would be a _material_ difference would be where there were hardly any rows in either of the tables you're adding in,
and in that case, query planning becomes a significant cost, at which point simpler is probably better.

If you do the queries in separate transactions, there's some addition of cost of COMMIT involved, but if they can be kept in a single transaction, the approaches oughtn't be materially different in cost,
and that's what you're finding.
-- select 'cbbrowne' || '@no-spam || 'gmail.com';
http://www3.sympatico.ca/cbbrowne/x.html MICROS~1: Where do you want to go today? Linux: Been there, done that.