PGSQL PERFORMANCE 26 IMPROVE PERFORMANCE OF QUERY
From: richard@no-spam (Richard Rowell)
Subject: Improve performance of query
Date: Thu, 16 Dec 2004 10:11:07 -0600


I'm trying to port our application from MS-SQL to Postgres. We have implemented all of our rather complicated application security in the database. The query that follows takes a half of a second or less on MS-SQL server and around 5 seconds on Postgres. My concern is that this data set is rather "small" by our applications standards. It is not unusual for the da_answer table to have 2-4 million records. I'm worried that if this very small data set is taking 5 seconds, then a "regular sized" data set will take far too long.

I originally thought the NOT EXISTS on the "da_data_restrict_except_open" table was killing performance, but the query took the exact same amount of time after I deleted all rows from this table. Note that the hard-coded 999999999.0, and 4000 parameters,
as well as the parameter to svp_getparentproviders are the three variables that change from one run of this query to the next.

I'm using Postgres 7.4.5 as packaged in Debian. shared_buffers is set to 57344 and sort_mem=4096.

The machine has an AMD 1.8+ and ` gig of RAM. Here are some relevant performance statistics:
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmmax 536870912
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmall 536870912
richard:/home/richard# hdparm -tT /dev/hda Timing cached reads: 1112 MB in 2.00 seconds = 556.00 MB/sec Timing buffered disk reads: 176 MB in 3.02 seconds = 58.28 MB/sec
I have included an EXPLAIN ANALYZE, relevant table counts, and relevant indexing information. If anyone has any suggestions on how to improve performance.... TIA!

SELECT tab.answer_id, client_id, question_id, recordset_id,
date_effective, virt_field_name FROM (
SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective FROM da_answer a WHERE a.date_effective <= 9999999999.0
AND a.inactive != 1
AND (
5000 = 4000 OR (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE svp_getparentproviderids = a.provider_id))
)
UNION SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective FROM da_answer a,
( SELECT main_id FROM da_data_restrict WHERE type_id = 2 AND (provider_id IN (SELECT * FROM svp_getparentproviderids(1)))
UNION SELECT sa.uid AS main_id FROM da_answer sa JOIN da_data_restrict_except_closed dr ON dr.main_id =
sa.uid AND dr.type_id = 2 AND dr.except_provider_id = 1
WHERE (restricted = 1) AND (restricted_closed_except = 1) AND sa.covered_by_roi = 1
UNION SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted = 0) AND (restricted_open_except = 1) AND (NOT EXISTS (SELECT dr.main_id FROM da_data_restrict_except_open dr WHERE (dr.main_id = sa.uid) AND (dr.type_id = 2) AND (dr.except_provider_id in (select * from svp_getparentproviderids(1)))))
AND sa.covered_by_roi = 1
UNION SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted = 0) AND (restricted_open_except = 0)
AND sa.covered_by_roi = 1
) sec WHERE a.covered_by_roi = 1
AND a.date_effective <= 9999999999.0
AND a.inactive != 1
AND a.uid = sec.main_id AND 5000 > 4000
) tab, da_question q WHERE tab.question_id = q.uid AND (min_access_level <= 4000 OR min_access_level IS NULL)

Table counts from relevant tables da_question 1095
da_answer 21117
da_question 1095
da_data_restrict_except_closed 3087
da_data_restrict_except_open 13391
svp_getparentproviderids(1) 1

Relevant Index create index in_da_data_restrict_provider_id on da_data_restrict(provider_id);
create index in_da_data_restrict_main_id on da_data_restrict(main_id);
create index in_da_data_restrict_type_id on da_data_restrict(type_id);
create index in_da_data_restrict_client_id on da_data_restrict(client_id);
create index in_da_dr_type_provider on da_data_restrict(type_id,provider_id);

create index in_da_data_rec_provider_id ON da_data_restrict_except_closed(provider_id);
create index in_da_data_rec_type_id ON da_data_restrict_except_closed(type_id);
create index in_da_data_rec_main_id ON da_data_restrict_except_closed(main_id);
create index in_da_data_rec_except_provider_id ON da_data_restrict_except_closed(except_provider_id);

create index in_da_data_reo_provider_id ON da_data_restrict_except_open(provider_id);
create index in_da_data_reo_type_id ON da_data_restrict_except_open(type_id);
create index in_da_data_reo_main_id ON da_data_restrict_except_open(main_id);
create index in_da_data_reo_except_provider_id ON da_data_restrict_except_open(except_provider_id);

create index in_da_answer_client_id ON da_answer(client_id);
create index in_da_answer_provider_id ON da_answer(provider_id);
create index in_da_answer_question_id ON da_answer(question_id);
create index in_da_answer_recordset_id ON da_answer(recordset_id);
create index in_da_answer_restricted ON da_answer(restricted);
create index in_da_answer_restricted_open_except ON da_answer(restricted_open_except);
create index in_da_answer_restricted_closed_except ON da_answer(restricted_closed_except);
create index in_da_answer_date_effective ON da_answer(date_effective);
create index in_da_answer_inactive ON da_answer(inactive);
create index in_da_answer_covered_by_roi ON da_answer(covered_by_roi);

create index in_da_ed_inactive_roi ON da_answer(date_effective,inactive,
covered_by_roi);

create index in_da_question_mal ON da_question(min_access_level);

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Hash Join (cost=1054186.23..1054631.36 rows=4496 width=67) (actual time=4902.250..4979.060 rows=7653 loops=1)

Hash Cond: ("outer".question_id = "inner".uid)
-> Subquery Scan tab (cost=1054123.62..1054457.09 rows=13339 width=24) (actual time=4896.963..4933.896 rows=7657 loops=1)

-> Unique (cost=1054123.62..1054323.70 rows=13339 width=24) (actual time=4896.948..4915.498 rows=7657 loops=1)

-> Sort (cost=1054123.62..1054156.96 rows=13339 width=24) (actual time=4896.944..4903.402 rows=7717 loops=1)

Sort Key: answer_id, client_id, question_id, recordset_id, date_effective
-> Append (cost=0.00..1053209.67 rows=13339 width=24) (actual time=279.091..4841.605 rows=7717 loops=1)

-> Subquery Scan "*SELECT* 1" (cost=0.00..64034.15 rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1)

-> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808 rows=161 loops=1)

Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND (subplan))

SubPlan -> Function Scan on svp_getparentproviderids (cost=0.00..15.00 rows=5 width=4) (actual time=0.203..0.203 rows=0 loops=21089)

Filter: (svp_getparentproviderids = $1)

-> Subquery Scan "*SELECT* 2" (cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720 rows=7556 loops=1)

-> Hash Join (cost=988627.58..989147.53 rows=2799 width=24) (actual time=290.722..395.739 rows=7556 loops=1)

Hash Cond: ("outer".main_id = "inner".uid)

-> Subquery Scan sec (cost=987913.23..988002.59 rows=5957 width=4) (actual time=203.862..225.462 rows=7567 loops=1)

-> Unique (cost=987913.23..987943.02 rows=5957 width=4) (actual time=203.851..215.834 rows=7567 loops=1)

-> Sort (cost=987913.23..987928.12 rows=5957 width=4) (actual time=203.843..207.273 rows=7567 loops=1)

Sort Key: main_id -> Append (cost=160.61..987539.72 rows=5957 width=4) (actual time=35.798..187.293 rows=7567 loops=1)

-> Subquery Scan "*SELECT* 1" (cost=160.61..164.53 rows=46 width=4) (actual time=35.796..35.923 rows=35 loops=1)

-> Hash Join (cost=160.61..164.07 rows=46 width=4) (actual time=35.791..35.868 rows=35 loops=1)

Hash Cond: ("outer".svp_getparentproviderids = "inner".provider_id)

-> HashAggregate (cost=15.00..15.00 rows=200 width=4) (actual time=0.491..0.492 rows=1 loops=1)

-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.473..0.474 rows=1 loops=1)

-> Hash (cost=145.50..145.50 rows=46 width=8) (actual time=35.256..35.256 rows=0 loops=1)

-> Index Scan using in_da_dr_type_provider on da_data_restrict (cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627 rows=7280 loops=1)

Index Cond: (type_id = 2)

-> Subquery Scan "*SELECT* 2" (cost=0.00..53.69 rows=1 width=4) (actual time=0.030..0.030 rows=0 loops=1)

-> Nested Loop (cost=0.00..53.68 rows=1 width=4) (actual time=0.028..0.028 rows=0 loops=1)

-> Index Scan using in_da_data_rec_except_provider_id on da_data_restrict_except_closed dr (cost=0.00..50.65 rows=1 width=4) (actual time=0.026..0.026 rows=0 loops=1)

Index Cond: (except_provider_id = 1)

Filter: (type_id = 2)

-> Index Scan using da_answer_pkey on da_answer sa (cost=0.00..3.02 rows=1 width=4) (never executed)

Index Cond: ("outer".main_id = sa.uid)

Filter: ((restricted = 1) AND (restricted_closed_except = 1) AND (covered_by_roi = 1))

-> Subquery Scan "*SELECT* 3" (cost=0.00..986638.62 rows=678 width=4) (actual time=0.346..77.393 rows=1841 loops=1)

-> Index Scan using in_da_answer_restricted_open_except on da_answer sa (cost=0.00..986631.84 rows=678 width=4) (actual time=0.342..74.614 rows=1841 loops=1)

Index Cond: (restricted_open_except = 1)

Filter: ((restricted = 0) AND (covered_by_roi = 1) AND (NOT (subplan)))

SubPlan
-> Nested Loop IN Join (cost=0.00..227.09 rows=1 width=4) (actual time=0.026..0.026 rows=0 loops=1841)

Join Filter: ("outer".except_provider_id = "inner".svp_getparentproviderids)

-> Index Scan using in_da_data_reo_main_id on da_data_restrict_except_open dr (cost=0.00..212.09 rows=1 width=8) (actual time=0.009..0.014 rows=2 loops=1841)

Index Cond: (main_id = $0)

Filter: (type_id = 2)

-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=3793)

-> Subquery Scan "*SELECT* 4" (cost=0.00..682.87 rows=5232 width=4) (actual time=0.064..69.107 rows=5691 loops=1)

-> Seq Scan on da_answer sa (cost=0.00..630.55 rows=5232 width=4) (actual time=0.059..61.052 rows=5691 loops=1)

Filter: ((restricted = 0) AND (restricted_open_except = 0) AND (covered_by_roi = 1))

-> Hash (cost=630.55..630.55 rows=9922 width=24) (actual time=86.699..86.699 rows=0 loops=1)

-> Seq Scan on da_answer a (cost=0.00..630.55 rows=9922 width=24) (actual time=0.043..73.232 rows=10062 loops=1)

Filter: ((covered_by_roi = 1) AND (date_effective <= 9999999999::double precision) AND (inactive <> 1))

-> Hash (cost=61.69..61.69 rows=369 width=47) (actual time=5.241..5.241 rows=0 loops=1)

-> Seq Scan on da_question q (cost=0.00..61.69 rows=369 width=47) (actual time=0.026..4.071 rows=1087 loops=1)

Filter: ((min_access_level <= 4000) OR (min_access_level IS NULL))

Total runtime: 4986.508 ms (58 rows)

count -------
21117
(1 row)

[plaintext explain.txt]

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match

From: john@no-spam (John A Meinel)
Subject: Re: Improve performance of query
Date: Thu, 16 Dec 2004 10:59:05 -0600

The first thing to check... Did you do a recent VACUUM ANALYZE? This updates all the statistics. There are a number of places where it says "rows=1000" which is usually the "I have no idea, let me guess 1000". Also, there are a number of places where the estimates are pretty far off. For instance:

Richard Rowell wrote:

>-> Subquery Scan "*SELECT* 1" (cost=0.00..64034.15 rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1)

> >
estimating 10,000 when only 161 is a little bit different.

> -> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808 rows=161 loops=1)

> Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND (subplan))

> >
Though this could be a lack of cross-column statistics. If 2 columns are correlated, the planner isn't as accurate as it could be. Also, date_effective <= 9999999999 doesn't seem very restrictive, could you use a between statement? (date between 0 and 9999999). I know for timestamps usually giving a between is better than a single sided query.

This one was underestimated.

>-> Subquery Scan "*SELECT* 2" (cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720 rows=7556 loops=1)

> -> Hash Join (cost=988627.58..989147.53 rows=2799 width=24) (actual time=290.722..395.739 rows=7556 loops=1)

> Hash Cond: ("outer".main_id = "inner".uid)

> >
This is one of the ones that looks like it didn't have any ideas. It could be because of the function. You might consider adding a function index, though I think there are some caveats there.

>-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.473..0.474 rows=1 loops=1)

> >
Another very poor estimation. It might be a need to increase the statistics for this column (ALTER TABLE, ALTER COLUMN, SET STATISTICS). IIRC, compared with other db's postgres defaults to a much lower statistics value. Try changing it from 10 (?) to 100 or so. There was a discussion that every column with an index should use higher statistics.

>-> Index Scan using in_da_dr_type_provider on da_data_restrict (cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627 rows=7280 loops=1)

> >
I'm not a great optimizer, these are just some first things to look at. Your sort mem seems pretty low to me (considering you have 1GB of RAM). Perhaps you could bump that up to 40MB instead of 4MB. Also, if you run this query twice in a row, is it still slow? (Sometimes it takes a bit of work to get the right indexes loaded into ram, but then it is faster.)

Just some guesses,
John =:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBwb7bJdeBCYSNAAMRArmYAKCQ8N27hjN+eSBXTmH2kB70NL0Y9gCfT3kU vnwYgtnYTtrh0l7ZVeOoQi0=
=sD8c -----END PGP SIGNATURE-----

[plaintext signature.asc.1103216345.737575461]


From: sfrost@no-spam (Stephen Frost)
Subject: Re: Improve performance of query
Date: Thu, 16 Dec 2004 12:02:20 -0500

* Richard Rowell (richard@no-spam wrote:
> I have included an EXPLAIN ANALYZE, relevant table counts, and relevant > indexing information. If anyone has any suggestions on how to improve > performance.... TIA!

Just a thought- do the UNION's actually have to be union's or would having them be 'UNION ALL's work?

Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBwb+arzgMPqB3kigRAjmVAJ9HOb0yZZ8L8hmL4JVurBr7ZC4yQwCeL32M 9rb52sFyjs5Ij3+K9ILXBE8=
=SNzJ -----END PGP SIGNATURE-----

[plaintext signature.asc.1103216540.1185202810]


From: dev@no-spam (Richard Huxton)
Subject: Re: Improve performance of query
Date: Thu, 16 Dec 2004 17:15:29 +0000

Richard Rowell wrote:
> I'm trying to port our application from MS-SQL to Postgres. We have > implemented all of our rather complicated application security in the > database. The query that follows takes a half of a second or less on > MS-SQL server and around 5 seconds on Postgres. My concern is that this > data set is rather "small" by our applications standards. It is not > unusual for the da_answer table to have 2-4 million records. I'm > worried that if this very small data set is taking 5 seconds, then a > "regular sized" data set will take far too long.
> > I originally thought the NOT EXISTS on the > "da_data_restrict_except_open" table was killing performance, but the > query took the exact same amount of time after I deleted all rows from > this table. Note that the hard-coded 999999999.0, and 4000 parameters,
> as well as the parameter to svp_getparentproviders are the three > variables that change from one run of this query to the next.
> > I'm using Postgres 7.4.5 as packaged in Debian. shared_buffers is set > to 57344 and sort_mem=4096.

That shared_buffers value sounds too large for 1GB RAM - rewind to 10000 say. Also make sure you've read the "performance tuning" article at:
http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
> I have included an EXPLAIN ANALYZE, relevant table counts, and relevant > indexing information. If anyone has any suggestions on how to improve > performance.... TIA!

I think it's the function call(s).

> SELECT tab.answer_id, client_id, question_id, recordset_id,
> date_effective, virt_field_name > FROM > (
> SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
> date_effective > FROM da_answer a > WHERE a.date_effective <= 9999999999.0
> AND a.inactive != 1
> AND > (
> 5000 = 4000 > OR > (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE > svp_getparentproviderids = a.provider_id))
> )
...
>SubPlan > -> Function Scan on svp_getparentproviderids (cost=0.00..15.00 rows=5 width=4) (actual time=0.203..0.203 rows=0 loops=21089)

> Filter: (svp_getparentproviderids = $1)

Here it's running 21,089 loops around your function. Each one isn't costing much, but it's the total that's killing you I think. It might be possible to mark the function STABLE or such, depending on what it does - see http://www.postgresql.org/docs/7.4/static/sql-createfunction.html
-- Richard Huxton Archonet Ltd
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

From: tgl@no-spam (Tom Lane)
Subject: Re: Improve performance of query
Date: Thu, 16 Dec 2004 12:19:11 -0500

Richard Rowell <richard@no-spam> writes:
> I'm trying to port our application from MS-SQL to Postgres. We have > implemented all of our rather complicated application security in the > database. The query that follows takes a half of a second or less on > MS-SQL server and around 5 seconds on Postgres.

The EXPLAIN shows that most of the time is going into repeated executions of svp_getparentproviderids() in the first UNION arm:

> -> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808 rows=161 loops=1)

> Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND (subplan))

> SubPlan > -> Function Scan on svp_getparentproviderids (cost=0.00..15.00 rows=5 width=4) (actual time=0.203..0.203 rows=0 loops=21089)

> Filter: (svp_getparentproviderids = $1)


I'd suggest replacing the EXISTS coding by IN:
(EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE svp_getparentproviderids = a.provider_id))

to (a.provider_id IN (SELECT * FROM svp_getparentproviderids(1)))
The latter form is likely to be significantly faster in PG 7.4.

It's also possible that the speed loss compared to MSSQL is really inside the svp_getparentproviderids function; you should look into that rather than assuming this query per se is at fault.

Also, do you actually need UNION as opposed to UNION ALL? The duplicate-elimination behavior of UNION is a bit expensive if not needed. It looks from the EXPLAIN output that some of the unions aren't actually eliminating any rows.

regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

From: john@no-spam (John A Meinel)
Subject: Re: Improve performance of query
Date: Thu, 16 Dec 2004 11:24:26 -0600

The first thing to check... Did you do a recent VACUUM ANALYZE? This updates all the statistics. There are a number of places where it says "rows=1000" which is usually the "I have no idea, let me guess 1000".
Also, there are a number of places where the estimates are pretty far off. For instance:

Richard Rowell wrote:

>-> Subquery Scan "*SELECT* 1" (cost=0.00..64034.15 rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1)

> >
estimating 10,000 when only 161 is a little bit different.

> -> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808 rows=161 loops=1)

> Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND (subplan))

> >
Though this could be a lack of cross-column statistics. If 2 columns are correlated, the planner isn't as accurate as it could be. Also,
date_effective <= 9999999999 doesn't seem very restrictive, could you use a between statement? (date between 0 and 9999999). I know for timestamps usually giving a between is better than a single sided query.

This one was underestimated.

>-> Subquery Scan "*SELECT* 2" (cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720 rows=7556 loops=1)

> -> Hash Join (cost=988627.58..989147.53 rows=2799 width=24) (actual time=290.722..395.739 rows=7556 loops=1)

> Hash Cond: ("outer".main_id = "inner".uid)

> >
This is one of the ones that looks like it didn't have any ideas. It could be because of the function. You might consider adding a function index, though I think there are some caveats there.

>-> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.473..0.474 rows=1 loops=1)

> >
Another very poor estimation. It might be a need to increase the statistics for this column (ALTER TABLE, ALTER COLUMN, SET STATISTICS).
IIRC, compared with other db's postgres defaults to a much lower statistics value. Try changing it from 10 (?) to 100 or so. There was a discussion that every column with an index should use higher statistics.

>-> Index Scan using in_da_dr_type_provider on da_data_restrict (cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627 rows=7280 loops=1)

> >
I'm not a great optimizer, these are just some first things to look at.
Your sort mem seems pretty low to me (considering you have 1GB of RAM).
Perhaps you could bump that up to 40MB instead of 4MB. Also, if you run this query twice in a row, is it still slow? (Sometimes it takes a bit of work to get the right indexes loaded into ram, but then it is faster.)

Just some guesses,
John =:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBwcTKJdeBCYSNAAMRAkmEAKClhpUHULd2IENkFdzSStbdPH0bxQCdE34l nieGHqeczah1CFyhitKbydw=
=08oE -----END PGP SIGNATURE-----

[plaintext signature.asc.1103217866.68234853]