Логика одна, планы запроса очень сильно различаются.

Nov 07, 2014 09:01

Коллеги, прошу помощи, однако, ибо сижу дурак дураком и ничего не понимаю.
Есть мониторинг (не спрашивайте, кто и как его делал - не знаю), там есть такие таблички:

monitoring=> \d interface
Таблица "public.interface"
Колонка | Тип | Модификаторы
-------------+-----------------------------+--------------------------------------------------------
id | integer | NOT NULL DEFAULT nextval('interface_id_seq'::regclass)
host_id | integer | NOT NULL
name | text | NOT NULL
ether | macaddr |
first_time | timestamp without time zone | NOT NULL
last_time | timestamp without time zone |
remove_time | timestamp without time zone |
Индексы:
"interface_pkey" PRIMARY KEY, btree (host_id, name, first_time)
"interface_id_key" UNIQUE CONSTRAINT, btree (id)
"interface_remove_time_idx" btree (remove_time) WHERE remove_time IS NULL
Ограничения внешнего ключа:
"interface_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)
Ссылки извне:
TABLE "arp" CONSTRAINT "arp_interface_id_fkey" FOREIGN KEY (interface_id) REFERENCES interface(id)
TABLE "inet" CONSTRAINT "inet_interface_id_fkey" FOREIGN KEY (interface_id) REFERENCES interface(id)
TABLE "neighbour" CONSTRAINT "neighbour_interface_id_fkey" FOREIGN KEY (interface_id) REFERENCES interface(id)

monitoring=> \d host
Таблица "public.host"
Колонка | Тип | Модификаторы
--------------+-----------------------+---------------------------------------------------
id | integer | NOT NULL DEFAULT nextval('host_id_seq'::regclass)
fqdn | character varying(64) | NOT NULL
ip | character varying(15) |
multihomed | boolean |
baseboard_id | integer |
ussd_version | integer |
Индексы:
"host_pkey" PRIMARY KEY, btree (fqdn)
"host_id_key" UNIQUE CONSTRAINT, btree (id)
Ссылки извне:
TABLE "baseboard_host" CONSTRAINT "baseboard_host_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)
TABLE "bios_host" CONSTRAINT "bios_host_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)
TABLE "hdd_host" CONSTRAINT "hdd_host_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)
TABLE "host_role" CONSTRAINT "host_role_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)
TABLE "interface" CONSTRAINT "interface_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)
TABLE "memory_module_host" CONSTRAINT "memory_module_host_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)
TABLE "processor_host" CONSTRAINT "processor_host_host_id_fkey" FOREIGN KEY (host_id) REFERENCES host(id)

monitoring=> \d neighbour
Таблица "public.neighbour"
Колонка | Тип | Модификаторы
--------------+-----------------------------+--------------------------------------------------------
id | integer | NOT NULL DEFAULT nextval('neighbour_id_seq'::regclass)
interface_id | integer | NOT NULL
ether | macaddr | NOT NULL
vlan | integer | NOT NULL
first_time | timestamp without time zone | NOT NULL
last_time | timestamp without time zone |
remove_time | timestamp without time zone |
Индексы:
"neighbour_pkey" PRIMARY KEY, btree (interface_id, ether, vlan, first_time)
"neighbour_id_key" UNIQUE CONSTRAINT, btree (id)
"neighbour_ether_idx" btree (ether) WHERE remove_time IS NULL
"neighbour_remove_time_idx" btree (remove_time) WHERE remove_time IS NULL
Ограничения внешнего ключа:
"neighbour_interface_id_fkey" FOREIGN KEY (interface_id) REFERENCES interface(id)

Взбрело мне в голову, что timestamp - это сильно больше, чем bool, и решил я немного поколдовать со схемой:

monitoring=> alter table interface add column alive bool;
alter table neighbour add column alive bool;

Добавил индексы по образу и подобию (remove_time):

create index interface_alive_idx on interface (alive) where alive='t';
create index neighbour_ether_alive_idx on neighbour (ether) where alive='t';
create index neighbour_alive_idx on neighbour (alive) where alive='t';

И еще парочку по наитию. Без этих индексов картина была совсем грустной.

create index interface_id_alive_idx on interface (id) where alive='t';
create index neighbour_id_alive_idx on neighbour (id) where alive='t';

Смотрю на explain analyze и немного удивляюсь. Старый:

monitoring=> explain analyze SELECT name, fqdn, COUNT(*) FROM (
SELECT i.name, s.fqdn, n1.ether
FROM neighbour n JOIN interface i
ON i.id = n.interface_id JOIN host s
ON s.id = i.host_id JOIN neighbour n1
ON n1.interface_id = i.id
WHERE n.remove_time IS NULL
AND i.remove_time IS NULL
AND n1.remove_time IS NULL
AND n.ether = '34:40:b5:88:f5:d6' GROUP BY i.name, s.fqdn, n1.ether) as l
GROUP BY name, fqdn ORDER BY COUNT(*), fqdn, name;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=59039.44..59039.47 rows=12 width=27) (actual time=236.000..236.000 rows=8 loops=1)
Sort Key: (count(*)), s.fqdn, i.name
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=59039.11..59039.23 rows=12 width=27) (actual time=236.000..236.000 rows=8 loops=1)
-> HashAggregate (cost=59038.78..59038.90 rows=12 width=33) (actual time=224.000..228.000 rows=2545 loops=1)
-> Nested Loop (cost=10282.30..59038.69 rows=12 width=33) (actual time=28.000..216.000 rows=2545 loops=1)
-> Nested Loop (cost=0.70..528.12 rows=5 width=35) (actual time=0.000..0.000 rows=8 loops=1)
-> Nested Loop (cost=0.42..488.74 rows=5 width=19) (actual time=0.000..0.000 rows=8 loops=1)
-> Index Scan using neighbour_ether_idx on neighbour n (cost=0.00..184.63 rows=36 width=4) (actual time=0.000..0.000 rows=8 loops=1)
Index Cond: (ether = '34:40:b5:88:f5:d6'::macaddr)
-> Index Scan using interface_id_key on interface i (cost=0.42..8.44 rows=1 width=15) (actual time=0.000..0.000 rows=1 loops=8)
Index Cond: (id = n.interface_id)
Filter: (remove_time IS NULL)
-> Index Scan using host_id_key on host s (cost=0.28..7.87 rows=1 width=24) (actual time=0.000..0.000 rows=1 loops=8)
Index Cond: (id = i.host_id)
-> Bitmap Heap Scan on neighbour n1 (cost=10281.60..11698.42 rows=369 width=10) (actual time=27.000..27.000 rows=318 loops=8)
Recheck Cond: ((interface_id = i.id) AND (remove_time IS NULL))
-> BitmapAnd (cost=10281.60..10281.60 rows=369 width=0) (actual time=27.000..27.000 rows=0 loops=8)
-> Bitmap Index Scan on neighbour_pkey (cost=0.00..1203.16 rows=23318 width=0) (actual time=1.000..1.000 rows=6418 loops=8)
Index Cond: (interface_id = i.id)
-> Bitmap Index Scan on neighbour_remove_time_idx (cost=0.00..9015.33 rows=251055 width=0) (actual time=26.000..26.000 rows=286088 loops=8)
Index Cond: (remove_time IS NULL)
Total runtime: 236.000 ms
(23 строки)

monitoring=> explain analyze SELECT name, fqdn, COUNT(*) FROM (
SELECT i.name, s.fqdn, n1.ether
FROM neighbour n JOIN interface i
ON i.id = n.interface_id JOIN host s
ON s.id = i.host_id JOIN neighbour n1
ON n1.interface_id = i.id
WHERE n.alive = 't'
AND i.alive = 't'
AND n1.alive = 't'
AND n.ether = '34:40:b5:88:f5:d6' GROUP BY i.name, s.fqdn, n1.ether) as l
GROUP BY name, fqdn ORDER BY COUNT(*), fqdn, name;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=4431.06..4431.09 rows=12 width=27) (actual time=412.000..412.000 rows=8 loops=1)
Sort Key: (count(*)), s.fqdn, i.name
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=4430.72..4430.84 rows=12 width=27) (actual time=412.000..412.000 rows=8 loops=1)
-> HashAggregate (cost=4430.39..4430.51 rows=12 width=33) (actual time=408.000..408.000 rows=2545 loops=1)
-> Hash Join (cost=482.91..4430.30 rows=12 width=33) (actual time=124.000..404.000 rows=2545 loops=1)
Hash Cond: (n1.interface_id = i.id)
-> Index Scan using neighbour_alive_idx on neighbour n1 (cost=0.42..3006.24 rows=251055 width=10) (actual time=0.000..208.000 rows=286088 loops=1)
Index Cond: (alive = true)
-> Hash (cost=482.42..482.42 rows=5 width=35) (actual time=0.000..0.000 rows=8 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Nested Loop (cost=0.99..482.42 rows=5 width=35) (actual time=0.000..0.000 rows=8 loops=1)
-> Nested Loop (cost=0.71..443.04 rows=5 width=19) (actual time=0.000..0.000 rows=8 loops=1)
-> Index Scan using neighbour_ether_alive_idx on neighbour n (cost=0.42..147.70 rows=36 width=4) (actual time=0.000..0.000 rows=8 loops=1)
Index Cond: (ether = '34:40:b5:88:f5:d6'::macaddr)
-> Index Scan using interface_id_alive_idx on interface i (cost=0.29..8.19 rows=1 width=15) (actual time=0.000..0.000 rows=1 loops=8)
Index Cond: (id = n.interface_id)
-> Index Scan using host_id_key on host s (cost=0.28..7.87 rows=1 width=24) (actual time=0.000..0.000 rows=1 loops=8)
Index Cond: (id = i.host_id)
Total runtime: 412.000 ms
(20 строк)

Вопрос: почему такие разные планы запросов при совершенно, по моему скромному мнению, идентичной логике? Что почитать, чего я не понимаю? Ткните, пожалуйста, в доки, а то у меня уже ум за разум заходит. :(

postgresql, планы запросов, производительность

Previous post Next post
Up