现在有个需要查询postgresql下表的信息同时统计表下的geometry的数目和对应类型,如果有多个geometry需要合并为一条返回.下面是查询的步骤和考虑.
- 查询pg_statio_all_tables
SELECT * FROM pg_catalog.pg_statio_all_tables WHERE schemaname = ‘public’
上面SQL主要是查询catalog下面的目录在public为schema下的表数据
- 查询information_schema.tables
SELECT * FROM information_schema.tables WHERE table_schema = ‘public’ ;
上面的SQL其实类似但是整体条SQL会包含很多 内置的表非常不考虑使用
- 查询geometry_columns
select f_table_name ,type,COUNT(f_table_name) as geomcount from geometry_columns WHERE f_table_schema = ‘public’ GROUP BY f_table_name ;
是查询对应的 geometry_columns 表下记录的geometry的字段和对应的类型(点,线,面,多点…等)
- 联合查询 pg_catalog.pg_statio_all_tables 和geometry_columns
SELECT
gc.type AS geo_type,
pg_statio_all_tables.relname,
obj_description(relid) AS description,
pg_total_relation_size(relid) / pg_column_size(relid::regclass) AS data_size,
COUNT() as num_duplicates
FROM
pg_catalog.pg_statio_all_tables
LEFT OUTER JOIN geometry_columns AS gc ON gc.f_table_name = relname
WHERE schemaname = ‘public’
GROUP BY geo_type, relname
HAVING COUNT() > 1;
这个结果只能过滤 一个表有多个geometry的表不符合要求
到这一步开始是考虑联合查询,不论是左连接 还是右连接查询
SELECT f_table_name, MAX(type) AS type ,COUNT(type) geomcount
FROM geometry_columns WHERE f_table_schema = ‘public’ GROUP BY f_table_name
查询 geometry_columns 过滤需要用到 MAX函数,而不是distinct .
distinct 的这种方式例子如下
SELECT f_table_name, type, COUNT(*) as count
FROM (
SELECT DISTINCT ON (f_table_name) f_table_name, type
FROM geometry_columns
ORDER BY f_table_name, type
) subquery
GROUP BY f_table_name;
查询结果其实也是会重复,而且拿不到重复的的geometry的type不合适.
- 最终版方式
SELECT
gc.type AS geo_type,
pg_statio_all_tables.relname,
obj_description(relid) AS description,
pg_total_relation_size(relid) / pg_column_size(relid::regclass) AS data_size,
gc.geomcount
FROM
pg_catalog.pg_statio_all_tables
LEFT OUTER JOIN (SELECT f_table_name,
MAX(type) AS type ,COUNT(type) geomcount
FROM geometry_columns GROUP BY f_table_name ) AS gc ON gc.f_table_name = relname
WHERE schemaname = ‘public’