From: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: paths in partitions of a dummy partitioned table |
Date: | 2017-09-06 07:08:24 |
Message-ID: | CAFjFpRcx4BzpzmLHha-PeoZ6csYmyTSRMW5Wc6nvfD=f9=yq6A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 25, 2017 at 10:46 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jul 6, 2017 at 11:35 AM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> If a partitioned table is proven dummy, set_rel_pathlist() doesn't mark the
>> partition relations dummy and thus doesn't set any (dummy) paths in the
>> partition relations. The lack of paths in the partitions means that we can
>> not use partition-wise join while joining this table with some other similarly
>> partitioned table as the partitions do not have any paths that child-joins can
>> use. This means that we can not create partition relations for such a join and
>> thus can not consider the join to be partitioned. This doesn't matter much when
>> the dummy relation is the outer relation, since the resultant join is also
>> dummy. But when the dummy relation is inner relation, the resultant join is not
>> dummy and can be considered to be partitioned with same partitioning scheme as
>> the outer relation to be joined with other similarly partitioned table. Not
>> having paths in the partitions deprives us of this future optimization.
>
> I think it's wrong for any code to be examining the path list for a
> rel marked dummy, so I would suggest approaching this from a different
> direction.
Me and Robert had an offline discussion about this. I am summarizing
it here for the sake of completeness.
A dummy relation is identified by the only dummy path that exists in
its pathlist. There is no flag in RelOptInfo which tells whether a
given relation is dummy or not, it's the dummy path which tells that.
A dummy path is an Append path with no subpaths. Planner doesn't treat
dummy relations any different from other relations when it comes to
using paths. When a dummy relation participates in a join, the dummy
path is used as one of the joining paths and converted to a Result
plan at the time of planning. So, for a partition-wise join where one
of the joining relations is dummy, its every child must have dummy
path which can be used to construct child-join paths.
But we don't need to mark partition relations dummy (if their parent
is dummy) even when it's not going to participate in partition-wise
join. The partition relations will be marked dummy when we know that
they will be required for partition-wise join. I was worried that we
might mark base relation dummy during join planning this way, but we
already have a precedence for that in add_paths_to_join_rel(). So,
shouldn't be a problem. So, I have now added a patch in partition-wise
join set to mark partition relations dummy when their parent is dummy.
> Given A LEFT JOIN B where Bk is dummy, I suggest
> constructing the path for (AB)k by taking a path from Ak and applying
> an appropriate PathTarget. You don't really need a join path at all;
> a path for the non-dummy input is fine - and, in fact, better, since
> it will be cheaper to execute. One problem is that it may not produce
> the correct output columns. (AB) may drop some columns that were
> being generated by A because they were only needed to perform the
> join, and it may add additional columns taken from B. But I think
> that if you take the default PathTarget for (AB) and replace
> references to columns of B with NULL constants, you should be able to
> apply the resulting PathTarget to a path for Ak and get a valid path
> for (AB)k. Maybe there is some reason why that won't work exactly,
> but I think it is probably more promising to attack the problem from
> this angle than to do what you propose. Sticking dummy joins into the
> query plan that are really just projecting out NULLs is not appealing.
>
This might help in the cases when the RelOptInfo itself is missing
e.g. missing partitions in partition matching as discussed in [1]. I
will discuss this approach on that thread.
[1] /message-id/CAFjFpRdjQvaUEV5DJX3TW6pU5eq54NCkadtxHX2JiJG_GvbrCA@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-09-06 07:11:55 | Re: CLUSTER command progress monitor |
Previous Message | Andres Freund | 2017-09-06 07:03:40 | Changing Jobs |