PostgreSQL Oracle FDW vs DBI_Link
As promised in my previous post on the PostgreSQL Oracle FDW, I’ve done some performance testing verses our old method of using DBI-Link to replicate tables from an Oracle 8i instance to postgres 9.1.2. I’ve got good news on several fronts!
First being that there is a definite improvement on the replication speed. Here’s some stats on our old process that copied an entire table of about 1.5 million rows
job_name | timestamp | completion_timestamp | duration
------------------------+----------------------------+----------------------------+-----------------
DBI-LINK TABLE REFRESH | 2012-02-10 05:00:07.720262 | 2012-02-10 05:07:40.509401 | 00:07:32.789139
DBI-LINK TABLE REFRESH | 2012-02-09 05:00:07.674933 | 2012-02-09 05:07:59.087194 | 00:07:51.412261
DBI-LINK TABLE REFRESH | 2012-02-08 05:00:07.880216 | 2012-02-08 05:07:31.350684 | 00:07:23.470468
DBI-LINK TABLE REFRESH | 2012-02-07 05:00:08.1697 | 2012-02-07 05:07:47.219942 | 00:07:39.050242
DBI-LINK TABLE REFRESH | 2012-02-06 05:00:07.793401 | 2012-02-06 05:07:48.86486 | 00:07:41.071459
DBI-LINK TABLE REFRESH | 2012-02-05 05:00:08.158177 | 2012-02-05 05:07:42.889557 | 00:07:34.73138
DBI-LINK TABLE REFRESH | 2012-02-04 05:00:07.665676 | 2012-02-04 05:07:12.19166 | 00:07:04.525984
DBI-LINK TABLE REFRESH | 2012-02-03 05:00:08.077096 | 2012-02-03 05:07:09.999919 | 00:07:01.922823
DBI-LINK TABLE REFRESH | 2012-02-02 05:00:07.727106 | 2012-02-02 05:07:36.716676 | 00:07:28.98957
DBI-LINK TABLE REFRESH | 2012-02-01 05:00:07.767412 | 2012-02-01 05:07:15.261486 | 00:07:07.494074
So you can see on average this job took about 7 1/2 minutes. Now lets see what using the FDW tables does for this process
job_name | timestamp | completion_timestamp | duration
-------------------+----------------------------+----------------------------+-----------------
FDW TABLE REFRESH | 2012-02-20 05:00:10.347331 | 2012-02-20 05:04:29.999205 | 00:04:19.651874
FDW TABLE REFRESH | 2012-02-19 05:00:07.649312 | 2012-02-19 05:03:41.616585 | 00:03:33.967273
FDW TABLE REFRESH | 2012-02-18 05:00:10.010323 | 2012-02-18 05:04:01.782797 | 00:03:51.772474
FDW TABLE REFRESH | 2012-02-17 05:00:10.373712 | 2012-02-17 05:04:18.442016 | 00:04:08.068304
FDW TABLE REFRESH | 2012-02-16 05:00:08.5677 | 2012-02-16 05:04:16.405462 | 00:04:07.837762
FDW TABLE REFRESH | 2012-02-15 05:00:10.68691 | 2012-02-15 05:05:09.513021 | 00:04:58.826111
FDW TABLE REFRESH | 2012-02-14 05:00:07.595426 | 2012-02-14 05:03:47.697134 | 00:03:40.101708
FDW TABLE REFRESH | 2012-02-13 05:00:07.557273 | 2012-02-13 05:03:54.2805 | 00:03:46.723227
FDW TABLE REFRESH | 2012-02-12 05:00:08.176132 | 2012-02-12 05:03:49.145094 | 00:03:40.968962
FDW TABLE REFRESH | 2012-02-11 05:00:08.053425 | 2012-02-11 05:04:03.345984 | 00:03:55.292559
On average about 4 minutes. That’s almost a 50% decrease! We have a lot of tables being refreshed in this manner, some much larger and others many times per day, so in total this will have a pretty big impact on the time spent in replication. We will still need to keep DBI_Link
around, though, since some of our jobs require writing back to Oracle which FDW does not support at this time.
I’ve generalized some of the code we use for doing these materialized snapshot views and made it available in case others may find it useful. Note that I only have Oracle 8i to work with at the moment, so some different setup steps may be required for your (hopefully) more modern versions. If the setup is different for other versions, I would appreciate anyone contributing back a setup file. I think the most useful part is really the function to automatically make an FDW table given only a remote Oracle table name. I’ll probably try and make that a little more general in the future so it’s not tied quite so much to the whole snapshot system (Update 4/27/2012
: This has been done and committed to github).
https://github.com/keithf4/oracle_fdw_snapshot
And some other good news out of this testing was helping the Oracle FDW author out and letting him know about a bug that came up when a query on an FDW table would fail under certain conditions and invalidate all further FDW queries for that session. If you’ve run into this issue as well, he was able to get it fixed and now it handles query errors much more smoothly.
Looking forward to seeing how this will help make the transition from Oracle to PostgreSQL that much easier in the future.