[Spacewalk-list] Keepalive for Database Connections?

Discussion:

s***@mms-dresden.de

2013-06-07 08:30:21 UTC

Hi,

there is a firewall between our Spacewalk-Server and our Oracle-DB, that
silently drops connections that have not been used for one hour.

When the Spacewalk-Server tries to reuse Database connections after they
have been idle for more than 60 minutes, it won't succeed. It takes quite a
while until it gives up reusing such connections and just opens up a new
one.

When a spacewalk-client tries to contact the spacewalk-server it usually
takes more than 10 minutes of constant retries until a connection succeeds.
That seems to be the time the spacewalk-server needs to get a new, stable
connection to the Database.

Since the firewall's behavior can't be adjusted, the most obvious solution
would be to enable keepalive probes for the database connection. But I did
not find a way to do this.

First I tried to set "SQLNET.EXPIRE_TIME = 10" in
/etc/rhn/tns_admin/osa-dispatcher/sqlnet.ora - which should tell the Oracle
Instant Client to use its implemented keepalives procedures. It just doesn't
work.

Then I tried to set appropriate TCP-Keepalive Values in the Linux Kernel
(sysctl -w net.ipv4.tcp_keepalive_time=600). But the sockets opened by
spacewalk don't use "SO_KEEPALIVE" - which effectively disables
tcp-keepalive.

Can someone help me out?

We use spacewalk 1.8 (from the yum repo yum.spacewalkproject.org) on RHEL6
with a separate Oracle 11 database. We use the Oracle Instant Client from
the same repo to connect to the DB.

Best regards,
Stefan

Michael Mraka

2013-06-07 09:57:42 UTC

Permalink

***@mms-dresden.de wrote:
% Hi,
%
% there is a firewall between our Spacewalk-Server and our Oracle-DB, that
% silently drops connections that have not been used for one hour.
...
% Since the firewall's behavior can't be adjusted, the most obvious solution
% would be to enable keepalive probes for the database connection. But I did
% not find a way to do this.
%
% First I tried to set "SQLNET.EXPIRE_TIME = 10" in
% /etc/rhn/tns_admin/osa-dispatcher/sqlnet.ora - which should tell the Oracle
% Instant Client to use its implemented keepalives procedures. It just doesn't
% work.

Hi Stefan,

SQLNET.EXPIRE_TIME is server option so you have to put it into
sqlnet.ora on database server side not spacewalk (=database client)
side.

Regards,

--
Michael Mráka
Satellite Engineering, Red Hat

Jan Pazdziora

2013-06-07 12:29:46 UTC

Permalink

Post by s***@mms-dresden.de
First I tried to set "SQLNET.EXPIRE_TIME = 10" in
/etc/rhn/tns_admin/osa-dispatcher/sqlnet.ora - which should tell the Oracle
Instant Client to use its implemented keepalives procedures. It just doesn't
work.

As pointed out by Michael, this is server-side thing which allows the
server to detect dead connections and release server resources.

Post by s***@mms-dresden.de
Then I tried to set appropriate TCP-Keepalive Values in the Linux Kernel
(sysctl -w net.ipv4.tcp_keepalive_time=600). But the sockets opened by
spacewalk don't use "SO_KEEPALIVE" - which effectively disables
tcp-keepalive.
Can someone help me out?

I believe you need to specify

ENABLE=BROKEN

in your tnsnames.ora to enable TCP keepalive.

--
Jan Pazdziora
Principal Software Engineer, Identity Management Engineering, Red Hat

s***@mms-dresden.de

2013-06-10 09:18:22 UTC

Permalink

We have enabled keepalive on the Oracle server using SQLNET.EXPIRE_TIME, but
the situation has not changed.

Post by Jan Pazdziora
I believe you need to specify
ENABLE=BROKEN
in your tnsnames.ora to enable TCP keepalive.

We can not use "ENABLE=BROKEN" on the Oracle Client, since Spacewalk uses
Oracle Instant Client - which has no tnsnames.ora.
Is there a place where I can tell spacewalk to enable keepalive when using
Oracle Instant Client? Maybe an option for /etc/rhn/cluster.ini or
/etc/rhn/rhn.conf ?

--
Stefan Lasche
Unix System Engineer

Michael Mraka

2013-06-10 09:49:56 UTC

Permalink

***@mms-dresden.de wrote:
% We have enabled keepalive on the Oracle server using SQLNET.EXPIRE_TIME, but
% the situation has not changed.
%
% > I believe you need to specify
% >
% > ENABLE=BROKEN
% >
% > in your tnsnames.ora to enable TCP keepalive.
%
% We can not use "ENABLE=BROKEN" on the Oracle Client, since Spacewalk uses
% Oracle Instant Client - which has no tnsnames.ora.
% Is there a place where I can tell spacewalk to enable keepalive when using
% Oracle Instant Client? Maybe an option for /etc/rhn/cluster.ini or
% /etc/rhn/rhn.conf ?

Hi Stefan,

you can create /etc/tnsname.ora with connect string you need.
Then just change db_name in /etc/rhn/rhn.conf to the tnsname you've created
(instead of default //host:port/name) and that's it :).

Regards,

--
Michael Mráka
Satellite Engineering, Red Hat

Stefan Lasche

2013-06-13 08:48:03 UTC

Permalink

Post by Michael Mraka
Hi Stefan,
you can create /etc/tnsname.ora with connect string you need.
Then just change db_name in /etc/rhn/rhn.conf to the tnsname you've created
(instead of default //host:port/name) and that's it :).

Thanks a lot Michael! That did the trick. The file is actually called /etc/tnsnames.ora and not /etc/tnsname.ora
Small difference, with big effects ;)
I wrote a small tutorial - just in case anyone else is interested:

How to implement Keepalive for Oracle Instant Client (for use in Spacewalk). Oracle Client will use the TCP Keepalive Feature of the Operating System (in my case Linux).

1) create /etc/tnsnames.ora by closely following the example in the "ENABLE" Section of the Oracle Docs: http://docs.oracle.com/cd/B28359_01/network.111/b28317/tnsnames.htm#CHDCDGCE

2) use Michael's magic and edit Spacewalk's /etc/rhn/rhn.conf. Change "db_name = //your_db_server_name/your_db "
into "db_name = net_service_name" (use the same net_service_name you used in your /etc/tnsnames.ora)

3) check the kernel's current keepalive values:
# sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_time = 7200 # send the first keepalive after 3600 seconds (2h)
net.ipv4.tcp_keepalive_intvl = 75 # after the first keepalive, send keepalives every 75 seconds
net.ipv4.tcp_keepalive_probes = 9 # kill connection after 9 failed keepalives

4) Adjust the kernels Keepalive values to your needs:
# sysctl –w net.ipv4.tcp_keepalive_time=1800
# echo “net.ipv4.tcp_keepalive_time = 1800” >> /etc/sysctl.conf

5) restart Spacewalk
# /usr/sbin/spacewalk-service stop && /usr/sbin/spacewalk-service start

6) check if keepalive is now enabled for your database connections (use "netstat -tpno" and check the last two columns for the word "keepalive" and the corresponding timer values)
# netstat -tnpo
tcp 0 0 192.168.1.1:54729 192.168.2.2:1521 ESTABLISHED 26988/java keepalive (1530.03/0/0)
tcp 0 0 192.168.1.1:54718 192.168.2.2:1521 ESTABLISHED 26944/httpd keepalive (1522.62/0/0)

7) enjoy your space walk ;)

__
regards,
Stefan

Michael Mraka

2013-06-13 09:20:48 UTC

Permalink

Stefan Lasche wrote:
% > Hi Stefan,
% >
% > you can create /etc/tnsname.ora with connect string you need.
% > Then just change db_name in /etc/rhn/rhn.conf to the tnsname you've created
% > (instead of default //host:port/name) and that's it :).
% >
%
% Thanks a lot Michael! That did the trick. The file is actually called /etc/tnsnames.ora and not /etc/tnsname.ora
% Small difference, with big effects ;)

Oops, sorry for the typo.
I hope you didn't spend ages to find it out ;).

% I wrote a small tutorial - just in case anyone else is interested:
...

Thanks both for success report and tutorial.

Regards,

--
Michael Mráka
Satellite Engineering, Red Hat