Discussion:
[Spacewalk-list] osa-dispatcher problems. no system ever, reports
Stefan Bluhm
2010-02-09 22:06:36 UTC
Permalink
Hi Josh,

I have just now done a very fresh clean install:
1 VMWare Fedora 12
2 yum update
3 installation of spacewalk nightly (using an existing Oracle DB server on a different machine)

I still get

Starting osa-dispatcher: /usr/lib/python2.6/site-packages/jabber/jabber.py:68: DeprecationWarning: the sha module is deprecated; use the hashlib module instead
import sha, time
RHN 4682 2010/02/09 21:36:39 +02:00: ('Traceback (most recent call last):\n File "/usr/share/rhn/osad/jabber_lib.py", line 254, in setup_connection\n c = self._get_jabber_client(js)\n File "/usr/share/rhn/osad/jabber_lib.py", line 311, in _get_jabber_client\n c.connect()\n File "/usr/share/rhn/osad/jabber_lib.py", line 577, in connect\n jabber.Client.connect(self)\n File "/usr/lib/python2.6/site-packages/jabber/xmlstream.py", line 464, in connect\n else: self._sock.connect((self._hostIP, self._port))\n File "<string>", line 1, in connect\nerror: [Errno 111] Connection refused\n',)


Now I am testing around a bit:
telnet spacewalk 5222 -> Connection Successful
netstat -ptnl -> c2s is listening on 5222 IPv4
db_verify authreg.db sm.db ; echo $? -> 0
xmllint --noout /etc/jabberd/*.xml -> no error
/etc/rhn/rhn.conf looks ok

Some log extracts which might mean something:
osa-dispatcher.log
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: rhnSQL/driver_cx_Oracle.connect('Connecting to database', 'XE')
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: osad/jabber_lib.__init__
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: osad/jabber_lib.print_message('socket error',)
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: osad/jabber_lib.print_message('Could not connect to jabber server', 'spacewalk.domain.com')


messages log on /usr/sbin/rhn-satellite stop
Feb 9 21:49:51 spacewalk jabberd/c2s[7024]: connection to router closed
Feb 9 21:49:51 spacewalk jabberd/c2s[7024]: attempting reconnect (3 left)
Feb 9 21:49:51 spacewalk jabberd/sm[7001]: connection to router closed
Feb 9 21:49:51 spacewalk jabberd/sm[7001]: attempting reconnect (3 left)
Feb 9 21:49:51 spacewalk jabberd/s2s[7047]: connection to router closed
Feb 9 21:49:51 spacewalk jabberd/s2s[7047]: attempting reconnect (3 left)
Feb 9 21:49:52 spacewalk jabberd/sm[7001]: attempting connection to router at 127.0.0.1, port=5347
Feb 9 21:49:52 spacewalk jabberd/sm[7001]: shutting down
Feb 9 21:49:53 spacewalk jabberd/c2s[7024]: attempting connection to router at 127.0.0.1, port=5347
Feb 9 21:49:53 spacewalk jabberd/c2s[7024]: [5] [router] write error: Connection refused (111)
Feb 9 21:49:53 spacewalk jabberd/c2s[7024]: connection to router closed
Feb 9 21:49:53 spacewalk jabberd/c2s[7024]: attempting reconnect (2 left)
Feb 9 21:49:53 spacewalk jabberd/s2s[7047]: attempting connection to router at 127.0.0.1, port=5347
Feb 9 21:49:53 spacewalk jabberd/s2s[7047]: [6] [router] write error: Connection refused (111)
Feb 9 21:49:53 spacewalk jabberd/s2s[7047]: connection to router closed
Feb 9 21:49:53 spacewalk jabberd/s2s[7047]: attempting reconnect (2 left)
Feb 9 21:49:53 spacewalk jabberd/c2s[7024]: attempting connection to router at 127.0.0.1, port=5347
Feb 9 21:49:53 spacewalk jabberd/c2s[7024]: shutting down
Feb 9 21:49:54 spacewalk jabberd/s2s[7047]: attempting connection to router at 127.0.0.1, port=5347
Feb 9 21:49:54 spacewalk jabberd/s2s[7047]: shutting down
Feb 9 21:49:54 spacewalk jabberd/s2s[7047]: connection to router closed

the start only shows successes in the messages log.

I keep getting these errors via email. Are they of any concern?
com.redhat.rhn.common.hibernate.HibernateRuntimeException: HibernateException executing CachedStatement
at com.redhat.rhn.common.db.datasource.CachedStatement.execute(CachedStatement.java:280)
at com.redhat.rhn.common.db.datasource.SelectMode.execute(SelectMode.java:105)
at com.redhat.rhn.taskomatic.task.repomd.ChannelRepodataDriver.getCandidates(ChannelRepodataDriver.java:86)
at com.redhat.rhn.taskomatic.task.threaded.TaskQueue.run(TaskQueue.java:106)
at com.redhat.rhn.taskomatic.task.ChannelRepodata.execute(ChannelRepodata.java:58)
at com.redhat.rhn.taskomatic.task.SingleThreadedTestableTask.execute(SingleThreadedTestableTask.java:54)
at org.quartz.core.JobRunShell.run(JobRunShell.java:203)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
Caused by: org.hibernate.StaleStateException: Batch update returned unexpected row count from update [0]; actual row count: 0;
expected: 1
at org.hibernate.jdbc.Expectations$BasicExpectation.checkBatched(Expectations.java:61)
at org.hibernate.jdbc.Expectations$BasicExpectation.verifyOutcome(Expectations.java:46)
at org.hibernate.jdbc.NonBatchingBatcher.addToBatch(NonBatchingBatcher.java:24)
at org.hibernate.persister.entity.AbstractEntityPersister.delete(AbstractEntityPersister.java:2520)
at org.hibernate.persister.entity.AbstractEntityPersister.delete(AbstractEntityPersister.java:2697)
at org.hibernate.action.EntityDeleteAction.execute(EntityDeleteAction.java:74)
at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:250)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:234)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:146)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:298)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:27)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000)
at com.redhat.rhn.common.db.datasource.CachedStatement.stealConnection(CachedStatement.java:853)
at com.redhat.rhn.common.db.datasource.CachedStatement.execute(CachedStatement.java:441)
... 11 more


Is there a way I can initiate and test the connections manually and manually simulate the behaviour?
Joshua Roys
2010-02-09 22:25:20 UTC
Permalink
Post by Stefan Bluhm
telnet spacewalk 5222 -> Connection Successful
netstat -ptnl -> c2s is listening on 5222 IPv4
db_verify authreg.db sm.db ; echo $? -> 0
xmllint --noout /etc/jabberd/*.xml -> no error
/etc/rhn/rhn.conf looks ok
osa-dispatcher.log
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: rhnSQL/driver_cx_Oracle.connect('Connecting to database', 'XE')
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: osad/jabber_lib.__init__
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: osad/jabber_lib.print_message('socket error',)
2010/02/09 21:01:17 +02:00 4701 0.0.0.0: osad/jabber_lib.print_message('Could not connect to jabber server', 'spacewalk.domain.com')
That is very strange that telnet can connect but osa-dispatcher can't.
Are they connecting to the same IP/port?

Let's try 2 more things:

# osa-dispatcher -N -vvvvvvvvvvvv

and look in the output for anything suspicious that might be causing
this, and then if that doesn't:

# strace -fe trace=network osa-dispatcher -N
(and if that doesn't work, just `strace -f osa-dispatcher -N')

Good luck,

Josh
Stefan Bluhm
2010-02-09 22:54:36 UTC
Permalink
Thanks Josh,

this is osa-dispatcher on vvvvvvvvv

<error><conflict xmlns = 'urn:ietf:params:xml:ns:xmpp-streams' /></error>


and this is the strace
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
socket(PF_NETLINK, SOCK_RAW, 0) = 7
bind(7, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(7, {sa_family=AF_NETLINK, pid=2557, groups=00000000}, [12]) = 0
sendto(7, "\24\0\0\0\26\0\1\3f\332qK\0\0\0\0\0\0\0\0", 20, 0,
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
recvmsg(7, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
msg_iov(1)=[{"0\0\0\0\24\0\2\0f\332qK\375\t\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"...,
4096}], msg_controllen=0, msg_flags=0}, 0) = 108
recvmsg(7, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
msg_iov(1)=[{"@\0\0\0\24\0\2\0f\332qK\375\t\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"...,
4096}], msg_controllen=0, msg_flags=0}, 0) = 128
recvmsg(7, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
msg_iov(1)=[{"\24\0\0\0\3\0\2\0f\332qK\375\t\0\0\0\0\0\0\1\0\0\0\24\0\1\0\0\0\0\0"...,
4096}], msg_controllen=0, msg_flags=0}, 0) = 20
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 7
connect(7, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("192.168.0.12")}, 28) = 0
send(7, "\26\332\1\0\0\1\0\0\0\0\0\0\tspacewalk\10bluhm-de\3"..., 40,
MSG_NOSIGNAL) = 40
recvfrom(7,
"\26\332\205\200\0\1\0\1\0\1\0\0\tspacewalk\10bluhm-de\3"..., 1024, 0,
{sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("192.168.0.12")}, [16]) = 78
connect(6, {sa_family=AF_INET, sin_port=htons(5222),
sin_addr=inet_addr("192.168.0.28")}, 16) = 0
send(6, "<?xml version='1.0' encoding='UT"..., 165, 0) = 165
recv(6, "<?xml version='1.0'?><stream:str"..., 1024, 0) = 459
send(6, "<starttls xmlns='urn:ietf:params"..., 52, 0) = 52
recv(6, "<proceed xmlns='urn:ietf:params:"..., 1024, 0) = 50
RHN 2557 2010/02/09 22:58:24 +02:00: ('Error caught:',)
RHN 2557 2010/02/09 22:58:24 +02:00: ('Traceback (most recent call
last):\n File "/usr/share/rhn/osad/jabber_lib.py", line 122, in
main\n self.process_forever(c)\n File
"/usr/share/rhn/osad/jabber_lib.py", line 180, in process_forever\n
self.process_once(client)\n File
"/usr/share/rhn/osad/osa_dispatcher.py", line 146, in process_once\n
client.retrieve_roster()\n File "/usr/share/rhn/osad/jabber_lib.py",
line 718, in retrieve_roster\n stanza = self.get_one_stanza()\n File
"/usr/share/rhn/osad/jabber_lib.py", line 790, in get_one_stanza\n
self.process(timeout=tm)\n File "/usr/share/rhn/osad/jabber_lib.py",
line 1048, in process\n raise SSLError("OpenSSL error; will retry",
str(e))\nSSLError: (\'OpenSSL error; will retry\', "(-1, \'Unexpected
EOF\')")\n',)



To me it looks like it might have something to do the SSL or so.
Bushby, Bruce (London)(c)
2010-02-09 23:34:40 UTC
Permalink
Perhaps try a tidy up of your /etc/hosts file and look at hostname/FQDN
?

"....Now I am testing around a bit:
telnet spacewalk 5222 -> Connection Successful...."

Yet in your logs
".......jabber_lib.print_message('Could not connect to jabber server',
'spacewalk.domain.com')......"

So is your domain "domain.com"?


Clean out /etc/hosts (if you can), mine looks similar to this:
#
127.0.0.1 localhost.localdomain localhost
#
192.168.1.10 spacewalk.home.co.uk spacewalk
#


Make sure /etc/nsswitch.conf has dns (files first)

What hostname (FQDN) did you give for the SSL cert?

I had a similar problem and fixed it by cleaning up my hosts file and
making sure I only used FQDN.





-----Original Message-----
From: spacewalk-list-***@redhat.com
[mailto:spacewalk-list-***@redhat.com] On Behalf Of Stefan Bluhm
Sent: 09 February 2010 22:55
To: spacewalk-***@redhat.com
Subject: Re: [Spacewalk-list] osa-dispatcher problems. no system
ever,reports

Thanks Josh,

this is osa-dispatcher on vvvvvvvvv

<error><conflict xmlns = 'urn:ietf:params:xml:ns:xmpp-streams'
/></error>


and this is the strace
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
socket(PF_NETLINK, SOCK_RAW, 0) = 7
bind(7, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(7, {sa_family=AF_NETLINK, pid=2557, groups=00000000}, [12])
= 0 sendto(7, "\24\0\0\0\26\0\1\3f\332qK\0\0\0\0\0\0\0\0", 20, 0,
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20 recvmsg(7,
{msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
msg_iov(1)=[{"0\0\0\0\24\0\2\0f\332qK\375\t\0\0\2\10\200\376\1\0\0\0\10\
0\1\0\177\0\0\1"...,
4096}], msg_controllen=0, msg_flags=0}, 0) = 108 recvmsg(7,
{msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
msg_iov(1)=[{"@\0\0\0\24\0\2\0f\332qK\375\t\0\0\n\200\200\376\1\0\0\0\24
\0\1\0\0\0\0\0"...,
4096}], msg_controllen=0, msg_flags=0}, 0) = 128 recvmsg(7,
{msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
msg_iov(1)=[{"\24\0\0\0\3\0\2\0f\332qK\375\t\0\0\0\0\0\0\1\0\0\0\24\0\1\
0\0\0\0\0"...,
4096}], msg_controllen=0, msg_flags=0}, 0) = 20 socket(PF_INET,
SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 7 connect(7, {sa_family=AF_INET,
sin_port=htons(53), sin_addr=inet_addr("192.168.0.12")}, 28) = 0 send(7,
"\26\332\1\0\0\1\0\0\0\0\0\0\tspacewalk\10bluhm-de\3"..., 40,
MSG_NOSIGNAL) = 40
recvfrom(7,
"\26\332\205\200\0\1\0\1\0\1\0\0\tspacewalk\10bluhm-de\3"..., 1024, 0,
{sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("192.168.0.12")}, [16]) = 78 connect(6,
{sa_family=AF_INET, sin_port=htons(5222),
sin_addr=inet_addr("192.168.0.28")}, 16) = 0 send(6, "<?xml
version='1.0' encoding='UT"..., 165, 0) = 165 recv(6, "<?xml
version='1.0'?><stream:str"..., 1024, 0) = 459 send(6, "<starttls
xmlns='urn:ietf:params"..., 52, 0) = 52 recv(6, "<proceed
xmlns='urn:ietf:params:"..., 1024, 0) = 50 RHN 2557 2010/02/09 22:58:24
+02:00: ('Error caught:',) RHN 2557 2010/02/09 22:58:24 +02:00:
('Traceback (most recent call last):\n File
"/usr/share/rhn/osad/jabber_lib.py", line 122, in
main\n self.process_forever(c)\n File
"/usr/share/rhn/osad/jabber_lib.py", line 180, in process_forever\n
self.process_once(client)\n File
"/usr/share/rhn/osad/osa_dispatcher.py", line 146, in process_once\n
client.retrieve_roster()\n File "/usr/share/rhn/osad/jabber_lib.py",
line 718, in retrieve_roster\n stanza = self.get_one_stanza()\n File

"/usr/share/rhn/osad/jabber_lib.py", line 790, in get_one_stanza\n
self.process(timeout=tm)\n File "/usr/share/rhn/osad/jabber_lib.py",
line 1048, in process\n raise SSLError("OpenSSL error; will retry",
str(e))\nSSLError: (\'OpenSSL error; will retry\', "(-1, \'Unexpected
EOF\')")\n',)



To me it looks like it might have something to do the SSL or so.

_______________________________________________
Spacewalk-list mailing list
Spacewalk-***@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list

**********************************************************************
Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees only. It contains information which may be confidential and privileged. If you are not the intended recipient, please notify the sender immediately, destroy this email and any attachments and do not otherwise disclose or use them. Email transmission is not a secure method of communication and Man Investments cannot accept responsibility for the completeness or accuracy of this email or any attachments. Whilst Man Investments makes every effort to keep its network free from viruses, it does not accept responsibility for any computer virus which might be transferred by way of this email or any attachments. This email does not constitute a request, offer, recommendation or solicitation of any kind to buy, subscribe, sell or redeem any investment instruments or to perform other such transactions of any kind. Man Investments reserves the right to monitor, record and retain all electronic communications through its network to ensure the integrity of its systems, for record keeping and regulatory purposes.
Visit us at: www.maninvestments.com
TG0908
**********************************************************************
Stefan Bluhm
2010-02-10 07:12:41 UTC
Permalink
Hi Bruce,

that did the job!!
I added my IP and FQDN to the hosts file and now it works (new entry
192.168.0.28 spacewalk.domain.com)

This is rather strange behavior though as I am able to connect from the
same machine via telnet and use the FQDN from the command line.
Also this was a very clean and fresh install just for the spacewalk
purpose so not adding the FQDN to hosts seems to be a Fedora 12 default.
Post by Bushby, Bruce (London)(c)
So is your domain "domain.com"?
No, I obscured it as I didn't want all my details to be archived on the
internet

Thanks a lot for this pointer Bruce. I guess I would have never found it
as manual testing worked.

Best wishes,

Stefan

Loading...