History for UnfairLinuxThreadsAndPythonOhMy
??changed:
-
From a post to comp.lang.python, 20 April 1999, by NeilSchemenaur:
I think I might have found part of the problem. My Debian Linux
system has glibc-2.1.3 which includes LinuxThreads 0.7. From the
LinuxThreads FAQ:
-- D.6: Scheduling seems to be very unfair when there is strong
contention on a mutex: instead of giving the mutex to each
thread in turn, it seems that it's almost always the same
thread that gets the mutex. Isn't this completely broken
behavior?
-- What happens is the following: when a thread unlocks a mutex,
all other threads that were waiting on the mutex are sent a
signal which makes them runnable. However, the kernel
scheduler may or may not restart them immediately. If the
thread that unlocked the mutex tries to lock it again
immediately afterwards, it is likely that it will succeed,
because the threads haven't yet restarted. This results in an
apparently very unfair behavior, when the same thread
repeatedly locks and unlocks the mutex, while other threads
can't lock the mutex.
-- This is perfectly acceptable behavior with respect to the
POSIX standard: for the default scheduling policy, POSIX
makes no guarantees of fairness, such as "the thread waiting
for the mutex for the longest time always acquires it first".
This allows implementations of mutexes to remain simple and
efficient. Properly written multithreaded code avoids that
kind of heavy contention on mutexes, and does not run into
fairness problems. If you need scheduling guarantees, you
should consider using the real-time scheduling policies
SCHED_RR and SCHED_FIFO, which have precisely defined
scheduling behaviors.
Threaded Python contends heavily for a few mutexes. Adding
sched_yield() to a few strategic places seems to improve things a
lot but I don't know if it is the proper solution. Does anyone
else know better? LinuxThreads 0.8 is supposed to be more fair.
I think the attached code shows the problem (or maybe I just
don't understand threads at all :). On my _uniprocessor_ machine
I get about four stars before the new thread seems to stop
running::
======================================================================
import thread
import os
import sys
def run():
while 1:
if os.fork() == 0:
sys.stderr.write('*')
break
os.wait()
thread.start_new_thread(run, ())
while 1:
pass
====================================================================
additional comments by Tony Rossignol (mailto:[email protected]) 2000-04-25
Background:
We are running three Linux RedHat servers as our Zope server farm. Two of these servers have kernel 2.2.12 w/ glibc 2.1.2 and the third has kernel 2.2.5 w/ glibc 2.0.7. All three are dual PentiumIII servers, varying in speed from 400-500Mhz. The machine with the older kernel is the slowest box.
Both servers with the newer kernel/glibc experience unexplained restarts; server 3, the slower/older system does not experience these restarts. Frequently server 3 will remain up for 24 hours (we have nightly restarts when a clone ZODB is copied over).
Results:
Running Neil's script (from above) on the various servers resulted in the following: servers 1&2 resulted in between 2 to 20 stars being printed before the new thread seemed to stop running. CPU was totally being eaten up by the python process; server 3 printed stars until the process was killed.
Meaning:
I don't know. But this is the first solid example I've seen that illustrates the observed differences between our servers, and offers some indication as to cause.