The Python world has this peculiar characteristic that for nearly all ideas that exist out there in the current programming world, there is a prepared solution that appears well-rounded at first and when you then try to “just plug it together”, after a certain while you encounter some most specific cases that make all the “just” go up in smoke and leave you with some hours of research.
That is also, why now, for quite some of these “solutions” there seem to be various similar-but-distinct packages that you better care to understand; which is again, hard, because every Python developer always likes to advertise their package with very easy sounding words.
But that is the fun of Python. If I choose it as suitable for a project, this is the contract I sign 🙂
So I recently ran into a surprisingly non-straightforward case with no simple go-to-solution. Maybe you know one and then I’ll be thankful to try that and discuss it through; sometimes one is just blind from all the options. (It’s also what you sign when choosing Python. I cope.)
Now, I thought a lot about my use case and I will split these findings up into multiple blog posts – thinking asynchronously (“async”) is tricky in many languages, but the Python way, again, is to hide its intricacies in very carefully selected places.
A desktop app with TPC socket HANDLING
As long as your program is a linear script, you do not need async features. But if there is some thing to be done for a longer time (or endless), you cannot afford to wait for it or otherwise intervene.
E.g for our desktop application (employing Python’s very basic tkinter) we sooner or later run into the tk.mainloop() , which, from the OS thread it was called from, is a endless loop that draws the interface, handles input events, updates the interface, repeat. This is blocking, i.e. that thread can now only do other stuff from the event handlers acting on that interface.
You might know: Any desktop UI framework really hates if you try to update its interface from “outside the UI thread”. Just think of any order of unpredictable behaviour if you would e.g. draw a button, then handle its click event and while you’re at it, a background thread removes that button – etc.
The good thing is, you will quickly be told not to do such a thing, the bad thing is, you might end up with an unexpected or difficult to parse error and some question marks about what else to do.
The UI-thread problem is a specific case of “doing stuff in parallel requires you to really manage who can actually access a given resource”. Just google race condition. If you think about it, this holds for managing projects / humans in general, but also for our desktop app that allows simple network connections via TCP socket.
Now the first clarifications have to be done. Python gives you
- on any tkinter widget, you have “.after()” to call any function some time in the future, i.e. you enqueue that function call to be executed after a time has passed; so this is nice for making stuff happen in the UI thread.
- But even small stuff like writing some characters to a file might delay the interface reaction time and people nowadays have no time for that (maybe I’m just particularly impatient.)
- Python’s standard library gives us packages like
threading,asyncioandmultiprocessingpackage, now for long enough to consider them mature. - There are also more advanced solutions, like looking into PyQt, or the mindset “everything is a Web App nowadays”, and they might equip you with asynchronous handling from the start, but remember – The Schneiderei in Softwareschneiderei means that we prefer tailor-made software over having to integrating bloated dependencies that we neither know nor want to maintain all year long.
We conclude with shining our light to the general choice in this project, and a few reasons why.
- What Python calls “threads” are not the same thing as what operating systems understands as a thread (which also differs) among them. See also the Python Global Interpreter Lock. We do not want to dive into all of that.
multiprocessingis the only way to do anything outside the current sytem process, which is what you need for running CPU-heavy tasks in parallel. It’s out our use case, and while it is more “true parallel” it comes with slower startup, more expensive communication costs and also some constraints for such exchange (e.g. to serializable data).- We are IO-bound, i.e. we have no idea when the next network package arrives, so we want to wait in separate event loops that would be blocking on their own (i.e. “while True”, similar to what tkinter does with our main thread.).
- Because we have the main thread stolen by tkinter anyway, we would use a threading.Thread anyway to allow concurrency, but then we face the choice to construct everything with Threads or to employ the newer asyncio features.
The basic block to do that would use a threading.Thread:
class ListenerThread(threading.Thread):
# initialize somehow
def run(self):
while True:
wait_for_input()
process_input()
communicate_with_ui_thread()
# usage like
# listener = ListenerThread(...)
# listener.start(...)
Now to think naively, the pseudo-code communicate_with_ui_thread() could then either lead to the tkinter .after() calls, or employ callback functions passed either in the initialization or the .start() call of the thread. But sadly, it was not as easy as that, because you run several risks of
- just masking your intention and still executing your callbacks blockingly (that thread can freeze the UI)
- still pass UI widget references to your background loop (throwing bad not-the-main-thread-error exceptions)
- have memory leaks in the background gobble up more than you ever wanted
- Lock starvation: bad error telling you
RuntimeError: can't allocate lock - deadlocks, by interdependent callbacks
This list is surely not exhaustive.
So what are the options? Let me discuss these in an upcoming post. For now, let’s just linger all these complications in your brain for a while.