Plone Parallelization—Scale Correctly and Improve Site Capacity

In this article, Studio theYANG shares our experience on how to correctly scale a Plone application, from the viewpoint of a seasoned Python web consultant. By analyzing the architecture of Plone, we hope to mark out common pitfalls in its parallelization and help unlocking the full power of your hardware.

Plone is an enterprise-level CMS system built upon Zope framework, and is extensible to many specific business needs. The architecture of Plone is more complicated than other Python frameworks and its resource needs may sound scary: Studio theYANG has a client running its Plone system on 16 CPU cores and 32GB memory!

In fact, we can describe Plone’s architecture in a very simple way, despite many optimizations within every layer. The data is stored in ZODB, Zope database, as Python objects. These objects rely on a process called “ZEO server” to be provided. When the ZEO server is correctly launched, client processes will start. As you may guess, these client processes are named “ZEO clients.” The communication between server and clients relies on localhost TCP connections.

With that said, it’s definitely not that simple to blindly throw in more ZEO clients and more hardware when the site capacity doesn’t meet our needs. Let’s take a proper look.

Step Back: Does Your Application Rely on Some Other Services?

Before getting our hands dirty with Plone, as a seasoned system consultant, we have to think of and exclude other possible reasons that bottleneck. It’s important to identify any other services that the application communicates with:

Pure CPU computational resource: this is rare but may happen in a scientific-backgrounded system, where the computational procedure is incorrectly placed within the request-response loop. As an application grows, it is important to separate this computationally heavy logic into background task. Fortunately, there are many solutions around Python, the most notable one being Celery.
Disk or network I/Os: careless implementation of I/Os is blocking in Python. A hanging network request will prevent the resource from being released. Using task queue like the previous one is a solution, or a better Python way is async I/O. Regarding disk I/Os, occasional disk hiccup or failure would lead to issues that are both serious and hard to debug.
Non-Zope databases: Plone applications tend to incorporate SQL or NoSQL databases beside Zope database to better satisfy business needs. It’s necessary to ensure external databases provide sufficient bandwidth for the Plone application.

We also want to check the resource usages, most prominently CPU, memory and swap, before reaching to a conclusion. But if we can exclude all peripheral issues and we observe a relatively low CPU/memory usage, it’s reasonable to work on Plone itself now.

Number of Threads, Number of ZEO Clients

In the abstracted architecture of Plone, we missed an important point: every ZEO client actually spawns a few threads to serve HTTP requests. The Plone application we work on may have already customized this value. It’s in ZEO client’s Zope.conf configuration file, under the parameter `zserver-threads`.

Our general suggestion from Studio theYANG is: delete any customization on this and keep it default. The default number of threads of 4 is already a good compromise between parallelability and memory usage. Why? It’s tempted to blindly increase this number, but it will touch Python’s GIL (Global Interpreter Lock), which restricts simultaneous instruction execution in multiple threads spawned. If it sounds complex, remember that while one thread is taking CPU resources, other threads have to wait until the first thread releases CPU usage and turns to I/O. This is a major limitation of most Python environments compiled in C and will instead slow down the processing of requests of our application. On the other hand, sacrificing number of threads for number of ZEO clients will lead to a much higher memory usage because threads can share memory (which is the major reason for Python’s GIL). As we’re discussing web applications, there bound to be network I/Os and using a small number of threads is the correct way to go.

Once we determine the number of threads, we can determine the number of ZEO clients by observing the global memory usage and the CPU usage of ZEO server. But before any observation, let’s not forget about caching parameters.

Cache

The caching of Zope applications is sophisticated and we can even find two seemingly contradictory parameters in Zope.conf:

Under <zodb_db>, there is number of objects to be cached under the parameter cache-size.
Under <zeoclient>, there is a parameter with the same name cache-size but meaning something different: client cache in bytes.

Unlike zserver-threads above, these two caching parameters need to be estimated. The number of objects is an easy estimation: simply counting the type of objects that we have and the number of them under each type. It is more difficult to think of the other parameter, but at Studio theYANG, we use an empirical approach by measuring the added CPU usage of ZEO server by an extra ZEO client. For example, if we already have two ZEO clients contributing to 40% CPU usage on the ZEO server, and we try to add another identical ZEO client that brings total ZEO server CPU usage to 55%, we then know that one ZEO client counts as 15% CPU usage, and we can thus have six ZEO clients in total. We then estimate the total memory that we can possibly occupy for the six ZEO clients, divide by six and leave some safe margin. This is the eventual number that we can put in Zope.conf, which is fundamentally to fill up the hardware that we have.

Load Balancing

With multiple ZEO clients, it’s natural to set up the load balancer. Here lies the final reminder: we have to ensure that a user is always served by the same ZEO client. In reverse proxy such as Apache, we can set up “sticky cookies” for users for this purpose. If not, a user’s subsequent requests will be sent to random ZEO clients.

Why would it be necessary? First, a human user would always need to access similar content that only concerns a small portion of Zope objects. Sticking the user to the same client will help with efficient caching. More importantly, it’s normal to have a user to click on a form submission twice in a row, especially in a business context where users are more rushy. If the submissions concern modification of a same object and are handled by separate ZEO clients, a ConflictError is likely to occur in the end, which drastically slows down the processing of Zope objects. Zope throws ConflictError when there are two simultaneous write attempts to the same object; the error is usually resolved by repetitive retries until one write is done before the other or the max retry is reached.

We hope this article is helpful for scaling your Plone web system. If you ever need further assistance, please don’t hesitate to contact us.

Plone Parallelization—Scale Correctly and Improve Site Capacity

Step Back: Does Your Application Rely on Some Other Services?

Number of Threads, Number of ZEO Clients

Cache

Load Balancing

Published by

Ling YANG

Leave a comment Cancel reply

Step Back: Does Your Application Rely on Some Other Services?

Number of Threads, Number of ZEO Clients

Cache

Load Balancing

Share this article / Partager cet article :

Related

Published by

Ling YANG

Leave a comment Cancel reply