What hardware do community managers use

Hardware scaling guidelines

These scaling guidelines provide an approximation of the hardware requirements required to implement an AEM project. The magnitude of the estimates depends on the architecture of the project, the complexity of the solution, the expected traffic, and the project requirements. This guide will help you determine the hardware requirements for a particular solution or find an upper and lower estimate of the hardware requirements.

The following factors should be considered:

  • Network speed

    • Network latency
    • Available bandwidth
  • Computing speed

    • Caching efficiency
    • Expected traffic volume
    • Complexity of templates, applications and components
    • Simultaneous authors
    • Complexity of the creation process (basic content editing, MSM rollout, etc.)
  • I / O performance

    • File or database storage performance and efficiency
  • hard disk

    • at least two or three times larger than the size of the repository
  • random access memory

    • Size of the website (number of content objects, pages and users)
    • Number of simultaneously active users / sessions

architecture

A typical AEM setup consists of an authoring and a publishing environment. These environments have different requirements in terms of the underlying hardware size and system configuration. You can find detailed information on both environments in the sections Authoring Version and Publishing Environment.

In a typical project setup, there are several environments available in which you can stage project phases:

  • Development environment To develop new features or make significant changes. It is best to work with one development environment per developer (usually local installations on the individual system).

  • Authoring test environment
    to review changes. The number of test environments can vary depending on the project requirements (e.g. separately for QA, integration tests or user acceptance tests).

  • Release test environment Mainly for testing use cases of collaboration in social networks and / or the interaction between the author and multiple publication instances.

  • Author editing environment For authors to edit content

  • Publishing editing environment To serve published content.

The environments also vary, from a single server system with AEM and an application server to a scaled-up set of multi-server and multi-CPU clusters. It is recommended that you use a separate computer for each production system and that you do not run any other applications on these computers.

Notes on scaling generic hardware

The following sections contain guidelines for calculating the hardware requirements, taking various considerations into account. For large systems, we recommend that you run a simple set of internal benchmark tests on a reference configuration.

Optimizing performance is a fundamental task that must be performed before benchmarking a particular project. Note the information in the documentation on performance optimization before you carry out benchmark tests and use the results for hardware design calculations.

Hardware scaling for advanced use cases should be based on a detailed performance assessment of the project. The characteristics of advanced use cases that require exceptional hardware resources include combinations of:

  • high payload / throughput
  • extensive use of customer-specific code, own workflows or third-party software libraries
  • Integration with unsupported external systems

Hard disk space / hard disk

The required storage space depends heavily on the volume and type of your web application. The calculations should take into account:

  • the number and size of pages, assets and other content stored in the repository such as workflows, profiles, etc.
  • the estimated frequency of content changes and thus the creation of content versions
  • the volume of DAM output to be generated
  • the growth of all content over time

The storage space is continuously monitored during the online and offline revision cleanup. If the available storage space falls below a critical value, the process is canceled. The critical value is 25% of the current disk size on the repository and cannot be configured. It is recommended that you use a hard drive that is at least two to three times the size of the repository, including expected growth.

For data redundancy, redundant arrays of independent hard drives (RAID, e.g. RAID10) are a good choice.

The temporary directory of a production instance should have at least 6 GB of free space.

Virtualization

AEM works well in virtualized environments, but there may be factors such as CPU or I / O that cannot be directly equated with physical hardware. It is generally advisable to select a higher I / O speed, as this is a critical factor in most cases. Benchmarks for your environment are necessary to have a clear understanding of what resources are required.

Parallelization of AEM instances

Resilience

A fail-safe website is used on at least two separate systems. If one system fails, another system can take over and compensate for the system failure.

System resource scalability

While all systems are running, increased computing power is available. This additional performance is not necessarily linear with the number of cluster nodes, since the relationship is highly dependent on the technical environment; See the cluster documentation for more information.

The estimate of how many cluster nodes are necessary is based on the basic requirements and specific use cases of the respective web project:

  • From a resilience perspective, it is necessary to determine for all environments how critical a failure is and how long it will take to restore a cluster node.
  • For the aspect of scalability, the number of write operations is basically the most important factor; see Parallel Working by Authors for the Authoring Environment, and Social Networking for the Publishing Environment. Load balancing can be set up for operations that only access the system to process reads; see dispatcher for details.

Special calculations for the authoring environment

For benchmarking purposes, Adobe has developed some benchmark tests for stand-alone author instances.

  • Benchmark test 1

    Calculate the maximum throughput of a load profile that would allow the user to perform a simple build page exercise over a base load of 300 existing pages, all of which are similar. The steps were to log into the website, create a page with a SWF and image / text, add a tag cloud, and activate the page.

    • Result

      The maximum throughput for a simple page creation exercise like above (considered as a transaction) was 1730 transactions / hour.

  • Benchmark test 2

    Calculate maximum throughput when the load profile has a mixture of creating new pages (10%), modifying an existing page (80%), and creating and then modifying a page (10%). The complexity of the pages remains the same as in the profile of benchmark test 1. The fundamental change to the page is made by adding an image and changing the text content. The exercise was again carried out on a base load of 300 pages with the same complexity as in benchmark test 1.

    • Result

      The maximum throughput for such a mix process was 3252 transactions per hour.

The throughput rate does not differentiate between types of movement within a load profile. The throughput measurement approach ensures that a fixed proportion of each type of transaction is included in the workload.

The two tests mentioned above clearly show that throughput varies depending on the operating mode. Use the activities in your area as a basis for sizing your system. You get better throughput with less intensive actions such as B. Modify (which is also more common).

Caching

In the authoring environment, the efficiency of the caching is usually significantly lower, as changes to the website occur more frequently and the content is also very interactive and personalized. With the dispatcher you can cache AEM libraries, JavaScripts, CSS files and layout images. This speeds up some aspects of the editing process. Configuring the web server to set additional headers for browser caching on these resources reduces the number of HTTP requests and thus improves the responsiveness of the system, as will be experienced by the authors.

Parallel work by authors

In the authoring environment, the number of authors working in parallel and the load on the system from their interactions are the most important limiting factors. We therefore recommend that you scale your system based on the shared data throughput.

For such scenarios, Adobe performed benchmark tests on a shared-nothing cluster of author instances with two nodes.

  • Benchmark test 1a

    With an active-active-nothing cluster of 2 author instances, you calculate the maximum throughput with a load profile where users perform a simple build page exercise over a base load of 300 existing pages, all similar.

    • Result

      The maximum throughput for a simple page creation exercise (e.g. above) (considered as a transaction) is 2016 transactions / hour. This is an increase of approx. 16% compared to an independent author instance for the same benchmark test.

  • Benchmark test 2b

    With an active two-instance shared page cluster in author mode, you can calculate maximum throughput when the load profile has a mix of creating new pages (10%), modifying existing pages (80%), and creating and editing a Contains page in a row (10%). The complexity of the page remains the same as in the profile of benchmark test 1. The fundamental change to the page is made by adding an image and changing the text content. Here, too, the exercise was carried out on a base load of 300 pages with the same complexity as in benchmark test 1.

    • Result

      The maximum throughput for such a mixed operating scenario was 6288 transactions / hour. This is an increase of approx. 93% compared to an independent author instance for the same benchmark test.

The throughput rate does not differentiate between types of movement within a load profile. The throughput measurement approach ensures that a fixed proportion of each type of transaction is included in the workload.

The above two tests clearly show that AEM scales well for writers doing basic edits with AEM. In general, AEM is most effective at scaling read operations.

On a typical website, most of the authoring happens during the project phase. After starting the website, the number of authors working in parallel usually drops to a lower (regular operating) average.

You can calculate the number of computers (or CPUs) required for your authoring environment as follows:

This formula can serve as a general guideline for scaling CPUs when authors perform basic operations with AEM. It is assumed that the system and application are optimized. However, the formula does not apply to advanced features such as MSM or Assets (see below).

Also see the additional comments on parallelization and performance tuning.

Hardware recommendations

Typically, you can use the same hardware for your authoring environment that is recommended for your publishing environment. Usually website traffic on authoring systems is much lower, but cache efficiency is also lower. However, the number of authors working in parallel and the type of actions that are carried out on the system are decisive. In general, AEM (the authoring environment) clustering is most effective at scaling read operations; in other words, an AEM cluster scales well with authors performing basic editing operations.

The Adobe benchmark tests were performed using the RedHat 5.5 operating system, which ran on a Hewlett-Packard ProLiant DL380 G5 hardware platform with the following configuration:

  • Two quad-core Intel Xeon X5450 CPUs with 3.00 GHz
  • 8 GB of RAM
  • Broadcom NetXtreme II BCM5708 Gigabit Ethernet
  • HP Smart Array RAID Controller, 256MB Cache
  • Two 146 GB 10,000 RPM SAS hard drives configured as a RAID0 strip set
  • SPEC CINT2006 rate benchmark score is 110

AEM instances ran with a minimum heap size of 256M and a maximum heap size of 1024M.

Publication of environment-specific calculations

Caching efficiency and traffic

The efficiency of the caching is critical to the speed of the website. The following table shows how many pages per second an optimized AEM system can process with a reverse proxy like Dispatcher:

Cache ratioPages / s (peak value)Million pages / day (average)
100%1000-200035-70
99%91032
95%69025
90%52018
60%2208
0%1003.5

Note: The numbers are based on a standard hardware configuration and may vary depending on the hardware used.

The caching quota indicates the percentage of pages the Dispatcher can return without accessing AEM. 100% means that the dispatcher answers all requests, 0% means that AEM bills every single page.

Complexity of templates and applications

If you use complex templates, AEM will take more time to render a page. Pages from the cache are not affected, but the page size is relevant to the overall response time. A complex page can easily take ten times longer to render than a simple page to render.

formula

You can use the following formula to calculate an estimate of the overall complexity of your AEM solution:

The complexity can help you determine the number of servers (or CPU cores) you need for your publishing environment:

The variables in the equation are as follows:

trafficThe expected peak traffic per second. This can be estimated as the number of page views per day divided by 35,000.
applicationComplexity

Use 1 for a simple application, 2 for a complex application, or anything in between:

  • 1 - a completely anonymous, content-driven site
  • 1.1 - a completely anonymous, content-driven site with client-side / audience personalization
  • 1.5 - a content-driven site with anonymous and logged-in areas, client-side / audience personalization
  • 1.7 - for a content-driven site with anonymous and logged-in areas, client-side / user-specific personalization, and some user-generated audiences
  • 2 - which require the entire site to be logged in, with extensive user-generated content and a variety of personalization techniques
cacheRatioThe percentage of pages that came from the dispatcher cache. Use 1 if all pages come from the cache or 0 if each page is calculated by AEM.
templateComplexityUse a value between 1 and 10 to indicate the complexity of your templates. Higher numbers indicate more complex templates, with a value of 1 for sites with an average of 10 components per page, 5 for a page average of 40 components, and 10 for an average of over 100 components.
activationsAverage number of activations (replication of medium-sized pages and assets from author to publication level) per hour divided by x, where x is the number of activations performed on a system without affecting the performance of others processed by the system Tasks occur. You can predefine a pessimistic initial value such as x = 100.

If you have a more complex website, you also need more powerful web servers so that AEM can respond to a request in a reasonable time.

  • Complexity below 4:

    • 1024 MB JVM RAM & last;
    • Low to medium performance CPU
  • Complexity between 4 and 8:

    • 2048 MB JVM RAM & last;
    • Mid-to-high performance CPU
  • Complexity over 8:

    • 4096 MB JVM RAM & last;
    • High-end CPU

& ast; In addition to the memory required by your JVM, set up enough memory for your operating system.

Additional application-specific calculations

In addition to the calculation for a standard web application, you may need to consider specific factors for the following use cases. The calculated values ​​are to be added to the standard calculation.

Asset-specific notices

Optimized hardware resources are required for extensive processing of digital assets; the most important factors here are the image size and the peak throughput of processed images.

Allocate at least 16 GB of heap and configure the DAM Update Asset workflow to capture raw images with the Camera Raw package.

A higher data throughput means that the computer resources have to keep pace with the system I / Os and vice versa. For example, if workflows are started by importing images, uploading many images via WebDAV can lead to a backlog of workflows.

The use of separate hard disks for TarPM, data storage and search index can help to optimize the I / O behavior of the system (however, it usually makes sense to keep the search index local).

Multi-site manager

The consumption of resources when using MSM in AEM in an authoring environment depends heavily on the specific use cases. The basic factors are:

  • Number of live copies
  • Frequency of the rollout
  • Size of the content tree to be provided
  • Associated functionality of the rollout actions

Testing the planned use case with a representative excerpt can help you to better understand resource consumption. By extrapolating the results with the planned throughput, you can estimate the additional resource requirements for the MSM in AEM.

Please also note that authors working in parallel will perceive performance side effects if MSM use cases for AEM consume more resources than planned.

Considerations for Sizing AEM Communities

AEM sites that contain AEM communities (community sites) features experience a high level of interaction from site visitors (members) in the publishing environment.

The size considerations for a community site depend on the expected interaction from community members and whether optimal performance is more important to the page content.

User generated content (UGC) is saved separately from the page content. While the AEM platform uses a node store that replicates website content from the authoring to the publishing environment, AEM Communities uses a single, shared store for UGC that is never replicated.

For UGC storage it is necessary to select a storage resource provider (SRP = Storage Resource Provider) that influences the selected provision.
Please refer