Planning your Xrootd Hardware Deploy

As Xrootd is meant to be a perturbation of the total site storage usage, the hardware requirements are small compared to deploying a storage element.

  • Host Count: A beginning site can start with a single node. However, we recommend the "final set" of hardware involve three nodes - two for load-balancing, and one for failover.
  • Network connectivity: Each host will need to have public-Internet network connectivity (DNS, host certificate, and open ports. The network ports are specified in the installation documents). It will also need to be able to read from your storage element.
  • Hardware requirements: Xrootd needs no significant amount of CPU, memory, disk, or network. Typically, sites will reuse old worker nodes. A node with 4 cores, 8GB RAM, 20GB disk, and 1Gbps connectivity will suffice.

ALERT! NOTE: If your site hosts >2 XRootD servers, consider deployment of XRootD site redirector which will subscribe upstream and all site servers will subscribe to site redirector instead. This eliminates number of managers (cmsd process) subscribed to the one manager in the hierarchy upstream. Please, refer to subscription details here.

TIP TIP: For large clusters with 65 and more data servers there is supervisor mode available to configure, see document here.

Recommended tweaks

There are a couple of places you can tweak configuration in case you have evidence XRootD instance (xrootd, cmsd process) has trouble to serve client requests or limits otherwise functional behavior.

System (OS) level

Data servers with high demand might require some changes in system-wide settings to behave normally. If you encounter problems like thread limit reached (in cmsd process messages) or Config maximum number of connections restricted to 65536 (in xrootd process messages), here is set of things to check and adjust accordingly:

$ cat /proc/`pidof xrootd`/limits
$ cat /proc/`pidof cmsd`/limits
max open files is usually set to 65536, check if it's the case if not change, please. Also, don't limit Max core file size to 0 in order to get core file when XRootD crashes. Make sure you either set core file size to unlimited per process or system wide:
$ ulimit -c unlimited
For running process you can generate core file as follows:
$ gcore $(pidof cmsd)
$ gcore $(pidof xrootd)

Usually, when XRootD crashes, core file is created under /var/spool/xrootd/. If you get one, then report to hn-cms-wanaccess@cernNOSPAMPLEASE.ch.

If you run stateless firewall it usually does connection tracking and with conntrack module loaded exhausts double amounts sockets when runnning XRootD. Please, check and set to recommended values (consider reasonable values depending on hardware resourses of the machine):

$ cat /proc/sys/net/netfilter/nf_conntrack_max          
65536
$ cat /proc/sys/net/netfilter/nf_conntrack_buckets    
16384
$ cat /proc/sys/net/ipv4/ip_local_port_range             
1024 65535
$ cat /proc/sys/kernel/pid_max                                 
131072
Out of couriosity, check nf_conntrack_count and somaxconn values to see utilization:
$ cat /proc/sys/net/netfilter/nf_conntrack_count        
$ cat /proc/sys/net/core/somaxconn

TIP TIP: Unless you use it, Ganglia Monitoring System or other piece of monitoring software is recommended to run in case of XRootD troubles it is easier to correlate systems resources usage to XRootD crashes etc.

Enable overcommit_memory:

$ sysctl vm.overcommit_memory 
vm.overcommit_memory = 1
TIP If disabled and machine is lack of virtual memory (or has other memory issues) system limits thread creation for the process hence leads often to crash of cmsd. When enabled thread creation will only fail when physical memory is fully occupied.

XRootD (configuration) level

XRootD consuming lot of memory

Look in system logs if you see any sign of messages like: xrootd: page allocation failure: order:0, mode:0x20

TIP This is usually case on EL7 systems and you might consider custom memory allocator instead of standard glibc, i.e. installing jemalloc (jemalloc rpm) or tcmalloc (gperftools-libs rpm). Then you need to add in your /etc/sysconfig/xrootd:

LD_PRELOAD=/usr/lib64/libjemalloc.so.1     # you may need to adjust the library path depending on the linux flavor
MALLOC_ARENA_MAX=4
Then on EL7 systems you might want to enable it in the systemd unit as well, /etc/systemd/system/xrootd@clustered.service.d/override.conf:
# systemd override file for xrootd@clustered
[Service]
EnvironmentFile=-/etc/sysconfig/xrootd

Intermittent SAM xrootd-access tests failures

Sometimes people see that xrootd-access test result marks site red in critical status while site admins believe SAM test file is present on their system and is reachable when they try manual xrdcp copy. Typical symptom of such failed SAM xrootd-access test is in the log file: error '[ERROR] Server responded with an error: [3011] No servers are available to read the file.

Depending on your cms.dfs configuration of XRootD you may consider tweak your setup, especially if you are big site with more than 10 servers where is harder to monitor 100% data server availabilty. What people usually set:

  cms.dfs lookup distrib mdhold 20m redirect immed
TIP Meaning file lookup is broadcasted to all servers and redirection is done on first response received. In redirect immed mode, the site redirector picks a node at random without trying to see if the file exists. Hence, in this mode, it's important that 100% of data servers are working. Otherwise "file not found" from broken server might get cached and new requests return file missing even file is present on other site server.

If you're experiencing situation described above recommended change will be:

  cms.dfs lookup central mdhold 20m redirect verify
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2017-09-27 - MarianZvada
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback