SYSTEMD.EXEC(5) systemd.exec SYSTEMD.EXEC(5)NAME
systemd.exec - Execution environment configuration
SYNOPSIS
service.service, socket.socket, mount.mount, swap.swap
DESCRIPTION
Unit configuration files for services, sockets, mount points, and swap
devices share a subset of configuration options which define the
execution environment of spawned processes.
This man page lists the configuration options shared by these four unit
types. See systemd.unit(5) for the common options of all unit
configuration files, and systemd.service(5), systemd.socket(5),
systemd.swap(5), and systemd.mount(5) for more information on the
specific unit configuration files. The execution specific configuration
options are configured in the [Service], [Socket], [Mount], or [Swap]
sections, depending on the unit type.
OPTIONS
WorkingDirectory=
Takes an absolute directory path. Sets the working directory for
executed processes. If not set, defaults to the root directory when
systemd is running as a system instance and the respective user's
home directory if run as user.
RootDirectory=
Takes an absolute directory path. Sets the root directory for
executed processes, with the chroot(2) system call. If this is
used, it must be ensured that the process and all its auxiliary
files are available in the chroot() jail.
User=, Group=
Sets the Unix user or group that the processes are executed as,
respectively. Takes a single user or group name or ID as argument.
If no group is set, the default group of the user is chosen.
SupplementaryGroups=
Sets the supplementary Unix groups the processes are executed as.
This takes a space-separated list of group names or IDs. This
option may be specified more than once in which case all listed
groups are set as supplementary groups. When the empty string is
assigned the list of supplementary groups is reset, and all
assignments prior to this one will have no effect. In any way, this
option does not override, but extends the list of supplementary
groups configured in the system group database for the user.
Nice=
Sets the default nice level (scheduling priority) for executed
processes. Takes an integer between -20 (highest priority) and 19
(lowest priority). See setpriority(2) for details.
OOMScoreAdjust=
Sets the adjustment level for the Out-Of-Memory killer for executed
processes. Takes an integer between -1000 (to disable OOM killing
for this process) and 1000 (to make killing of this process under
memory pressure very likely). See proc.txt[1] for details.
IOSchedulingClass=
Sets the IO scheduling class for executed processes. Takes an
integer between 0 and 3 or one of the strings none, realtime,
best-effort or idle. See ioprio_set(2) for details.
IOSchedulingPriority=
Sets the IO scheduling priority for executed processes. Takes an
integer between 0 (highest priority) and 7 (lowest priority). The
available priorities depend on the selected IO scheduling class
(see above). See ioprio_set(2) for details.
CPUSchedulingPolicy=
Sets the CPU scheduling policy for executed processes. Takes one of
other, batch, idle, fifo or rr. See sched_setscheduler(2) for
details.
CPUSchedulingPriority=
Sets the CPU scheduling priority for executed processes. The
available priority range depends on the selected CPU scheduling
policy (see above). For real-time scheduling policies an integer
between 1 (lowest priority) and 99 (highest priority) can be used.
See sched_setscheduler(2) for details.
CPUSchedulingResetOnFork=
Takes a boolean argument. If true, elevated CPU scheduling
priorities and policies will be reset when the executed processes
fork, and can hence not leak into child processes. See
sched_setscheduler(2) for details. Defaults to false.
CPUAffinity=
Controls the CPU affinity of the executed processes. Takes a
space-separated list of CPU indices. This option may be specified
more than once in which case the specificed CPU affinity masks are
merged. If the empty string is assigned, the mask is reset, all
assignments prior to this will have no effect. See
sched_setaffinity(2) for details.
UMask=
Controls the file mode creation mask. Takes an access mode in octal
notation. See umask(2) for details. Defaults to 0022.
Environment=
Sets environment variables for executed processes. Takes a
space-separated list of variable assignments. This option may be
specified more than once in which case all listed variables will be
set. If the same variable is set twice, the later setting will
override the earlier setting. If the empty string is assigned to
this option, the list of environment variables is reset, all prior
assignments have no effect. Variable expansion is not performed
inside the strings, however, specifier expansion is possible. The $
character has no special meaning. If you need to assign a value
containing spaces to a variable, use double quotes (") for the
assignment.
Example:
Environment="VAR1=word1 word2" VAR2=word3 "VAR3=$word 5 6"
gives three variables "VAR1", "VAR2", "VAR3" with the values "word1
word2", "word3", "$word 5 6".
See environ(7) for details about environment variables.
EnvironmentFile=
Similar to Environment= but reads the environment variables from a
text file. The text file should contain new-line-separated variable
assignments. Empty lines and lines starting with ; or # will be
ignored, which may be used for commenting. A line ending with a
backslash will be concatenated with the following one, allowing
multiline variable definitions. The parser strips leading and
trailing whitespace from the values of assignments, unless you use
double quotes (").
The argument passed should be an absolute filename or wildcard
expression, optionally prefixed with "-", which indicates that if
the file does not exist, it will not be read and no error or
warning message is logged. This option may be specified more than
once in which case all specified files are read. If the empty
string is assigned to this option, the list of file to read is
reset, all prior assignments have no effect.
The files listed with this directive will be read shortly before
the process is executed (more specifically, this means after all
processes from a previous unit state terminated. This means you can
generate these files in one unit state, and read it with this
option in the next). Settings from these files override settings
made with Environment=. If the same variable is set twice from
these files, the files will be read in the order they are specified
and the later setting will override the earlier setting.
StandardInput=
Controls where file descriptor 0 (STDIN) of the executed processes
is connected to. Takes one of null, tty, tty-force, tty-fail or
socket. If null is selected, standard input will be connected to
/dev/null, i.e. all read attempts by the process will result in
immediate EOF. If tty is selected, standard input is connected to a
TTY (as configured by TTYPath=, see below) and the executed process
becomes the controlling process of the terminal. If the terminal is
already being controlled by another process, the executed process
waits until the current controlling process releases the terminal.
tty-force is similar to tty, but the executed process is forcefully
and immediately made the controlling process of the terminal,
potentially removing previous controlling processes from the
terminal. tty-fail is similar to tty but if the terminal already
has a controlling process start-up of the executed process fails.
The socket option is only valid in socket-activated services, and
only when the socket configuration file (see systemd.socket(5) for
details) specifies a single socket only. If this option is set,
standard input will be connected to the socket the service was
activated from, which is primarily useful for compatibility with
daemons designed for use with the traditional inetd(8) daemon. This
setting defaults to null.
StandardOutput=
Controls where file descriptor 1 (STDOUT) of the executed processes
is connected to. Takes one of inherit, null, tty, syslog, kmsg,
journal, syslog+console, kmsg+console, journal+console or socket.
If set to inherit, the file descriptor of standard input is
duplicated for standard output. If set to null, standard output
will be connected to /dev/null, i.e. everything written to it will
be lost. If set to tty, standard output will be connected to a tty
(as configured via TTYPath=, see below). If the TTY is used for
output only, the executed process will not become the controlling
process of the terminal, and will not fail or wait for other
processes to release the terminal. syslog connects standard output
to the syslog(3) system syslog service. kmsg connects it with the
kernel log buffer which is accessible via dmesg(1). journal
connects it with the journal which is accessible via journalctl(1)
(Note that everything that is written to syslog or kmsg is
implicitly stored in the journal as well, those options are hence
supersets of this one). syslog+console, journal+console and
kmsg+console work similarly but copy the output to the system
console as well. socket connects standard output to a socket from
socket activation, semantics are similar to the respective option
of StandardInput=. This setting defaults to the value set with
DefaultStandardOutput= in systemd-system.conf(5), which defaults to
journal.
StandardError=
Controls where file descriptor 2 (STDERR) of the executed processes
is connected to. The available options are identical to those of
StandardOutput=, with one exception: if set to inherit the file
descriptor used for standard output is duplicated for standard
error. This setting defaults to the value set with
DefaultStandardError= in systemd-system.conf(5), which defaults to
inherit.
TTYPath=
Sets the terminal device node to use if standard input, output, or
error are connected to a TTY (see above). Defaults to /dev/console.
TTYReset=
Reset the terminal device specified with TTYPath= before and after
execution. Defaults to "no".
TTYVHangup=
Disconnect all clients which have opened the terminal device
specified with TTYPath= before and after execution. Defaults to
"no".
TTYVTDisallocate=
If the terminal device specified with TTYPath= is a virtual console
terminal, try to deallocate the TTY before and after execution.
This ensures that the screen and scrollback buffer is cleared.
Defaults to "no".
SyslogIdentifier=
Sets the process name to prefix log lines sent to syslog or the
kernel log buffer with. If not set, defaults to the process name of
the executed process. This option is only useful when
StandardOutput= or StandardError= are set to syslog or kmsg.
SyslogFacility=
Sets the syslog facility to use when logging to syslog. One of
kern, user, mail, daemon, auth, syslog, lpr, news, uucp, cron,
authpriv, ftp, local0, local1, local2, local3, local4, local5,
local6 or local7. See syslog(3) for details. This option is only
useful when StandardOutput= or StandardError= are set to syslog.
Defaults to daemon.
SyslogLevel=
Default syslog level to use when logging to syslog or the kernel
log buffer. One of emerg, alert, crit, err, warning, notice, info,
debug. See syslog(3) for details. This option is only useful when
StandardOutput= or StandardError= are set to syslog or kmsg. Note
that individual lines output by the daemon might be prefixed with a
different log level which can be used to override the default log
level specified here. The interpretation of these prefixes may be
disabled with SyslogLevelPrefix=, see below. For details see sd-
daemon(3). Defaults to info.
SyslogLevelPrefix=
Takes a boolean argument. If true and StandardOutput= or
StandardError= are set to syslog, kmsg or journal, log lines
written by the executed process that are prefixed with a log level
will be passed on to syslog with this log level set but the prefix
removed. If set to false, the interpretation of these prefixes is
disabled and the logged lines are passed on as-is. For details
about this prefixing see sd-daemon(3). Defaults to true.
TimerSlackNSec=
Sets the timer slack in nanoseconds for the executed processes. The
timer slack controls the accuracy of wake-ups triggered by timers.
See prctl(2) for more information. Note that in contrast to most
other time span definitions this parameter takes an integer value
in nano-seconds if no unit is specified. The usual time units are
understood too.
LimitCPU=, LimitFSIZE=, LimitDATA=, LimitSTACK=, LimitCORE=, LimitRSS=,
LimitNOFILE=, LimitAS=, LimitNPROC=, LimitMEMLOCK=, LimitLOCKS=,
LimitSIGPENDING=, LimitMSGQUEUE=, LimitNICE=, LimitRTPRIO=,
LimitRTTIME=
These settings control various resource limits for executed
processes. See setrlimit(2) for details. Use the string infinity to
configure no limit on a specific resource.
PAMName=
Sets the PAM service name to set up a session as. If set, the
executed process will be registered as a PAM session under the
specified service name. This is only useful in conjunction with the
User= setting. If not set, no PAM session will be opened for the
executed processes. See pam(8) for details.
CapabilityBoundingSet=
Controls which capabilities to include in the capability bounding
set for the executed process. See capabilities(7) for details.
Takes a whitespace-separated list of capability names as read by
cap_from_name(3), e.g. CAP_SYS_ADMIN, CAP_DAC_OVERRIDE,
CAP_SYS_PTRACE. Capabilities listed will be included in the
bounding set, all others are removed. If the list of capabilities
is prefixed with "~", all but the listed capabilities will be
included, the effect of the assignment inverted. Note that this
option also affects the respective capabilities in the effective,
permitted and inheritable capability sets, on top of what
Capabilities= does. If this option is not used, the capability
bounding set is not modified on process execution, hence no limits
on the capabilities of the process are enforced. This option may
appear more than once in which case the bounding sets are merged.
If the empty string is assigned to this option, the bounding set is
reset to the empty capability set, and all prior settings have no
effect. If set to "~" (without any further argument), the bounding
set is reset to the full set of available capabilities, also
undoing any previous settings.
SecureBits=
Controls the secure bits set for the executed process. See
capabilities(7) for details. Takes a list of strings: keep-caps,
keep-caps-locked, no-setuid-fixup, no-setuid-fixup-locked, noroot
and/or noroot-locked. This option may appear more than once in
which case the secure bits are ORed. If the empty string is
assigned to this option, the bits are reset to 0.
Capabilities=
Controls the capabilities(7) set for the executed process. Take a
capability string describing the effective, permitted and inherited
capability sets as documented in cap_from_text(3). Note that these
capability sets are usually influenced by the capabilities attached
to the executed file. Due to that CapabilityBoundingSet= is
probably the much more useful setting.
ReadWriteDirectories=, ReadOnlyDirectories=, InaccessibleDirectories=
Sets up a new file system namespace for executed processes. These
options may be used to limit access a process might have to the
main file system hierarchy. Each setting takes a space-separated
list of absolute directory paths. Directories listed in
ReadWriteDirectories= are accessible from within the namespace with
the same access rights as from outside. Directories listed in
ReadOnlyDirectories= are accessible for reading only, writing will
be refused even if the usual file access controls would permit
this. Directories listed in InaccessibleDirectories= will be made
inaccessible for processes inside the namespace. Note that
restricting access with these options does not extend to submounts
of a directory. You must list submounts separately in these
settings to ensure the same limited access. These options may be
specified more than once in which case all directories listed will
have limited access from within the namespace. If the empty string
is assigned to this option, the specific list is reset, and all
prior assignments have no effect.
Paths in ReadOnlyDirectories= and InaccessibleDirectories= may be
prefixed with "-", in which case they will be ignored when they do
not exist. Note that using this setting will disconnect propagation
of mounts from the service to the host (propagation in the opposite
direction continues to work). This means that this setting may not
be used for services which shall be able to install mount points in
the main mount namespace.
PrivateTmp=
Takes a boolean argument. If true, sets up a new file system
namespace for the executed processes and mounts private /tmp and
/var/tmp directories inside it that is not shared by processes
outside of the namespace. This is useful to secure access to
temporary files of the process, but makes sharing between processes
via /tmp or /var/tmp impossible. If this is enabled all temporary
files created by a service in these directories will be removed
after the service is stopped. Defaults to false. It is possible to
run two or more units within the same private /tmp and /var/tmp
namespace by using the JoinsNamespaceOf= directive, see
systemd.unit(5) for details. Note that using this setting will
disconnect propagation of mounts from the service to the host
(propagation in the opposite direction continues to work). This
means that this setting may not be used for services which shall be
able to install mount points in the main mount namespace.
PrivateDevices=
Takes a boolean argument. If true, sets up a new /dev namespace for
the executed processes and only adds API pseudo devices such as
/dev/null, /dev/zero or /dev/random (as well as the pseudo TTY
subsystem) to it, but no physical devices such as /dev/sda. This is
useful to securely turn off physical device access by the executed
process. Defaults to false. Enabling this option will also remove
CAP_MKNOD from the capability bounding set for the unit (see
above), and set DevicePolicy=closed (see systemd.resource-
control(5) for details). Note that using this setting will
disconnect propagation of mounts from the service to the host
(propagation in the opposite direction continues to work). This
means that this setting may not be used for services which shall be
able to install mount points in the main mount namespace.
PrivateNetwork=
Takes a boolean argument. If true, sets up a new network namespace
for the executed processes and configures only the loopback network
device "lo" inside it. No other network devices will be available
to the executed process. This is useful to securely turn off
network access by the executed process. Defaults to false. It is
possible to run two or more units within the same private network
namespace by using the JoinsNamespaceOf= directive, see
systemd.unit(5) for details. Note that this option will disconnect
all socket families from the host, this includes AF_NETLINK and
AF_UNIX. The latter has the effect that AF_UNIX sockets in the
abstract socket namespace will become unavailable to the processes
(however, those located in the file system will continue to be
accessible).
MountFlags=
Takes a mount propagation flag: shared, slave or private, which
control whether mounts in the file system namespace set up for this
unit's processes will receive or propagate mounts or unmounts. See
mount(2) for details. Defaults to shared. Use shared to ensure that
mounts and unmounts are propagated from the host to the container
and vice versa. Use slave to run processes so that none of their
mounts and unmounts will propagate to the host. Use private to also
ensure that no mounts and unmounts from the host will propagate
into the unit processes' namespace. Note that slave means that file
systems mounted on the host might stay mounted continously in the
unit's namespace, and thus keep the device busy. Note that the file
system namespace related options (PrivateTmp=, PrivateDevices=,
ReadOnlyDirectories=, InaccessibleDirectories= and
ReadWriteDirectories=) require that mount and unmount propagation
from the unit's file system namespace is disabled, and hence
downgrade shared to slave.
UtmpIdentifier=
Takes a four character identifier string for an utmp/wtmp entry for
this service. This should only be set for services such as getty
implementations where utmp/wtmp entries must be created and cleared
before and after execution. If the configured string is longer than
four characters, it is truncated and the terminal four characters
are used. This setting interprets %I style string replacements.
This setting is unset by default, i.e. no utmp/wtmp entries are
created or cleaned up for this service.
SELinuxContext=
Set the SELinux security context of the executed process. If set,
this will override the automated domain transition. However, the
policy still needs to autorize the transition. This directive is
ignored if SELinux is disabled. If prefixed by "-", all errors will
be ignored. See setexeccon(3) for details.
AppArmorProfile=
Take a profile name as argument. The process executed by the unit
will switch to this profile when started. Profiles must already be
loaded in the kernel, or the unit will fail. This result in a non
operation if AppArmor is not enabled. If prefixed by "-", all
errors will be ignored.
IgnoreSIGPIPE=
Takes a boolean argument. If true, causes SIGPIPE to be ignored in
the executed process. Defaults to true because SIGPIPE generally is
useful only in shell pipelines.
NoNewPrivileges=
Takes a boolean argument. If true, ensures that the service process
and all its children can never gain new privileges. This option is
more powerful than the respective secure bits flags (see above), as
it also prohibits UID changes of any kind. This is the simplest,
most effective way to ensure that a process and its children can
never elevate privileges again.
SystemCallFilter=
Takes a space-separated list of system call names. If this setting
is used, all system calls executed by the unit processes except for
the listed ones will result in immediate process termination with
the SIGSYS signal (whitelisting). If the first character of the
list is "~", the effect is inverted: only the listed system calls
will result in immediate process termination (blacklisting). If
running in user mode and this option is used, NoNewPrivileges=yes
is implied. This feature makes use of the Secure Computing Mode 2
interfaces of the kernel ('seccomp filtering') and is useful for
enforcing a minimal sandboxing environment. Note that the execve,
rt_sigreturn, sigreturn, exit_group, exit system calls are
implicitly whitelisted and do not need to be listed explicitly.
This option may be specified more than once in which case the
filter masks are merged. If the empty string is assigned, the
filter is reset, all prior assignments will have no effect.
If you specify both types of this option (i.e. whitelisting and
blacklisting), the first encountered will take precedence and will
dictate the default action (termination or approval of a system
call). Then the next occurrences of this option will add or delete
the listed system calls from the set of the filtered system calls,
depending of its type and the default action. (For example, if you
have started with a whitelisting of read and write, and right after
it add a blacklisting of write, then write will be removed from the
set.)
SystemCallErrorNumber=
Takes an "errno" error number name to return when the system call
filter configured with SystemCallFilter= is triggered, instead of
terminating the process immediately. Takes an error name such as
EPERM, EACCES or EUCLEAN. When this setting is not used, or when
the empty string is assigned, the process will be terminated
immediately when the filter is triggered.
SystemCallArchitectures=
Takes a space separated list of architecture identifiers to include
in the system call filter. The known architecture identifiers are
x86, x86-64, x32, arm as well as the special identifier native.
Only system calls of the specified architectures will be permitted
to processes of this unit. This is an effective way to disable
compatibility with non-native architectures for processes, for
example to prohibit execution of 32-bit x86 binaries on 64-bit
x86-64 systems. The special native identifier implicitly maps to
the native architecture of the system (or more strictly: to the
architecture the system manager is compiled for). If running in
user mode and this option is used, NoNewPrivileges=yes is implied.
Note that setting this option to a non-empty list implies that
native is included too. By default, this option is set to the empty
list, i.e. no architecture system call filtering is applied.
RestrictAddressFamilies=
Restricts the set of socket address families accessible to the
processes of this unit. Takes a space-separated list of address
family names to whitelist, such as AF_UNIX, AF_INET or AF_INET6.
When prefixed with ~ the listed address families will be applied as
blacklist, otherwise as whitelist. Note that this restricts access
to the socket(2) system call only. Sockets passed into the process
by other means (for example, by using socket activation with socket
units, see systemd.socket(5)) are unaffected. Also, sockets created
with socketpair() (which creates connected AF_UNIX sockets only)
are unaffected. Note that this option has no effect on 32bit x86
and is ignored (but works correctly on x86-64). If running in user
mode and this option is used, NoNewPrivileges=yes is implied. By
default no restriction applies, all address families are accessible
to processes. If assigned the empty string any previous list
changes are undone.
Use this option to limit exposure of processes to remote systems,
in particular via exotic network protocols. Note that in most cases
the local AF_UNIX address family should be included in the
configured whitelist as it is frequently used for local
communication, including for syslog(2) logging.
Personality=
Controls which kernel architecture uname(2) shall report, when
invoked by unit processes. Takes one of x86 and x86-64. This is
useful when running 32bit services on a 64bit host system. If not
specified the personality is left unmodified and thus reflects the
personality of the host system's kernel.
RuntimeDirectory=, RuntimeDirectoryMode=
Takes a list of directory names. If set one or more directories by
the specified names will be created below /run (for system
services) or below $XDG_RUNTIME_DIR (for user services) when the
unit is started and removed when the unit is stopped. The
directories will have the access mode specified in
RuntimeDirectoryMode=, and will be owned by the user and group
specified in User= and Group=. Use this to manage one or more
runtime directories of the unit and bind their lifetime to the
daemon runtime. The specified directory names must be relative, and
may not include a "/", i.e. must refer to simple directories to
create or remove. This is particularly useful for unpriviliges
daemons that cannot create runtime directories in /run due to lack
of privileges, and to make sure the runtime directory is cleaned up
automatically after use. For runtime directories that require more
complex or different configuration or lifetime guarantees, please
consider using tmpfiles.d(5).
ENVIRONMENT VARIABLES IN SPAWNED PROCESSES
Processes started by the system are executed in a clean environment in
which select variables listed below are set. System processes started
by systemd do not inherit variables from PID 1, but processes started
by user systemd instances inherit all environment variables from the
user systemd instance.
$PATH
Colon-separated list of directiories to use when launching
executables. Systemd uses a fixed value of
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin.
$LANG
Locale. Can be set in locale.conf(5) or on the kernel command line
(see systemd(1) and kernel-command-line(7)).
$USER, $LOGNAME, $HOME, $SHELL
User name (twice), home directory, and the login shell. The
variables are set for the units that have User= set, which includes
user systemd instances. See passwd(5).
$XDG_RUNTIME_DIR
The directory for volatile state. Set for the user systemd
instance, and also in user sessions. See pam_systemd(8).
$XDG_SESSION_ID, $XDG_SEAT, $XDG_VTNR
The identifier of the session, the seat name, and virtual terminal
of the session. Set by pam_systemd(8) for login sessions.
$XDG_SEAT and $XDG_VTNR will only be set when attached to a seat
and a tty.
$MAINPID
The PID of the units main process if it is known. This is only set
for control processes as invoked by ExecReload= and similar.
$MANAGERPID
The PID of the user systemd instance, set for processes spawned by
it.
$LISTEN_FDS, $LISTEN_PID
Information about file descriptors passed to a service for socket
activation. See sd_listen_fds(3).
$TERM
Terminal type, set only for units connected to a terminal
(StandardInput=tty, StandardOutput=tty, or StandardError=tty). See
termcap(5).
Additional variables may be configured by the following means: for
processes spawned in specific units, use the Environment= and
EnvironmentFile= options above; to specify variables globally, use
DefaultEnvironment= (see systemd-system.conf(5)) or the kernel option
systemd.setenv= (see systemd(1)). Additional variables may also be set
through PAM, c.f. pam_env(8).
SEE ALSOsystemd(1), systemctl(8), journalctl(8), systemd.unit(5),
systemd.service(5), systemd.socket(5), systemd.swap(5),
systemd.mount(5), systemd.kill(5), systemd.resource-control(5),
systemd.directives(7), tmpfiles.d(5), exec(3)NOTES
1. proc.txt
https://www.kernel.org/doc/Documentation/filesystems/proc.txt
systemd 212SYSTEMD.EXEC(5)