What does sudo do? Most people have a basic understanding: it lets you temporarily run a command as root (or some other user, based on the options passed in).

People will discuss sudo as nice from a security perspective relative to logging in as root, because you are scoping your usage of the root account to what really needs it. And with the default sudo setup you aren't handing out the root password, but instead giving permission to certain users to use sudo for privilege escalation.

One might think that sudo is actually some binary deeply integrated into the kernel, relying on a special purpose-built system call to achieve its functionality. After all, it lets you use root without even providing the password for that account! But thanks to one bit inside file permissions, sudo can exist without any of this.


Let's write our own version of sudo to prove this point. First, let's make a program that serves as our test bed:

#!/usr/bin/env python3
# say_hello.py
import sys
import os
import pwd

user_id = os.getuid()
# usernames exist in mappings to the password file
username = pwd.getpwuid(user_id).pw_name

print(f"Called with: {sys.argv}")
print(f"Hello, {username}")

(aside: file permissions store the user ID, not the username. So taking your hard drive of files and sticking it into another machine might leave you with a lot of files inaccessible to "your user" on that machine!)

Let's sanity check that this program at least does what we think it does:

>>> chmod +x say_hello.py
>>> ./say_hello.py
Called with: ['./say_hello.py']
Hello, rtpg
>>> sudo ./say_hello.py
Called with: ['./say_hello.py']
Hello, root

This looks to be working correctly!


Now we will work on building our own sudo variant.

Let's first just make a program that will run whatever is provided to it, using exec:

#!/usr/bin/env python3
# mysudo.py
import os
import sys


if sys.argv[0].split("/")[-1] != "mysudo.py":
    raise ValueError("This program must be called directly")

if len(sys.argv) < 2:
    raise ValueError("Missing progam to call")

# call sys.argv[1], and pass the arguments to the program
# (you need to pass in argv[1] in the argument list
#  as well, it is not prepended)
os.execvp(sys.argv[1], sys.argv[1:])

this is a good enough approximation of what we want.

execvp is part of the exec family of Unix functions. Unlike spinning up a new subprocess or starting a thread, exec lets you replace the existing program running in a process with a new one.

In practice what that means is that you will run your indicated program but maintain the existing process ID, user ID, etc. There are a loooooot of asterisks involved here. (Check out man execve for a lot of details). The general idea is that the program you were running is replaced with another one, and that , but with many of the OS-level properties of the process (namely, permissions) retained. Except for the properties that don't stick around, of course!

Running this on our test program will give:

>>> ./mysudo.py ./say_hello.py foo bar
Called with: ['./say_hello.py', 'foo', 'bar']
Hello, rtpg

Now for the part we actually want to do: privilege escalation.

We had os.getuid. There is os.setuid. But of course just adding os.setuid(0) gets us:

Traceback (most recent call last):
  File ".../mysudo.py", line 14, in <module>
    os.setuid(0)
PermissionError: [Errno 1] Operation not permitted

It feels obvious that you can't just give yourself more privileges. But in that case... what would setuid even exist for?


There are three important user IDs:

  • the real user ID
  • the effective user ID
  • the saved set-user-ID

The real user ID is meant to track the "real" user, in essence who is logged into the machine.

The effective user ID is used to determine file permissions. This is what allows a process run by root to run "as another user" (for example when using process coordination programs like Supervisor), and for files written by those processes to have the expected permissions and owners.

With just the real user ID and the effective user ID, it feels like we can do a lot! If you have the root effective user ID, calling setuid(user) will set the process' real user ID, effective user ID, and saved set-user-ID to this new ID.

setuid lets root "become" any user, but for a non-privileged user you only have two choices: you can do setuid(real_user_id) or setuid(saved_set_user_id).

So you can write programs like:

data = read_files_as_root()
os.setuid(unprivileged_user_id)
write_files_as_unprivileged_user(data)

and have your expected results. The problem, though, is that this is effectively a one-shot operation. After the setuid call, you no longer have the permissions to return to your root privileges. So something like:

for uid in all_users:
    data = fetch_user_data_as_root(uid)
    os.setuid(uid)
    write_user_data(data)
    os.setuid(0) # try to go back to root

becomes impossible, as the second setuid call will fail. You could fork off separate processes before your setuid call, but it's not fun.

This is where seteuid comes in. seteuid only changes the effective user ID, without touching the real user ID. This means that we can still hold onto the truth that we are still root, change to an unprivileged user, and then call seteuid(real_user_id) to become root again.

data = fetch_user_data_as_root(uid)
os.seteuid(uid)  # file permissions now with uid
write_user_data(data)
# real user ID still 0, so following is allowed
os.seteuid(0) # real user ID

We can thus see that there's a lot of flexibility for programs that are being run as root. But sudo is being run without root! Our real user ID is not 0 so we can't do this toggling forward and backwards.

This is where the saved set-user-ID comes into play.

When calling exec on a program (for example by running a program in your shell), by default all three user IDs are set as:

  • real user ID: unchanged
  • effective user ID: unchanged
  • saved set-user-ID: copy the current effective user ID

But, if your program has the setuid flag set on the program file, instead your program will load in the following way:

  • real user ID: unchanged
  • effective user ID: the user ID of the program owner (the file owner)
  • saved set-user-ID copy the current effective user ID

What this means is that if you have a program that is owned by root, and has the setuid flag set on the program (through chmod +s) then when exec-ing the program, it will run with the permissions of the program file owner.

sudo is owned by root, but the binary has the setuid permission set, so when I run it as a non-privileged user, sudo has its effective user ID set to 0, and can do whatever it wants. But it will set up the right effective user ID before running the program the user asked for.


Because we have the real user ID and the effective one, I've edited my test program a bit to also display the effective user ID

#!/usr/bin/env python3
import sys
import os
import pwd

user_id = os.getuid()
e_uid = os.geteuid()
# usernames exist in mappings to the password file
username = pwd.getpwuid(user_id).pw_name
effective_username = pwd.getpwuid(e_uid).pw_name

print(f"Called with: {sys.argv}")
print(f"Hello, {username}")
print(f"(Effective user is {effective_username})")

After setting the set-user-ID bit, I call this and...

>>> ./say_hello.py
Called with: ['./say_hello.py']
Hello, rtpg
(Effective user is rtpg)
>>> sudo ./say_hello.py
Called with: ['./say_hello.py']
Hello, root
(Effective user is root)

After a bit of searching around I figured out that the set-user-ID bit is ignored on Linux for interpreter scripts (programs that rely on the #!, called "shebang", to execute), and I'm on Ubuntu so....

Here is a port of my test program in C:

// say_hello.c
#include <unistd.h>
#include <pwd.h>
#include <stdio.h>

void print_username(uid_t uid){
    struct passwd* p = getpwuid(uid);
    if (p == NULL){
        printf("(Unknown, user ID %d)", uid);
    } else {
        printf("%s", p->pw_name);
    }
}

int main(int argc, char** argv){
    int uid = getuid();
    int euid = geteuid();

    printf("Real user: "); print_username(uid);
    printf("\nEffective user: "); print_username(euid);
    printf("\n");
    return 0;
}

Then I compiled and ran this, to at least confirm I really understood how the setuid bit worked

>>> gcc -o say_hello say_hello.c
>>> chmod +s say_hello
>>> ./say_hello
Real user: rtpg
Effective user: rtpg
>>> sudo ./say_hello
[sudo] password for rtpg:
Real user: root
Effective user: rtpg

Even when running with sudo, because the binary has the setuid bit, our effective user is set to the file owner.


So in order to write our own sudo with this bit, I opted for just writing some C instead of Python

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>


int main(int argc, char** argv){
    if(argc<2){
        fprintf(stderr, "Need to be provided arguments!\n");
        exit(EXIT_FAILURE);
    }
    int euid = geteuid();
    if(euid !=0){
        fprintf(stderr, "Not root! Does mysudo have the setuid bit set?\n");
        exit(EXIT_FAILURE);
    }
    // our effective uid is already root,
    //let's also set the real user ID
    setreuid(0, -1);

    // argc is at least 2...
    execvp(argv[1], argv + 1);
}

Then I tried it out:

>>> gcc -o mysudo mysudo.c
>>> sudo chown root:root ./mysudo
>>>
>>> sudo chmod +s mysudo
>>> ./mysudo ./say_hello.py foo bar
Called with: ['./say_hello.py', 'foo', 'bar']
Hello, root
(Effective user is root)

(I called my original say_hello.py script, whose setuid bit is not respected).

This was successful! An unprivileged user can run a program "as root" thanks to this binary.


Well, it seems like a bad idea to have a binary on our system that lets any unprivileged user get root access. This is where sudo does stuff like the sudoers file, where only specific users have permissions, and perhaps only have permissions to do very specific things.

This is why people deal with sudo or doas rather than write their own 10-line scripts to do this, of course.

Let's add some really basic checks: only my user (user ID 1000, found by running id -u in a shell) should be allowed to use this:

    if(argc<2){
        fprintf(stderr, "Need to be provided arguments!\n");
        exit(EXIT_FAILURE);
    }
    // added extra security!
    int uid = getuid();
    if(uid != 1000){
        fprintf(stderr, "Only user 1000 has these powers!\n");
        exit(EXIT_FAILURE);
    }

With this even root can't use our binary!

>>> sudo ./mysudo ./say_hello.py foo bar
Only user 1000 has these powers!

There's a lot more stuff that sudo does (including auditing of commands through audit plugins, or password prompts with timestamped files to avoid having to type your password over and over). The codebase is actually relatively flat C, and looking through it helped me to understand a couple things about how it works.

But really the challenge is in trying to figure out all the ways in which things can be made insecure unintentionally. There are a lot of ways you might try to scope down the security features, but things like user ID checks are not the end-all, especially if your security model goes a bit beyond "root and everyone else".

Further Reading

For people interested in Unix stuff in general, "Advanced Programming In The Unix Environment" is a must-have book. It might look old, but it covers more than enough ground to be a valuable read for any programmer. Ostensibly a reference book, you can basically read it front to back and find out a lot of little bits of information about how Unix works. It might feel costly for something that doesn't feel up to date (especially when we have man pages for everything covered), but the organization of the material is key.

For me it has been essential in cleaning up my mental model of what the OS is doing, and reduces bad surprises when I'm doing operational work.

The sudo codebase is some legible C, and kinda fun to scroll through. The main function is very straightforward, and you can jump to various files to see a lot of engineering to try and solve many different edge cases from the millions of sudo calls happening every hour.

This very old email thread discussing set-user-ID as a feature and its history is very revelatory to how Unix today exists as a combination of a lot of pragmatism, much more than some perfect design from the outset.