This topic is really necessary to cover if you want deep knowledge of Linux. And even if you are not interested in Linux at all, some of the following concepts are universal in the world of operating systems. What are the File descriptors? What was the need for these? Is it just have to do with Unix/Linux operating systems? Where do we use the File descriptors in our systems? I am going to answer all of these questions in this article.

In Unix/Linux operating systems, a file descriptor is a unique identifier to a resource and that resource could be a file or some I/O resource. You can write this definition in the exam, or tell your professor to impress him/her. But if you wanna know the real magic of file descriptors, you will need to read every aspect of this article. As most of us have heard that everything in Linux is a file. Actually, it is quite true because, in a Linux kernel, files, dirs, sockets etc are defined using file descriptors in the virtual file system. And a virtual file system defines how the user-level program accesses the actual physical file system. It is implemented in the Linux kernel. Just think of it, what have you seen mostly in your Kali or Parrot Linux Distro? You will see that most of the content in your Linux is a file. for example, what do you have in your /bin? what’s in /etc or in /var ? by the way, there are file objects in /dev, not files, but let’s not go into that.

Let’s go more practical, open your Python interpreter in your Linux. You can use Python2 or Python3, won’t matter. Open a file using open(‘FILE_PATH’) function.

What we have just done, is a process of creating a file descriptor and using that file descriptor we read from the physical memory using the .read() method. You can find its process resource. I meant to say that we can look into the resources, Python3 is using and accruing in time. And until you haven’t closed that file descriptor using the .close() method, you can even see the file descriptor (f ) point to the virtual memory. In order to see those resources, you will need to find the process ID of the correct Python3 interpreter. you can use pidof command to do so.

There are other processes running over python3 on my system, so, I will guess that the last python3 process might have a larger number than the previous ones, so I will check the 804565. The Linux processes are in the /proc directory. We can find the resource for our process in the /proc/804565 directory.

As I said earlier, everything is a file. you can see that the resources of a process are files. You can google more about each file in this directory. But for now, we will focus on the fd directory. go into this directory and list out every file.

We can see that the file descriptor is pointing to that file I opened in my python3. the 0,1,2 are the default file descriptor for a process and you can see that it is pointing to pts (pseudo-terminals). This is actually referring to my terminal windows (or pane in tmux). 0 file descriptor is for input, 1 for write output and 2 for error output. If your process is having subprocesses (fork syscall), you will see that each subprocess along with the root process has the same pts cause they are running in the same terminal. and this is a reason your threads can’t write to STDOUT at the same time. because there are chances that if one thread wants to print something on the terminal, another one may have occupied the file descriptor already.

You can even see, how would cat work on this file descriptor.

If you have worked with C, you must have heard about read() and write() syscalls. cat uses the same syscalls in order to read or to print. For example, write() syscall takes file descriptor as the first argument, so in cat, that would be 1 for the pts device (terminal) it was processed into.

If you wanna learn about Unix/Linux file descriptors more, you should learn about FIFO files. As the name suggests First In First Out concept is being used here. To easily understand it, we’ll create a FIFO file quickly using mkfifo command and will echo a string into it while a cat command is being used to print the content from the FIFO file.

fifo file lol is working as a pipe where from one end a string is being pushed into it and from another end, it is being written out on the terminal. To understand what’s happing here, we can again use strace commands to get all the syscalls required to run it.

Now it is totally visible how a file descriptor is used in read() and write() syscalls.

In windows, we have file-handles works the same as file descriptors but it is different at a low level. a handle is a void pointer and it is not pointing to a memory block as we see in Unix/Linux. I am not an expert in windows kernel so, I won’t be able to explain it that well. you can check out this amazing blog.

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here