Device Drivers, Part 6: Decoding Character Device File Operations

Gearing up for character drivers

This article, which is part of the series on Linux device drivers, continues to cover the various concepts of character drivers and their implementation, which was dealt with in the previous two articles [1, 2].

So, what was your guess on how Shweta would crack the problem? Obviously, with the help of Pugs. Wasn’t it obvious? In our previous article, we saw how Shweta was puzzled by not being able to read any data, even after writing into the /dev/mynull character device file. Suddenly, a bell rang — not inside her head, but a real one at the door. And for sure, there was Pugs.

“How come you’re here?” exclaimed Shweta.

“I saw your tweet. It’s cool that you cracked your first character driver all on your own. That’s amazing. So, what are you up to now?” asked Pugs.

“I’ll tell you, on the condition that you do not play spoil sport,” replied Shweta.

Pugs smiled, “Okay, I’ll only give you advice.”

“And that too, only if I ask for it! I am trying to understand character device file operations,” said Shweta.

Pugs perked up, saying, “I have an idea. Why don’t you decode and then explain what you’ve understood about it?”

Shweta felt that was a good idea. She tail‘ed the dmesg log to observe the printk output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open, my_close, my_read, and my_write.

static int my_open(struct inode *i, struct file *f)
{
    printk(KERN_INFO "Driver: open()\n");
    return 0;
}
static int my_close(struct inode *i, struct file *f)
{
    printk(KERN_INFO "Driver: close()\n");
    return 0;
}
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    return 0;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    return len;
}

Based on the earlier understanding of the return value of the functions in the kernel, my_open() and my_close() are trivial, their return types being int, and both of them returning zero, means success.

However, the return types of both my_read() and my_write() are not int, rather, it is ssize_t. On further digging through kernel headers, that turns out to be a signed word. So, returning a negative number would be a usual error. But a non-negative return value would have additional meaning. For the read operation, it would be the number of bytes read, and for the write operation, it would be the number of bytes written.

Reading the device file

To understand this in detail, the complete flow has to be given a relook. Let’s take the read operation first. When the user does a read from the device file /dev/mynull, that system call comes to the virtual file system (VFS) layer in the kernel. VFS decodes the <major, minor> tuple, and figures out that it needs to redirect it to the driver’s function my_read(), that’s registered with it. So from that angle, my_read() is invoked as a request to read, from us — the device-driver writers. And hence, its return value would indicate to the requesters (i.e., the users), how many bytes they are getting from the read request.

In our null driver example, we returned zero — which meant no bytes available, or in other words, the end of the file. And hence, when the device file is being read, the result is always nothing, independent of what is written into it.

“Hmmm… So, if I change it to 1, would it start giving me some data?” asked Pugs, by way of verifying.

Shweta paused for a while, looked at the parameters of the function my_read() and answered in the affirmative, but with a caveat — the data sent would be some junk data, since my_read() is not really populating data into buf (the buffer variable that is the second parameter of my_read(), provided by the user). In fact, my_read() should write data into buf, according to len (the third parameter to the function), the count in bytes requested by the user.

To be more specific, it should write less than, or equal to, len bytes of data into buf, and the number of bytes written should be passed back as the return value. No, this is not a typo — in the read operation, device-driver writers “write” into the user-supplied buffer. We read the data from (possibly) an underlying device, and then write that data into the user buffer, so that the user can read it. “That’s really smart of you,” said Pugs, sarcastically.

Writing into the device file

The write operation is the reverse. The user provides len (the third parameter of my_write()) bytes of data to be written, in buf (the second parameter of my_write()). The my_write() function would read that data and possibly write it to an underlying device, and return the number of bytes that have been successfully written.

“Aha!! That’s why all my writes into /dev/ mynull have been successful, without actually doing any read or write,” exclaimed Shweta, filled with happiness at understanding the complete flow of device file operations.

Preserving the last character

With Shweta not giving Pugs any chance to correct her, he came up with a challenge. “Okay. Seems like you are thoroughly clear with the read/write fundamentals; so, here’s a question for you. Can you modify these my_read() and my_write() functions such that whenever I read /dev/mynull, I get the last character written into /dev/mynull?”

Confidently, Shweta took on the challenge, and modified my_read() and my_write() as follows, adding a static global character variable:

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    buf[0] = c;
    return 1;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    c = buf[len – 1];
    return len;
}

“Almost there, but what if the user has provided an invalid buffer, or if the user buffer is swapped out. Wouldn’t this direct access of the user-space buf just crash and oops the kernel?” pounced Pugs.

Shweta, refusing to be intimidated, dived into her collated material and figured out that there are two APIs just to ensure that user-space buffers are safe to access, and then updated them. With the complete understanding of the APIs, she rewrote the above code snippet as follows:

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: read()\n");
    if (copy_to_user(buf, &c, 1) != 0)
        return -EFAULT;
    else
        return 1;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{
    printk(KERN_INFO "Driver: write()\n");
    if (copy_from_user(&c, buf + len – 1, 1) != 0)
        return -EFAULT;
    else
        return len;
}

Then Shweta repeated the usual build-and-test steps as follows:

  1. Build the modified “null” driver (.ko file) by running make.
  2. Load the driver using insmod.
  3. Write into /dev/mynull, say, using echo -n "Pugs" > /dev/ mynull
  4. Read from /dev/mynull using cat /dev/mynull (stop by using Ctrl+C)
  5. Unload the driver using rmmod.

On cat‘ing /dev/mynull, the output was a non-stop infinite sequence of s, as my_read() gives the last one character forever. So, Pugs intervened and pressed Ctrl+C to stop the infinite read, and tried to explain, “If this is to be changed to ‘the last character only once’, my_read() needs to return 1 the first time, and zero from the second time onwards. This can be achieved using off (the fourth parameter of my_read()).”

Shweta nodded her head obligingly, just to bolster Pugs’ ego.

  • Jerrin Shaji George

    Could you explain how the fourth parameter ‘off’ can be used to prevent the infinite output sequence. Thanks in advance. 

    • http://twitter.com/anil_pugalia Anil Pugalia

      For that please understand, why is the infinite sequence in the first place. It is because the read never says end of file by returning a zero, but always keep on giving data, whenever asked for. With this a user code doing a read till end of file would go into infinite loop. Hence, to fix this you need to have a case of returning a zero. In our case, we used the case to be “when you try to read the second time or second byte”, which is very well captured by the fourth parameter ‘off’, telling us where exactly was it already reading.

  • Guest

    You might want to add the linux header uaccess.h to access those api calls.

    • http://twitter.com/anil_pugalia Anil Pugalia

      You are right. I missed mentioning that.

  • http://sosaysharis.wordpress.com/ Haris Ibrahim K. V.

    Please mention to add the header in order for the copy_to_user and copy_from_user function calls to work properly. I’m following each and every step of your guide. :)

    • http://twitter.com/anil_pugalia Anil Pugalia

      Thanks for the addendum.

  • http://sosaysharis.wordpress.com/ Haris Ibrahim K. V.

    I made the changes in my write and read functions and built the driver. However, after writing to the driver using echo, when I tried “cat /dev/mynull” nothing happened. The output was same as in last article. Any idea where I might have gone wrong?

  • PeterHiggs

    I am able to compile successfully, then i insmod the module, then I did the write openration as you mentioned the “$ echo -1 “Pugs” > /dev/mynull “. Now I tried to read the file using “$ sudo cat /dev/mynull” but see something in infinite loop, but I am not able to see any character. I can’t see anything and the read() operation in loop. Where I am going wrong?

    • anil_pugalia

      It should be “echo -n” not “echo -1″. Please correct it & try.

  • Sab

    Firstly , great article. I bought the Linux device driver book buy Oreilly but could not understand much. Most of my learning has been from your website. If you write a book please do let me know. I will be the first to buy it :-).I tried to use the value of *off and I printed it to my logs, but it appears that it is always 0. I had to use a separate variable to make it print only once. Could you please explain as to how to do it using the long offset pointer

    • http://twitter.com/anil_pugalia Anil Pugalia

      Thanks for reading & appreciating the article. *off would change
      only if the driver changes it. So, in read when it is 0, you need to put
      the value in buf & then increment the *off, i.e. do a (*off)++;

      • Sab

        Okay. That clears things up. Thanks

  • Amit

    thanks sir , for your contribution…
    sir, is there any way to track down how control is going from when we call open to .open in operation struct
    any tool or any other way to know the control flow in device driver..

    • http://twitter.com/anil_pugalia Anil Pugalia

      You may try strace, printk in kernel, … to name a few.

  • Rahul

    Awesome article ….

    • http://twitter.com/anil_pugalia Anil Pugalia

      Thanks for reading & appreciating.

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
LINUX For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.