Article Index

The bcm2835 library uses direct memory access to the GPIO and other peripherals. In this chapter we look at how this works. You don't need to know this but if you need to modify the library or access features that the library doesn't expose this is the way to go. 

 

 

 

Now On Sale!

You can now buy a print or ebook edition of Raspberry Pi IoT in C from Amazon.

 

For Errata and Listings Visit: IO Press

 

 

This our ebook on using the Raspberry Pi to implement IoT devices using the C programming language. The full contents can be seen below. Notice this is a first draft and a work in progress. 

Chapter List

  1. Introducing Pi (paper book only)

  2. Getting Started With NetBeans In this chapter we look at why C is a good language to work in when you are creating programs for the IoT and how to get started using NetBeans. Of course this is where Hello C World makes an appearance.

  3. First Steps With The GPIO
    The bcm2835C library is the easiest way to get in touch with the Pi's GPIO lines. In this chapter we take a look at the basic operations involved in using the GPIO lines with an emphasis on output. How fast can you change a GPIO line, how do you generate pulses of a given duration and how can you change multiple lines in sync with each other? 

  4. GPIO The SYSFS Way
    There is a Linux-based approach to working with GPIO lines and serial buses that is worth knowing about because it provides an alternative to using the bcm2835 library. Sometimes you need this because you are working in a language for which direct access to memory isn't available. It is also the only way to make interrupts available in a C program.

  5. Input and Interrupts
    There is no doubt that input is more difficult than output. When you need to drive a line high or low you are in command of when it happens but input is in the hands of the outside world. If your program isn't ready to read the input or if it reads it at the wrong time then things just don't work. What is worse is that you have no idea what your program was doing relative to the event you are trying to capture - welcome to the world of input.

  6. Memory Mapped I/O
    The bcm2835 library uses direct memory access to the GPIO and other peripherals. In this chapter we look at how this works. You don't need to know this but if you need to modify the library or access features that the library doesn't expose this is the way to go. 

  7. Near Realtime Linux
    You can write real time programs using standard Linux as long as you know how to control scheduling. In fact it turns out to be relatively easy and it enables the Raspberry Pi to do things you might not think it capable of. There are also some surprising differences between the one and quad core Pis that make you think again about real time Linux programming.

  8. PWM
    One way around the problem of getting a fast response from a microcontroller is to move the problem away from the processor. In the case of the Pi's processor there are some builtin devices that can use GPIO lines to implement protocols without the CPU being involved. In this chapter we take a close look at pulse width modulation PWM including, sound, driving LEDs and servos.

  9. I2C Temperature Measurement
    The I2C bus is one of the most useful ways of connecting moderately sophisticated sensors and peripherals to the any processor. The only problem is that it can seem like a nightmare confusion of hardware, low level interaction and high level software. There are few general introductions to the subject because at first sight every I2C device is different, but here we present one.

  10. A Custom Protocol - The DHT11/22
    In this chapter we make use of all of the ideas introduced in earlier chapters to create a raw interface with the low cost DHT11/22 temperature and humidity sensor. It is an exercise in implementing a custom protocol directly in C. 

  11. One Wire Bus Basics
    The Raspberry Pi is fast enough to be used to directly interface to 1-Wire bus without the need for drivers. The advantages of programming our own 1-wire bus protocol is that it doesn't depend on the uncertainties of a Linux driver.

  12. iButtons
    If you haven't discovered iButtons then you are going to find of lots of uses for them. At its simples an iButton is an electronic key providing a unique coce stored in its ROM which can be used to unlock or simply record the presence of a particular button. What is good news is that they are easy to interface to a Pi. 

  13. The DS18B20
    Using the software developed in previous chapters we show how to connect and use the very popular DS18B20 temperature sensor without the need for external drivers. 

  14. The Multidrop 1-wire bus
    Some times it it just easier from the point of view of hardware to connect a set of 1-wire devices to the same GPIO line but this makes the software more complex. Find out how to discover what devices are present on a multi-drop bus and how to select the one you want to work with.

  15. SPI Bus
    The SPI bus can be something of a problem because it doesn't have a well defined standard that every device conforms to. Even so if you only want to work with one specific device it is usually easy to find a configuration that works - as long as you understand what the possibilities are. 

  16. SPI MCP3008/4 AtoD  (paper book only)

  17. Serial (paper book only)

  18. Getting On The Web - After All It Is The IoT (paper book only)

  19. WiFi (paper book only)

 

Accessing the hardware directly isn't something that everyone wants or needs to do but knowing how it all works gives you a different perspective. It means you can think about what you are doing, even if is only using the BCM2835 library in a broader and deeper way. In this chapter we look in more detail at the GPIO, its hardware and how it is controlled by the software. In particular we look as the ingenious method that Linux uses to allow you to access peripherals or any memory you want  to. This is useful in a wider context because you can use the same techniques to map any file into memory. The same techniques can be used to work with other hardware in the bcm2835 that perhaps the library doesn't cater for. 

All of the peripherals that are directly connected to the processor are memory mapped. What this means is that there are a set of addresses that correspond to "registers" that control and give the devices status. Using these is just a matter of knowing what addresses to use and what the format of the registers is and how to directly use memory under Linux.

Easy to say - slightly more difficult to get right. 

However after you have got it right you can't understand what the fuss was about.

The best way to understand how all of this works is to find out about a particular peripheral - the GPIO.

The GPIO Registers

If you look at the manual for the BCM2835 processor you will find a long section on the registers that are connected to the GPIO lines. This looks very complicated but in fact it comes down to a very simple pattern.

There are 54 GPIO lines arranged as two banks not all usable on the Pi.

For each GPIO line there is a three bit configuration code that sets it to input or output or one of the alternate functions:. 

000 = GPIO Pin is an input
001 = GPIO Pin is an output
100 = GPIO Pin takes alternate function 0
101 = GPIO Pin takes alternate function 1
110 = GPIO Pin takes alternate function 2
111 = GPIO Pin takes alternate function 3
011 = GPIO Pin takes alternate function 4
010 = GPIO Pin takes alternate function 5

These three bits are packed into five 32 bit function select or configuration registers. The first function select register holds the configuration bits for GPIO 0 to 9 i.e. 10 GPIO lines with GPIO 0 as the first three low order bits and GPIO 9 as the bits 27, 28 and 29. Bits 30 and 31 are unused in each of the registers.

The first register is:

 

Basically the three configuration bits are packed into the 32 bit registers as best they can be. This arrangement continues for the next four registers, which control ten GPIO lines each, but the fifth register only holds the configuration bits for GPIO 50 to 53 and bits 12 to 31 are unused. 

Packing 54 GPIO lines into multiples of 32 bits is always going to leave some bits over. 

Once the GPIO lines are configured into either input or output you can use the set and clear registers to set them high or low and the Level registers to read the state of inputs. 

There are two Set and two Clear registers. 

The first register of each pair controls GPIO lines 0 to 31 and the second pair controls GPIO lines 32 to 53. One bit is assigned per line and the second register of each pair has bits 22 to 31 unused. That is bit zero in Set 0 or Clear 0 controls GPIO 0 etc. 

 

If you write to either register then the lines that correspond to one bits are set or cleared according to which register you write to. There is no simple Out register that you can write to simultaneously set and clear bits. The reason is that you can use Set and Clear to set or clear any GPIO lines without changing the state of the others. 

For example suppose you want to set GPIO 0 to a one then you would write 0x01 to the Set 0 register. In this case the zero bits have no effect. A general Out register would also set all of the other lines it controlled to low in response to the zeros. This means controlling GPIO lines with a general Out register usually involves a read to establish the current state of all of the GPIO lines, then a logical operation to set or clear a particular line followed by a write. Having Set and Clear registers means you can set any group of lines to high or low in one write operation. However what you cannot do is set lines high and low at the same time. 

To summarize the main registers controlling the GPIO lines are

FSEL0-FSEL5    configuration registers three bits 
               for each GPIO line

SET0     Set any group of GPIO lines high 
SET1 

CLR0  Set any group of GPIO lines low
CLR1

LEV0  read the state of all GPIO lines
LEV1

There are also pairs of registers that control the event detection and interrupts which follow the description on of the functions in the interrupt section of the previous chapter. 

These are:

EDS0  Event detect status
EDS1

REN0 Rising edge detect enable
REN1

FEN0 Falling edge detect enable
FEN1

HEN0 High detect enable
HEN1

LEN0 Low detect enable
LEN1 

and two async version of the edge detection enables:

AREN0 async rising edge detect enable
AREN1

AFEN0 async falling edge detect enable
AFEN1

There are also three registers that let you set the pullup/pulldown behavior of any GPIO line:

UD      Pull up/down enable
UDCLK0  Enable clock
UDCLK1

You will notice that unlike the other enable registers there is only a single 32 bit Pull up/down register to control 53 GPIO lines. In fact only the first two low bits control anything.

b1 b0
 0   0   push pull
 0   1   pull down
 1   0   pull up
 1   1   reserved

To set which GPIO line the new state refers to you have to use the UDCKL pair of registers. The procedure is surprisingly complicated. 

  1.  set the state required in the UD register
  2. wait 150 clock cycles for the hardware to settle
  3. write a 1 to each bit position in UDCKL0/1 to determine which GPIO lines the new state will apply to. 
  4. wait 150 clock cycles for the hardware to settle
  5. write all zeros to UD and UDCKL0/1 to clear the set state

The bcm2835 library provides functions that expose this clocking procedure and a function that hides it from you. The

bcm2835_gpio_set_pud (uint8_t pin, uint8_t pud)

function simply sets the pin you select to the specified pullup/down. 

There is one more set of registers to consider that it not listed in the standard documentation for reasons that are not clear. There are three PAD registers that set the fine details of the GPIO drive. You can consider the PAD registers as additional to the PUD registers. The three registers control groups of GPIO lines:

PAD0 GPIO 0 -27
PAD1 GPIO 28-45
PAD2 GPIO 46-53

The configuration is set in bits 0 to 5:

bits 2,1 0   Drive Strength
000            = 2mA 
001            =4mA
n                =(n+1)*2mA

bit 3 controls hysteresis 0 = disabled 1= enabled

bit 4 controls slew rater 0=slew rate limited 1 = slew rate not limited. 

The top 8 bits of the PAD registers has to be set to 0x5A to allow the PAD register to be written to. This is misleadingly called a PASSWRD. 

There isn't much documentation on hysteresis and slew rate but broadly speaking hysteresis makes the input a schmitt trigger and slew rate puts a small capacitive load on the input. 

The Drive Strength setting deserves explanation. This isn't the amount of current that the pin can supply. It is the effective output resistance. Each time the drive current is increased by 2mA another transistor is used in the drive so lowering the output resistance. This has the effect of increasing or decreasing the voltage. 

This is a subtle idea and related to how much current is needed to make the output 3.3V or 0V. For example, if you have a 2mA drive current then a load that draws 2mA will have a voltage across it within the tolerance for 3.3V logic. If the load draws more than 2mA then the output voltages will not be high and low enough to meet the tolerances 0.8V for zero and 1.3V for one.

The drive current is not a maximum current that can be supplied and it is certainly not a current limiting value. 

If you short an output to 0V or 3.3V then it will supply as much current as it can before overheating and failing. 

The power up defaults for PAD registers are Slew rate unlimited, hysteresis enabled and drive 8mA.


Where Are The Registers?

The only question we have to answer now is where are the registers?

This turns out to be a difficult question to answer because the processor implement memory mapping which essentially means that any physical address can appear almost anywhere in the memory map of a running program. 

All of the peripheral registers including the GPIO registers we have been describing are in a block of memory  0x2000 0000 to 0x20FF FFFF acording to the documentation but this has been changed to 0x3F00 0000 for the Pi 2 and 3 which is a source of much confusion. 

So in the PI 1 the peripheral registers start at 0x2000 0000. For a Pi 2 and later the registers start is indicated by the contents of a file setup by the Linux Device Tree. So to find out where the registers are you have to read:

/proc/device-tree/soc/ranges

which contains three four byte values the address of the start of the memory, the second byte and its size,. the third byte. In practice reading the file returns 0x3F00 0000 for a Pi 2 and 3 but this could change in the future.

The best way to discover where the perhiperhal registers are located is to try to read the file and use the default 0x2000 0000 if it isn't present. For example:

bcm2835_peripherals_base=  0x20000000;  
int fp;
if ((fp = fopen("/proc/device-tree/soc/ranges" , "rb")))
{
  unsigned char buf[8];
 if (fread(buf, 1, sizeof(buf), fp) == sizeof(buf))
 bcm2835_peripherals_base = 
  (uint32_t *)(buf[4] << 24 |
               buf[5] << 16 | 
               buf[6] << 8  | 
               buf[7] << 0);
    fclose(fp);
    }

 

In practice it is easier to rely on the bcm2835 library and the 

bcm2835_peripherals_base 

variable which is set to the start of the peripherals area when you initialize the library - i.e. it is set correctly for all Pis. 

The register addresses can be specified as offsets from bcm2835_peripherals_base but it is easier to take the offsets from where the first register belonging to the device is.

So for the GPIO the first GPIO register is at 0x20 0000, which is BCM2835_GPIO_BASE  in the bcm2835 library

That is the starting address of the GPIO registers is given by:

gpio address  =bcm2835_peripherals_base +
                               BCM2835_GPIO_BASE

​The offsets for each of the registers from this address is

0x0000 GPFSEL0 GPIO Function Select 0 
0x0004 GPFSEL1 GPIO Function Select 1 
0x0008 GPFSEL2 GPIO Function Select 2 
0x000C GPFSEL3 GPIO Function Select 3
0x0010 GPFSEL4 GPIO Function Select 4
0x0014 GPFSEL5 GPIO Function Select 5
   
0x001C GPSET0 GPIO Pin Output Set 0
0x0020 GPSET1 GPIO Pin Output Set 1
   
0x0028 GPCLR0 GPIO Pin Output Clear 0
0x002C GPCLR1 GPIO Pin Output Clear 1 32 W
   
0x0034 GPLEV0 GPIO Pin Level 0
0x0038 GPLEV1 GPIO Pin Level 1
   
0x0040 GPEDS0 GPIO Pin Event Detect Status 0
0x0044 GPEDS1 GPIO Pin Event Detect Status 1
   
0x004C GPREN0 GPIO Pin Rising Edge Detect Enable 0
0x0050 GPREN1 GPIO Pin Rising Edge Detect Enable 1
0x0058 GPFEN0 GPIO Pin Falling Edge Detect Enable 0 
0x005C GPFEN1 GPIO Pin Falling Edge Detect Enable 1
0x0064 GPHEN0 GPIO Pin High Detect Enable 0
0x0068 GPHEN1 GPIO Pin High Detect Enable 1
0x0070 GPLEN0 GPIO Pin Low Detect Enable 0
0x0074 GPLEN1 GPIO Pin Low Detect Enable 1
0x007C GPAREN0 GPIO Pin Async. Rising Edge Detect
0x0080 GPAREN1 GPIO Pin Async. Rising Edge Detect 1
0x0088 GPAFEN0 GPIO Pin Async. Falling Edge Detect 0
0x008C GPAFEN1 GPIO Pin Async. Falling Edge Detect 1
   
0x0094 GPPUD GPIO Pin Pull-up/down Enable
0x0098 GPPUDCLK0 GPIO Pin Pull-up/down Enable Clock 
0x009C GPPUDCLK1 GPIO Pin Pull-up/down Enable Clock

 

The only registers missing from this list are the PAD control registers and these are located in a different area of memory at:

BCM2835_GPIO_PADS=bcm2835_peripherals_base+BCM2835_GPIO_PADS
                 =0x20100000 or
                 =0x3F100000 for Pi2

And the offsets are:

0x002c PADS0 (GPIO 0-27)
0x0030 PADS1 (GPIO 28-45)
0x0034 PADS2 (GPIO 46-53)

You can work out the address of any register listed in the documentation by finding the base address for the registers and adding its offset. 

It is worth knowing that you can use

cat /proc/iomem

to discover where devices are in memory. On a Pi 2 you will see:

00000000-3affffff : System RAM
  00008000-007e641f : Kernel code
  0085e000-0098c1ab : Kernel data
3f006000-3f006fff : dwc_otg
3f007000-3f007eff : /soc/dma@7e007000
3f00b840-3f00b84e : /soc/vchiq
3f00b880-3f00b8bf : /soc/mailbox@7e00b800
3f200000-3f2000b3 : /soc/gpio@7e200000
3f201000-3f201fff : /soc/uart@7e201000
  3f201000-3f201fff : /soc/uart@7e201000
3f202000-3f2020ff : /soc/sdhost@7e202000
3f980000-3f98ffff : dwc_otg

 


Linux Memory Access /dev/mem

if you recall the discussion in the chapter on SYSFS Linux likes to have every I/O device look as much like a file as possible. This is a basic principle of Unix that Linux has taken to heart and in the main it sort of works. However sometimes you have to think that we are going the long way round to get a job done. 

For example, how do you think Linux gives your user mode program access to raw memory?

Yes, that is correct. It represents the memory as a single binary file /dev/mem. 

This is character device file that is an image of user mode memory. And when you read or write byte n this is the same as reading and writing the memory location at byte address n. You can move the pointer to any memory location using lseek and read and write blocks of bytes using fread and fwrite. 

It is simple but it takes some time to get used to the idea. 

For example to open the file to read and write it you might use:

int memfd = open("/dev/mem", O_RDWR | O_SYNC);

The O_RDWR opens the file for read and write and the O_SYNC flag makes the call blocking. 

After this you can lseek to the memory location you want to work with. For example to go to the start of the GPIO registers you would use:

  uint32_t p = lseek(memfd, (off_t) 0x3f200000, SEEK_SET);

for the Pi 2 or 

  uint32_t p = lseek(memfd, (off_t) 0x20200000, SEEK_SET);

for the Pi 1. Notice that these  are the base address plus 0x20 000

Next you could read the 32 bits starting at that location i.e. the FSEl0 register:

    int n = read(memfd, buffer, 4);

Unfortunately at the moment if you try this then you will find that it doesn't work and you get a bad address error. You can read and write other memory locations but the peripheral registers don't seem to work. If they did this would be a perfectly good way to read and write the registers. 

As this doesn't work we need to move on to what does. 

Memory Mapping Files

The Linux approach to I/O places the emphasis on files but there are times when reading and writing a file to an external device like a disk drive is too slow. To solve this problem Linux has a memory mapping function which will read any portion of a file into user memory so that you can work with is directly using pointers. This in principle is a very fast way to access any file - including /dev/mem. 

This seems like a crazy round about route to get at memory. First implement memory access as a file you can read and then map that file into memory so that you can read it as if it was memory - which it is. However if you follow the story it is logical. What is more it solves a slightly different problem very elegantly. It allows the fixed physical addresses of the peripherals into the user space virtual addresses. In other words when you memory map the /dev/mem file into user memory it can be located anywhere and the address of the start of the register area will be within your programs allocated address space. This means that all of the addresses we have been listing will change. Of course as long as we work with offsets from the start of memory this is no problem - we update the staring value and use the same offsets. 

Lets see how this works in practice. 

The key function is mmap

void *mmap(
 void *addr,
 size_t length,
 int prot, 
 int  flags,
 int fd, 
 off_t offset
);

The function memory maps the file corresponding to the file descriptor fd into memory and returns its start address. The offset and length parameters control the portion of the file mapped i.e. the mapped portions starts at the byte given by offset and continues for length bytes.

There is a small complication in that for efficiency reasons the file is always mapped in units of the page size of the machine. So if you ask for a 1Kbyte file to be loaded into memory then, on the Pi with a 4Kbyte page size, 4Kbytes of memory will be allocated. The file will occupy the first 1Kbytes and the rest will be zeroed. 

You can also specify the address that you would like the file loaded to in your programs address space but the system doesn't have to honor this request - it just uses it as a hint. Some programmers reserve and area of memory using malloc say and then ask the system to load the file into it - as this might not happen it seems simpler to let the system allocate the memory and pass NULL as the starting address. 

Prot and flags specify various ways the file can be memory mapped and there are a lot of options - see the man page for details. 

Notice that this is a completely general mechanism and you can use it to map any file into memory. For example if you have graphics file - image.gif - then you could load it into memory to make working with it faster. Many databases use this technique to speed up their processing. 

Now all we have to do is map /dev/mem into memory. 

First we need to open the /dev/mem device as usual:

uint32_t memfd = open("/dev/mem", O_RDWR | O_SYNC);

As long as this works we can map the file into memory. 

We want to map the file starting at either 0x2020 0000 for the Pi 1 or starting at 0x3F20 0000 for the Pi 2. If we only want to work with the GPIO registers then we only need offsets of 0000 to 00B0 i.e. 176 bytes but as we get the a complete 4K page we might as well map 4KBytes worth of address space:

 uint32_t * map = (uint32_t *)mmap(
                   NULL,
                   4*1024,
                   (PROT_READ | PROT_WRITE),
                   MAP_SHARED,
                   memfd,
                   0x3f200000);

If you try this remember to change the offset to be correct for the Pi you are using or better us bcm2835_peripherals_base to specify the address.

Notice also that we haven't set an address for the file to be loaded into - the system will take care of it and return the address in map. We also have asked for read/write permission and allowed other processes to share the map. This makes map a very important variable because now it gives the location of the start of the GPIO register area but in user space. The bcm2835 has a standard variable for this:

bcm2835_peripherals

Now we can read and write a 3KByte block of addresses starting at the first GPIO register i.e. FSEL0. 

For example to read FSEL0 we would use:

printf("fsel0 %X \n\r",*map);

To access the other registers we need to add their offset but there is one subtle detail. The pointer to the start of the memory has been caste to a  uint32_t because we want to read and write 32 bit registers. However by the rules of pointer arithmetic when you add one to a pointer you actually add the size of the date type the pointer is pointing to. In this case when you add one to map you increment the location it is pointing at by four i.e. the size of a 32 bit unsigned integer. 

The rule is that with this cast we are using word addresses which are byte addresses divided by 4. 

Thus when we add the offsets we need to add the offset divided by 4. 

With this all clear lets write a program that toggles GPIO 4 as fast as possible. 


The Fastest Pulse

First we need to set it to output and GPIO 4 is controlled by FSEL0 and bits 12,13 and 14 which we want to set to 001 i.e. we need to store 0x1000 in FSEL0:

*paddr=0x1000;

With GPIO 4 set to output we now need to use the SET0 and CLR0 registers to set the line high and low. As we want to do this as fast as possible we need to precompute the addresses of SET0 and CLR0:

volatile uint32_t* paddr1 = map + 0x1C/4;
volatile uint32_t* paddr2 = map + 0x28/4;
for(;;){
​  *paddr1=0x10;
  *paddr2=0x10;
};

The complete program is:

#include <stdio.h>
#include <stdlib.h>
#include <bcm2835.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <errno.h>

int main(int argc, char** argv) {
int memfd = open("/dev/mem", O_RDWR | O_SYNC);
uint32_t * map = (uint32_t *)mmap(
                    NULL,
                    4*1024,
                    (PROT_READ | PROT_WRITE),
                    MAP_SHARED, 
                    memfd, 
                    0x3f200000);
 if (map == MAP_FAILED)
    printf("bcm2835_init: %s mmap failed: %s\n", strerror(errno));    
close(memfd);

volatile uint32_t* paddr = map;
*paddr=0x1000;
volatile uint32_t* paddr1 = map + 0x1C/4;
volatile uint32_t* paddr2 = map + 0x28/4;
for(;;){
 *paddr1=0x10;
 *paddr2=0x10;
};
    return (EXIT_SUCCESS);
}

If you run this program you will discover that it generates pulses that are as small as 0.25 microseconds (Pi 2 and Zero). This is as fast as you can go using memory mapped file access. 

Because of all of the complexities and differences between the Pi 1 and P2 you are much better off using the bcm2835 library which uses exactly these technique to work with the GPIO and isn't much slower than a custom code approach.

Lets look at the lower lever functions that the library provides.

Low Level Register Access

The bcm2835 library provides a small number of functions that will access any register you need to. It makes use of the /dev/mem file and the mmap function and it works in more or less the way described above. The big advantage is that it sets things up so that the addressing is correct for the current and presumably future versions of the Pi. 

There are two read and two write functions:

uint32_t bcm2835_peri_read (volatile uint32_t *paddr)
uint32_t bcm2835_peri_read_nb (volatile uint32_t *paddr)

void bcm2835_peri_write (volatile uint32_t *paddr, uint32_t value)
void bcm2835_peri_write_nb (volatile uint32_t *paddr, uint32_t value)

The difference between them is the use of read/write barriers. This is something that has been ignored until now. The processor allows operations to occur in an almost synchronous way. This means that it is possible for results to occur out of order to the way you programmed them. This can only happen on the first access to a peripheral. If you read or write to a peripheral for the first time you need to use a barrier. Subsequent reads and writes don't need a barrier. If you write to another peripheral and then go back to the first you need to use a barrier again. In short you need a barrier at the start of any consecutive peripheral accesses. 

It is always safer to use standard read/write functions that apply a barrier than the nb - non-barrier functions - however these are slightly faster. 

As well as the four basic read/write functions we also have a set function:

void bcm2835_peri_set_bits (volatile uint32_t *paddr, uint32_t value, uint32_t mask)

This will set the bits defined in the mask to the value specified in the corresponding bit in the value parameter.  For example

bcm2835_peri_set_bits (paddr,0x01,0x01)

will set bit 0 to a 1 leaving all other bits unchanged and

bcm2835_peri_set_bits (paddr,0x00,0x01)

will set bit 0 to a 0 leaving all other bits unchanged. 

Finally we have the problem of specifying the addresses we want to use. The problem is of course what is the base address?

There is a useful function that will return the base address of any of the standard registers:

uint32_t * bcm2835_regbase (uint8_t regbase)

and regbase is one of

BCM2835_REGBASE_ST 

Base of the ST (System Timer) registers.

BCM2835_REGBASE_GPIO 

Base of the GPIO registers.

BCM2835_REGBASE_PWM 

Base of the PWM registers.

BCM2835_REGBASE_CLK 

Base of the CLK registers.

BCM2835_REGBASE_PADS 

Base of the PADS registers.

BCM2835_REGBASE_SPI0 

Base of the SPI0 registers.

BCM2835_REGBASE_BSC0 

Base of the BSC0 registers.

BCM2835_REGBASE_BSC1 

Base of the BSC1 registers.

 

So to get the address in user memory of the GPIO register you can use BCM2835_REGBASE_GPIO.

Alternatively you can use bcm2835_peripherals and simply add the known offsets e.g.

bcm2835_peripherals + BCM2835_REGBASE_GPIO/4;

The library also provides a set of precomputed starting addresses for the standard sets of registers:

bcm2835_gpio = bcm2835_peripherals + BCM2835_GPIO_BASE/4;
bcm2835_pwm  = bcm2835_peripherals + BCM2835_GPIO_PWM/4;
bcm2835_clk  = bcm2835_peripherals + BCM2835_CLOCK_BASE/4;
bcm2835_pads = bcm2835_peripherals + BCM2835_GPIO_PADS/4;
bcm2835_spi0 = bcm2835_peripherals + BCM2835_SPI0_BASE/4;
bcm2835_bsc0 = bcm2835_peripherals + BCM2835_BSC0_BASE/4; /* I2C */
bcm2835_bsc1 = bcm2835_peripherals + BCM2835_BSC1_BASE/4; /* I2C */
bcm2835_st   = bcm2835_peripherals + BCM2835_ST_BASE/4;

Notice that the addresses all refer to the location in user space where the file has been mapped and all of the offsets have to be converted to word addressed by being divided by 4.

An Almost Fastest Pulse

As an example of using the low level functions let's repeat the toggling of GPIO 4 using them. This doesn't give you the fastest possible time because there is the overhead of the function calls - but it is almost as good. 

This time we need to initialize the library - this is where the mapping is set up.

    if (!bcm2835_init())
        return 1;

Next we can get the address of the start of the GPIO registers in user memory:

uint32_t* gpioBASE = bcm2835_regbase(BCM2835_REGBASE_GPIO);

Finally we can set GPIO 4 to output and set and clear it:

bcm2835_peri_write(gpioBASE, 0x1000);
for (;;) {
 bcm2835_peri_write(gpioBASE + BCM2835_GPSET0 / 4, 0x10);
 bcm2835_peri_write(gpioBASE + BCM2835_GPCLR0 / 4, 0x10);
}

If you run this version of the program you will find that the smallest pulses are around 0.5 microseconds (PI 2 and Zero).

If you change the writes for non-barrier writes then the pulses do get shorter - typically 0.3 microseconds (PI 2 and Zero) but there is much more variability. 

In practice using the barrier read/writes seems adequate.

GPIO Clocks - An Example

This is an advanced topic.

The GPIO Clocks are a facility isn't as well known as they deserve to be and there are no functions that let you work with them in the bcm2835 library - but it fairly easy to add one.

There are three general purpose GPIO clocks and two special purpose clocks - the PWM and PCM clock, You can set any of the clocks to run at a given rate and the general purpose clocks can be routed to a subset of GPIO pins. The outputs are pulse trains of the sepecified frequency which can be modulated by changing the clock divider.

The frequency division can include a fractional part which, if you know your digital logic, is surpising. Dividing by 2, 4 or 8 is easy but how do you divide by 2.5? The answer is that you use a MASH filter. Exactly how this works is beyond the scope of this book to explain but it is a digital processing technique that can generate a signal with the frequency desired. The problem is that it also generates additional error frequencies that with luck are outside of the band required and easy to remove. If you opt for no MASH filter then you cannot use a fractional divider. Selecting one, two or three MASH filters produces the required frequecny but with different properties of noise associated with the signal.

There are three GPIO clocks which can be used with the following GPIO lines:

GPCLK0 GPIO4 GPIO20 GPIO32 GPIO34 GPIO44 

GPCLK1 GPIO5  GPIO21 GPIO42 

GPCLK2 GPIO6 GPIO43

The only GPIO pin available on early Pis is GPIO4 but on B+/2 you can use the following:

GPIO4  GPCLK0 ALT0
GPIO5  GPCLK1 ALT0 (reserved for system use)
GPIO6  GPCLK2 ALT0 
GPIO20 GPCLK0 ALT5 

Each clock is controlled by two registers: a control register and a register used to specify the clock divider.

The word, multiply by 4 for byte address, offsets for the registers  are:

#define CLK_GP0_CTL 28
#define CLK_GP0_DIV 29

#define CLK_GP1_CTL 30
#define CLK_GP1_DIV 31

#define CLK_GP2_CTL 32
#define CLK_GP2_DIV 33

#define CLK_PCM_CTL 38
#define CLK_PCM_DIV 39

#define CLK_PWM_CTL 40
#define CLK_PWM_DIV 41

The offsets are all relative to 

BCM2835_CLOCK_BASE              0x101000

which is a byte address relative to the start of the perhiperhal area or to

bcm2835_clk

which is a word address relative to the start of the memory mapped registers. 

The safest way to form the address of a clock register, GP0 control for example, is to use

bcm2835_clk+CLK_GP0_CTL 

as the bcm2835_clk is automatically adjusted for the start ot he perhipherals as mapped in memory. 

The control register has a simple layout:

31-24 PASSWD Clock Manager password “5a”

10-9 MASH MASH  control
      0 = integer division
      1 = 1-stage MASH 
      2 = 2-stage MASH
      3 = 3-stage MASH
 8 FLIP Invert the clock generator output
 7 BUSY Clock generator is running
 6 - Unused
 5 KILL Kill the clock generator
      0 = no action 1 = stop 
4 ENAB Enable the clock generator 
3-0 SRC Clock source
      0 = GND
      1 = oscillator        19.2MHz
      2 = testdebug0
      3 = testdebug1
      4 = PLLA              0MHz               
      5 = PLLC              1000MHz
      6 = PLLD              500MHz
      7 = HDMI auxiliary    216MHz
 8-15 = GND

The important points are that you use ENAB to start and stop the clock. You don't make changes to the settings while the clock is running and you don't make changes while enabling the clock.

The divider register has the format:

31-24 PASSWD Clock Manager password “5a”
23-12 DIVI Integer part of divisor
11-0 DIVF Fractional part of divisor

and as for the control register you do not change this while BUSY=1.

So to configure the clock you:

  • Set ENAB low
  • Wait for BUSY to go low
  • Set the values you want to change including the divider register but with ENAB low
  • Set ENAB high with the same set of values so as not to change them.

Now we can put this together to write a function that will set the clock assocated with GPIO 4, you can easily change this to work with any of the valid GPIO lines:

#define CLK_GP0_CTL 28
#define CLK_GP0_DIV 29
void bcm2835_GPIO4_set_clock_source(
           uint32_t source,
           uint32_t divisorI,
           uint32_t divisorF) {
 if (bcm2835_clk == MAP_FAILED)
       return;
 divisorI &= 0xfff;
 divisorF &= 0xfff;
 source &= 0xf;
   
 uint8_t mask=bcm2835_peri_read(bcm2835_clk + CLK_GP0_CTL)
           & 0xffffffef;
   
 bcm2835_peri_write(bcm2835_clk + CLK_GP0_CTL,
                     BCM2835_PWM_PASSWRD | mask);
    
 while ((bcm2835_peri_read(bcm2835_clk + CLK_GP0_CTL) & 0x80) != 0){};

 bcm2835_peri_write(bcm2835_clk + CLK_GP0_DIV,
  BCM2835_PWM_PASSWRD | (divisorI << 12) | divisorF);
 bcm2835_peri_write(bcm2835_clk + CLK_GP0_CTL,
  BCM2835_PWM_PASSWRD |  source|0x200);
 bcm2835_peri_write(bcm2835_clk + CLK_GP0_CTL, 
  BCM2835_PWM_PASSWRD | 0x0210 | source);
}

At the start of the function we make sure that the divisors and the source are withing the legal range by anding them with masks. Next we read the control register so as to create a mask so that we don't change any of the bits until the clock is stopped. Notice that the enable bit of the mask is set to zero which is why the clock stops when the mask is written back to the control register. The while loop waits for the clock to stop and then the divsor is written to the divisor register and the source to the control register. Finally the source is written to the control register along with an enable bit set to 1. The 0x200 selects one stage of MASH which is necessary if you want to use the fractional divider. Also notice that we are using the predefined BCM2835_PWM_PASSWRD for 0x5A to allow a write to the registers.

You can try this out with a main program something like:

 if (!bcm2835_init())
        return 1;
    bcm2835_gpio_fsel(4, BCM2835_GPIO_FSEL_ALT0);
    bcm2835_GPIO4_set_clock_source(6, 50, 0);

If you look at the output of GPIO 4 with a logic analyzer you might be surprised how jittery the 10MHz i.e. 500MHz/50 clock is. Part of this is likely to be due to the poor wave shape at such a high frequecny. If you look at the pulse stream on an oscilloscope then you should see something like:

 

The maximum frequency that can be produced depends on the loading on the GPIO pin and stray capacitance. In theory you should be able to get up to over 100MHz in practice this is difficult. As the frequency goes up the losses due to capacitance increase and the output voltage falls. Add to this the fact that many oscilloscopes will not record a signal at 100MHz and it gets very difficult to work with dividers smaller than 20 because you can't see the output.

With a divider of 5 you can still get enough of a signal to use the output as a 100MHz FM transmitter. There are programs that make use of this effect to send an FM audio signal at  by changing the frequency via the fractional divider. 

Conclusion

​Learning how to access the GPIO registers using memory mapping isn't so much important for using the GPIOs as extending what can be done to the other registers. Once you understand the Linux performs memory mapping you can see that it is a much more general mechanism that can be applied in other situations. However most of the time you can start out using the library and only think about doing anything more complicated if it proves necessary. Before implementing your own memory mapping make sure you try the low level register functions - they are almost as fast as anything you can create.

 

 

 

Now On Sale!

You can now buy a print or ebook edition of Raspberry Pi IoT in C from Amazon.

 

For Errata and Listings Visit: IO Press

 

 

This our ebook on using the Raspberry Pi to implement IoT devices using the C programming language. The full contents can be seen below. Notice this is a first draft and a work in progress. 

Chapter List

  1. Introducing Pi (paper book only)

  2. Getting Started With NetBeans In this chapter we look at why C is a good language to work in when you are creating programs for the IoT and how to get started using NetBeans. Of course this is where Hello C World makes an appearance.

  3. First Steps With The GPIO
    The bcm2835C library is the easiest way to get in touch with the Pi's GPIO lines. In this chapter we take a look at the basic operations involved in using the GPIO lines with an emphasis on output. How fast can you change a GPIO line, how do you generate pulses of a given duration and how can you change multiple lines in sync with each other? 

  4. GPIO The SYSFS Way
    There is a Linux-based approach to working with GPIO lines and serial buses that is worth knowing about because it provides an alternative to using the bcm2835 library. Sometimes you need this because you are working in a language for which direct access to memory isn't available. It is also the only way to make interrupts available in a C program.

  5. Input and Interrupts
    There is no doubt that input is more difficult than output. When you need to drive a line high or low you are in command of when it happens but input is in the hands of the outside world. If your program isn't ready to read the input or if it reads it at the wrong time then things just don't work. What is worse is that you have no idea what your program was doing relative to the event you are trying to capture - welcome to the world of input.

  6. Memory Mapped I/O
    The bcm2835 library uses direct memory access to the GPIO and other peripherals. In this chapter we look at how this works. You don't need to know this but if you need to modify the library or access features that the library doesn't expose this is the way to go. 

  7. Near Realtime Linux
    You can write real time programs using standard Linux as long as you know how to control scheduling. In fact it turns out to be relatively easy and it enables the Raspberry Pi to do things you might not think it capable of. There are also some surprising differences between the one and quad core Pis that make you think again about real time Linux programming.

  8. PWM
    One way around the problem of getting a fast response from a microcontroller is to move the problem away from the processor. In the case of the Pi's processor there are some builtin devices that can use GPIO lines to implement protocols without the CPU being involved. In this chapter we take a close look at pulse width modulation PWM including, sound, driving LEDs and servos.

  9. I2C Temperature Measurement
    The I2C bus is one of the most useful ways of connecting moderately sophisticated sensors and peripherals to the any processor. The only problem is that it can seem like a nightmare confusion of hardware, low level interaction and high level software. There are few general introductions to the subject because at first sight every I2C device is different, but here we present one.

  10. A Custom Protocol - The DHT11/22
    In this chapter we make use of all of the ideas introduced in earlier chapters to create a raw interface with the low cost DHT11/22 temperature and humidity sensor. It is an exercise in implementing a custom protocol directly in C. 

  11. One Wire Bus Basics
    The Raspberry Pi is fast enough to be used to directly interface to 1-Wire bus without the need for drivers. The advantages of programming our own 1-wire bus protocol is that it doesn't depend on the uncertainties of a Linux driver.

  12. iButtons
    If you haven't discovered iButtons then you are going to find of lots of uses for them. At its simples an iButton is an electronic key providing a unique coce stored in its ROM which can be used to unlock or simply record the presence of a particular button. What is good news is that they are easy to interface to a Pi. 

  13. The DS18B20
    Using the software developed in previous chapters we show how to connect and use the very popular DS18B20 temperature sensor without the need for external drivers. 

  14. The Multidrop 1-wire bus
    Some times it it just easier from the point of view of hardware to connect a set of 1-wire devices to the same GPIO line but this makes the software more complex. Find out how to discover what devices are present on a multi-drop bus and how to select the one you want to work with.

  15. SPI Bus
    The SPI bus can be something of a problem because it doesn't have a well defined standard that every device conforms to. Even so if you only want to work with one specific device it is usually easy to find a configuration that works - as long as you understand what the possibilities are. 

  16. SPI MCP3008/4 AtoD  (paper book only)

  17. Serial (paper book only)

  18. Getting On The Web - After All It Is The IoT (paper book only)

  19. WiFi (paper book only)

 

 

comments powered by Disqus