Atomic operations on the x86 processors

  • warning: include(/tmp/fortune.txt): failed to open stream: No such file or directory in /home/mohawksoft/org/www/htdocs/includes/common.inc(1696) : eval()'d code on line 1.
  • warning: include(): Failed opening '/tmp/fortune.txt' for inclusion (include_path='.:/usr/share/php:/usr/share/pear') in /home/mohawksoft/org/www/htdocs/includes/common.inc(1696) : eval()'d code on line 1.

On the Intel type of x86 processors including AMD, increasingly there are more CPU cores or processors running in parallel.

In the old days when there was a single processor, the operation:

++i; 

Would be thread safe because it was one machine instruction on a single processor. These days laptops have numerous CPU cores so that even single instruction operations aren't safe. What do you do? Do you need to wrap all operations in a mutex or semaphore? Well, maybe you don't need too.

Fortunately, the x86 has an instruction prefix that allows a few memory referencing instruction to execute on specific memory locations exclusively.

There are a few basic structures that can use this:

(for the GNU Compiler)

void atom_inc(volatile int *num)
{
        __asm__ __volatile__ ( "lock incl %0" : "=m" (*num));
}
void atom_dec(volatile int *num)
{
        __asm__ __volatile__ ( "lock decl %0" : "=m" (*num));
}
int atom_xchg(volatile int *m, int inval)
{
        register int val = inval;
        __asm__ __volatile__ ( "lock xchg %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
        return val;
}
void atom_add(volatile int *m, int inval)
{
        register int val = inval;
        __asm__ __volatile__ ( "lock add %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
}
void atom_sub(volatile int *m, int inval)
{
        register int val = inval;
        __asm__ __volatile__ ( "lock sub %1,%0" : "=m" (*m), "=r" (val) : "1" (inval));
}

For the Microsoft Compiler:


void atom_inc(volatile int *num)
{
        _asm
        {
                        mov     esi,    num
                lock    inc     DWORD PTR [esi]
        };
}
void atom_dec(volatile int *num)
{
        _asm
        {               mov     esi,    num
                lock    dec     DWORD PTR [esi]
        };
}
int atom_xchg(volatile int *m, int inval)
{
        _asm
        {
                        mov     eax,    inval
                        mov     esi,    m
                lock    xchg    eax,    DWORD PTR [esi]
                        mov     inval,  eax
        }
        return inval;
}
void atom_add(volatile int *num, int val)
{
        _asm
        {               mov     esi,    num
                        mov     eax,    val
                lock    add     DWORD PTR [esi], eax
        };
}
void atom_sub(volatile int *num, int val)
{
        _asm
        {               mov     esi,    num
                        mov     eax,    val
                lock    sub     DWORD PTR [esi], eax
        };
}

The lock prefix is not universally applied. It only works if all accesses to the locations also use lock. So, even though you use "lock" in one section of code, another section of code that just sets the value will not be locked out. Think of it as just a mutex.

Basic usage:


class poll
{
    int m_pollCount;
    ....
    ....

    void pollAdd()
    { 
        atom_inc(&m_pollCount);
    }
};

The above example increments a poll object count by one.