Is it possible to do any better?
Aug. 27th, 2011 12:19 pm#if __GNUC__ && __i386__
/* This is not entirely satisfactory: the xchgl would be unnecessary, if only
* we had some way of communicating the detailed input and output assignments
* of the registers to the compiler. */
#define BSWAP64(N) \
({uint64_t __n = (N); __asm__("xchgl %%eax,%%edx\n" \
"\tbswap %%eax\n" \
"\tbswap %%edx" \
: "+A"(__n)); \
__n;})
#endif
The trouble is, the result ends up looking something like this:
movl 0x04(%eax),%edx
movl (%eax),%eax
xchgl %edx,%eax
bswap %eax
bswap %edx
movl 0x0c(%ebp),%esi
movl %eax,(%esi)
movl %edx,0x04(%esi)
…when obviously it would be better to:
movl 0x04(%eax),%edx
movl (%eax),%eax
bswap %eax
bswap %edx
movl 0x0c(%ebp),%esi
movl %edx,(%esi)
movl %eax,0x04(%esi)
However as far as I can see A is the only available constraint letter for 64-bit values on x86.
(no subject)
Date: 2011-08-27 11:49 am (UTC)uint32 ah = a >> 32; uint32 al = (uint32)a; BSWAP32(ah); BSWAP32(al); a = (al << 32) | ah;whereBSWAP32expands to an__asm__statement encoding a singlebswapinstruction on an arbitrary register. With any luck the compiler will optimise the shifts-by-32 into 'just access the top register of the two allocated to this 64-bit value', and will then be able to register-allocate thebswaps however it turns out easiest.(no subject)
Date: 2011-08-27 12:11 pm (UTC)(no subject)
Date: 2011-08-28 09:08 pm (UTC)(no subject)
Date: 2011-08-27 12:27 pm (UTC)# define __bswap_64(x) \ (__extension__ \ ({ union { __extension__ unsigned long long int __ll; \ unsigned int __l[2]; } __w, __r; \ if (__builtin_constant_p (x)) \ __r.__ll = __bswap_constant_64 (x); \ else \ { \ __w.__ll = (x); \ __r.__l[0] = __bswap_32 (__w.__l[1]); \ __r.__l[1] = __bswap_32 (__w.__l[0]); \ } \ __r.__ll; }))(no subject)
Date: 2011-08-27 12:37 pm (UTC)(no subject)
Date: 2011-08-27 12:38 pm (UTC)(no subject)
Date: 2011-08-27 12:48 pm (UTC)Then again, given that part of glibc is headers-only with no object code and, given it's LGPL, why not just use glibc?
(no subject)
Date: 2011-08-28 09:21 pm (UTC)__builtin_bswap64instead.(no subject)
Date: 2011-08-27 12:33 pm (UTC)I like Simon's idea, but might an alternative be to have __asm__ give you the address of __n, tell it you're going to trash %eax and %edx then do the stack manipulation yourself?
I would provide a worked example, but I'm pretty rusty on __asm__, especially in x86-land rather than ARM-land.
(no subject)
Date: 2011-08-27 05:00 pm (UTC)(no subject)
Date: 2011-08-28 01:45 am (UTC)I was assuming this was because the compiler couldn't cope with register tracking for 64-bit values?
(no subject)
Date: 2011-08-28 08:53 am (UTC)