I have an digital board intended for a Flight control system on RC aircraft, with grand plans for a home built UAV. I have been doing a lot of study of the math and algorithms that go into such systems. Some of the discussions cover efficiency of the algorithms, in terms of how many multiply operations and addition operations are required for each update time. The AVR 8 bit devices have a built in hardware multiply operation, in 8 bit. 16 bit operations require a couple of cycles. What they do NOT have is a divide operation. A divide is the same a multiplication by the reciprocal.. but how do you get the reciprocal? The other operation that is not represented is the square root.
The avr-libc library includes divide and sqrt functions, but they are fairly costly according to the documentation. The __divsf3 function, which I assume is the divide (but documentation is limited here), takes 465 clock cycles. That is about 24us at 20Mhz, with 1 cycle per clock (I think that is correct).. That seems like an awfully long time to perform a division.. especially if a mul operation is 1 cycle! sqrt takes 492 cycles, which is also pretty high. For some reason, I would expect sqrt to take more.
I'm sure I can dig into the details of these routines, but if anyone has some insight, that would be great. Perhaps if I knew a bit more about how these particular routines work, I might be able to optimize a version specifically for my task. I'm sure that the avr-libc versions are somewhat generic, so, making one that is specific to a single purpose might save some of those cycles. Anyone have any experience in this area?
The avr-libc library includes divide and sqrt functions, but they are fairly costly according to the documentation. The __divsf3 function, which I assume is the divide (but documentation is limited here), takes 465 clock cycles. That is about 24us at 20Mhz, with 1 cycle per clock (I think that is correct).. That seems like an awfully long time to perform a division.. especially if a mul operation is 1 cycle! sqrt takes 492 cycles, which is also pretty high. For some reason, I would expect sqrt to take more.
I'm sure I can dig into the details of these routines, but if anyone has some insight, that would be great. Perhaps if I knew a bit more about how these particular routines work, I might be able to optimize a version specifically for my task. I'm sure that the avr-libc versions are somewhat generic, so, making one that is specific to a single purpose might save some of those cycles. Anyone have any experience in this area?