HP5508A interferometer replacement hardware 2.0 - Heterodyne Interpolation

As a next step, it would be nice if the hardware could detect the phase angle between REF and MEAS. Certainly, oversampling could be used by the consumer of the data stream, but this is relatively slow. Let's try to make it happen in the Teensy. At first glance, the GPT timers seem like just what we need. There are two on the chip. They are 32 bit counters that can be clocked as high as 150 MHz and each counter has two sampling registers that can trigger on external signals. So, if we can forward the signal pins, we can perform the following measurements:

  • measure rising followed by falling edge of REF to get a count for a half period

  • measure rising edge of REF followed by rising edge of MEAS to get a count of the phase difference delay count

  • the ratio of a full-period count to the delay count is then the phase difference.

If not, we have to use some other scheme...

So can we connect these timers to the right pins? And also is the Teensy already using either one of these to provide Arduino API functionality? As for the API functions, this page:

https://github.com/luni64/TeensyTimerTool/wiki/Supported-Timers#gpt---general-purpose-timer

states that the GPT blocks are free.

Next, lets look at the pin assignments for the inputs

To determine which pins those are on the Teensy 4.0 we have to look at its documentation. There is a handy file here:

Or we can find it here:

Or we can use the Teensy 4.0 Schematic:

Thus,

GPT1_CAPTURE1 - GPIO_EMC_24 -> Unavailable on Teensy 4.0

GPT1_CAPTURE1 - GPIO_B1_05 -> Unavailable on Teensy 4.0

GPT1_CAPTURE2 - GPIO_EMC_23 -> Unavailable on Teensy 4.0

GPT1_CAPTURE2 - GPIO_B1_06 -> Pin17/A3

GPT2_CAPTURE1 - GPIO_EMC_41 -> Unavailable on Teensy 4.0

GPT2_CAPTURE1 - GPIO_AD_B1_03 -> Pin 15/A1

GPT2_CAPTURE2 - GPIO_EMC_40 -> Unavailable on Teensy 4.0

GPT2_CAPTURE2 - GPIO_AD_B1_04 -> Unavailable on Teensy 4.0

So the GPT timers cannot be utilized because they cannot be connected to all the signals we need.

Instead, let us try to read the ports directly. Here the information about where the pins lie in memory:

  • GPIO1-5 are standard-speed GPIOs that run off the IPG_CLK_ROOT, while GPIO6-9 are high-speed GPIOs that run at the AHB_CLK_ROOT frequency.

  • Regular GPIO and high speed GPIO are paired (GPIO1 and GPIO6 share the same pins, GPIO2 and GPIO7 share, etc). The IOMUXC_GPR_GPR26-29 registers are used to determine if the regular or high-speed GPIO module is used for the GPIO pins on a given port.

According to this discussion here:

https://forum.pjrc.com/threads/58432-Nanosecond-Resolution-Interrupts-on-Teensy-4-0

the Teensy 4.0 defaults to fast mapping.

This is interesting because it indicates that these pins could maybe be sampled at the full 600+Mhz core frequency. At the same time,

So does that mean that there is a constant 2 cycle delay between the pin state and reading it, or does it mean that it takes 3 cycles to settle one read? Well see....

Previously we had chosen the following pins for heterodyne function:

REF: Quadtimer4 - D9 - GPIO_B0_11

Which we can find on this table:

Which means we should be able to read pin D9 directly using GPIO2/7

Like this:

void setup() {

Serial.begin(115200);

pinMode(9, INPUT);

}


void loop() {

uint32_t * pointerToGPIO6_PSR = (uint32_t *) 0x42004008 ;

Serial.println(*pointerToGPIO6_PSR);

delay(100);

}

Next, lets see how fast we can read this port in a tight loop using just C:

void setup() {

Serial.begin(115200);

pinMode(9, INPUT);

}


void loop() {

uint32_t gpioData[100];

uint32_t * pointerToGPIO6_PSR = (uint32_t *) 0x42004008 ;

// first read 100 values as fast as possible

for (int i = 0; i < 100; i++)

{

gpioData[i] = *pointerToGPIO6_PSR;

}

// now write the 100 values

for (int j = 0; j < 100; j++)

{

Serial.println(gpioData[j]);

}

Serial.println(); // separator

delay(100);

}

Which results in the following serial data when driven with a 1 Mhz signal:

We are reading a 32 bit port. The bit describing D9 is at the value for 2048. Thus about 67 samples per oscillation. We could go maybe 30 times faster ... 30 Mhz. This is really too slow for what we are trying to to. A 5 Mhz signal could only be interpolated to 6X resolution. Lets try to use assembly code. In order to avoid branching overhead, I just unrolled the loop.


void setup() {

// put your setup code here, to run once:

Serial.begin(115200);

pinMode(9, INPUT);

}


void loop() {

// put your main code here, to run repeatedly:


uint32_t gpioData[100];

// first read 100 values as fast as possible

for (int i = 0; i < 100; i++)

{

gpioData[i] = 5;

}

asm volatile("ldr r0 ,=0x42004008 \n\t" // load address of GPIO6_PSR into r0

"mov r1 ,%0 \n\t" // copy address of array into register, index = 0

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r1

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

: // output operand list

: "r" (gpioData) // input operand list

: "r0", "r1", "r2"

);

// now write the 100 values

for (int j = 0; j < 100; j++)

{

Serial.println(gpioData[j]);

}

Serial.println(); // separator

delay(100);

}


The serial console shows this for a 10 Mhz signal:


So either 7 or 8 samples per period, corresponding to a maximum phase offset measurement of about 70-80 Mhz. For much faster signals it may be feasible to just use the registers:


void setup() {

// put your setup code here, to run once:

Serial.begin(115200);

pinMode(9, INPUT);

}


void loop() {

// put your main code here, to run repeatedly:


uint32_t gpioData[11];

asm volatile("ldr r0 ,=0x42004008 \n\t" // load address of GPIO6_PSR into r0

"mov r1 ,%0 \n\t" // copy address of array into register, index = 0

"ldr r2 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"ldr r3 ,[r0] \n\t" // load value of GPIO6_PSR into r3

"ldr r4 ,[r0] \n\t" // load value of GPIO6_PSR into r4

"ldr r5 ,[r0] \n\t" // load value of GPIO6_PSR into r5

"ldr r6 ,[r0] \n\t" // load value of GPIO6_PSR into r6

"ldr r7 ,[r0] \n\t" // load value of GPIO6_PSR into r7

"ldr r8 ,[r0] \n\t" // load value of GPIO6_PSR into r8

"ldr r9 ,[r0] \n\t" // load value of GPIO6_PSR into r9

"ldr r10,[r0] \n\t" // load value of GPIO6_PSR into r10

"ldr r11,[r0] \n\t" // load value of GPIO6_PSR into r11

"ldr r12,[r0] \n\t" // load value of GPIO6_PSR into r12

"str r2 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r3 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r4 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r5 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r6 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r7 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r8 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r9 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r10,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r11,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"str r12,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

: // output operand list

: "r" (gpioData) // input operand list

: "r0", "r1", "r2","r3", "r4", "r5","r6", "r7", "r8", "r9", "r10", "r11", "r12"

);

// now write the 11 values

for (int j = 0; j < 11; j++)

{

Serial.println(gpioData[j]);

}

Serial.println(); // separator

delay(100);

}

And the result is:

Absolutely no improvement. The array stores are very effectively cached by the processor. Still, this gives 75 Mhz resolution for the phase shift between the signals.

So, a branch condition probably won't add any overhead, either. Let us try:

void setup() {

Serial.begin(115200);

pinMode(9, INPUT);

}


void loop() {

uint32_t gpioData[100];

for (int j = 0; j < 100; j++)

{

gpioData[j] = 5;

}

asm volatile("ldr r0 ,=0x42004008 \n\t" // load address of GPIO6_PSR into r0

"mov r1 ,%0 \n\t" // copy address of array into register, index = 0

"mov r2, r1 \n\t" // copy r1 to start on end-of-loop-condition

"add r2, #396 \n\t" // end-of-loop condition. we want 100 4 byte values, so the end is at 400 - 4 = 396 bytes after the beginning of the array

"nextdata: \n\t" // create label to loop to

"ldr r3 ,[r0] \n\t" // load value of GPIO6_PSR into r2

"str r3 ,[r1], #4 \n\t" // store value into gpioDataArray and then add 4 bytes to the index

"cmp r1, r2 \n\t" // check loop counter against loop limit

"ble nextdata \n\t" // loop if limit not reached

: // output operand list

: "r" (gpioData) // input operand list

: "r0", "r1", "r2","r3"//, "r4", "r5","r6", "r7", "r8", "r9", "r10", "r11", "r12"

);

// now write the 11 values

for (int j = 0; j < 100; j++)

{

Serial.println(gpioData[j]);

}

Serial.println(); // separator

delay(100);

}

And, sure enough, the sample frequency stays the same!

Unfortunately the GPIO read does not yield any sane data when these pins are assigned to the timer counters, so we have to copy the signals to available spare pins.

Some easy candidates are GPIO_B0_10, GPIO_B1_00, GPIO_B0_02, GPIO_B0_01 on pins D6,D8,D11,D12 respectively.

These are then (in the same order as on the previous line) on bits GPIO2_IO10, GPIO2_IO16, GPIO2_IO02, GPIO2_IO01