top of page

Forum Comments

SOLVED: In Search of Reliable Bidirectional Bluetooth Serial Port Profile (SPP) Communication
In Clinic
Timothy McCarthy
May 13, 2024
"BTconnected is only true within the BiBoard boot session that successful pairing occurs between the laptop and the BiBoard. In all other boot sessions, BTconnected is false so it cannot be used as is written in line 117 of io.h." What makes you say this? Do you have evidence that BTconnected becomes true before a connection is made? The log you give doesn't show that. I assume you've modified the code to output BTconnected at discrete points. The log doesn't show a successful pairing during the boot session; it shows a successful initialization of the Bluetooth object.,i.e., // io.h void blueSspSetup() { PTH("SSP: ", strcat(readLongByBytes(EEPROM_BLE_NAME), "_SSP")); SerialBT.enableSSP(); SerialBT.onConfirmRequest(BTConfirmRequestCallback); SerialBT.onAuthComplete(BTAuthCompleteCallback); SerialBT.begin(strcat(readLongByBytes(EEPROM_BLE_NAME), "_SSP")); //Bluetooth device name Serial.println("The SSP device is started, now you can pair it with Bluetooth!"); } That's not the same as the message that should appear when a successful pairing occurs, i.e., // io.h void BTAuthCompleteCallback(boolean success) { confirmRequestPending = false; if (success) { BTconnected = true; Serial.println("SSP Pairing success!!"); } else { BTconnected = false; Serial.println("SSP Pairing failed, rejected by user!!"); } } "In all other boot sessions, BTconnected is false so it cannot be used as is written in line 117 of io.h." I'm not sure what you mean by "other boot sessions"; there's only one, to my knowledge. The bot only boots once per session. I don't have a working Bluetooth setup, (my phone never sees the BittleX or any other device) so I can't run the test, but the test should be very easy to perform if you have a working Bluetooth setup. Open the Arduino serial monitor and then connect the USB cable to the bot. You should see the boot message appear in the Serial Monitor; specifically, you should see the "The SSP device is started, now you can pair it with Bluetooth!". For completeness, turn the bot battery on. Now, go to your mobile device and connect via Bluetooth to the bot. When you have successfully connected, in the Serial Monitor you should see the message "SSP Pairing success!!" At that point, you know the value of BTconnected is true. I don't know how you disconnect Bluetooth from the box, but if you do, I expect some message in the Serial Monitor to indicate that.
0
SOLVED: In Search of Reliable Bidirectional Bluetooth Serial Port Profile (SPP) Communication
In Clinic
Time to see the Vet
In Clinic
Timothy McCarthy
May 10, 2024
Prof. Farnsworth: "Good news, everyone!" Bender: "Uh-oh! I don't like the sound of that." Prof. Farnsworth: "I have two resolutions and one clue ..." Bender: "Here it comes ..." Prof. Farnsworth: "... and two apologies" Bender: "And I'm outta here!" Apology #1: The shoemaker's wife has no shoes. Dr. Petoi recently posted a video of a test of a T_SKILL_DATA command with the maximum number of elements that would fill up the newCmd buffer to demonstrate it could handle the data. In my reply I asked if he could increase the data amout in order to force the buffer overflow code to execute. I wasn't thinking. I could have done that myself with the test software I wrote. My bot wasn't working and had a hardware problem but I have been testing it with the battery off for some commands for a while. I could have easily written the stress test to overflow the newCmd buffer. But I wasn't thinking. So my apologies for asking you to do something I could have done myself. Apology #2: Do the whole job, not just half. The test is fairly trivial but the results ... well, that's getting ahead of things. Here's my first version of the test TEST_F(ftfBittleXProtocol, newCmd_overflow) { const unsigned BUFF_LEN{ 2507 }; // 1524 =125*20+7=2507 vector < int8_t > symphony(BUFF_LEN, 32); cmd_def_t command{ BEEP, "b", "BEEP" }; command += symphony; EXPECT_TRUE(on_command(command)); } When I ran the test I got a long beep and the output (trimmed for space) 1>[ RUN ] ftfBittleXProtocol.newCmd_overflow 1>ftBittleX::ftfBittleX::on_send 1>TX command : b32 32 32 32 32 ... 32 32 32 32 1>write count : 7522 1>G:\AppDev\Robotics\Petoi\test\ftBittleX\ftBittleX.cpp(423): error : Value of: result 1> Actual: false 1>Expected: true 1>ftSerial.write FAILED expected: 7522 actual: 2304 1>expected : 7522 1>actual : 2304 1>G:\AppDev\Robotics\Petoi\test\ftBittleX\ftBittleX.cpp(654): error : Value of: result 1> Actual: false 1>Expected: true 1>G:\AppDev\Robotics\Petoi\test\ftBittleX\ftprotocol.cpp(565): error : Value of: on_command(command) 1> Actual: false 1>Expected: true 1>[ FAILED ] ftfBittleXProtocol.newCmd_overflow (308 ms) I was surprised by the number of bytes sent and we saw the write failure message before in a previous test. First I cut down the data size to get the byte count closer to the buffer size (I'd figure out why the size was large later). But once I got the byte count down to just below 2507 I was still getting the write failure. So I needed to look more closely into that. I recalled the posts about the problem of BittleX reading the Bluetooth input and how Bluetooth would breakup the data into chunks that caused a BittleX read timeout error. Dr. Petoi said the solution was to increase the timeout value for those commands and asked if I could try that and play with the value. Well, I did try it and matched the BittleX TIMEOUT_LONG value in the OpenCat_serial code and tested it and didn't see any effect. But I didn't "play" with the value. This time I did by increasing the timeout value by 10 ms. And the result ... no more write failure! So, my apologies for slacking off and not doing as I'm told. Resolutions From my POV, this issue is on the BittleX side where the UART microcode needs to manage the input queue during read. The write timeout is the time to wait for the receiving device to signal ready to receive the next byte. But it could be on the Windows side where it manages the output queue during write. With that resolved I could explain the size of the data being sent. For an ASCII command like 'b' each data element is a number. An ASCII number has one byte for each digit in the number. I used a 2 digit number (32) for all elements. There is a space between each number. Thus I need 3 bytes for each element. The maximum number of elements I can have is then BUFF_LEN/3. That test gives 1>[ RUN ] ftfBittleXProtocol.newCmd_overflow 1>ftBittleX::ftfBittleX::on_send 1>TX command : b32 32 32 32 ... 32 32 32 32 1>write count : 2506 1>expected : 2506 1>actual : 2506 1>ftBittleX::ftfBittleX::on_response 1>Command completed (normal) 1>description : BEEP 1>cmd : b 1>id : 1 1>data : 32 32 32 32 ... 32 32 32 32 1> 12984 ms : RX_latency 1> 748 ms : RX_elapsed 1>response : 2 lines 1>Changing volume to 10/10 1> 1>b 1> 1>response : end //... 1>[ OK ] ftfBittleXProtocol.newCmd_overflow (14792 ms) To force the overflow condition I add 1 to the buffer size. vector < int8_t > symphony(BUFF_LEN / 3 + 1, 32); // overflow  to get the output I'm looking for 1>[ RUN ] ftfBittleXProtocol.newCmd_overflow 1>ftBittleX::ftfBittleX::on_send 1>TX command : b32 32 32 32 ... 32 32 32 32 1>write count : 2509 1>expected : 2509 1>actual : 2509 1>ftBittleX::ftfBittleX::on_response 1>Command completed (normal) 1>description : BEEP 1>cmd : b 1>id : 1 1>data : 32 32 32 32 ... 32 32 32 32 1> 46 ms : RX_latency 1> 750 ms : RX_elapsed 1>response : 2 lines 1>OVFb 1> 1>response : end This is progress. I see the overflow response and I hear the error beep. I also see confirmation of my calculation of the response time (750 ms). The RX_latency value is an indicator of how long it the code to detect the overflow. But what about the rest of it? The overflow code replaces the command in error with the "standup" command and lets it get executed. In order to get that I need act as if I sent the "standup" command and I'm looking for the response. if (!HasFailure()) { // process overflow standup command cmd_def_t cmd_standup{ SKILL, "k", "SKILL" }; cmd_standup += string("up"); EXPECT_TRUE(on_response(cmd_standup)); } to get the output: 1>ftBittleX::ftfBittleX::on_response 1>Command completed (normal) 1>description : SKILL 1>cmd : k 1>id : 9 1>data : up 1> 0 ms : RX_latency 1> 23 ms : RX_elapsed 1>response : 3 lines 1>up 1> 1>k 1> 1>k 1> 1>response : end We're not done yet. This demonstrates that an ASCII command overflows properly. It should be identical for a binary command. If so, then the test for the binary test should be identical to the ASCII test and trivial to write and has identical behavior, TEST_F(ftfBittleXProtocol, BIN_newCmd_overflow) { const unsigned BUFF_LEN{ 2507 }; // 1524 =125*20+7=2507 vector < int8_t > symphony(BUFF_LEN + 1, 32); // overflow cmd_def_t command{ BEEP_BIN, "B", "BEEP" }; command += symphony; unsigned latency{ 14000 }; EXPECT_TRUE(on_command(command, latency)); if (!HasFailure()) { // process overflow standup command cmd_def_t cmd_standup{ SKILL, "k", "SKILL" }; cmd_standup += string("up"); EXPECT_TRUE(on_response(cmd_standup)); } } the results: 1>ftBittleX::ftfBittleX::on_send 1>TX command : 4220202020... 202020207e 1>write count : 2510 1>expected : 2510 1>actual : 2510 1>ftBittleX::ftfBittleX::on_response 1>Command completed (normal) 1>description : BEEP_BIN 1>cmd : B 1>id : 3 1>data : 32323232 ... 32323232 1> 46 ms : RX_latency 1> 749 ms : RX_elapsed 1>response : 2 lines 1>OVFB 1> 1>response : end 1>ftBittleX::ftfBittleX::on_response 1>Command completed (normal) 1>description : SKILL 1>cmd : k 1>id : 9 1>data : up 1> 0 ms : RX_latency 1> 228 ms : RX_elapsed 1>response : 3 lines 1>up 1> 1>k 1> 1>k 1> 1>response : end More progress. I was also able to expose a known issue in the binary data stream. A binary command is terminated by the tilde character ('~'). I believe the documentation cautions you to avoid using a tilde as a data value. The obvious reason is that the tilde will be interpreted as the end of the command but the command processing will see it as a data value and "strange" things may happen. As it turns out, my first version of the binary beep test used the same data length and values as the ASCII version TEST_F(ftfBittleXProtocol, ASC_newCmd_overflow) { const unsigned BUFF_LEN{ 2507 }; // 1524 =125*20+7=2507 vector < int8_t > symphony(BUFF_LEN / 3 + 1, 32); // no overflow cmd_def_t command{ BEEP_BIN, "B", "BEEP_BIN" }; command += symphony; //... This won't overflow the newCmd buffer but is a malformed command. The buzzer tone never stops. The data for the beep command comes in pairs, "note duration", We can calculate if the test data is a properly formed data stream by subtracting the command suffix ('~') from the length of the data stream we provide, (BUFF_LEN / 3 + 1). The result must be even value for a properly formed data stream. (BUFF_LEN / 3 + 1) - 1. = 836 -1 = 835. This is more directly shown by simply using BUFF_LEN - 1 for the data vector < int8_t > symphony(BUFF_LEN - 1, 32); // malformed This yields 3 data points for testing: vector < int8_t > malformed_symphony(BUFF_LEN - 1, 32); vector < int8_t > no_overflow_symphony(BUFF_LEN, 32); vector < int8_t > overflow_symphony(BUFF_LEN + 1, 32); I don't recommend the running the malformed test. The buzzer never ends and "annoys the pig." Here's the exciting footage of the T_BEEP buffer overflow test: https://youtu.be/vbEehKzfEt4 These results resolve two issues: 1. Why the Windows WriteFile call fails? (Thanks to Doc Petoi) 2. What happens when you do overflow newCmd buffer? The Last Test We need to do one last test before we put this issue to rest. We need to test when the T_SKILL_DATA command overflows the newCmd buffer the program doesn't crash. Logically, this should no different than the binary BEEP command but it is the issue under test so we want to explicitly test it. We just have to change the command we send; the data or its format doesn't matter. All that matters is the size of the data. The response should be the same as that for the binary BEEP command. There's an interesting twist to this test TEST_F(ftfBittleXProtocol, SKILL_DATA_newCmd_overflow) { const unsigned BUFF_LEN{ 2507 }; // 1524 =125*20+7=2507 vector  nonsense(BUFF_LEN + 1, 32); // overflow cmd_def_t command{ SKILL_DATA, "K", "SKILL_DATA" }; command += nonsense; unsigned latency{ 14000 }; EXPECT_TRUE(on_command(command, latency)); if (!HasFailure()) { // process overflow standup command cmd_def_t cmd_standup{ SKILL, "k", "SKILL" }; cmd_standup += string("up"); EXPECT_TRUE(on_response(cmd_standup)); } } The results confirm our analysis but uncover a problem in processing some commands 1>[ RUN ] ftfBittleXProtocol.SKILL_DATA_newCmd_overflow 1>ftBittleX::ftfBittleX::on_send 1>TX command : 4b20202020 ... 202020207e 1>write count : 2510 1>expected : 2510 1>actual : 2510 1>ftBittleX::ftfBittleX::on_response 1>Command completed (toggle) 1>description : SKILL_DATA 1>cmd : K 1>id : 10 1>data : 32323232 ... 32323232 1> 45 ms : RX_latency 1> 1027 ms : RX_elapsed 1>response : 4 lines 1>OVFK 1> 1>up 1> 1>k 1> 1>response : end 1>ftBittleX::ftfBittleX::on_response 1>ftBittleX::ftfBittleX::on_response elapse timeout! 1>description : SKILL 1>cmd : k 1>id : 9 1>data : up 1> 0 ms : RX_latency 1> 2028 ms : RX_elapsed 1>response : 1 lines 1>k 1> 1>response : end When most commands complete they echo the command to the serial monitor. However, some commands, like T_SKILL_DATA, do some software "sleight-of-hand" and echo back one or more lowercase values of the command. The T_SKILL_DATA is one of these and normally responds with two 'k' values. But it doesn't do that here. For overflow it responds with "OVFK". It then issues the "kup" command which does respond with two 'k' valuss when it completes. So the test harness becomes confused about the responses. This is part of why I'm not fond of  behavior I didn't ask for. It's inconsistent. I think an argument can be made for the error handling to be elsewhere. But that's a design, not a testing, issue. The tests here confirm the expected behavior and not a crash. Here's the dramatic and astounding video of the T_SKILL_DATA buffer overflow test: https://youtu.be/DIS5Efdx258   A Possible Clue As a  result of the diagnosis of a possible hardware failure, I started to think about how to test the hardware to confirm the diagnosis. This brought back memories of  my early days working in a computer manufacturing plant when I worked with various hardware components and microprocessors. It's been so long since then (40 years) that I've forgotten the day-to-day, simple tasks you do when a hardware issue comes up. But  I  suddenly recalled one of them,  a task so simple you do it by reflex: visual  inspection. You look at the component to see if there's a visible  flaw. It takes a trained eye to see some flaws but the folks who make them know where to  look and can spot them at a glance. The component under suspicion here is the power supply circuit. The documentation identifies where that circuit is located on the board. So I took a picture of Malfunctioning Eddie's board: Here's a enlarged view of the power circuit (if I've located it correctly): My eye isn't educated enough to know if there's a visible flaw. My eye is drawn  to the one of the two circular soldered connections. The one on the right; the upper right hand  corner seems odd to me; as if a connection point has been uncovered due to solder  fatigue or something. But Idunno. But I do think there's someone who could quickly tell if there's an obvious flaw.
Content media
1
Time to see the Vet
In Clinic
Timothy McCarthy
May 09, 2024
"The T_SKILL_DATA overflow is already handled by cmdLen >= BUFF_LEN alone because it's an "or" condition." That condition handles just the raw overflow of the newCmd buffer. No command may exceed the BUFF_LEN limit. By definition, read_serial would directly overflow the newCmd buffer in that case. But that's not the test case we're interested in. We're interested in the case of spaceAfterStoringData <= cmdLen. The command length exceeds the threshold spaceAfterStoringData of the move operation but not the size of the command buffer newCmd. I provided a test of that with T_SKILL_DATA that demonstrates the condition isn't detected 1>G:\AppDev\Robotics\Petoi\test\utOpenCatEsp32\readoverflow.cpp(120): error : Value of: is_overflowex() 1> Actual: false 1>Expected: true 1>Overflow undetected for: 1> K : token 1> 17 : cmdLen 1> 16 : spaceAfterStoringData I'm focused on the value of spaceAfterStoringData when the expression is evaluated. In read_serial, during each iteration of the do-while loop, what is the value of spaceAfterStoringData? What skill command does it pertain to? Get Out Jail Free: Fortunately, or unfortunately, this analysis is mute due to the sheer size of newCmd (2507 bytes). I don't have a test case to demonstrate the error state. We can calculate where the threshold value is for a given skill command, and I can't find a pair that would trigger the error.  In addition, I've been mischaracterizing the error as a buffer overflow. It's not a buffer overflow but a data overlap. The duty angle data is inserted into the newCmd buffer by moving the skill command frame list to the end of the buffer. If the duty angle data overlap the frame it will overwrite the frame data.
0
Time to see the Vet
In Clinic
Timothy McCarthy
May 08, 2024
see Skill::inplaceShift
0
Time to see the Vet
In Clinic
Timothy McCarthy
May 08, 2024
Part 3 More Tests An additional test is to remove the dependence on the order of precidence of the logical operators by adding parentheses to the expression. This insures the compiler sees the expression as we intend. My assuption is the actual intent was for the expression to be bool is_overflowex(int8_t skill_token = T_SKILL) { bool result = ( (skill_token == token || T_INDEXED_SIMULTANEOUS_ASC == lowerToken || T_INDEXED_SEQUENTIAL_ASC == lowerToken) && (cmdLen >= spaceAfterStoringData || cmdLen >= BUFF_LEN)); return result; } This shouldn't produce any new results TEST_F(utfinput_overflow, on_inputex_SKILL_DATA_overflow_undetected) { token = T_SKILL_DATA; spaceAfterStoringData = BUFF_LEN / 2; cmdLen = spaceAfterStoringData + 1; EXPECT_TRUE(is_overflowex()) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } 1>[ RUN ] utfinput_overflow.on_inputex_SKILL_DATA_overflow_undetected 1>G:\AppDev\Robotics\Petoi\test\utOpenCatEsp32\readoverflow.cpp(120): error : Value of: is_overflowex() 1> Actual: false 1>Expected: true 1>Overflow undetected for: 1> K : token 1> 17 : cmdLen 1> 16 : spaceAfterStoringData 1> 1>[ FAILED ] utfinput_overflow.on_inputex_SKILL_DATA_overflow_undetected (1 ms) The test of T_SKILL should detect the oveflow TEST_F(utfinput_overflow, is_overflowex_SKILL_overflow_detected) { token = T_SKILL; spaceAfterStoringData = BUFF_LEN / 2; cmdLen = spaceAfterStoringData + 1; EXPECT_TRUE(is_overflowex()) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } 1>[ RUN ] utfinput_overflow.is_overflowex_SKILL_overflow_detected 1>[ OK ] utfinput_overflow.is_overflowex_SKILL_overflow_detected (0 ms) These tests demonstrate adding parentheses doesn't fix the defect. The T_SKILL == token comparison cripples the expression. As a last test, I removed the exception checks and performed a simple overflow test ool is_overflow_simple() { bool result = BUFF_LEN <= cmdLen; return result; } TEST_F(utfinput_overflow, is_overflow_simple_SKILL_DATA_overflow_detected) { token = T_SKILL_DATA; spaceAfterStoringData = BUFF_LEN / 2; cmdLen = BUFF_LEN; EXPECT_TRUE(is_overflow_simple()) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } TEST_F(utfinput_overflow, is_overflow_simple_SKILL_overflow_undetected) { token = T_SKILL; spaceAfterStoringData = BUFF_LEN / 2; cmdLen = BUFF_LEN; EXPECT_TRUE(is_overflow_simple()) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } 1>[ RUN ] utfinput_overflow.is_overflow_simple_SKILL_DATA_overflow_detected 1>[ OK ] utfinput_overflow.is_overflow_simple_SKILL_DATA_overflow_detected (0 ms) 1>[ RUN ] utfinput_overflow.is_overflow_simple_SKILL_overflow_undetected 1>[ OK ] utfinput_overflow.is_overflow_simple_SKILL_overflow_undetected (0 ms) This implies that the test of spaceAfterStoringData should be performed after reading the command. The read_seriial function only insures the raw command is completely read. I would also make the argument that read_serial should return a boolean result. Conclusion This ends the first part of the code analysis. I think this demonstrates the existence of the defect in the code. I said this wasn't definiative because there's an accomplice to this defect. Demonstrating that a buffer overflow might happen isn't the same as finding where it does happen. Luckily we know the name of the accomplice: spaceAfterStoringData. Part 2 will focus on tracking him down and finding where the buffer overrun happens. In the conclusion I'll propose code changes to resolve the issue.
0
0
Time to see the Vet
In Clinic
Timothy McCarthy
May 08, 2024
Part 2 Let's write a test Before I spend energy and brain cells on code analysis, I want to write a test to veify my analysis. GTest makes this fairly easy to write. The test I want checks if the conditional statement if ( (token == T_SKILL || lowerToken == T_INDEXED_SIMULTANEOUS_ASC || lowerToken == T_INDEXED_SEQUENTIAL_ASC) && cmdLen >= spaceAfterStoringData || cmdLen >= BUFF_LEN) { //... } evaluates to true when a buffer oveflow would occur and false otherwise. The function is trivial to write bool is_overflow() { bool result = ( (T_SKILL == token || T_INDEXED_SIMULTANEOUS_ASC == lowerToken || T_INDEXED_SEQUENTIAL_ASC == lowerToken) && spaceAfterStoringData <= cmdLen || BUFF_LEN <= cmdLen); return result; } Style: I prefer comparison expressions operators to be left to right, smaller to larger, constants on the left;. logical expressions on separate lines. The function uses global variables token, lowerToken, spaceAfterStoringData, and cmdLen, so we define and initialize them along with the constants they use. const uint32_t BUFF_LEN{ 32 }; const int8_t T_SKILL{ 'k' }; const int8_t T_SKILL_DATA{ 'K' }; const int8_t T_INDEXED_SIMULTANEOUS_ASC{ 'i' }; const int8_t T_INDEXED_SEQUENTIAL_ASC{ 'm' }; const int8_t T_QUERY{ '?' }; int8_t token{ T_SKILL_DATA }; int8_t lowerToken{ int8_t(tolower(token)) }; uint32_t cmdLen{}; uint32_t spaceAfterStoringData{ BUFF_LEN }; We wrap the function and variables into a class, called a test fixture, that our test willl use. class utfinput_overflow : public testing::Test { protected: int8_t token{ T_SKILL_DATA }; int8_t lowerToken{ int8_t(tolower(token)) }; uint32_t cmdLen{}; uint32_t spaceAfterStoringData{ BUFF_LEN }; public: utfinput_overflow() {} ~utfinput_overflow() {} bool is_overflow() { bool result = ( (T_SKILL == token || T_INDEXED_SIMULTANEOUS_ASC == lowerToken || T_INDEXED_SEQUENTIAL_ASC == lowerToken) && spaceAfterStoringData <= cmdLen || BUFF_LEN <= cmdLen); return result; } }; The class is derived from a class that GTest provides to handle the interface to the GTest framework. We aren't concerned about the details for that class here but we need it to connect our test to the fixture. The test is trivial TEST_F(utfinput_overflow, SKILL_DATA_overflow_undetected) { token = T_SKILL_DATA; lowerToken = int8_t(tolower(token)); spaceAfterStoringData = BUFF_LEN/2; cmdLen = spaceAfterStoringData + 1; EXPECT_TRUE(is_overflow()) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } We build and run the test and get the output 1>[ RUN ] utfinput_overflow.SKILL_DATA_overflow_undetected 1>G:\AppDev\Robotics\Petoi\test\utOpenCatEsp32\readoverflow.cpp(70): error : Value of: is_overflow() 1> Actual: false 1>Expected: true 1>Overflow undetected for: 1> K : token 1> 17 : cmdLen 1> 16 : spaceAfterStoringData 1> 1>[ FAILED ] utfinput_overflow.SKILL_DATA_overflow_undetected (0 ms) This confirms our analysis that the expression doesn't do it's job. The T_SKILL_DATA command can produce an undetected buffer overflow. It also confirms our analysis of the order of precedence for the logical operators. That's progress When we test using the expected T_SKILL token, the test passes. TEST_F(utfinput_overflow, SKILL_overflow_detected) { token = T_SKILL; spaceAfterStoringData = BUFF_LEN / 2; cmdLen = spaceAfterStoringData + 1; EXPECT_TRUE(is_overflow()) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } output: 1>[ RUN ] utfinput_overflow.SKILL_overflow_detected 1>[ OK ] utfinput_overflow.SKILL_overflow_detected (0 ms) However, the T_SKILL caommand uses a named, built-in skill and doesn't have variable data, This test case should never happen. If it does happen, it's detected but is a false-postive and means that spaceAfterStoringData isn't being maintained properly. That observation will be useful later on. The test is helpful but could be more helpful and flexible if we could control the token value being sought in addition to the input token.This is easily accomplished by adding a default token value to the is_overflow function. bool is_overflow(int8_t skill_token = T_SKILL) { bool result = ( (skill_token == token || T_INDEXED_SIMULTANEOUS_ASC == lowerToken || T_INDEXED_SEQUENTIAL_ASC == lowerToken) && cmdLen >= spaceAfterStoringData || cmdLen >= BUFF_LEN); return result; } The current tests are unaffected and should behave as before but we can now test what would appear to be the obvious fix: change the test expression to use T_SKILL_DATA TEST_F(utfinput_overflow, SKILL_DATA_overflow_detected) { token = T_SKILL_DATA; spaceAfterStoringData = BUFF_LEN / 2; cmdLen = spaceAfterStoringData + 1; EXPECT_TRUE(is_overflow(T_SKILL_DATA)) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } output: 1>[ RUN ] utfinput_overflow.SKILL_DATA_overflow_detected 1>[ OK ] utfinput_overflow.SKILL_DATA_overflow_detected (0 ms) That's great, right? Not exactly. We need to look at how T_SKILL behaves TEST_F(utfinput_overflow, SKILL_overflow_undetected) { token = T_SKILL; spaceAfterStoringData = BUFF_LEN / 2; cmdLen = spaceAfterStoringData + 1; EXPECT_TRUE(is_overflow(T_SKILL_DATA)) << "Overflow undetected for:\n" << setw(4) << token << " : token\n" << setw(4) << cmdLen << " : cmdLen\n" << setw(4) << spaceAfterStoringData << " : spaceAfterStoringData\n" ; } output: 1>[ RUN ] utfinput_overflow.SKILL_overflow_undetected 1>G:\AppDev\Robotics\Petoi\test\utOpenCatEsp32\readoverflow.cpp(106): error : Value of: is_overflow(T_SKILL_DATA) 1> Actual: false 1>Expected: true 1>Overflow undetected for: 1> k : token 1> 17 : cmdLen 1> 16 : spaceAfterStoringData 1> 1>[ FAILED ] utfinput_overflow.SKILL_overflow_undetected (1 ms) "Well." sez me, "that's not really a problem cuz T_SKILL doesn't have variable data so there won't be an overflow." Disturbia Maybe the programmer didn't make a typing error by using the value T_SKILL. Maybe they did it for a reason that we haven't discovered yet. There may be a case where the T_SKILL command uses the spaceAfterStoringData value. We don't need to speculate about that because there are other commands that can have variable number of parameters that aren't included in the expression. In addition to T_INDEXED_SIMULTANEOUS_ASC and T_INDEXED_SEQUENTIAL_ASC there are matching binary versions that can also take a variable number of parameters. How many for all of them? It's not 16 but an arbitrary number, i.e., you can repeat the jointindex as long you wish. There's BEEP and BEEP_BIN. The T_JOINTS command takes a list of joints that can be arbitrarily long and have repeats as well, e.g., j 0 8 9 10 0 8 9 10 0 8 9 10 ... Not to mention any new commands added later on. The point is the fragile nature of "exception coding" like that of the conditional expression. More Disturbia. What's the intent of the statement? We're trying to prevent buffer overflow when reading a command. Does it really matter what the command is that overflows the buffer? The T_READ command should only accept one pin number to read from but that doesn't mean you can't send a long list of pins. I don't know what it does with such a list because my test only checked a single pin but I don't expect it to "brick" the bot. Up until now I've ignored what happens when a buffer overflow condition is detected. What happens is if ( /* detect overflow */ ) { PTF("OVF"); beep(5, 100, 50, 5); do { serialPort->read(); } while (serialPort->available()); printToAllPorts(token); token = T_SKILL; strcpy(newCmd, "up"); cmdLen = 2; return; } This doesn't look too bad until you look closer and think about it. Remember, this all happens as the sender is trying to send data and you're trying to read the input. The PTF macro does I/O to the serial monitor, which is slow and uses the same line that the USB is on that the sender is writing to. I beleive the USB is full-duplex (can send an receive at the same time) but it is the speed that is of more concern. It's the call to beep that boggles me: beep(5, 100, 50, 5) says, "Play the note 5 for 100 ms, wait for 50 ms, and do that 5 times." That's 150 ms * 5 = 750 ms! Three quarters of a second! This is 5 times longer than our largest timeout for reads. If we timeout after 150 ms, what would you expect to happen on the sender side if we delay for 750 ms? It's only at this point do we drain the input buffer; we probably should have done that first. The command response is sent out by echoing the token to the serial port. Finally, for added emphasis that something has gone wrong, the skill command for "stand up" is placed into the command buffer for subsequent processing by reaction(). From my POV, out of all the poses to take to indicate a problem, "stand up" isn't the one I'd select. "Rest" I would think might be more indicative. However, I don't like doing anything you weren't told to do. Doing nothing would be a safer course of action. end Part 2
0
0
Time to see the Vet
In Clinic
Timothy McCarthy
May 08, 2024
Preface: This is a 3 part post. You should leave now or settle in. Part 1 Corrupted memory location Before analyzing the code, I wanted to get an idea of the kind of memory that would be corrupted by a buffer overflow.This might help guage the severity and behavior symptoms. The definition of the buffer newCmd is char *newCmd = new char[BUFF_LEN + 1]; In my experience, dynamic memory corruption is the diagnostic "kiss of death". Without a debugger, unless you're an expert on memory layout, et al., it's fruitless to try to trace. As the saying goes, "And it annoys the pig." Analysis The expression (token == T_SKILL || lowerToken == T_INDEXED_SIMULTANEOUS_ASC || lowerToken == T_INDEXED_SEQUENTIAL_ASC) && cmdLen >= spaceAfterStoringData || cmdLen >= BUFF_LEN needs to be examined in some detail in order to understand the conditions necessary to replicate the defect. First, I want to reorder the expression terms for clarity. (T_SKILL == token || T_INDEXED_SIMULTANEOUS_ASC == lowerToken || T_INDEXED_SEQUENTIAL_ASC == lowerToken) && spaceAfterStoringData <= cmdLen || BUFF_LEN <= cmdLen This code style brings "what I'm looking for" to the forefront when reading the code.This helps me keep my mental model of the algorithm being used. The test case of the expression is: T_SKILL_DATA == token. The questions are: What is the intent of the code? How is the expression evaluated? How do the values of lowerToken, spaceAfterStoringData, and cmdLen affect the expression. What is the intent of the code? The intent is to detect a buffer overflow for those commands that accept a variable length argument list. How is the expression evaluated? The code depends upon the order of precedence of the logial operators. (Parentesies would have made the intent unambiguous.) The order of precedence for the logical operators is: NOT (!), AND (&&), OR (||). The expression is then equivalent to ( ( T_SKILL == token || T_INDEXED_SIMULTANEOUS_ASC == lowerToken || T_INDEXED_SEQUENTIAL_ASC == lowerToken ) && (spaceAfterStoringData <= cmdLen) ) || (BUFF_LEN <= cmdLen) The precondition, T_SKILL_DATA == token ('K') controls the value of lowerToken, i.e., T_SKILL == lowerToken ('k'). Substitution these values into the first term of the AND expression gives, (T_SKILL == T_SKILL_DATA || T_INDEXED_SIMULTANEOUS_ASC == T_SKILL || T_INDEXED_SEQUENTIAL_ASC == T_SKILL) which evaluates to false. This causes the entire expression to collapse to the last term (BUFF_LEN <= cmdLen) which is the check for buffer overflow. Isn't that what we want, after all? Not rxactly. The reason is that the first term of the AND expression should be (T_SKILL_DATA == T_SKILL_DATA || T_INDEXED_SIMULTANEOUS_ASC == T_SKILL || T_INDEXED_SEQUENTIAL_ASC == T_SKILL) which evaluates to true. So the AND expression should be (TRUE) && (spaceAfterStoringData <= cmdLen) This shows the effect of spaceAfterStoringData on the test for buffer overflow. If cmdLen exceeds spaceAfterStoringData then buffer overflow will occur. Note that the code distinguishes between BUFF_LEN and spaceAfterStoringData. It certainly appears their relationship is: spaceAfterStoringData <= BUFF_LEN. The initialization of spaceAfterStoringData confirms this. int spaceAfterStoringData = BUFF_LEN; We want to know how and when spaceAfterStoringData is calculated. We know it doesn't happen as the input is read, so it should be calculated before the call to read_serial. Disturbia The data is coming through the protocol so we don't know the length or content until after we've received it. (We'd have to be psychic to do that.) However, we might, at the start of reading a command, reset spaceAfterStoringData back to it's initialization state. (Pssst! We don't.) But that doesn't really help us determine if the buffer will oveflow as we're reading the data. It's only after we've read the command and data, and we're processing it, that we can determine if the buffer will overflow. The use of spaceAfterStoringData during the read operation is suspicious. We're almost ready to track down spaceAfterStoringData but there's another clue about it that we have. The buffer ovrflow check is restricted by the 3 token types. We can expect that spaceAfterStoringData is affected by or affects the processing of those tokens. end Part 1
0
5

Timothy McCarthy

更多動作
bottom of page