Hello,

I'm developing an application for the XDK and have run into an issue. It takes sensor readings and uses an HTTP GET to send them to a server with REST. 

All works well for a period of time. I have had it run as long as 3 hours without issue. However it always, at some point, hangs. This can happen as quickly as 20 seconds and, like I'd mentioned, after as long as 3 hours. 

 

I seem to be failing to receive an HTTP response. Specifically 

 

static retcode_t onHTTPResponseReceived(HttpSession_T *httpSession, Msg_T *msg_ptr, retcode_t status){
    (void) (httpSession);

    if (status == RC_OK && msg_ptr != NULL) {
        Http_StatusCode_T statusCode = HttpMsg_getStatusCode(msg_ptr);
        char const *contentType = HttpMsg_getContentType(msg_ptr);
        char const *content_ptr;
        unsigned int contentLength = 0;
        HttpMsg_getContent(msg_ptr, &content_ptr, &contentLength);
        char content[contentLength+1];
        strncpy(content, content_ptr, contentLength);
        content[contentLength] = 0;
        printf("HTTP RESPONSE: %d [%s]\r\n", statusCode, contentType);
        printf("%s\r\n", content);
    } else {
        printf("Failed to receive HTTP response!\r\n");
        printf("Error code: %d\n", status);
    }

 

is hitting the else block with status code 1387,  which is, retcode_t RC_HTTP_CLIENT_NO_RESPONSE.

 

So at some point, the server fails to respond to me (or perhaps does not respond quickly enough). I wouldn't think this would cause an issue except on my next HTTP_GET attempt I fail at

    retcode_t test1 = HttpClient_initRequest(&serverAddr, port, &msg);
    if (RC_OK != test1){
        printf("Failed to create MSG_T struct. Retcode: %d", test1);
        assert(false);
    }

 

***Worth noting*** This is not always the failure case. I seem to sometimes simply hang after successfully receiving a server response. I haven't been able to pin down where that might be happening.

Since that wasn't exceptionally helpful I purchased a JTAG debugger and have been looking at the register values. I rewrote the Hard Fault handler function to dump the register values to variables so I can view them in the debugger. 

 

So when I hit the hard fault my lr register is value 0x2df0f. If I "Go to line" for that particular line of assembly I find that I am in the "Callable_callback()" function. I do not have the source for this available so I am unable to debug further. Worth noting is that upon fault, my pc register is 0. 

 

I have suspicions now that at some point a callback function is being set as NULL, and I am causing a hard fault by attempting to execute an instruction at address 0.

 

Things I have tried:

Sanity checks for NULL pointers everywhere I assign a callback function.

Semaphores surrounding any memory access between tasks (even read-only accesses).

I have a great deal of confidence it is a networking issue. The sensor tasks alone have run overnight multiple times without issue.

Increasing delays between making HTTP calls. 

 

I'm kind of banging my head against the wall at this point, especially given this seems incredibly non-deterministic. It can take me (like stated) up to 3 hours to reproduce, but it usually only takes a few minutes at most. 

 

If anyone can offer advice to move forward, I'd greatly appreicate it.

 

edit: I've observed a new behavior just now in testing, where I meet the failure criteria at  HttpClient_initRequest(&serverAddr, port, &msg);

but my sensor tasks continue running, and also the task I use to print data to the console. It looks as if the http_GET task has asserted and stopped running at that point. 

0 (0 Votes)
RE: Hitting Hard Fault (Possibly during an HTTPReceived Callback)
Answer
10/11/17 2:52 PM as a reply to Brennan Ruthardt.

Hello Brennan,

since you have already sanitized the code, you can go one step further and remove the assert(false) calls. Essentially, this will stop your task from stopping at the first time you have a bad return code. Of course, you would have to prevent subsequent functions from being called if they depend on a positive return code. Currently, it seems that the application will break at the first time a response is not received or a message could not be initialized. Asserts are mainly useful when continuing the application code could end in disaster. In this case, continuing would not be harmful.

Secondly, I would like to know if you have checked whether the XDK is still connected to the WiFi as soon as the http_Get task stops. This could be one reason why the XDK does not receive an answer from the server. The XDK can lose connection for various reasons, such as an overburdened access point, but it may very well be an arbitrary loss of signal.

Is your code based on one of the examples from the XDK-Workbench, on a guide, or is it your own code?

In addition, if this is a possibilty for you, I would like to use your code to reproduce your issue for further analysis. Would you be so kind and post your e-mail address in here so that I can contact you for direct communication.

Kind regards,
Manuel

0 (0 Votes)
RE: Hitting Hard Fault (Possibly during an HTTPReceived Callback)
Answer
10/11/17 10:00 PM as a reply to Manuel Cerny.

Hey Manuel,

Thank you for the response!

I will remove the asserts and add some logic to skip further function calls which might depend on the offending function exiting successfully. 

I am not currently checking on the status of the WiFi connection. I'll add a check at the end of the http_GET task. It's reproducible on multiple networks but it's always good to cover every base.

My code is largely based on the examples in the guides here. Of course, it's been tailored to my needs but the API calls remain the same. 

Let me try your suggestions first, and I'll report back. If issues persist I can look into possibly sharing the project with you. :)

0 (0 Votes)
RE: Hitting Hard Fault (Possibly during an HTTPReceived Callback)
Answer
10/12/17 2:26 PM as a reply to Brennan Ruthardt.

Hello Brennan,

I will be waiting for your results, then. My most likely guess to the cause of this issue is a disconnect from the WiFi network. This has been a common problem for many users, since the XDK wifi-examples does not automatically reconnect upon a lost connection. We recommend to implement this feature for own applications.

Kind regards,
Manuel

0 (0 Votes)
RE: Hitting Hard Fault (Possibly during an HTTPReceived Callback)
Answer
10/17/17 5:51 PM as a reply to Manuel Cerny.

Hi Manuel,

I'm still having some strange behavior, but I think I've made progress. I'd like to discuss further and I can send you some information to give you more context. You can email me at brennan.ruthardt@c-labs.com

0 (0 Votes)
RE: Hitting Hard Fault (Possibly during an HTTPReceived Callback)
Answer
10/18/17 3:23 PM as a reply to Brennan Ruthardt.

Hello Brennan,

We would like to keep the discussion regarding this topic in the forum. This will benefit other users who may have similar issues and are looking for answers via the search-function. Were you able to verify, whether the XDK is connected to the WiFi or not, when the issue occurs? You can doublecheck this by pinging the XDK's IP address from the terminal/command line interface of a computer in the same network.

You can print the IP-address to the console by using the following code after connecting to the WiFi network:

// if not already included, include BCDS_NetworkConfig.h

NetworkConfig_IpSettings_T myIp;
NetworkConfig_GetIpSettings(&myIp);

// insert a delay here, if the IP is not properly printed
printf("The IP was retrieved: %u.%u.%u.%u \n\r",
(unsigned int) (NetworkConfig_Ipv4Byte(myIp.ipV4, 3)),
(unsigned int) (NetworkConfig_Ipv4Byte(myIp.ipV4, 2)),
(unsigned int) (NetworkConfig_Ipv4Byte(myIp.ipV4, 1)),
(unsigned int) (NetworkConfig_Ipv4Byte(myIp.ipV4, 0)));

Additionally, I will contact you directly.

Kind regards,
Franjo

0 (0 Votes)