logo elektroda
logo elektroda
X
logo elektroda

ESP32+IDF: Unstable operation (resetting,hanging,hacking attack,backdoor)

dondu 1938 26
ADVERTISEMENT
Treść została przetłumaczona polish » english Zobacz oryginalną wersję tematu
  • #1 17957118
    dondu
    Moderator on vacation ...
    Hi.

    I have started exploring ESP-WROOM-32 using the original IDF version of Stable: https://docs.espressif.com/projects/esp-idf/en/stable/index.html

    The version I have:

    ESP32+IDF: Unstable operation (resetting,hanging,hacking attack,backdoor) .

    Based on the original esp_http_client examples: https://docs.espressif.com/projects/esp-idf/e.../api-reference/protocols/esp_http_client.html.
    I have prepared two modules.

    Both modules make an http request to the server at cba.pl sending data received by the PHP script and loaded into the database:
    - module A - every 10 seconds
    - module B - every 15 minutes

    Module A works very unstably by resetting itself several times a day (sometimes every 2 minutes, and sometimes it works for many hours without resetting). Sometimes it can also hang up completely - the reset button helps. I suspect that it may not be able to keep up with requests in particular when the cba.pl server is holding it down.

    Module B is working much more stably, but it has also reset a few times over a few days.

    Both modules in the same version - new.
    Both modules have a good power supply, plus module A has a 220uF LOW ESR capacitor soldered in.

    I'll paste the program code tomorrow, as I don't have access to the computer I wrote it on. It is possible that I have written something incorrectly.

    I have not specifically implemented the Watchdog yet.

    My question is about your experiences with the stability of ESP32 operation under the original IDF? Do you have any problems?
  • ADVERTISEMENT
  • Helpful post
    #2 17957192
    khoam
    Level 42  
    dondu wrote:
    Module A operates very unstably resetting itself several times a day (sometimes every 2 minutes, and sometimes it runs for many hours without resetting).
    .
    Probably due to a hardware WD. It's good to "catch" them, I usually manage to track down specific lines of source code with high probability. I use PIO with VSC so such diagnostics are directed (almost) automatically to the serial port monitor. I don't know what IDE you're using, but I'd first focus on configuring the monitor to catch such situations - there's a lot of ESP_LOGE, ESP_LOGW in the IDF code itself.
    Link

    dondu wrote:
    For now, I have not specifically implemented Watchdog yet.
    .
    They are already there :) You can at most turn them off, but this is not recommended.
  • ADVERTISEMENT
  • #3 17960368
    dondu
    Moderator on vacation ...
    I have the monitor on. ESP_LOGE, ESP_LOGW, etc. I use - they were in the example I converted for my own purposes so I recognized them right away along with the description on the page you pointed out.

    khoam wrote:
    dondu wrote:
    For now I have not specifically implemented Watchdog yet.
    .
    They are already there :) You can at most turn them off, but this is not recommended.


    Actually, I checked the settings in menuconfig and it is enabled.

    Due to the resetting and worse hanging I changed the IDF version today from Latest to Stable dedicated to production use. The first time installing it a couple of weeks ago, however, I chose Latest, and renewing the work afterwards I thought I had Stable :) .

    I did this on module A, which was adding records to the database every 10 seconds and resetting and hanging up the most. I'll leave it running again for a few days - see if there's a difference.

    Since it can be assumed that it is the WD that is resetting, I will prepare a support to recognise this. There are two types of WD out there.

    However, I am more concerned about the module hanging up.

    I still have little time to quickly determine what is causing the problems :( .
  • #4 17960517
    khoam
    Level 42  
    dondu wrote:
    But I am more concerned about the module hanging up.
    .
    The hanging can be caused by a so-called crash core panic. A case diagnosis can be made in this case too: Link .
  • #5 17961381
    dondu
    Moderator on vacation ...
    Have you seen an example somewhere with dumping to flesh and subsequent analysis of this data after ESP32 reboot?
  • #6 17961394
    khoam
    Level 42  
    If it is program data, then preferably to SPIFFS . You mount the /spiffs partition and there it can operate on files using fgets(), fputs() etc.
    Examples are here: https://github.com/espressif/esp-idf/tree/a20d02b7f/examples/storage/spiffs
    Unfortunately you cannot create directories.

    Added after 4 [minutes]: .

    Here is another example: https://github.com/loboris/ESP32_spiffs_example
  • #7 17961408
    dondu
    Moderator on vacation ...
    Perhaps I can clarify. We have two cases:

    1. resetting,
    2. the module hanging despite the watchdog being enabled by default.

    In both cases, I would like esp to be able to analyse what has happened after a reset and react accordingly (e.g. send the relevant data to the database). Especially the second case is very important.
  • Helpful post
    #8 17961435
    khoam
    Level 42  
    Ideally, you should download Kolban's book on ESP32. - it's free on the web: https://leanpub.com/kolban-ESP32
    There it is nicely described how to configure the system to upload core dumps to flash and many other useful things related to debugging. Well, unless you already have but haven't read ;) yet.

    Added after 5 [minutes]:

    There is also a video version :) .
    https://www.youtube.com/watch?v=MpD_3oVJAEs&a...e&list=PLB-czhEQLJbWMOl7Ew4QW1LpoUUE7QMOo
  • #9 17961594
    dondu
    Moderator on vacation ...
    Yes of course I have, but I only reviewed it in 5 minutes and put it aside ad acta focusing on the API description from the manufacturer's website. I didn't think to look into this book in this topic which I will hereby do :) .

    I did not come across this video on YT - thank you for the link.
  • ADVERTISEMENT
  • #10 17968721
    dondu
    Moderator on vacation ...
    khoam wrote:
    .
    Unfortunately this is a dump to the UART, not a flash.

    In general, changing the IDF to Stable 3.2. has ensured stable operation of the modules, but ...


    .... it depends on the module version: .

    So far I've used ESP32 modules in the version with the white board on the underside :

    ESP32+IDF: Unstable operation (resetting,hanging,hacking attack,backdoor) .

    On Saturday I tried others, which are black on the underside . When I uploaded the BLINK example, there was no problem, but when I started using WiFi transmission e.g. in station mode (increased current demand), the modules reported a brownout reset. I checked the supply voltage directly on the VCC pin of the Esp32 with an oscilloscope and this is indeed the problem.

    Probably the tantalum capacitor on the board is too small or of poor quality. So I added a 220uF low ESR electrolytic capacitor (220mΩ) directly on the ESP32 pins and now it is ok. I checked two such modules - both had the same problem.


    The white module has been running for 5 days and has already added (every 10 seconds) over 43200 records to the database on my server and still works fine - this is the version with the white board underneath without the extra capacitor.
  • #11 17968804
    khoam
    Level 42  
    dondu wrote:
    Unfortunately this is a dump to the UART, not flash.
    .
    "Kolban's book on ESP32" - Chapter "Debugging", subsection "Core Dump Processing" and beyond - in my case page 122 :) .

    dondu wrote:
    .... it depends on the module version:
    .
    This is why I don't use off-the-shelf modules, but a devboard, as shown below:

    ESP32+IDF: Unstable operation (resetting,hanging,hacking attack,backdoor) .
  • #12 17968809
    dondu
    Moderator on vacation ...
    khoam wrote:
    dondu wrote:
    Unfortunately this is a dump to the UART, not flash.
    .
    "Kolban's book on ESP32" - Chapter "Debugging", subsection "Core Dump Processing" and beyond - in my case page 122 :)
    .

    I've read it, but there's no example of how to read it from the flash later.

    By the way please have another look here: https://www.elektroda.pl/rtvforum/topic3583050.html
  • ADVERTISEMENT
  • #14 17968830
    dondu
    Moderator on vacation ...
    I've seen that too, but there is only an indication of a python script there.
    I'm looking for an example in C where a running ESP after a reset could analyse the core dump stored in its own flash.
  • #15 17968897
    khoam
    Level 42  
    dondu wrote:
    I've seen that too, but there is only an indication of a python script there.
    I'm looking for an example in C where a running ESP after a reset would be able to analyse the core dump stored in its own flash.
    .
    I don't really understand. Do you want to add some function to the program in ESP32 that, after a coredump and reboot occurs, loads the core dump from the flash partition, then decodes and displays the result? Where should it display it?
  • #16 17968928
    dondu
    Moderator on vacation ...
    Yes, except that I didn't write anything about displaying just the appropriate ESP32 response:

    dondu wrote:
    In both cases I would like so that the esp after a reset could analyse what happened and react accordingly (e.g. send the appropriate data to the database) .
    .
  • #19 17969028
    khoam
    Level 42  
    This is quite generic and is unlikely to indicate where the critical error occurs in the code.
    You can, however, hook your own function under esp_register_shutdown_handler () and write diagnostic information before the reset occurs.
  • #20 17969053
    dondu
    Moderator on vacation ...
    khoam wrote:
    This is fairly generic information and is unlikely to indicate where the critical error occurred in the code.
    .
    For the time being, the mere fact of detecting the reason must be enough for me.

    khoam wrote:
    You can, however, hook up your own function under esp_register_shutdown_handler () and save the diagnostic information before the reset occurs.
    .
    But here it is important to check whether this will be safe. This is because it may be that, depending on the cause, e.g. brownout, or stack related, such a write operation will be unsafe.
  • #21 17971825
    khoam
    Level 42  
    dondu wrote:
    Till now I have been using the ESP32 modules with the white board version on the underside:
    On Saturday I tried others that are black on the underside.

    What versions of the cubes are on these two types of modules? ESP32-S or ESP-WROOM-32?
  • #22 17972456
    dondu
    Moderator on vacation ...
    khoam wrote:
    What versions of cubes are on these two types of modules? ESP32-S or ESP-WROOM-32?
    .
    Both versions of the board have ESP-WROOM-32 soldered on (as pictured in the first post).
  • #23 17989697
    dondu
    Moderator on vacation ...
    Interesting fact ... Hacking attack, backdoor, what cause?

    ... I am testing the above 16 modules deployed in different locations (different ISPs, different routers and of course different IPs of end users deployed in one area within a 10km radius). Internet connections from providers to routers also various, from fibre to WiFi.

    All of them add a record of control data, including the reason for the reset, to the database on the cba.pl server every minute. A simple programme written on the basis of the original example esp_http_client: https://docs.espressif.com/projects/esp-idf/e.../api-reference/protocols/esp_http_client.html

    A couple of days ago there was an incident, the cause of which I cannot explain.

    Three of these 16 modules were reset at the same time in 30 seconds with ESP_RST_INT_WDT "Reset (software or hardware) due to interrupt watchdog" as the cause.

    The Interrupt Watchdog Timeout is set to 300ms.

    All three:
    - have the exact same software uploaded, written using IDF Stable 3.2,
    - are not connected to each other in any way (separate network, separate internet provider),
    - they were launched on different days and at different times,
    - they do not communicate with each other.
    - apart from this incident, they have been working stably adding 1440 records a day each to the database for several days.

    The database as the perpetrator of the confusion was excluded, because at that time the other dozen ESPs were correctly adding records to the database.

    So I monitored the modules further and ... today 4 others (also unrelated) did the same thing at the same time with within 30 seconds the modules were reset reporting ESP_RST_INT_WDT.

    Both events occurred between 5 and 6am.

    An attempted hacking attack?
    The Chinese have a backdoor in the ESP32? :)
    ... or perhaps some more mundane reason?
    Where to look for the cause?
  • #24 17989928
    khoam
    Level 42  
    dondu wrote:
    Hacker attack attempt?
    Chinese have a backdoor in ESP32?
    .
    You should first verify the operation of several such modules with the same software on a WiFi network without internet access .
    And as for the hacking, or systematic attempts at surveillance of any devices on the internet, the Chinese should get in rather a long queue :) It's also a good idea to check the logs on the internet access routers themselves - you know the 'suspicious' time frame, so it shouldn't be that difficult.
  • #25 17989937
    dondu
    Moderator on vacation ...
    khoam wrote:
    It's also a good idea to check the logs on the internet access routers themselves - you know the 'suspicious' timeframe, so it shouldn't be that difficult.
    .
    Unfortunately but:
    dondu wrote:
    .... I am testing the above 16 modules distributed in different locations (different ISPs, different routers and of course different IPs of end users distributed in one area within a 10km radius). Internet links from providers to routers also vary, from fibre to WiFi.
    .
    Hence I do not have access to logs from routers that are not mine.
    The only one I have access to unfortunately only stores a small amount of information.


    khoam wrote:
    First you should verify the operation of several such modules with the same software on a WiFi network without internet access .
    .
    This is a good tip, in my spare time I'll set up a database and php interpreter and run a test.
  • #26 17989945
    khoam
    Level 42  
    dondu wrote:
    The only one I have access to unfortunately only stores a small amount of information.
    .
    Alternatively, you could insert a sniffer in the IP subnet where the ESP is and also collect logs on traffic to/from the ESP that way.
  • #27 20976179
    Wodz23
    Level 11  
    Hello
    In order for the ESP32 not to hang up, all you need to do is skip the UART circuit. Feed 3.3v directly to the 3.3v pin from another good power supply....
    Use the usb connector only for programming and do not power the chip with it.
    Pinning the capacitors doesn't do much, but good as they are. You can add a 10K resistor between the plus 3.3v and EN (RESET)
    and an electrolytic capacitor from EN 1uF (I gave 22uF) to GND. After this procedure, the ESP32 starts immediately when power is applied.
    It works without hang-ups connects to the network momentarily and not as before a minimum of 30 seconds and works stably.

    Regards.

Topic summary

The discussion revolves around the unstable operation of ESP32 modules, specifically the ESP-WROOM-32, when using the ESP-IDF framework. The user reports frequent resets and hangs in Module A, which sends HTTP requests every 10 seconds, while Module B operates more stably with a 15-minute interval. Responses suggest that the issues may stem from hardware watchdog (WD) resets, power supply problems, or core panics. Recommendations include monitoring logs for errors, adjusting the IDF version to Stable, and ensuring adequate power supply with additional capacitors. The conversation also touches on potential hacking concerns, with suggestions to analyze network traffic and check router logs. A final solution proposed involves bypassing the UART circuit and directly powering the ESP32 to improve stability.
Summary generated by the language model.
ADVERTISEMENT